0% found this document useful (0 votes)
1 views48 pages

DBMS Module-4 Notes

The document provides notes on advanced SQL queries within the Database Management Systems course (BCS403) for the B.E. IV Semester at Sai Vidya Institute of Technology. It covers topics such as handling NULL values, nested queries, correlated queries, aggregate functions, and the use of GROUP BY and HAVING clauses. Additionally, it discusses the importance of SQL syntax and logical operations in constructing complex queries.

Uploaded by

akashn.23cs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views48 pages

DBMS Module-4 Notes

The document provides notes on advanced SQL queries within the Database Management Systems course (BCS403) for the B.E. IV Semester at Sai Vidya Institute of Technology. It covers topics such as handling NULL values, nested queries, correlated queries, aggregate functions, and the use of GROUP BY and HAVING clauses. Additionally, it discusses the importance of SQL syntax and logical operations in constructing complex queries.

Uploaded by

akashn.23cs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

MODULE –4 NOTES DBMS- BCS403

MODULE 4- NOTES
ON
DATABASE MANAGEMENT SYSTEMS( BCS403)
2024 – 2025

B. E IV Semester

Department of Computer Science & Engineering

Sai Vidya Institute of Technology


Bengaluru -560064
Madhura N, Assistant Professor
Dr. Varun E, Associate Professor
Dr. Shashikumar D R, Professor and HOD

Web: www.saividya.ac.in
https://www.facebook.com/SaiVidyaInstituteOfTechnology

\ Dept. of CSE, SVIT


Page 1
MODULE –4 NOTES DBMS- BCS403

CHAPTER 1

SQL: Advanced Queries: More complex SQL retrieval queries, Specifying


constraints as assertions and action triggers, Views in SQL

7.1 More Complex SQL Retrieval Queries


7.1.1 Comparisons Involving NULL and Three-Valued Logic

▪ SQL has various rules for dealing with NULL values.


▪ Meanings of NULL
✓ Unknown value (value exists but is not known, or it is not known whether or not the
value exists)
✓ Unavailable or withheld value (value exists but is purposely withheld)
✓ Not applicable attribute (the attribute does not apply to this tuple or is undefined for this
tuple).

▪ Consider the following examples to illustrate each of the meanings of NULL.

1. Unknown value. A person’s date of birth is not known, so it is represented by NULL in the
database. An example of the other case of unknown would be NULL for a person’s home
phone because it is not known whether or not the person has a home phone.

2. Unavailable or withheld value. A person has a home phone but does not want it to be
listed, so it is withheld and represented as NULL in the database.

3. Not applicable attribute. An attribute LastCollegeDegree would be NULL for a person


who has no college degrees because it does not apply to that person.

▪ Each individual NULL value considered to be different from every other NULL value.

\ Dept. of CSE, SVIT


Page 2
MODULE –4 NOTES DBMS- BCS403

▪ SQL uses a three-valued logic: TRUE, FALSE, and UNKNOWN instead of the standard
two-valued (Boolean) logic with values TRUE or FALSE.

▪ When a record with NULL in one of its attributes is involved in a comparison operation, the
result is considered to be UNKNOWN (it may be TRUE or it may be FALSE).

▪ It is therefore necessary to define the results (or truth values) of three-valued logical
expressions when the logical connectives AND, OR, and NOT are used. Table 7.1 shows the
resulting values.

▪ In Tables 7.1(a) and 7.1(b), the rows and columns represent the values of the results of
comparison conditions, which would typically appear in the WHERE clause of an SQL query.

▪ Each expression result would have a value of TRUE, FALSE, or UNKNOWN.

▪ The result of combining the two values using the AND logical connective is shown by the
entries in Table 7.1(a). Table 7.1(b) shows the result of using the OR logical connective.

▪ For example, the result of (FALSE AND UNKNOWN) is FALSE, whereas the result of
(FALSE OR UNKNOWN) is UNKNOWN.

\ Dept. of CSE, SVIT


Page 3
MODULE –4 NOTES DBMS- BCS403

▪ Table 7.1(c) shows the result of the NOT logical operation.

➢ SQL allows queries that check whether an attribute value is NULL. Rather than using = or <>
to compare an attribute value to NULL, SQL uses the comparison operators IS or IS NOT.

➢ Query 18 illustrates NULL com parison by retrieving any employees who do not have a
supervisor.

Query 18. Retrieve the names of all employees who do not have supervisors

Q18: SELECT Fname, Lname


FROM EMPLOYEE
WHERE Super_ssn IS NULL;

7.1.2 Nested Queries, Tuples, and Set/Multiset Comparisons.

▪ Some queries require that existing values in the database be fetched and then used in a
comparison condition. Such queries can be formulated by using nested queries, which are
complete select-from-where blocks within another SQL query. That other query is called
the outer query.

▪ Q4A introduces the comparison operator IN, which compares a value v with a set (or multiset)
of values V and evaluates to TRUE if v is one of the elements in V.

▪ In Q4A, the first nested query selects the project numbers of projects that have an employee
with last name ‘Smith’ involved as manager, whereas the second nested query selects the
project numbers of projects that have an employee with last name ‘Smith’ involved as worker.

▪ In the outer query, we use the OR logical connective to retrieve a PROJECT tuple if the
PNUMBER value of that tuple is in the result of either nested query.

\ Dept. of CSE, SVIT


Page 4
MODULE –4 NOTES DBMS- BCS403

Make a list of all project numbers for projects that involve employee Smith either as worker
or as a manager of the department that controls the project:

▪ SQL allows the use of tuples of values in comparisons by placing them within parentheses.
To illustrate this, consider the following query.

▪ This query will select the Essns of all employees who work the same (project, hours)
combination on some project that employee ‘John Smith’ (whose Ssn = ‘123456789’) works
on.

▪ In addition to the IN operator, other comparison operators can be used to compare a single
value v (typically an attribute name) to a set or multiset v (typically a nested query). The =
ANY (or = SOME) operator returns TRUE if the value v is equal to some value in the set V
and is hence equivalent to IN.

▪ The two keywords ANY and SOME have the same effect. Other operators that can be
combined with ANY (or SOME) include >, >=, <=, and <>.

\ Dept. of CSE, SVIT


Page 5
MODULE –4 NOTES DBMS- BCS403

▪ The keyword ALL can also be combined with each of these operators. An example is the
following query, which returns the names of employees whose salary is greater than the
salary of all the employees in department 5:

To illustrate the potential ambiguity of attribute names in nested queries, consider Query 16.

Query 16. Retrieve the name of each employee who has a dependent with the same first name
and is the same sex as the employee.

Note: It is generally advisable to create tuple variables (aliases) for all the tables referenced
in an SQL query to avoid potential errors and ambiguities.

7.1.3 Correlated Nested Queries

Whenever a condition in the WHERE clause of a nested query references some attribute of a
relation declared in the outer query, the two queries are said to be correlated.
A nested query is a query inside another query. The inner query is executed first, and its result is
passed to the outer query.
• Inner query runs independently of the outer query.
• Inner query is executed once.
• Outer query uses the result to filter or process.

\ Dept. of CSE, SVIT


Page 6
MODULE –4 NOTES DBMS- BCS403

A correlated query is a subquery that depends on the outer query. The inner query is executed
once per row from the outer query.
• Inner query refers to a column of the outer query.
• Executed multiple times, once for each row from the outer query.
• Often slower than simple nested queries due to repeated execution.

• In general, a query written with nested select-from-where blocks and using the = or IN
comparison operators can always be expressed as a single block query. For example, Q16 may
be written as in Q16A:

7.1.4 The EXISTS and UNIQUE Functions in SQL


• EXISTS and UNIQUE are Boolean functions that return TRUE or FALSE; hence, they can be used in a
WHERE clause condition.

• The EXISTS function in SQL is used to check whether the result of a nested query is empty (contains no tuples)
or not.

• The result of EXISTS is a Boolean value TRUE if the nested query result contains at least one tuple, or FALSE
if the nested query result contains no tuples.

• Query 16 in an alternative form that uses EXISTS as in Q16B:

\ Dept. of CSE, SVIT


Page 7
MODULE –4 NOTES DBMS- BCS403

• EXISTS and NOT EXISTS : Typically used in conjunction with a correlated nested query.

• EXISTS(Q) returns TRUE if there is at least one tuple in the result of the nested query Q, and returns FALSE
otherwise.

• On the other hand, NOT EXISTS(Q) returns TRUE if there are no tuples in the result of nested query Q, and
returns FALSE otherwise.

• In Q6, the correlated nested query retrieves all DEPENDENT tuples related to a particular EMPLOYEE tuple. If
none exist, the EMPLOYEE tuple is selected because the WHERE-clause condition will evaluate to TRUE in
this case.

• For each EMPLOYEE tuple, the correlated nested query selects all DEPENDENT tuples whose Essn value
matches the EMPLOYEE Ssn; if the result is empty, no dependents are related to the employee, so we select that
EMPLOYEE tuple and retrieve its Fname and Lname.

• In Q7, where we specify two nested correlated queries; the first selects all DEPENDENT tuples related to an
EMPLOYEE, and the second selects all DEPARTMENT tuples managed by the EMPLOYEE. If at least one of
the first and at least one of the second exists, we select the EMPLOYEE.

\ Dept. of CSE, SVIT


Page 8
MODULE –4 NOTES DBMS- BCS403

• The query Q3: Retrieve the name of each employee who works on all the projects controlled by department
number 5 can be written using EXISTS and NOT EXISTS in SQL systems.

7.1.5 Explicit Sets and Renaming in SQL


• It is also possible to use an explicit set of values in the WHERE clause, rather than a nested query. Such a set is
enclosed in parentheses in SQL.

• Query 17. Retrieve the Social Security numbers of all employees who work on project numbers 1, 2, or 3.

• In SQL, it is possible to rename any attribute that appears in the result of a query by adding the qualifier AS
followed by the desired new name.

7.1.6 Joined Tables in SQL and Outer Joins


• The concept of a joined table (or joined relation) was incorporated into SQL to permit users to specify a table
resulting from a join operation in the FROM clause of a query.

• For example, consider query Q1, which retrieves the name and address of every employee who works for the
‘Research’ department.

\ Dept. of CSE, SVIT


Page 9
MODULE –4 NOTES DBMS- BCS403

• The concept of a joined table also allows the user to specify different types of join, such as NATURAL JOIN
and various types of OUTER JOIN. In a NATURAL JOIN on two relations R and S, no join condition is
specified; an implicit EQUIJOIN condition for each pair of attributes with the same name from R and S is created.

• If the names of the join attributes are not the same in the base relations, it is possible to rename the attributes so
that they match, and then to apply NATURAL JOIN. In this case, the AS construct can be used to rename a
relation and all its attributes in the FROM clause.

• The default type of join in a joined table is called an inner join, where a tuple is included in the result only if a
matching tuple exists in the other relation.

• If the user requires that all employees be included, a different type of join called OUTER JOIN must be used
explicitly.

• It is also possible to nest join specifications; that is, one of the tables in a join may itself be a joined table. This
allows the specification of the join of three or more tables as a single joined table, which is called a multiway join.

7.1.7 Aggregate Functions in SQL


• Aggregate functions are used to summarize information from multiple tuples into a single-tuple summary.
Grouping is used to create subgroups of tuples before summarization.

• A number of built-in aggregate functions exist: COUNT, SUM, MAX, MIN, and AVG.

\ Dept. of CSE, SVIT


Page 10
MODULE –4 NOTES DBMS- BCS403

• The COUNT function returns the number of tuples or values as specified in a query.

• The functions SUM, MAX, MIN, and AVG can be applied to a set or multiset of numeric values and return,
respectively, the sum, maximum value, minimum value, and average (mean) of those values.

• These functions can be used in the SELECT clause or in a HAVING clause.

Query 19. Find the sum of the salaries of all employees, the maximum salary, the minimum salary, and the
average salary.

• This query returns a single-row summary of all the rows in the EMPLOYEE table. We could use AS to rename
the column names in the resulting single-row table; for example, as in Q19A.

Query 20. Find the sum of the salaries of all employees of the ‘Research’ department, as well as the maximum
salary, the minimum salary, and the average salary in this department.

Queries 21 and 22. Retrieve the total number of employees in the company (Q21) and the number of
employees in the ‘Research’ department (Q22).

Note :Here the asterisk (*) refers to the rows (tuples), so COUNT (*) returns the number of rows in the result of the
query. We may also use the COUNT function to count values in a column rather than tuples, as in the next example.

\ Dept. of CSE, SVIT


Page 11
MODULE –4 NOTES DBMS- BCS403

Geneal rule of aggregate function


When an aggregate function is applied to a collection of values, NULLs are removed from the collection before the
calculation; if the collection becomes empty because all values are NULL, the aggregate function will return NULL
(except in the case of COUNT, where it will return 0 for an empty collection of values).

Correlated nested query can also be specified with an aggregate function, and then use the nested query in the
WHERE clause of an outer query.

For example, to retrieve the names of all employees who have two or more dependents (Query 5), we can write the
following.

The correlated nested query counts the number of dependents that each employee has; if this is greater than or equal to
two, the employee tuple is selected.

7.1.8 Grouping: The GROUP BY and HAVING Clauses

• If we want to apply the aggregate functions to subgroups of tuples in a relation, where the
subgroups are based on some attribute values, SQL has GROUP BY clause for this purpose.

• For example, we may want to find the average salary of employees in each department or the
number of employees who work on each project.

• In these cases, we need to partition the relation into nonoverlapping subsets (or groups) of
tuples. Each group (partition) will consist of the tuples that have the same value of some
attribute(s), called the grouping attribute(s).

\ Dept. of CSE, SVIT


Page 12
MODULE –4 NOTES DBMS- BCS403

• The GROUP BY clause specifies the grouping attributes, which should also appear in the
SELECT clause, so that the value resulting from applying each aggregate function to a group of
tuples appears along with the value of the grouping attribute(s).

Figure 7.1(a) illustrates how grouping works and shows the result of Q24.

Note: If NULLs exist in the grouping attribute, then a separate group is created for all tuples with a
NULL value in the grouping attribute.

Query 25. For each project, retrieve the project number, the project name, and the number
of employees who work on that project.

\ Dept. of CSE, SVIT


Page 13
MODULE –4 NOTES DBMS- BCS403

Suppose, that we want to modify Query 25 so that only projects with more than two employees
appear in the result. SQL provides a HAVING clause, which can appear in conjunction with a
GROUP BY clause, for this purpose. This is illustrated by Query 26.

Query 26. For each project on which more than two employees work, retrieve the project
number, the project name, and the number of employees who work on the project.

\ Dept. of CSE, SVIT


Page 14
MODULE –4 NOTES DBMS- BCS403

Query 27. For each project, retrieve the project number, the project name, and the number of
employees from department 5 who work on the project.

Query 28. For each department that has more than five employees, retrieve the department
number and the number of its employees who are making more than $40,000.

7.1.9 Other SQL Constructs: WITH and CASE

WITH Constructs:

• Creates a temporary named result (like a view), used only in that specific query.
• Simplifies complex queries, especially with multiple subqueries or recursive logic.
• For example, we can rewrite Q28 as Q28′:

\ Dept. of CSE, SVIT


Page 15
MODULE –4 NOTES DBMS- BCS403

In Q28′, we defined in the WITH clause a temporary table BIG_DEPTS whose result holds the Dno’s of
departments with more than five employees, then used this table in the subsequent query. Once this query
is executed, the temporary table BIGDEPTS is discarded.

CASE Construct

• It Acts like an IF-ELSE in SQL.


• Used to conditionally return values in SELECT, WHERE, ORDER BY, and even UPDATE
statements.

For example: we want to give employees different raise amounts depending on which department
they work for; for example, employees in department 5 get a $2,000 raise, those in department 4 get
$1,500 and those in department 1 get $3,000.

7.1.10 Recursive Queries in SQL

An example of a recursive operation is to retrieve all supervisees of a supervisory employee e at all


levels—that is, all employees e′ directly supervised by e, all employees e′ directly supervised by
each employee e′, all employees e″′ directly supervised by each employee e″, and so on. In
SQL:99, this query can be written as follows.

\ Dept. of CSE, SVIT


Page 16
MODULE –4 NOTES DBMS- BCS403

• In Q29, we are defining a view SUP_EMP that will hold the result of the recursive query.
• The view is initially empty.
• It is first loaded with the first level (supervisor, supervisee) Ssn combinations via the first part
(SELECT SupervisorSss, Ssn FROM EMPLOYEE), which is called the base query.
• This will be combined via UNION with each successive level of supervisees through the second
part, where the view contents are joined again with the base values to get the second level
combinations, which are UNIONed with the first level.
• This is repeated with successive levels until a fixed point is reached, where no more tuples are
added to the view. At this point, the result of the recursive query is in the view SUP_EMP.

7.1.11 Discussion and Summary of SQL Queries

• A retrieval query in SQL can consist of up to six clauses, but only the first two— SELECT and
FROM—are mandatory.

• The query can span several lines, and is ended by a semicolon.

• Query terms are separated by spaces, and parentheses can be used to group relevant parts of a
query in the standard way.

• The clauses are specified in the following order, with the clauses between square brackets [ … ]
being optional:

• The SELECT clause lists the attributes or functions to be retrieved.

• The FROM clause specifies all relations (tables) needed in the query, including joined relations,
but not those in nested queries.

\ Dept. of CSE, SVIT


Page 17
MODULE –4 NOTES DBMS- BCS403

• The WHERE clause specifies the conditions for selecting the tuples from these relations,
including join conditions if needed.

• GROUP BY specifies grouping attributes.

• HAVING specifies a condition on the groups being selected rather than on the individual
tuples.

• The built-in aggregate functions COUNT, SUM, MIN, MAX, and AVG are used in
conjunction with grouping, but they can also be applied to all the selected tuples in a query
without a GROUP BY clause.

• ORDER BY specifies an order for displaying the result of a query.

7.2 Specifying Constraints as Assertions and Actions as Triggers

➢ CREATE ASSERTION
Specify additional types of constraints outside scope of built-in relational model
constraints.

➢ CREATE TRIGGER
Specify automatic actions that database system will perform when certain events and
conditions occur.

7.2.1 Specifying General Constraints as Assertions in SQL

• Specify a query that selects any tuples that violate the desired condition.

• Use only in cases where it goes beyond a simple CHECK which applies to individual attributes and domains

• Each assertion is given a constraint name and is specified via a condition similar to the WHERE clause of an
SQL query.

• For example, to specify the constraint that the salary of an employee must not be greater than the salary of the
manager of the department that the employee works for in SQL, we can write the following assertion:

\ Dept. of CSE, SVIT


Page 18
MODULE –4 NOTES DBMS- BCS403

• The constraint name SALARY_CONSTRAINT is followed by the keyword CHECK, which is


followed by a condition in parentheses that must hold true on every database state for the assertion to
be satisfied.

7.2.2 Introduction to Triggers in SQL

• It may be useful to specify a condition that, if violated, causes some user to be informed of the
violation.

• For Example ,A manager may want to be informed if an employee’s travel expenses exceed a certain
limit by receiving a message whenever this occurs.

• The action that the DBMS must take in this case is to send an appropriate message to that user. The
condition is thus used to monitor the database.

• Other actions may be specified, such as executing a specific stored procedure or triggering other
updates.

• The CREATE TRIGGER statement is used to implement such actions in SQL.

• Suppose we want to check whenever an employee’s salary is greater than the salary of his or her direct
supervisor in the COMPANY database .

• Several events can trigger this rule: inserting a new employee record, changing an employee’s salary,
or changing an employee’s supervisor.

• Suppose that the action to take would be to call an external stored procedure
SALARY_VIOLATION, which will notify the supervisor.

• Note: Below syntax in for Oracle database.

\ Dept. of CSE, SVIT


Page 19
MODULE –4 NOTES DBMS- BCS403

A typical trigger which is regarded as an ECA (Event, Condition, Action) rule has three components:

1) The event(s): These are usually database update operations that are explicitly applied to the
database.
• In this example the events are: inserting a new employee record, changing an employee’s
salary, or changing an employee’s supervisor.
• The person who writes the trigger must make sure that all possible events are accounted
for.
• These events are specified after the keyword BEFORE in our example, which means that
the trigger should be executed before the triggering operation is executed.
• An alternative is to use the keyword AFTER, which specifies that the trigger should be
executed after the operation specified in the event is completed.

2) The condition :That determines whether the rule action should be executed: Once the triggering
event has occurred, an optional condition may be evaluated.
• If no condition is specified, the action will be executed once the event occurs.
• If a condition is specified, it is first evaluated, and only if it evaluates to true will the rule
action be executed.
• The condition is specified in the WHEN clause of the trigger.

3) The action to be taken: The action is usually a sequence of SQL statements, but it could also be a
database transaction or an external program that will be automatically executed.
• In this example, the action is to execute the stored procedure INFORM_SUPERVISOR.

\ Dept. of CSE, SVIT


Page 20
MODULE –4 NOTES DBMS- BCS403

7.3 Views (Virtual Tables) in SQL

7.3.1 Concept of a View in SQL

• A view in SQL terminology is a single table that is derived from other tables. These other
tables can be base tables or previously defined views.

• A view does not necessarily exist in physical form; it is considered to be a virtual table, in
contrast to base tables, whose tuples are always physically stored in the database.

7.3.2 Specification of Views in SQL

• In SQL, the command to specify a view is CREATE VIEW.

• The view is given a (virtual) table name (or view name), a list of attribute names, and a query to
specify the contents of the view.

• In V1, attributes retain the names from base tables. In V2, attributes are assigned names.

\ Dept. of CSE, SVIT


Page 21
MODULE –4 NOTES DBMS- BCS403
• We can now specify SQL queries on a view—or virtual table—in the same way we specify
queries involving base tables.

• For example, to retrieve the last name and first name of all employees who work on the
‘ProductX’ project, we can utilize the WORKS_ON1 view and specify the query as in QV1.

• A view is supposed to be always up-to-date; if we modify the tuples in the base tables on
which the view is defined, the view must automatically reflect these changes. Hence, the
view does not have to be realized or materialized at the time of view definition but rather at
the time when we specify a query on the view. It is the responsibility of the DBMS and not
the user to make sure that the view is kept up to-date.

• If we do not need a view anymore, we can use the DROP VIEW command to dispose of it.
For example, to get rid of the view V1, we can use the SQL statement in V1A:

7.3.3 View Implementation, View Update, and Inline Views


There are two main View Implementation Strategies:

1) Strategy1: Query modification approach

• Compute the view as and when needed. Do not store permanently.

• Modify view query into a query on underlying base tables

• Disadvantage: inefficient for views defined via complex queries that are time
consuming to execute.

\ Dept. of CSE, SVIT


Page 22
MODULE –4 NOTES DBMS- BCS403

2) Strategy 2 : View materialization

• Physically create a temporary view table when the view is first queried.

• Keep that table on the assumption that other queries on the view will follow.

• Requires efficient strategy for automatically updating the view table when the base tables
are updated.

• Techniques using the concept of incremental update have been developed for this
purpose, where the DBMS can determine what new tuples must be inserted, deleted, or
modified in a materialized view table when a database update is applied to one of the
defining base tables.

Multiple ways to handle materialization:

• Immediate update strategy updates a view as soon as the base tables are changed.

• Lazy update strategy updates the view when needed by a view query.

• Periodic update strategy updates the view periodically (in the latter strategy, a view
query may get a result that is not up-to-date). This is commonly used in Banks, Retail
store operations, etc.

View Update

• Update on a view defined on a single table without any aggregate functions.


Can be mapped to an update on underlying base table- possible if the primary key is
preserved in the view.

• Update not permitted on aggregate views. E.g.,

UV2: UPDATE DEPT_INFO


SET Total_sal=100000
WHERE Dname=‘Research’;

• This cannot be processed because Total_sal is a computed value in the view definition.

\ Dept. of CSE, SVIT


Page 23
MODULE –4 NOTES DBMS- BCS403
In summary, we can make the following observations:

■ A view with a single defining table is updatable if the view attributes contain the primary key of
the base relation, as well as all attributes with the NOT NULL constraint that do not have default
values specified.

■Views defined on multiple tables using joins are generally not updatable.

■ Views defined using grouping and aggregate functions are not updatable.

In SQL, the clause WITH CHECK OPTION should be added at the end of the view definition if a
view is to be updated by INSERT, DELETE, or UPDATE statements.

It is also possible to define a view table in the FROM clause of an SQL query. This is known as an in-
line view. In this case, the view is defined within the query itself.

7.3.4 Views as Authorization Mechanisms

• Views can be used to hide certain attributes or tuples from unauthorized users.

• E.g., For a user who is only allowed to see employee information for those who work for department
5, he may only access the view :

• A view can restrict a user to only see certain columns; for example, only the first name, last name,
and address of an employee may be visible as follows:

\ Dept. of CSE, SVIT


Page 24
MODULE –4 NOTES DBMS- BCS403

CHAPTER 2
TRANSACTION PROCESSING

Introduction

The concept of transaction provides a mechanism for describing logical units of database processing.
Transaction processing systems are systems with large databases and hundreds of concurrent users
executing database transactions. Examples: Airline reservations banking, credit card processing,
online retail purchasing, Stock markets, supermarket checkouts etc.

These systems require high availability and fast response time for hundreds of concurrent users. A
transaction is typically implemented by a computer program, which includes database commands
such as retrievals, insertions, deletions, and updates.

20.1 Introduction to Transaction Processing

20.1.1 Single-User versus Multiuser Systems

One criterion for classifying a database system is according to the number of users who can
use the system concurrently.

Single-User versus Multiuser Systems


A DBMS is single-user means - at most one user at a time can use the system
- Eg: Personal Computer System

Multiuser means- many users can use the system and hence access the database concurrently
- Eg: Airline reservation database

Concurrent access is possible because of Multiprogramming. Multiprogramming can be achieved


by:
Interleaved execution
Parallel Processing

Multiprogramming operating systems execute some commands from one process, then suspend that
process and execute some commands from the next process, and so on.

A process is resumed at the point where it was suspended whenever it gets its turn to use the CPU
again. Hence, concurrent execution of processes is actually interleaved, as illustrated in Figure 20.1.

\ Dept. of CSE, SVIT


Page 25
MODULE –4 NOTES DBMS- BCS403

• Figure 20.1, shows two processes, A and B, executing concurrently in an interleaved fashion
Interleaving keeps the CPU busy when a process requires an input or output (I/O) operation,
such as reading a block from disk.

• The CPU is switched to execute another process rather than remaining idle during I/O time
Interleaving also prevents a long process from delaying other processes.

• If the computer system has multiple hardware processors (CPUs), parallel processing of
multiple processes is possible, as illustrated by processes C and D in Figure 20.1 Most of the
theory concerning concurrency control in databases is developed in terms of interleaved
concurrency.

• In a multiuser DBMS, the stored data items are the primary resources that may be accessed
concurrently by interactive users or application programs, which are constantly retrieving
information from and modifying the database.

20.1.2 Transactions, Database Items, Read and Write Operations, and DBMS
Buffers

• A Transaction is an executing program that forms a logical unit of database processing.

• It includes one or more DB access operations such as insertion, deletion, modification or


retrieval operation.

• It can be either embedded within an application program using begin transaction and end
transaction statements or specified interactively via a high-level query language such as SQL.

• Transaction which does not update database are known as read only transactions.

• Transaction which does update database are known as read write transactions.

• A database is basically represented as a collection of named data items .The size of a data
item is called its granularity.
\ Dept. of CSE, SVIT
Page 26
MODULE –4 NOTES DBMS- BCS403
• A data item can be a database record, but it can also be a larger unit such as a whole disk block,
or even a smaller unit such as an individual field (attribute) value of some record in the
database.

• Each data item has a unique name.

• Basic DB access operations that a transaction can include are:


▪ read_item(X): Reads a DB item named X into a program variable.
▪ write_item(X): Writes the value of a program variable into the DB item named X.

• Executing read_item(X) include the following steps:

1. Find the address of the disk block that contains item X


2. Copy the block into a buffer in main memory
3. Copy the item X from the buffer to program variable named X.

• Executing write_item(X) include the following steps:

1. Find the address of the disk block that contains item X


2. Copy the disk block into a buffer in main memory
3. Copy item X from program variable named X into its correct location in buffer.
4. Store the updated disk block from buffer back to disk (either immediately or later).

• Decision of when to store a modified disk block is handled by recovery manager of the DBMS
in cooperation with operating system.

• A DB cache includes a number of data buffers.

• When the buffers are all occupied a buffer replacement policy is used to choose one of the
buffers to be replaced. EG: LRU.

• A transaction includes read_item and write_item operations to access and update DB.

\ Dept. of CSE, SVIT


Page 27
MODULE –4 NOTES DBMS- BCS403

• The read-set of a transaction is the set of all items that the transaction reads.

• The write-set is the set of all items that the transaction writes.

• For example, the read-set of T1 in Figure 20.2 is {X, Y} and its write-set is also {X, Y}.

20.1.3 Why Concurrency Control Is Needed

• Several problems can occur when concurrent transactions execute in an uncontrolled manner

Example: We consider an Airline reservation DB

Each record is stored for an airline flight which includes Number of reserved seats among other
information.

Types of problems we may encounter:


1. The Lost Update Problem
2. The Temporary Update (or Dirty Read) Problem
3. The Incorrect Summary Problem
4. The Unrepeatable Read Problem

• Transaction T1
Transfers N reservations from one flight whose number of reserved seats is stored in the
database item named X to another flight whose number of reserved seats is stored in the database
item named Y.

• Transaction T2
Reserves M seats on the first flight (X).

\ Dept. of CSE, SVIT


Page 28
MODULE –4 NOTES DBMS- BCS403
1. The Lost Update Problem.

• Occurs when two transactions that access the same DB items have their operations interleaved
in a way that makes the value of some DB item incorrect.

• Suppose that transactions T1 and T2 are submitted at approximately the same time, and suppose
that their operations are interleaved as shown in Figure below.

• Final value of item X is incorrect because T2 reads the value of X before T1 changes it in the
database, and hence the updated value resulting from T1 is lost.

• For example:

X = 80 at the start (there were 80 reservations on the flight)


N = 5 (T1 transfers 5 seat reservations from the flight corresponding to X to the flight
corresponding to Y)
M = 4 (T2 reserves 4 seats on X) The final result should be X = 79.

• The interleaving of operations shown in Figure is X = 84 because the update in T1 that removed
the five seats from X was lost.

2. The Temporary Update (or Dirty Read) Problem

• Occurs when one transaction updates a database item and then the transaction fails for some
reason.
• Meanwhile the updated item is accessed by another transaction before it is changed back to its
original value.

\ Dept. of CSE, SVIT


Page 29
MODULE –4 NOTES DBMS- BCS403

3. The Incorrect Summary Problem


If one transaction is calculating an aggregate summary function on a number of db items while
other transactions are updating some of these items, the aggregate function may calculate some
values before they are updated and others after they are updated.

4. The Unrepeatable Read Problem

• Transaction T reads the same item twice and gets different values on each read, since the
item was modified by another transaction T` between the two reads.
• for example, if during an airline reservation transaction, a customer inquiry about seat
availability on several flights.
• When the customer decides on a particular flight, the transaction then reads the number of
seats on that flight a second time before completing the reservation, and it may end up
reading a different value for the item.

\ Dept. of CSE, SVIT


Page 30
MODULE –4 NOTES DBMS- BCS403

20.1.4 Why Recovery Is Needed

• Whenever a transaction is submitted to a DBMS for execution, the system is responsible for
making sure that either
1. All the operations in the transaction are completed successfully and their effect is
recorded permanently in the database or
2.The transaction does not have any effect on the database or any other transactions.

• In the first case, the transaction is said to be committed, whereas in the second case, the
transaction is aborted.

• If a transaction fails after executing some of its operations but before executing all of them, the
operations already executed must be undone and have no lasting effect.

Types of failures
1. A computer failure (system crash):
• A hardware, software, or network error occurs in the computer system during transaction
execution.
• Hardware crashes are usually media failures for example, main memory failure.

2. A transaction or system error:


• Some operation in the transaction may cause it to fail, such as integer overflow or
division by zero.
• Also occur because of erroneous parameter values

3. Local errors or exception conditions detected by the transaction:


• During transaction execution, certain conditions may occur that necessitate cancellation
of the transaction.
• For example, data for the transaction may not be found

4. Concurrency control enforcement:


• The concurrency control may decide to abort a transaction because itviolates
serializability or several transactions are in a state of deadlock.

5. Disk failure:
• Some disk blocks may lose their data because of a read or write malfunction or because
of a disk read/write head crash.

6. Physical problems and catastrophes:


• Refers to an endless list of problems that includes power or air-conditioning failure, fire,
theft, overwriting disks or tapes by mistake,
• Failures of types 1, 2, 3, and 4 are more common than those of types 5 or 6.
\ Dept. of CSE, SVIT
Page 31
MODULE –4 NOTES DBMS- BCS403

• Whenever a failure of type 1 through 4 occurs, the system must keep sufficient
information to quickly recover from the failure.
• Disk failure or other catastrophic failures of type 5 or 6 do not happen frequently; if they
do occur, recovery is a major task.

20.2 Transaction and System Concepts

20.2.1 Transaction States and Additional Operations


• A transaction is an atomic unit of work that should either be completed in its entirety or not
done at all. For recovery purposes, the system keeps track of start of a transaction,
termination, commit or aborts.

BEGIN_TRANSACTION: marks the beginning of transaction execution.

READ or WRITE: specify read or write operations on the database items that are executed
as part of a transaction.

END_TRANSACTION: specifies that READ and WRITE transaction operations have


ended and marks the end of transaction execution.

COMMIT_TRANSACTION: signals a successful end of the transaction so that any


changes (updates) executed by the transaction can be safely committed to the database and
will not be undone

ROLLBACK: signals that the transaction has ended unsuccessfully, so that any changes or
effects that the transaction may have applied to the database must be undone.

\ Dept. of CSE, SVIT


Page 32
MODULE –4 NOTES DBMS- BCS403

• A transaction goes into active state immediately after it starts execution and can execute read
and write operations.

• When the transaction ends it moves to partially committed state.

• At this end additional checks are done to see if the transaction can be committed or not.

If these checks are successful the transaction is said to have reached commit point and
enters committed state. All the changes are recorded permanently in the db.

• A transaction can go to the failed state if one of the checks fails or if the transaction is
aborted during its active state.

• The transaction may then have to be rolled back to undo the effect of its write operation.

• Terminated state corresponds to the transaction leaving the system. All the information about
the transaction is removed from system tables.

20.2.2 The System Log

• Log or Journal keeps track of all transaction operations that affect the values of database
items.

• This information may be needed to permit recovery from transaction failures.

• The log is kept on disk, so it is not affected by any type of failure except for disk or
catastrophic failure

• one (or more) main memory buffers hold the last part of the log file, so that log entries are
first added to the main memory buffer .

• When the log buffer is filled, or when certain other conditions occur, the log buffer is
appended to the end of the log file on disk.

• In addition, the log is periodically backed up to archival storage (tape) to guard against such
catastrophic failures.

• The following are the types of entries called log records that are written to the log file and the
corresponding action for each log record.

• In these entries, T refers to a unique transaction-id that is generated automatically by the


system for each transaction and that is used to identify each transaction:

\ Dept. of CSE, SVIT


Page 33
MODULE –4 NOTES DBMS- BCS403
1. [start_transaction, T]. Indicates that transaction T has started execution.

2. [write_item, T, X, old_value, new_value]. Indicates that transaction T has changed the


value of database item X from old_value to new_value.

3. [read_item, T, X]. Indicates that transaction T has read the value of database item X.

4. [commit, T]. Indicates that transaction T has completed successfully, and affirms that its
effect can be committed (recorded permanently) to the database.

5. [abort, T]. Indicates that transaction T has been aborted.

20.2.3 Commit Point of a Transaction:

• Definition a Commit Point:

➢ A transaction T reaches its commit point when all its operations that access the database
have been executed successfully and the effect of all the transaction operations on the
database has been recorded in the log.

➢ Beyond the commit point, the transaction is said to be committed, and its effect is
assumed to be permanently recorded in the database. The transaction then writes an entry
[commit,T] into the log.

• Roll Back of transactions: Needed for transactions that have a [start_transaction,T] entry
into the log but no commit entry [commit,T] into the log.

20.2.4 DBMS specific buffer Replacement policies

Domain Separation(DS) method

• DBMS cache is divided into separate domains, each handle one type of disk pages
and replacements within each domain are handled via basic LRU page replacement.

• LRU is a static algorithm and does not adopt to dynamically changing loads because
the number of available buffers for each domain is predetermined.

• Group LRU adds dynamically load balancing feature since it gives each domain a
priority and selects pages from lower priority level domain first for replacement.

Hot Set Method:

• This is useful in queries that have to scan a set of pages repeatedly.

\ Dept. of CSE, SVIT


Page 34
MODULE –4 NOTES DBMS- BCS403
• The hot set method determines for each db processing algorithm the set of disk pages
that will be accessed repeatedly and it does not replace them until their processing is
completed.

The DBMIN method:


• Uses a model known as QLSM (Query Locality set model), which predetermines the
pattern of page references for each algorithm for a particular db operation.
• Depending on the type of access method, the file characteristics, and the algorithm used
the QLSM will estimate the number of main memory buffers needed for each file
involved in the operation.

20.3 Desirable properties of Transactions


• Transactions should possess several properties, often called the ACID properties

A Atomicity: a transaction is an atomic unit of processing and it is either performed entirely or


not at all.

C Consistency Preservation: a transaction should be consistency preserving that is it must take


the database from one consistent state to another.

I Isolation/Independence: A transaction should appear as though it is being executed in


isolation from other transactions, even though many transactions are executed concurrently.

D Durability (or Permanency): if a transaction changes the database and is committed, the
changes must never be lost because of any failure.

The atomicity property requires that we execute a transaction to completion. It is the


responsibility of the transaction recovery subsystem of a DBMS to ensure atomicity.

• The preservation of consistency is generally considered to be the responsibility of the


programmers who write the database programs or of the DBMS module that enforces
integrity constraints.

• The isolation property is enforced by the concurrency control subsystem of the DBMS. If
every transaction does not make its updates (write operations) visible to other transactions
until it is committed, one form of isolation is enforced that solves the temporary update
problem and eliminates cascading rollbacks .

• Durability is the responsibility of recovery subsystem.

\ Dept. of CSE, SVIT


Page 35
MODULE –4 NOTES DBMS- BCS403
20.4 Characterizing Schedules Based on Recoverability

Schedule (or history): the order of execution of operations from all the various transactions.

Schedules (Histories) of Transactions: A schedule S of n transactions T 1, T2, ……Tn is a


sequential ordering of the operations of the n transactions.

• The transactions are interleaved, for each transaction Ti that participates in the schedule S, the
operations of Ti in S must appear in the same order in which they occur in Ti.

• The order of operations in S is considered to be a total ordering, meaning that for any two
operations in the schedule, one must occur before the other. It is possible theoretically to deal
with schedules whose operations form partial orders, but we will assume for now total ordering
of the operations in a schedule.

• A shorthand notation for describing a schedule uses the symbols b, r, w, e, c, and a for the
operations begin_transaction, read_item, write_item, end_transaction, commit, and abort,
respectively.

• The schedule in Figure lost update problem, which we shall call Sa, can be written as follows
in this notation:

Sa: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y);

• Similarly, the schedule for Figure temporary update problem, which we call Sb, can be written
as follows, if we assume that transaction T1 aborted after its read_item(Y) operation:

Sb: r1(X); w1(X); r2(X); w2(X); r1(Y); a1;

Conflicting Operations in a Schedule:

• Two operations in a schedule are said to conflict if they satisfy all three of the following
conditions:

(1) they belong to different transactions;


(2) they access the same item X; and
(3) at least one of the operations is a write_item(X)

Conflicting operations:

\ Dept. of CSE, SVIT


Page 36
MODULE –4 NOTES DBMS- BCS403
A schedule S of n transactions T1, T2, ... , Tn is said to be a complete schedule if the following
conditions hold:

1. The operations in S are exactly those operations in T1, T2, ... , Tn, including a commit or abort
operation as the last operation for each transaction in the schedule.

2. For any pair of operations from the same transaction Ti, their relative order of appearance in S is
the same as their order of appearance in Ti.

3. For any two conflicting operations, one of the two must occur before the other in the schedule.

• Committed projection C(S) of a schedule S, which includes only the operations in S that belong
to committed transactions that is, transactions Ti whose commit operation ci is in S.

4.4.2 Schedules classified on recoverability:


• Recoverable schedule:

----One where no transaction needs to be rolled back.


-----A schedule S is recoverable if no transaction T in S commits until all transactions T’ that have
written an item that T reads have committed.

Example:
o Sc: r1(X); w1(X); r2(X); r1(Y); w2(X); c2; a1;
o Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2;

• Cascading rollback (or cascading abort)


---- Occur in some recoverable schedules, where an uncommitted transaction has to be rolled back
because it read an item from a transaction that failed. This is illustrated in schedule Se, where
transaction T2 has to be rolled back because it read item X from T1, and T1 then aborted.

- Se: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); a1; a2;

• Cascadeless schedule:
---- One where every transaction reads only the items that are written by committed transactions.

• Schedules requiring cascaded rollback:


----- A schedule in which uncommitted transactions that read an item from a failed transaction
must be rolled back.

• Strict Schedules:
------A schedule in which a transaction can neither read or write an item X until the last
transaction that wrote X has committed.

For example, consider schedule Sf : Sf : w1(X, 5); w2(X, 8); a1; Suppose that the value of X was
originally 9, which is the before image stored in the system log along with the w1(X, 5) operation.
\ Dept. of CSE, SVIT
Page 37
MODULE –4 NOTES DBMS- BCS403
If T1 aborts, as in Sf, the recovery procedure that restores the before image of an aborted write
operation will restore the value of X to 9, even though it has already been changed to 8 by
transaction T2, thus leading to potentially incorrect results. Although schedule Sf is cascade less, it is
not a strict schedule, since it permits T2 to write item X even though the transaction T1that last
wrote X had not yet committed (or aborted).

20.5 Characterizing Schedules Based on Serializability

Schedules that are always considered to be correct when concurrent transactions are executing are
known as serializable schedules.

Suppose that two users for example, two airline reservations agents submit to the DBMS
transactions T1 and T2 at approximately the same time. If no interleaving of operations is permitted,
there are only two possible outcomes:

1. Execute all the operations of transaction T1 (in sequence) followed by all the operations of
transaction T2 (in sequence).

2. Execute all the operations of transaction T2 (in sequence) followed by all the operations of
transaction T1 (in sequence).

\ Dept. of CSE, SVIT


Page 38
MODULE –4 NOTES DBMS- BCS403
Serial schedule:
A schedule S is serial if, for every transaction T participating in the schedule, all the operations of T
are executed consecutively in the schedule. Otherwise, the schedule is called non serial schedule.

Serializable schedule:
A schedule S is serializable if it is equivalent to some serial schedule of the same n transactions.

Result equivalent:
Two schedules are called result equivalent if they produce the same final state of the database.

• For example, in Figure 20.6, schedules S1 and S2 will produce the same final database state if they
execute on a database with an initial value of X = 100; however, for other initial values of X, the
schedules are not result equivalent.

Conflict equivalent:
Two schedules are said to be conflict equivalent if the order of any two conflicting operations is the
same in both schedules.

Conflict serializable:

A schedule S is said to be conflict serializable if it is conflict equivalent to some serial schedule S’.

• Being serializable is not the same as being serial.

• Being serializable implies that the schedule is a correct schedule.

✓ It will leave the database in a consistent state.


✓ The interleaving is appropriate and will result in a state as if the transactions were serially
executed, yet will achieve efficiency due to concurrent execution.

\ Dept. of CSE, SVIT


Page 39
MODULE –4 NOTES DBMS- BCS403
20.5.2 Testing for Serializability of a Schedule
Algorithm: Testing Conflict Serializability of a Schedule S

1. For each transaction Ti participating in schedule S, create a node labeled Ti in the precedence
graph.

2. For each case in S where Tj executes a read_item(X) after Ti executes a write_item(X), create an
edge (Ti → Tj) in the precedence graph.

3. For each case in S where Tj executes a write_item(X) after Ti executes a read_item(X), create an
edge (Ti → Tj) in the precedence graph.

4. For each case in S where Tj executes a write_item(X) after Ti executes a write_item(X), create an
edge (Ti → Tj) in the precedence graph.

5. The schedule S is serializable if and only if the precedence graph has no cycles.

• The algorithm looks at only the read_item and write_item operations in a schedule to construct
a precedence graph (or serialization graph), which is a directed graph G = (N, E) that consists of
a set of nodes N = {T1, T2, ... , Tn } and a set of directed edges E = {e1, e2, ... , em }.

• Each edge ei in the graph is of the form (Tj → Tk ), 1 ≤ j ≤ n, 1 ≤ k ≤ n, where Tj is the starting
node of ei and Tk is the ending node of ei.

• If there is a cycle in the precedence graph, schedule S is not (conflict) serializable; if there is
no cycle, S is serializable.

• A cycle in a directed graph is a sequence of edges C = ((Tj → Tk), (Tk → Tp), ... , (Ti → Tj))
with the property that the starting node of each edge except the first edge is the same as the
ending node of the previous edge, and the starting node of the first edge is the same as the ending
node of the last edge.

• The edges (Ti → Tj) in a precedence graph can optionally be labeled by the name(s) of the data
item(s) that led to creating the edge. Figure shows such labels on the edges.

\ Dept. of CSE, SVIT


Page 40
MODULE –4 NOTES DBMS- BCS403

Fig: Constructing the precedence graphs for schedules A and D from fig 20.5 to test for conflict
serializability.

(a) Precedence graph for serial schedule A.


(b) Precedence graph for serial schedule B.
(c) Precedence graph for schedule C (not serializable).
(d) Precedence graph for schedule D (serializable, equivalent to schedule A).

• Another example of serializability testing. (a) The READ and WRITE operations of three
transactions T1, T2, and T3.

• Schedule E is not serializable because the corresponding precedence graph has cycles. Schedule F
is serializable, and the serial schedule equivalent to F is shown in Figure 20.8(e).

\ Dept. of CSE, SVIT


Page 41
MODULE –4 NOTES DBMS- BCS403

\ Dept. of CSE, SVIT


Page 42
MODULE –4 NOTES DBMS- BCS403
20.5.4 View Equivalence and View Serializability
• Two schedules S and S′ are said to be view equivalent if the following three conditions hold:

1. The same set of transactions participates in S and S′, and S and S′ include the same operations
of those transactions.

2. For any operation ri(X) of Ti in S, if the value of X read by the operation has been written by
an operation wj(X) of Tj (or if it is the original value of X before the schedule started), the same
condition must hold for the value of X read by operation ri(X) of Ti in S′.

3. If the operation wk(Y) of Tk is the last operation to write item Y in S, then wk(Y) of Tk must
also be the last operation to write item Y in S′.

• The idea behind view equivalence is that, as long as each read operation of a transaction reads the
result of the same write operation in both schedules, the write operations of each transaction must
produce the same results.

• The read operations are hence said to see the same view in both schedules.

• Condition 3 ensures that the final write operation on each data item is the same in both schedules,
so the database state should be the same at the end of both schedules.

• A schedule S is said to be view serializable if it is view equivalent to a serial schedule.

• The definitions of conflict serializability and view serializability are similar if a condition known as
the constrained write assumption (or no blind writes) holds on all transactions in the schedule.
This condition states that any write operation wi(X) in Ti is preceded by a ri(X) in Ti and that the
value written by wi(X) in Ti depends only on the value of X read by ri(X).

• A blind write is a write operation in a transaction T on an item X that is not dependent on the old
value of X, so it is not preceded by a read of X in the transaction T.

• The definition of view serializability is less restrictive than that of conflict serializability under the
unconstrained write assumption, where the value written by an operation wi(X) in Ti can be
independent of its old value.

• This is possible when blind writes are allowed, and it is illustrated by the following schedule Sg of
three transactions T1: r1(X); w1(X); T2: w2(X); and T3: w3(X):
Sg: r1(X); w2(X); w1(X); w3(X); c1; c2; c3;

• In Sg the operations w2(X) and w3(X) are blind writes, since T2 and T3 do not read the value of X.
The schedule Sg is view serializable, since it is view equivalent to the serial schedule T1, T2, T3.
However, Sg is not conflict serializable.

\ Dept. of CSE, SVIT


Page 43
MODULE –4 NOTES DBMS- BCS403

20.5.5 Other Types of Equivalence of Schedules

• Serializability of schedules is sometimes considered to be too restrictive as a condition for


ensuring the correctness of concurrent executions. Some applications can produce schedules
that are correct by satisfying conditions less stringent than either conflict serializability or
view serializability.

• An example is the type of transactions known as debit-credit transactions, those that apply
deposits and withdrawals to a data item whose value is the current balance of a bank account.

• The semantics of debit-credit operations is that they update the value of a data item X by
either subtracting from or adding to the value of the data item. Because addition and
subtraction operations are commutative that is, they can be applied in any order it is possible
to produce correct schedules that are not serializable.

• For example, consider the following transactions, each of which may be used to transfer an
amount of money between two bank accounts:
T1: r1(X); X :{equal} X − 10; w1(X); r1(Y); Y :{equal} Y + 10; w1(Y);
T2: r2(Y); Y :{equal} Y − 20; w2(Y); r2(X); X :{equal} X + 20; w2(X);

• Consider the following non serializable schedule Sh for the two transactions:
Sh: r1(X); w1(X); r2(Y); w2(Y); r1(Y); w1(Y); r2(X); w2(X);

• With the additional knowledge, or semantics, that the operations between each ri(I) and wi(I)
are commutative, the order of executing the sequences consisting of (read, update, write) is
not important as long as each (read, update, write) sequence by a particular transaction Ti on a
particular item I is not interrupted by conflicting operations.

• Hence, the schedule Sh is considered to be correct even though it is not serializable.


Researchers have been working on extending concurrency control theory to deal with cases
where serializability is considered to be too restrictive as a condition for correctness of
schedules.

• Also, in certain domains of applications, such as computer-aided design (CAD) of complex


systems like aircraft, design transactions last over a long time period. In such applications,
more relaxed schemes of concurrency control have been proposed to maintain consistency of
the database, such as eventual consistency.

\ Dept. of CSE, SVIT


Page 44
MODULE –4 NOTES DBMS- BCS403
20.6 Transaction Support in SQL
• The basic definition of an SQL transaction is, it is a logical unit of work and is guaranteed to be
atomic.

• A single SQL statement is always considered to be atomic either it completes execution without
an error or it fails and leaves the database unchanged.

• With SQL, there is no explicit Begin_Transaction statement. Transaction initiation is done


implicitly when particular SQL statements are encountered.

• Every transaction must have an explicit end statement, which is either a COMMIT or a
ROLLBACK.

• Every transaction has certain characteristics attributed to it and are specified by a SET
TRANSACTION statement in SQL

The characteristics are :


The access mode
- can be specified as READ ONLY or READ WRITE
- The default is READ WRITE
- A mode of READ WRITE allows select, update, insert, delete, and create commands to
be executed
- A mode of READ ONLY, as the name implies, is simply for data retrieval.

The diagnostic area size


- DIAGNOSTIC SIZE n, specifies an integer value n, which indicates the number of
conditions that can be held simultaneously in the diagnostic area.
- These conditions supply feedback information (errors or exceptions) to the user or
program on the n most recently executed SQL statement

The isolation level


- specified using the statement ISOLATION LEVEL <isolation>, where the value for
<isolation> can be READ UNCOMMITTED, READ COMMITTED, REPEATABLE
READ, or SERIALIZABLE
- The default isolation level is SERIALIZABLE
- The use of the term SERIALIZABLE here is based on not allowing violations that
cause dirty read, unrepeatable read, and phantoms
- If a transaction executes at a lower isolation level than SERIALIZABLE, then one or
more of the following three violations may occur:

1. Dirty read. A transaction T1 may read the update of a transaction T2, which has not
yet committed. If T2 fails and is aborted, then T1 would have read a value that does not
exist and is incorrect.

\ Dept. of CSE, SVIT


Page 45
MODULE –4 NOTES DBMS- BCS403
2. Nonrepeatable read. A transaction T1 may read a given value from a table. If another
transaction T2 later updates that value and T1 reads that value again, T1 will see a
different value.

3. Phantoms. A transaction T1 may read a set of rows from a table, perhaps based on
some condition specified in the SQL WHERE-clause. Now suppose that a transaction T2
inserts a new row that also satisfies the WHERE-clause condition used in T1, into the
table used by T1.
If T1 is repeated, then T1 will see a phantom, a row that previously did not exist.

The transaction consists of first inserting a new row in the EMPLOYEE table and then updating the
salary of all employees who work in department 2 . If an error occurs on any of the SQL statements,
the entire transaction is rolled back This implies that any updated salary (by this transaction) would
be restored to its previous value and that the newly inserted row would be remove.

\ Dept. of CSE, SVIT


Page 46
Page 47
Page 48

You might also like