DBMS Module-4 Notes
DBMS Module-4 Notes
MODULE 4- NOTES
ON
DATABASE MANAGEMENT SYSTEMS( BCS403)
2024 – 2025
B. E IV Semester
Web: www.saividya.ac.in
https://www.facebook.com/SaiVidyaInstituteOfTechnology
CHAPTER 1
1. Unknown value. A person’s date of birth is not known, so it is represented by NULL in the
database. An example of the other case of unknown would be NULL for a person’s home
phone because it is not known whether or not the person has a home phone.
2. Unavailable or withheld value. A person has a home phone but does not want it to be
listed, so it is withheld and represented as NULL in the database.
▪ Each individual NULL value considered to be different from every other NULL value.
▪ SQL uses a three-valued logic: TRUE, FALSE, and UNKNOWN instead of the standard
two-valued (Boolean) logic with values TRUE or FALSE.
▪ When a record with NULL in one of its attributes is involved in a comparison operation, the
result is considered to be UNKNOWN (it may be TRUE or it may be FALSE).
▪ It is therefore necessary to define the results (or truth values) of three-valued logical
expressions when the logical connectives AND, OR, and NOT are used. Table 7.1 shows the
resulting values.
▪ In Tables 7.1(a) and 7.1(b), the rows and columns represent the values of the results of
comparison conditions, which would typically appear in the WHERE clause of an SQL query.
▪ The result of combining the two values using the AND logical connective is shown by the
entries in Table 7.1(a). Table 7.1(b) shows the result of using the OR logical connective.
▪ For example, the result of (FALSE AND UNKNOWN) is FALSE, whereas the result of
(FALSE OR UNKNOWN) is UNKNOWN.
➢ SQL allows queries that check whether an attribute value is NULL. Rather than using = or <>
to compare an attribute value to NULL, SQL uses the comparison operators IS or IS NOT.
➢ Query 18 illustrates NULL com parison by retrieving any employees who do not have a
supervisor.
Query 18. Retrieve the names of all employees who do not have supervisors
▪ Some queries require that existing values in the database be fetched and then used in a
comparison condition. Such queries can be formulated by using nested queries, which are
complete select-from-where blocks within another SQL query. That other query is called
the outer query.
▪ Q4A introduces the comparison operator IN, which compares a value v with a set (or multiset)
of values V and evaluates to TRUE if v is one of the elements in V.
▪ In Q4A, the first nested query selects the project numbers of projects that have an employee
with last name ‘Smith’ involved as manager, whereas the second nested query selects the
project numbers of projects that have an employee with last name ‘Smith’ involved as worker.
▪ In the outer query, we use the OR logical connective to retrieve a PROJECT tuple if the
PNUMBER value of that tuple is in the result of either nested query.
Make a list of all project numbers for projects that involve employee Smith either as worker
or as a manager of the department that controls the project:
▪ SQL allows the use of tuples of values in comparisons by placing them within parentheses.
To illustrate this, consider the following query.
▪ This query will select the Essns of all employees who work the same (project, hours)
combination on some project that employee ‘John Smith’ (whose Ssn = ‘123456789’) works
on.
▪ In addition to the IN operator, other comparison operators can be used to compare a single
value v (typically an attribute name) to a set or multiset v (typically a nested query). The =
ANY (or = SOME) operator returns TRUE if the value v is equal to some value in the set V
and is hence equivalent to IN.
▪ The two keywords ANY and SOME have the same effect. Other operators that can be
combined with ANY (or SOME) include >, >=, <=, and <>.
▪ The keyword ALL can also be combined with each of these operators. An example is the
following query, which returns the names of employees whose salary is greater than the
salary of all the employees in department 5:
To illustrate the potential ambiguity of attribute names in nested queries, consider Query 16.
Query 16. Retrieve the name of each employee who has a dependent with the same first name
and is the same sex as the employee.
Note: It is generally advisable to create tuple variables (aliases) for all the tables referenced
in an SQL query to avoid potential errors and ambiguities.
Whenever a condition in the WHERE clause of a nested query references some attribute of a
relation declared in the outer query, the two queries are said to be correlated.
A nested query is a query inside another query. The inner query is executed first, and its result is
passed to the outer query.
• Inner query runs independently of the outer query.
• Inner query is executed once.
• Outer query uses the result to filter or process.
A correlated query is a subquery that depends on the outer query. The inner query is executed
once per row from the outer query.
• Inner query refers to a column of the outer query.
• Executed multiple times, once for each row from the outer query.
• Often slower than simple nested queries due to repeated execution.
• In general, a query written with nested select-from-where blocks and using the = or IN
comparison operators can always be expressed as a single block query. For example, Q16 may
be written as in Q16A:
• The EXISTS function in SQL is used to check whether the result of a nested query is empty (contains no tuples)
or not.
• The result of EXISTS is a Boolean value TRUE if the nested query result contains at least one tuple, or FALSE
if the nested query result contains no tuples.
• EXISTS and NOT EXISTS : Typically used in conjunction with a correlated nested query.
• EXISTS(Q) returns TRUE if there is at least one tuple in the result of the nested query Q, and returns FALSE
otherwise.
• On the other hand, NOT EXISTS(Q) returns TRUE if there are no tuples in the result of nested query Q, and
returns FALSE otherwise.
• In Q6, the correlated nested query retrieves all DEPENDENT tuples related to a particular EMPLOYEE tuple. If
none exist, the EMPLOYEE tuple is selected because the WHERE-clause condition will evaluate to TRUE in
this case.
• For each EMPLOYEE tuple, the correlated nested query selects all DEPENDENT tuples whose Essn value
matches the EMPLOYEE Ssn; if the result is empty, no dependents are related to the employee, so we select that
EMPLOYEE tuple and retrieve its Fname and Lname.
• In Q7, where we specify two nested correlated queries; the first selects all DEPENDENT tuples related to an
EMPLOYEE, and the second selects all DEPARTMENT tuples managed by the EMPLOYEE. If at least one of
the first and at least one of the second exists, we select the EMPLOYEE.
• The query Q3: Retrieve the name of each employee who works on all the projects controlled by department
number 5 can be written using EXISTS and NOT EXISTS in SQL systems.
• Query 17. Retrieve the Social Security numbers of all employees who work on project numbers 1, 2, or 3.
• In SQL, it is possible to rename any attribute that appears in the result of a query by adding the qualifier AS
followed by the desired new name.
• For example, consider query Q1, which retrieves the name and address of every employee who works for the
‘Research’ department.
• The concept of a joined table also allows the user to specify different types of join, such as NATURAL JOIN
and various types of OUTER JOIN. In a NATURAL JOIN on two relations R and S, no join condition is
specified; an implicit EQUIJOIN condition for each pair of attributes with the same name from R and S is created.
• If the names of the join attributes are not the same in the base relations, it is possible to rename the attributes so
that they match, and then to apply NATURAL JOIN. In this case, the AS construct can be used to rename a
relation and all its attributes in the FROM clause.
• The default type of join in a joined table is called an inner join, where a tuple is included in the result only if a
matching tuple exists in the other relation.
• If the user requires that all employees be included, a different type of join called OUTER JOIN must be used
explicitly.
• It is also possible to nest join specifications; that is, one of the tables in a join may itself be a joined table. This
allows the specification of the join of three or more tables as a single joined table, which is called a multiway join.
• A number of built-in aggregate functions exist: COUNT, SUM, MAX, MIN, and AVG.
• The COUNT function returns the number of tuples or values as specified in a query.
• The functions SUM, MAX, MIN, and AVG can be applied to a set or multiset of numeric values and return,
respectively, the sum, maximum value, minimum value, and average (mean) of those values.
Query 19. Find the sum of the salaries of all employees, the maximum salary, the minimum salary, and the
average salary.
• This query returns a single-row summary of all the rows in the EMPLOYEE table. We could use AS to rename
the column names in the resulting single-row table; for example, as in Q19A.
Query 20. Find the sum of the salaries of all employees of the ‘Research’ department, as well as the maximum
salary, the minimum salary, and the average salary in this department.
Queries 21 and 22. Retrieve the total number of employees in the company (Q21) and the number of
employees in the ‘Research’ department (Q22).
Note :Here the asterisk (*) refers to the rows (tuples), so COUNT (*) returns the number of rows in the result of the
query. We may also use the COUNT function to count values in a column rather than tuples, as in the next example.
Correlated nested query can also be specified with an aggregate function, and then use the nested query in the
WHERE clause of an outer query.
For example, to retrieve the names of all employees who have two or more dependents (Query 5), we can write the
following.
The correlated nested query counts the number of dependents that each employee has; if this is greater than or equal to
two, the employee tuple is selected.
• If we want to apply the aggregate functions to subgroups of tuples in a relation, where the
subgroups are based on some attribute values, SQL has GROUP BY clause for this purpose.
• For example, we may want to find the average salary of employees in each department or the
number of employees who work on each project.
• In these cases, we need to partition the relation into nonoverlapping subsets (or groups) of
tuples. Each group (partition) will consist of the tuples that have the same value of some
attribute(s), called the grouping attribute(s).
• The GROUP BY clause specifies the grouping attributes, which should also appear in the
SELECT clause, so that the value resulting from applying each aggregate function to a group of
tuples appears along with the value of the grouping attribute(s).
Figure 7.1(a) illustrates how grouping works and shows the result of Q24.
Note: If NULLs exist in the grouping attribute, then a separate group is created for all tuples with a
NULL value in the grouping attribute.
Query 25. For each project, retrieve the project number, the project name, and the number
of employees who work on that project.
Suppose, that we want to modify Query 25 so that only projects with more than two employees
appear in the result. SQL provides a HAVING clause, which can appear in conjunction with a
GROUP BY clause, for this purpose. This is illustrated by Query 26.
Query 26. For each project on which more than two employees work, retrieve the project
number, the project name, and the number of employees who work on the project.
Query 27. For each project, retrieve the project number, the project name, and the number of
employees from department 5 who work on the project.
Query 28. For each department that has more than five employees, retrieve the department
number and the number of its employees who are making more than $40,000.
WITH Constructs:
• Creates a temporary named result (like a view), used only in that specific query.
• Simplifies complex queries, especially with multiple subqueries or recursive logic.
• For example, we can rewrite Q28 as Q28′:
In Q28′, we defined in the WITH clause a temporary table BIG_DEPTS whose result holds the Dno’s of
departments with more than five employees, then used this table in the subsequent query. Once this query
is executed, the temporary table BIGDEPTS is discarded.
CASE Construct
For example: we want to give employees different raise amounts depending on which department
they work for; for example, employees in department 5 get a $2,000 raise, those in department 4 get
$1,500 and those in department 1 get $3,000.
• In Q29, we are defining a view SUP_EMP that will hold the result of the recursive query.
• The view is initially empty.
• It is first loaded with the first level (supervisor, supervisee) Ssn combinations via the first part
(SELECT SupervisorSss, Ssn FROM EMPLOYEE), which is called the base query.
• This will be combined via UNION with each successive level of supervisees through the second
part, where the view contents are joined again with the base values to get the second level
combinations, which are UNIONed with the first level.
• This is repeated with successive levels until a fixed point is reached, where no more tuples are
added to the view. At this point, the result of the recursive query is in the view SUP_EMP.
• A retrieval query in SQL can consist of up to six clauses, but only the first two— SELECT and
FROM—are mandatory.
• Query terms are separated by spaces, and parentheses can be used to group relevant parts of a
query in the standard way.
• The clauses are specified in the following order, with the clauses between square brackets [ … ]
being optional:
• The FROM clause specifies all relations (tables) needed in the query, including joined relations,
but not those in nested queries.
• The WHERE clause specifies the conditions for selecting the tuples from these relations,
including join conditions if needed.
• HAVING specifies a condition on the groups being selected rather than on the individual
tuples.
• The built-in aggregate functions COUNT, SUM, MIN, MAX, and AVG are used in
conjunction with grouping, but they can also be applied to all the selected tuples in a query
without a GROUP BY clause.
➢ CREATE ASSERTION
Specify additional types of constraints outside scope of built-in relational model
constraints.
➢ CREATE TRIGGER
Specify automatic actions that database system will perform when certain events and
conditions occur.
• Specify a query that selects any tuples that violate the desired condition.
• Use only in cases where it goes beyond a simple CHECK which applies to individual attributes and domains
• Each assertion is given a constraint name and is specified via a condition similar to the WHERE clause of an
SQL query.
• For example, to specify the constraint that the salary of an employee must not be greater than the salary of the
manager of the department that the employee works for in SQL, we can write the following assertion:
• It may be useful to specify a condition that, if violated, causes some user to be informed of the
violation.
• For Example ,A manager may want to be informed if an employee’s travel expenses exceed a certain
limit by receiving a message whenever this occurs.
• The action that the DBMS must take in this case is to send an appropriate message to that user. The
condition is thus used to monitor the database.
• Other actions may be specified, such as executing a specific stored procedure or triggering other
updates.
• Suppose we want to check whenever an employee’s salary is greater than the salary of his or her direct
supervisor in the COMPANY database .
• Several events can trigger this rule: inserting a new employee record, changing an employee’s salary,
or changing an employee’s supervisor.
• Suppose that the action to take would be to call an external stored procedure
SALARY_VIOLATION, which will notify the supervisor.
A typical trigger which is regarded as an ECA (Event, Condition, Action) rule has three components:
1) The event(s): These are usually database update operations that are explicitly applied to the
database.
• In this example the events are: inserting a new employee record, changing an employee’s
salary, or changing an employee’s supervisor.
• The person who writes the trigger must make sure that all possible events are accounted
for.
• These events are specified after the keyword BEFORE in our example, which means that
the trigger should be executed before the triggering operation is executed.
• An alternative is to use the keyword AFTER, which specifies that the trigger should be
executed after the operation specified in the event is completed.
2) The condition :That determines whether the rule action should be executed: Once the triggering
event has occurred, an optional condition may be evaluated.
• If no condition is specified, the action will be executed once the event occurs.
• If a condition is specified, it is first evaluated, and only if it evaluates to true will the rule
action be executed.
• The condition is specified in the WHEN clause of the trigger.
3) The action to be taken: The action is usually a sequence of SQL statements, but it could also be a
database transaction or an external program that will be automatically executed.
• In this example, the action is to execute the stored procedure INFORM_SUPERVISOR.
• A view in SQL terminology is a single table that is derived from other tables. These other
tables can be base tables or previously defined views.
• A view does not necessarily exist in physical form; it is considered to be a virtual table, in
contrast to base tables, whose tuples are always physically stored in the database.
• The view is given a (virtual) table name (or view name), a list of attribute names, and a query to
specify the contents of the view.
• In V1, attributes retain the names from base tables. In V2, attributes are assigned names.
• For example, to retrieve the last name and first name of all employees who work on the
‘ProductX’ project, we can utilize the WORKS_ON1 view and specify the query as in QV1.
• A view is supposed to be always up-to-date; if we modify the tuples in the base tables on
which the view is defined, the view must automatically reflect these changes. Hence, the
view does not have to be realized or materialized at the time of view definition but rather at
the time when we specify a query on the view. It is the responsibility of the DBMS and not
the user to make sure that the view is kept up to-date.
• If we do not need a view anymore, we can use the DROP VIEW command to dispose of it.
For example, to get rid of the view V1, we can use the SQL statement in V1A:
• Disadvantage: inefficient for views defined via complex queries that are time
consuming to execute.
• Physically create a temporary view table when the view is first queried.
• Keep that table on the assumption that other queries on the view will follow.
• Requires efficient strategy for automatically updating the view table when the base tables
are updated.
• Techniques using the concept of incremental update have been developed for this
purpose, where the DBMS can determine what new tuples must be inserted, deleted, or
modified in a materialized view table when a database update is applied to one of the
defining base tables.
• Immediate update strategy updates a view as soon as the base tables are changed.
• Lazy update strategy updates the view when needed by a view query.
• Periodic update strategy updates the view periodically (in the latter strategy, a view
query may get a result that is not up-to-date). This is commonly used in Banks, Retail
store operations, etc.
View Update
• This cannot be processed because Total_sal is a computed value in the view definition.
■ A view with a single defining table is updatable if the view attributes contain the primary key of
the base relation, as well as all attributes with the NOT NULL constraint that do not have default
values specified.
■Views defined on multiple tables using joins are generally not updatable.
■ Views defined using grouping and aggregate functions are not updatable.
In SQL, the clause WITH CHECK OPTION should be added at the end of the view definition if a
view is to be updated by INSERT, DELETE, or UPDATE statements.
It is also possible to define a view table in the FROM clause of an SQL query. This is known as an in-
line view. In this case, the view is defined within the query itself.
• Views can be used to hide certain attributes or tuples from unauthorized users.
• E.g., For a user who is only allowed to see employee information for those who work for department
5, he may only access the view :
• A view can restrict a user to only see certain columns; for example, only the first name, last name,
and address of an employee may be visible as follows:
CHAPTER 2
TRANSACTION PROCESSING
Introduction
The concept of transaction provides a mechanism for describing logical units of database processing.
Transaction processing systems are systems with large databases and hundreds of concurrent users
executing database transactions. Examples: Airline reservations banking, credit card processing,
online retail purchasing, Stock markets, supermarket checkouts etc.
These systems require high availability and fast response time for hundreds of concurrent users. A
transaction is typically implemented by a computer program, which includes database commands
such as retrievals, insertions, deletions, and updates.
One criterion for classifying a database system is according to the number of users who can
use the system concurrently.
Multiuser means- many users can use the system and hence access the database concurrently
- Eg: Airline reservation database
Multiprogramming operating systems execute some commands from one process, then suspend that
process and execute some commands from the next process, and so on.
A process is resumed at the point where it was suspended whenever it gets its turn to use the CPU
again. Hence, concurrent execution of processes is actually interleaved, as illustrated in Figure 20.1.
• Figure 20.1, shows two processes, A and B, executing concurrently in an interleaved fashion
Interleaving keeps the CPU busy when a process requires an input or output (I/O) operation,
such as reading a block from disk.
• The CPU is switched to execute another process rather than remaining idle during I/O time
Interleaving also prevents a long process from delaying other processes.
• If the computer system has multiple hardware processors (CPUs), parallel processing of
multiple processes is possible, as illustrated by processes C and D in Figure 20.1 Most of the
theory concerning concurrency control in databases is developed in terms of interleaved
concurrency.
• In a multiuser DBMS, the stored data items are the primary resources that may be accessed
concurrently by interactive users or application programs, which are constantly retrieving
information from and modifying the database.
20.1.2 Transactions, Database Items, Read and Write Operations, and DBMS
Buffers
• It can be either embedded within an application program using begin transaction and end
transaction statements or specified interactively via a high-level query language such as SQL.
• Transaction which does not update database are known as read only transactions.
• Transaction which does update database are known as read write transactions.
• A database is basically represented as a collection of named data items .The size of a data
item is called its granularity.
\ Dept. of CSE, SVIT
Page 26
MODULE –4 NOTES DBMS- BCS403
• A data item can be a database record, but it can also be a larger unit such as a whole disk block,
or even a smaller unit such as an individual field (attribute) value of some record in the
database.
• Decision of when to store a modified disk block is handled by recovery manager of the DBMS
in cooperation with operating system.
• When the buffers are all occupied a buffer replacement policy is used to choose one of the
buffers to be replaced. EG: LRU.
• A transaction includes read_item and write_item operations to access and update DB.
• The read-set of a transaction is the set of all items that the transaction reads.
• The write-set is the set of all items that the transaction writes.
• For example, the read-set of T1 in Figure 20.2 is {X, Y} and its write-set is also {X, Y}.
• Several problems can occur when concurrent transactions execute in an uncontrolled manner
Each record is stored for an airline flight which includes Number of reserved seats among other
information.
• Transaction T1
Transfers N reservations from one flight whose number of reserved seats is stored in the
database item named X to another flight whose number of reserved seats is stored in the database
item named Y.
• Transaction T2
Reserves M seats on the first flight (X).
• Occurs when two transactions that access the same DB items have their operations interleaved
in a way that makes the value of some DB item incorrect.
• Suppose that transactions T1 and T2 are submitted at approximately the same time, and suppose
that their operations are interleaved as shown in Figure below.
• Final value of item X is incorrect because T2 reads the value of X before T1 changes it in the
database, and hence the updated value resulting from T1 is lost.
• For example:
• The interleaving of operations shown in Figure is X = 84 because the update in T1 that removed
the five seats from X was lost.
• Occurs when one transaction updates a database item and then the transaction fails for some
reason.
• Meanwhile the updated item is accessed by another transaction before it is changed back to its
original value.
• Transaction T reads the same item twice and gets different values on each read, since the
item was modified by another transaction T` between the two reads.
• for example, if during an airline reservation transaction, a customer inquiry about seat
availability on several flights.
• When the customer decides on a particular flight, the transaction then reads the number of
seats on that flight a second time before completing the reservation, and it may end up
reading a different value for the item.
• Whenever a transaction is submitted to a DBMS for execution, the system is responsible for
making sure that either
1. All the operations in the transaction are completed successfully and their effect is
recorded permanently in the database or
2.The transaction does not have any effect on the database or any other transactions.
• In the first case, the transaction is said to be committed, whereas in the second case, the
transaction is aborted.
• If a transaction fails after executing some of its operations but before executing all of them, the
operations already executed must be undone and have no lasting effect.
Types of failures
1. A computer failure (system crash):
• A hardware, software, or network error occurs in the computer system during transaction
execution.
• Hardware crashes are usually media failures for example, main memory failure.
5. Disk failure:
• Some disk blocks may lose their data because of a read or write malfunction or because
of a disk read/write head crash.
• Whenever a failure of type 1 through 4 occurs, the system must keep sufficient
information to quickly recover from the failure.
• Disk failure or other catastrophic failures of type 5 or 6 do not happen frequently; if they
do occur, recovery is a major task.
READ or WRITE: specify read or write operations on the database items that are executed
as part of a transaction.
ROLLBACK: signals that the transaction has ended unsuccessfully, so that any changes or
effects that the transaction may have applied to the database must be undone.
• A transaction goes into active state immediately after it starts execution and can execute read
and write operations.
• At this end additional checks are done to see if the transaction can be committed or not.
If these checks are successful the transaction is said to have reached commit point and
enters committed state. All the changes are recorded permanently in the db.
• A transaction can go to the failed state if one of the checks fails or if the transaction is
aborted during its active state.
• The transaction may then have to be rolled back to undo the effect of its write operation.
• Terminated state corresponds to the transaction leaving the system. All the information about
the transaction is removed from system tables.
• Log or Journal keeps track of all transaction operations that affect the values of database
items.
• The log is kept on disk, so it is not affected by any type of failure except for disk or
catastrophic failure
• one (or more) main memory buffers hold the last part of the log file, so that log entries are
first added to the main memory buffer .
• When the log buffer is filled, or when certain other conditions occur, the log buffer is
appended to the end of the log file on disk.
• In addition, the log is periodically backed up to archival storage (tape) to guard against such
catastrophic failures.
• The following are the types of entries called log records that are written to the log file and the
corresponding action for each log record.
3. [read_item, T, X]. Indicates that transaction T has read the value of database item X.
4. [commit, T]. Indicates that transaction T has completed successfully, and affirms that its
effect can be committed (recorded permanently) to the database.
➢ A transaction T reaches its commit point when all its operations that access the database
have been executed successfully and the effect of all the transaction operations on the
database has been recorded in the log.
➢ Beyond the commit point, the transaction is said to be committed, and its effect is
assumed to be permanently recorded in the database. The transaction then writes an entry
[commit,T] into the log.
• Roll Back of transactions: Needed for transactions that have a [start_transaction,T] entry
into the log but no commit entry [commit,T] into the log.
• DBMS cache is divided into separate domains, each handle one type of disk pages
and replacements within each domain are handled via basic LRU page replacement.
• LRU is a static algorithm and does not adopt to dynamically changing loads because
the number of available buffers for each domain is predetermined.
• Group LRU adds dynamically load balancing feature since it gives each domain a
priority and selects pages from lower priority level domain first for replacement.
D Durability (or Permanency): if a transaction changes the database and is committed, the
changes must never be lost because of any failure.
• The isolation property is enforced by the concurrency control subsystem of the DBMS. If
every transaction does not make its updates (write operations) visible to other transactions
until it is committed, one form of isolation is enforced that solves the temporary update
problem and eliminates cascading rollbacks .
Schedule (or history): the order of execution of operations from all the various transactions.
• The transactions are interleaved, for each transaction Ti that participates in the schedule S, the
operations of Ti in S must appear in the same order in which they occur in Ti.
• The order of operations in S is considered to be a total ordering, meaning that for any two
operations in the schedule, one must occur before the other. It is possible theoretically to deal
with schedules whose operations form partial orders, but we will assume for now total ordering
of the operations in a schedule.
• A shorthand notation for describing a schedule uses the symbols b, r, w, e, c, and a for the
operations begin_transaction, read_item, write_item, end_transaction, commit, and abort,
respectively.
• The schedule in Figure lost update problem, which we shall call Sa, can be written as follows
in this notation:
• Similarly, the schedule for Figure temporary update problem, which we call Sb, can be written
as follows, if we assume that transaction T1 aborted after its read_item(Y) operation:
• Two operations in a schedule are said to conflict if they satisfy all three of the following
conditions:
Conflicting operations:
1. The operations in S are exactly those operations in T1, T2, ... , Tn, including a commit or abort
operation as the last operation for each transaction in the schedule.
2. For any pair of operations from the same transaction Ti, their relative order of appearance in S is
the same as their order of appearance in Ti.
3. For any two conflicting operations, one of the two must occur before the other in the schedule.
• Committed projection C(S) of a schedule S, which includes only the operations in S that belong
to committed transactions that is, transactions Ti whose commit operation ci is in S.
Example:
o Sc: r1(X); w1(X); r2(X); r1(Y); w2(X); c2; a1;
o Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2;
• Cascadeless schedule:
---- One where every transaction reads only the items that are written by committed transactions.
• Strict Schedules:
------A schedule in which a transaction can neither read or write an item X until the last
transaction that wrote X has committed.
For example, consider schedule Sf : Sf : w1(X, 5); w2(X, 8); a1; Suppose that the value of X was
originally 9, which is the before image stored in the system log along with the w1(X, 5) operation.
\ Dept. of CSE, SVIT
Page 37
MODULE –4 NOTES DBMS- BCS403
If T1 aborts, as in Sf, the recovery procedure that restores the before image of an aborted write
operation will restore the value of X to 9, even though it has already been changed to 8 by
transaction T2, thus leading to potentially incorrect results. Although schedule Sf is cascade less, it is
not a strict schedule, since it permits T2 to write item X even though the transaction T1that last
wrote X had not yet committed (or aborted).
Schedules that are always considered to be correct when concurrent transactions are executing are
known as serializable schedules.
Suppose that two users for example, two airline reservations agents submit to the DBMS
transactions T1 and T2 at approximately the same time. If no interleaving of operations is permitted,
there are only two possible outcomes:
1. Execute all the operations of transaction T1 (in sequence) followed by all the operations of
transaction T2 (in sequence).
2. Execute all the operations of transaction T2 (in sequence) followed by all the operations of
transaction T1 (in sequence).
Serializable schedule:
A schedule S is serializable if it is equivalent to some serial schedule of the same n transactions.
Result equivalent:
Two schedules are called result equivalent if they produce the same final state of the database.
• For example, in Figure 20.6, schedules S1 and S2 will produce the same final database state if they
execute on a database with an initial value of X = 100; however, for other initial values of X, the
schedules are not result equivalent.
Conflict equivalent:
Two schedules are said to be conflict equivalent if the order of any two conflicting operations is the
same in both schedules.
Conflict serializable:
A schedule S is said to be conflict serializable if it is conflict equivalent to some serial schedule S’.
1. For each transaction Ti participating in schedule S, create a node labeled Ti in the precedence
graph.
2. For each case in S where Tj executes a read_item(X) after Ti executes a write_item(X), create an
edge (Ti → Tj) in the precedence graph.
3. For each case in S where Tj executes a write_item(X) after Ti executes a read_item(X), create an
edge (Ti → Tj) in the precedence graph.
4. For each case in S where Tj executes a write_item(X) after Ti executes a write_item(X), create an
edge (Ti → Tj) in the precedence graph.
5. The schedule S is serializable if and only if the precedence graph has no cycles.
• The algorithm looks at only the read_item and write_item operations in a schedule to construct
a precedence graph (or serialization graph), which is a directed graph G = (N, E) that consists of
a set of nodes N = {T1, T2, ... , Tn } and a set of directed edges E = {e1, e2, ... , em }.
• Each edge ei in the graph is of the form (Tj → Tk ), 1 ≤ j ≤ n, 1 ≤ k ≤ n, where Tj is the starting
node of ei and Tk is the ending node of ei.
• If there is a cycle in the precedence graph, schedule S is not (conflict) serializable; if there is
no cycle, S is serializable.
• A cycle in a directed graph is a sequence of edges C = ((Tj → Tk), (Tk → Tp), ... , (Ti → Tj))
with the property that the starting node of each edge except the first edge is the same as the
ending node of the previous edge, and the starting node of the first edge is the same as the ending
node of the last edge.
• The edges (Ti → Tj) in a precedence graph can optionally be labeled by the name(s) of the data
item(s) that led to creating the edge. Figure shows such labels on the edges.
Fig: Constructing the precedence graphs for schedules A and D from fig 20.5 to test for conflict
serializability.
• Another example of serializability testing. (a) The READ and WRITE operations of three
transactions T1, T2, and T3.
• Schedule E is not serializable because the corresponding precedence graph has cycles. Schedule F
is serializable, and the serial schedule equivalent to F is shown in Figure 20.8(e).
1. The same set of transactions participates in S and S′, and S and S′ include the same operations
of those transactions.
2. For any operation ri(X) of Ti in S, if the value of X read by the operation has been written by
an operation wj(X) of Tj (or if it is the original value of X before the schedule started), the same
condition must hold for the value of X read by operation ri(X) of Ti in S′.
3. If the operation wk(Y) of Tk is the last operation to write item Y in S, then wk(Y) of Tk must
also be the last operation to write item Y in S′.
• The idea behind view equivalence is that, as long as each read operation of a transaction reads the
result of the same write operation in both schedules, the write operations of each transaction must
produce the same results.
• The read operations are hence said to see the same view in both schedules.
• Condition 3 ensures that the final write operation on each data item is the same in both schedules,
so the database state should be the same at the end of both schedules.
• The definitions of conflict serializability and view serializability are similar if a condition known as
the constrained write assumption (or no blind writes) holds on all transactions in the schedule.
This condition states that any write operation wi(X) in Ti is preceded by a ri(X) in Ti and that the
value written by wi(X) in Ti depends only on the value of X read by ri(X).
• A blind write is a write operation in a transaction T on an item X that is not dependent on the old
value of X, so it is not preceded by a read of X in the transaction T.
• The definition of view serializability is less restrictive than that of conflict serializability under the
unconstrained write assumption, where the value written by an operation wi(X) in Ti can be
independent of its old value.
• This is possible when blind writes are allowed, and it is illustrated by the following schedule Sg of
three transactions T1: r1(X); w1(X); T2: w2(X); and T3: w3(X):
Sg: r1(X); w2(X); w1(X); w3(X); c1; c2; c3;
• In Sg the operations w2(X) and w3(X) are blind writes, since T2 and T3 do not read the value of X.
The schedule Sg is view serializable, since it is view equivalent to the serial schedule T1, T2, T3.
However, Sg is not conflict serializable.
• An example is the type of transactions known as debit-credit transactions, those that apply
deposits and withdrawals to a data item whose value is the current balance of a bank account.
• The semantics of debit-credit operations is that they update the value of a data item X by
either subtracting from or adding to the value of the data item. Because addition and
subtraction operations are commutative that is, they can be applied in any order it is possible
to produce correct schedules that are not serializable.
• For example, consider the following transactions, each of which may be used to transfer an
amount of money between two bank accounts:
T1: r1(X); X :{equal} X − 10; w1(X); r1(Y); Y :{equal} Y + 10; w1(Y);
T2: r2(Y); Y :{equal} Y − 20; w2(Y); r2(X); X :{equal} X + 20; w2(X);
• Consider the following non serializable schedule Sh for the two transactions:
Sh: r1(X); w1(X); r2(Y); w2(Y); r1(Y); w1(Y); r2(X); w2(X);
• With the additional knowledge, or semantics, that the operations between each ri(I) and wi(I)
are commutative, the order of executing the sequences consisting of (read, update, write) is
not important as long as each (read, update, write) sequence by a particular transaction Ti on a
particular item I is not interrupted by conflicting operations.
• A single SQL statement is always considered to be atomic either it completes execution without
an error or it fails and leaves the database unchanged.
• Every transaction must have an explicit end statement, which is either a COMMIT or a
ROLLBACK.
• Every transaction has certain characteristics attributed to it and are specified by a SET
TRANSACTION statement in SQL
1. Dirty read. A transaction T1 may read the update of a transaction T2, which has not
yet committed. If T2 fails and is aborted, then T1 would have read a value that does not
exist and is incorrect.
3. Phantoms. A transaction T1 may read a set of rows from a table, perhaps based on
some condition specified in the SQL WHERE-clause. Now suppose that a transaction T2
inserts a new row that also satisfies the WHERE-clause condition used in T1, into the
table used by T1.
If T1 is repeated, then T1 will see a phantom, a row that previously did not exist.
The transaction consists of first inserting a new row in the EMPLOYEE table and then updating the
salary of all employees who work in department 2 . If an error occurs on any of the SQL statements,
the entire transaction is rolled back This implies that any updated salary (by this transaction) would
be restored to its previous value and that the newly inserted row would be remove.