Module2-DBMS
Module2-DBMS
System (DBMS)
Dr. Amit K. (AI Researcher)
It represents collection of entities and describes It represent data in the form of tables and describes
relationship between them. relationship between them
It consists of components like Entity, Entity Type, It consists of components like domain, attributes,
Entity Set tuples.
It is easy to understand the relationship between It is less easy to derive the relationship between
entities different tables.
Key
• Set of attribute by which each tuple of a relational table can be
uniquely identified.
• Types of Keys:
❑Super Key
❑Candidate Key
❑Primary Key
❑Alternate Key
Keys
• SUPER KEYS:
Any set of attributes that allows us to identify unique rows (tuples) in a
given relation are known as super keys. Or Superset of any candidate
key is a super key.
• Candidate Key: A super key whose proper subset is not a super key is the
candidate key. Also we can say, Minimal Super key is the candidate Key.
• Primary Key: A Candidate Key whose value can not be null is the primary
key.
• Alternate Key: Keys which are candidate key and not primary key are the
alternate keys.
Super Key and Candidate Key
X A B C
1 2 2 5
2 2 3 3
3 1 4 4
4 1 1 5
5 3 6 7
6 3 6 6
7 4 Null 8
Referenced Relation (Employee) Referencing Relation (works on) Referenced Relation (Department)
Employee Works
Department
On
EMP _ID and DEPT ID in the “works on” relation are the foreign keys.
Constraints in Relational Model
• While designing Relational Model, we define some conditions which
must hold for data present in database are called Constraints. These
constraints are checked before performing any operation (insertion,
deletion and updation) in database. If there is a violation in any of
constrains, operation will fail.
• Domain Constraints: These are attribute level constraints. An
attribute can only take values which lie inside the domain range. e.g,;
If a constrains AGE>0 is applied on STUDENT relation, inserting
negative value of AGE will result in failure.
• Key Integrity: Every relation in the database should have atleast one
set of attributes which defines a tuple uniquely. Those set of
attributes is called key. e.g.; ROLL_NO in STUDENT is a key. No two
students can have same roll number. So a key has two properties:
• It should be unique for all tuples.
• It can’t have NULL values.
Referential Integrity:
• When one attribute of a relation can only take values from other
attribute of same relation or any other relation, it is called referential
integrity. Let us suppose we have 2 relations.
• EMP ID of Works ON can only take the values which are present in
EMP ID of Employee which is called referential integrity constraint.
The relation which is referencing to other relation is called
REFERENCING RELATION (“Works On” in this case) and the relation to
which other relations refer is called REFERENCED RELATION
(Employee in this case).
Foreign Key
Set of attribute that references to primary key of the Same or some other relation
Referenced Relation Employee) Referencing Relation “works on” Referenced Relation “ Department”
Employee Works
Department
On
EMP _ID and DEPT ID in the “works on” relation are the foreign keys.
ANOMALIES
• An anomaly is an irregularity, or something which deviates from the
expected or normal state. When designing databases, we identify
three types of anomalies: Insert, Update and Delete.
• Insertion Anomaly in Referencing Relation:
We can’t insert a row in REFERENCING RELATION if referencing
attribute’s value is not present in referenced attribute value.
EMployee
M Department
We can not merge all tables into a single table as all attribute will have null values thus we can not make ant attribute a
primary key
Minimum 2 Tables required when both entities
are partially participated in 1:1 relationship
Department + M
EMPLOYEE
Dept DEPt Name SInce EMP ID
EMP_ID EMP Name ID
CSE Computer 2001 1
1 ram Scince
2 shyam Engineering
3 mohan ME Machanical 2002 2
ENgineering
4 rohan
CIVIL Civil 2003 3
ENgineering
EC Electronics Null Null
and
Communicatio
n
We can merge either of the participated entity with the relation. Here the department is merged with
the relation.
Minimum 2 tables are required
Employee + M
Department
EMP_ID EMP SInce Dept ID
Name Dept ID DEPt Name
1 ram 2001 CSE CSE Computer Scince Engineering
2 shyam 2002 ME ME Machanical ENgineering
3 mohan 2003 CIVIL CIVIL Civil ENgineering
4 rohan NULL EC Electronics and
Communication
Case 2:One to One relationship with at least
one total participation
EMP Since DEPT
Dept_ID Name
EMP_ID Name
EMployee
M Department
Employee
M Department
EMP_I EMP
D Name DEPT ID SInce DEPT ID DEPT Number of
EM NAme
PID Emplyee
1 ram
1 CSE 2001 CSE
2 shyam
2 ME 2002 ME
3 mohan
3 CIVIL 2003 CIVIL
4 rohan
4 ME 2000 Architectur
5 sohan
e
Minimum 2 tables are required in M:1 relationship
If we merge the side of “many “ with the relationship then repetition will not occur, therefore we can merge the
side of ‘many” with the relationship in case of many to one.
Repetition will occur if we merge side of “one” with a relationship,
therefore we can’t merge the side of “many” with the relationship.
For Student(SID, Name), SID is the primary key. For Course(CID, C_name ), CID is the primary key
Now the question is, what should be the primary key for Enroll? Should it be SID or CID or both
combined into one.
Question
Now the question is, what should be the primary key for Enroll? Should it be SID or CID or both
combined into one.
Question
N
Employee Dependency Dependent
EMP_ID
Dependent Name
Works
Phone Number On
N Dependent Address
Since
Department
Department ID
The basic rule for converting the ER diagrams into tables is
Referencing Relation
Referenced Relation
Emp ID Phone
Emp ID Name Age DOB Numbe
1 SOhan 34 1986 r
1 463245
2 Mohan 40 1982
1 769313
3 ROhan 36 1983
1 421456
2 674217
2 43245
Example of Weak Entity:
Dependency Dependent (weak Entity)
Employee
Emp Name Age DOB Phone Emp ID Dependent Dependent Age Workin
ID Number Name Name g
1 John 34 198 463245 1 SOhan Status
6 2 SOhan SOhan 34 Workin
2 Tom 34 198 769313 g
1 Mohan
6 SOhan 30 Not
3 Alice 34 198 421456 working
6 Mohan 36 Not
working
Employee Dependency
Employee-Phn No Dependent
EMPID Emp ID ( foreign Key) Dependent Name
Emp ID ( foreign Key) Emp Name Dependent Name Age
Age Working status
Phn No Street
City
Works On
Emp Table
ID (foreign Key) Department
New
Dept ID (foreign key)
Since Dept _ID
Number of Employee
Project
Project ID
Project Start Date
Employee Dependent
Employee-project
Allotment Dependent Name
Employee-Phn No Emp ID ( foreign Key)
EMPID Age
Emp ID (foreign Key) Working Status
Emp Name
Age Phone Number
Street
City
Project ID (foreign Key)
Works On Department
Emp ID (foreign Key) Dept _ID
Dept ID (foreign key) Number of Employee
Since
Relation Algebra
• The basic set of operations for the formal relational model is the relational
algebra.
• Two types:
• Selection (σ)
• Projection (π)
Selection (σ)
• choose a subset of the tuples from a relation that satisfies a selection condition.
• For example, to select the EMPLOYEE tuples whose department is 4, or those whose
salary is greater than $30,000, we can individually specify each of these two conditions
with a SELECT operation as follows:
σDno=4(EMPLOYEE)
σSalary>30000(EMPLOYEE)
• In general, the SELECT operation is denoted by
σ<selection condition>(R)
• where the symbol σ (sigma) is used to denote the SELECT operator and the selection
condition is a Boolean expression (condition) specified on the attributes of relation R.
Notice that R is generally a relational algebra expression whose result is a relation—the
simplest such expression is just the name of a database relation. The relation resulting from
the SELECT operation has the same attributes as R.
Selection (σ)
• The Boolean expression specified in <selection condition> is made up of a number of
clauses of the form
<attribute name> <comparison op> <constant value>
or
<attribute name> <comparison op> <attribute name
• where <attribute name> is the name of an attribute of R, <comparison op> is normally
one of the operators {=, <, ≤, >, ≥, ≠}, and <constant value> is a constant value from the
attribute domain. Clauses can be connected by the standard Boolean operators and, or,
and not to form a general selection condition. For example, to select the tuples for all
employees who either work in department 4 and make over $25,000 per year, or work in
department 5 and make over $30,000, we can specify the following SELECT operation:
• where π (pi) is the symbol used to represent the PROJECT operation, and
<attribute list> is the desired sublist of attributes from the attributes of relation R.
• The result of the PROJECT operation has only the attributes specified in
<attribute list> in the same order as they appear in the list. Hence, its degree is
equal to the number of attributes in <attribute list>.
Projection (π)
• If the attribute list includes only nonkey attributes of R, duplicate tuples are
likely to occur.
• The PROJECT operation removes any duplicate tuples, so the result of the
PROJECT operation is a set of distinct tuples, and hence a valid relation. This is
known as duplicate elimination.
1. DEP5_EMPS ← σDno=5(EMPLOYEE)
where,
The symbol ρ (rho) is used to denote the RENAME operator, S is the new relation
name, and B1, B2, … , Bn are the new attribute names.
The first expression renames both the relation and its attributes, the second renames
the relation only, and the third renames the attributes only.
Relational Algebra Operations from Set
Theory
• UNION, INTERSECTION, and MINUS Operations.
• For example, to retrieve the Social Security numbers of all employees who
either work in department 5 or directly supervise an employee who works in
department 5, we can use the UNION operation as follows:
1. DEP5_EMPS ← σDno=5(EMPLOYEE)
2. RESULT1 ← πSsn(DEP5_EMPS)
3. RESULT2(Ssn) ← πSuper_ssn(DEP5_EMPS)
• Two relations R(A1, A2, … , An) and S(B1, B2, … , Bn) are said to be union
compatible (or type compatible) if they have the same degree n and if dom(Ai)
= dom(Bi) for 1 ≤ i ≤ n.
• This means that the two relations have the same number of attributes and each
corresponding pair of attributes has the same domain.
UNION, INTERSECTION, and MINUS
Operations
• UNION: The result of this operation, denoted by R ∪ S, is a relation that
includes all tuples that are either in R or in S or in both R and S. Duplicate
tuples are eliminated.
• This set operation produces a new element by combining every member (tuple)
from one relation (set) with every member (tuple) from the other relation (set).
• The result of R(A1, A2, … , An) × S(B1, B2, … , Bm) is a relation Q with degree n +
m attributes Q(A1, A2, … , An, B1, B2, … , Bm), in that order.
• The resulting relation Q has one tuple for each combination of tuples—one from R
• and one from S. Hence, if R has n R tuples (denoted as |R| = nR), and S has nS tuples,
then R × S will have nR * nS tuples.
The CARTESIAN PRODUCT (CROSS PRODUCT)
Operation
1. FEMALE_EMPS ← σSex=‘F’(EMPLOYEE)
4. ACTUAL_DEPENDENTS ← σSsn=Essn(EMP_DEPENDENTS)
• Generalized Projection:
• Extends the projection operation by allowing functions of attributes to be
included in the projection list.
• The generalized form can be expressed as:
• πF1, F2, ..., Fn (R)
Salary (EMPLOYEE))
Aggregate Functions (ℑ) and Grouping
<grouping attributes> ℑ <function list> (R)
• where <grouping attributes> is a list of attributes of the relation specified in R,
and <function list> is a list of (<function> <attribute>) pairs.
• <function> is one of the allowed functions—such as SUM, AVERAGE,
MAXIMUM, MINIMUM, COUNT
• <attribute> is an attribute of the relation specified by R.
For example, to retrieve each department number, the number of employees in the
department, and their average salary, while renaming the resulting attributes as:
ρR(Dno, No_of_employees, Average_sal) (Dno ℑ COUNT Ssn, AVERAGE Salary (EMPLOYEE))
Dno ℑ COUNT Ssn, AVERAGE Salary(EMPLOYEE)
ℑ COUNT Ssn, AVERAGE Salary(EMPLOYEE)
Relational Calculus
• Another formal query language for the relational model
• Tuple relational calculus
• Domain relational calculus
• We write one declarative expression to specify a retrieval request; hence, there
is no description of how, or in what order, to evaluate a query.
• Nonprocedural language
• Expressive power of the languages is identical.
• Relational query language L is considered relationally complete if we can
express in L any query that can be expressed in relational calculus.
The Tuple Relational Calculus
Tuple Variables and Range Relations
• A simple tuple relational calculus query is of the form:
{t | COND(t)}
• where t is a tuple variable and COND(t) is a conditional (Boolean) expression
involving t that evaluates to either TRUE or FALSE for different assignments of
tuples to the variable t.
• To find all employees whose salary is above $50,000, we can write the
following tuple calculus expression:
{t | EMPLOYEE(t) AND t.Salary>50000}
The condition EMPLOYEE(t) specifies that the range relation of tuple variable t is
EMPLOYEE.
• To retrieve only some of the attributes—say, the first and last names:
t.Fname, t.Lname | EMPLOYEE(t) AND t.Salary>50000}
The Existential (∀) and Universal (∃)
Quantifiers
• ∃ t ∈ R (Q(t)) = ”there exists” a tuple in t in relation R such that predicate Q(t)
is true.
• List the birth date and address of the employee whose name is ‘John
B. Smith’.