2.relational Database
2.relational Database
2.relational Database
Relational Model
• Tables, Columns and Rows
• Relationships and Keys
• Data Integrity
• Normalization
Relational Model
• Salient features of a relational table
• Values are atomic. A special null value is used to represent
values that are unknown or inapplicable to certain tuples.
• Column values are of the same kind (Domain)
• Each Row is unique
• Sequence of columns is insignificant
• Sequence of rows is insignificant
• Each column must have a unique name
• Relationships and Keys
• Keys - Fundamental to the concept of relational databases
• Relationship - An association between two or more tables
defined by means of keys
CHARACTERISTICS OF RELATIONS
• Notation:
– We refer to component values of a tuple t by
t[Ai] = vi (the value of attribute Ai for tuple t).
– Similarly, t[Au, Av, ..., Aw] refers to the subtuple
of t containing the values of attributes Au,
Av, ..., Aw, respectively.
Relational Integrity Constraints
1. Key constraints
2. Entity integrity constraints
3. Referential integrity constraints
Keys
• Let K R
• K is a superkey of R if values for K are sufficient to identify a unique tuple
of each possible relation r(R)
– by “possible r ” we mean a relation r that could exist in the enterprise
we are modeling.
– Example: {customer_name, customer_street} and
{customer_name}
are both superkeys of Customer, if no two customers can possibly
have the same name
• In real life, an attribute such as customer_id would be used
instead of customer_name to uniquely identify customers, but we
omit it to keep our examples small, and instead assume customer
names are unique.
Keys (Cont.)
• K is a candidate key if K is minimal
Example: {customer_name} is a candidate key for Customer, since
it is a superkey and no subset of it is a superkey.
• Primary key: a candidate key chosen as the principal means of
identifying tuples within a relation
– Should choose an attribute whose value never, or very rarely,
changes.
– E.g. email address is unique, but may change
Key Constraints
Entity Integrity
• Relational Database Schema: A set S of relation schemas
that belong to the same database. S is the name of the
database.
S = {R1, R2, ..., Rn}
• Entity Integrity: The primary key attributes PK of each relation
schema R in S cannot have null values in any tuple of r(R). This
is because primary key values are used to identify the
individual tuples.
t[PK] null for any tuple t in r(R)
• Difference
• Product
• Division
DIFFERENCE
Query Languages
• Language in which user requests information from the database.
• Categories of languages
– Procedural
– Non-procedural, or declarative
• “Pure” languages:
– Relational algebra
– Tuple relational calculus
– Domain relational calculus
• Pure languages form underlying basis of query languages that
people use.
Relational Algebra
• The basic set of operations for the relational model is
known as the relational algebra. These operations enable a
user to specify basic retrieval requests.
• Procedural language
• Six basic operators
– select:
– project:
– union:
– set difference: –
– Cartesian product: x
– rename:
• The operators take one or two relations as inputs and produce
a new relation as a result. Hence, it is closed.
Select Operation – Example
• Selection
• SELECT operation is used to select a subset of the tuples from a
relation that satisfy a selection condition. It is a filter that keeps only
those tuples that satisfy a qualifying condition – those satisfying the
condition are selected while others are discarded.
A B C D E
1 A 212 Y 2
2 C 45 N 84
3 B 8656 N 4
4 D 324 N 56
5 C 5656 Y 34
6 A 445 N 4
7 B 546 Y 55
Select
Relation r
Operation – Example
A B C D
1 7
5 7
12 3
23 10
1 7
23 10
Select Operation
• Notation: p(r)
• p is called the selection predicate
• Defined as:
• Example of selection:
branch_name=“Perryridge”(account)
Select Operation
A B C D E
1 A 212 Y 2
2 C 45 N 84
3 B 8656 N 4
4 D 324 N 56
5 C 5656 Y 34
6 A 445 N 4
7 B 546 Y 55
Project Operation
• Example: To eliminate the branch_name attribute of account
account_number, balance (account)
• Example: To list each employee’s first and last name and salary, the
following is used: lname, fname, salary (employee)
• The general form of the project operation is <attribute list>(R) where
(pi) is the symbol used to represent the project operation and <attribute
list> is the desired list of attributes from the attributes of relation R.
10 1
20 1
30 1
40 2
A,C (r) A C A C
1 1
1 = 1
1 2
2
Relational Algebra
Unary Relational Operations (cont.)
Unary Relational Operations (cont.)
–
<list2> R)<list1> Ras long
<list1>
as<list2>contains theattributes in<list2>
Rename Operation
• Rename Operation
We may want to apply several relational algebra operations one after the
other. Either we can write the operations as a single relational algebra
expression by nesting the operations, or we can apply one operation at a time
and create intermediate result relations. In the latter case, we must give
names to the relations that hold the intermediate results.
Example: To retrieve the first name, last name, and salary of all employees
who work in department number 5, we must apply a select and a project
operation. We can write a single relational algebra expression as follows:
S (B1, B2, …, Bn ) ( R) is a renamed relationS based on R with column names B1, B1, …..Bn
S ( R) is a renamed relationS based on R (which does not specify column names).
(B1, B2, …, Bn ) ( R) is a renamed relationwith column names B1, B1, …..Bn which does
• Type Compatibility
– The operand relations R1(A1, A2, ..., An) and R2(B1,
B2, ..., Bn) must have the same number of attributes, and
the domains of corresponding attributes must be
compatible; that is, dom(Ai)=dom(Bi) for i=1, 2, ..., n.
1 3
r s: 1
r s
2
1
3
STUDENTINSTRUCTOR
Intersection Operation
• INTERSECTION OPERATION
The result of this operation, denoted by R S, is a relation that includes all
tuples that are in both R and S. The two operands must be "type compatible"
Defined as r s = { t | t r and t s }
• Assume:
– r, s have the same attribute
– attributes of r and s are compatible
• Note: r s = r – (r – s)
Intersection Operation – Example
• INTERSECTION Example
STUDENT INSTRUCTOR
• Relation r, s:
A B A B
1 2
2 3
1
r s
• rs A B
2
• Notation r – s
• Defined as:
r – s = {t | t r and t s}
• Set differences must be taken between compatible relations.
– r and s must have the same arity
– attribute domains of r and s must be compatible
Set
• Relations r, s:
Difference – Example
A B A B
1 2
2 3
1 s
r
r – s:
A B
1
1
Set Difference – Example
STUDENT-INSTRUCTOR
INSTRUCTOR-STUDENT
Cartesian Operation
• CARTESIAN (or cross product) Operation
– This operation is used to combine tuples from two relations in a
combinatorial fashion. In general, the result of R(A1, A2, . . ., An) x S(B1,
B2, . . ., Bm) is a relation Q with degree n + m attributes Q(A1, A2, . . .,
An, B1, B2, . . ., Bm), in that order. The resulting relation Q has one tuple
for each combination of tuples—one from R and one from S.
– Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples,
then
| R x S | will have nR * nS tuples.
– The two operands do NOT have to be "type compatible”
– Assume that attributes of r(R) and s(S) are disjoint. (That is, R S =
).
– If attributes of r(R) and s(S) are not disjoint, then renaming must be
used.
Cartesian-Product – Example
Relations r, s:
A B C D E
1 10 a
10 a
2 20 b
r 10 b
s
r x s:
A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
Example:
FEMALE_EMPS SEX=’F’(EMPLOYEE)
EMPNAMES FNAME, LNAME, SSN (FEMALE_EMPS)
The most common use of join involves join conditions with equality
comparisons only. Such a join, where the only comparison operator used is =,
is called an EQUIJOIN. In the result of an EQUIJOIN we always have one or
more pairs of attributes (whose names need not be identical) that have
identical values in every tuple.
The JOIN seen in the previous example was EQUIJOIN.
A B C D B D E
1 a 1 a
2 a 3 a
4 b 1 a
1 a 2 b
2 b 3 b
r s
r s
A B C D E
1 a
1 a
1 a
1 a
2 b
Additional Relational Operations
• The OUTER JOIN Operation
– In NATURAL JOIN tuples without a matching (or related) tuple are
eliminated from the join result. Tuples with null in the join attributes are
also eliminated. This amounts to loss of information.
– A set of operations, called outer joins, can be used when we want to
keep all the tuples in R, or all those in S, or all those in both relations in
the result of the join, regardless of whether or not they have matching
tuples in the other relation.
– The left outer join operation keeps every tuple in the first or left relation R
in R S; if no matching tuple is found in S, then the attributes of S
in the join result are filled or “padded” with null values.
– A similar operation, right outer join, keeps every tuple in the second or
right relation S in the result of R S.
– A third operation, full outer join, denoted by keeps all tuples in
both the left and the right relations when no matching tuples are found,
padding them with null values as needed.
OUTER JOIN (cont.)
• The outer union operation was developed to take the union of tuples from
two relations if the relations are not union compatible.
• This operation will take the union of tuples in two relations R(X, Y) and
S(X, Z) that are partially compatible, meaning that only some of their
attributes, say X, are union compatible.
• The attributes that are union compatible are represented only once in the
result, and those attributes that are not union compatible from either
relation are also kept in the result relation T(X, Y, Z).
• Example: An outer union to two schemas STUDENT(Name, SSN,
Department, Advisor) and INSTRUCTOR(Name, SSN, Department,
Rank).
– Tuples from the two relations are matched based on having the same
combination of values of the shared attributes—Name, SSN, Department. If a
student is also an instructor, both Advisor and Rank will have a value;
otherwise, one of these two attributes will be null.
STUDENT_OR_INSTRUCTOR (Name, SSN, Department, Advisor,
Rank)
Outer Join – Example
• Relation loan
Relation borrower
customer_name loan_number
Jones L-170
Smith L-230
Hayes L-155
Outer Join – Example
• Join
loan borrower
loan borrower
loan_number branch_name amount customer_name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
Outer Join – Example
Right Outer Join
loan borrower
loan borrower
loan_number branch_name amount customer_name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
L-155 null null Hayes
Null Values
• It is possible for tuples to have a null value, denoted by null, for
some of their attributes
• null signifies an unknown value or that a value does not exist.
• The result of any arithmetic expression involving null is null.
• Aggregate functions simply ignore null values (as in SQL)
• For duplicate elimination and grouping, null is treated like any
other value, and two nulls are assumed to be the same (as in
SQL)
Null Values
• Comparisons with null values return the special truth value: unknown
– If false was used instead of unknown, then not (A < 5)
would not be equivalent to A >= 5
• Three-valued logic using the truth value unknown:
– OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
– AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
– NOT: (not unknown) = unknown
– In SQL “P is unknown” evaluates to true if predicate P evaluates to
unknown
• Result of select predicate is treated as false if it evaluates to unknown
Division Operation
• Notation: r s
• Suited to queries that include the phrase “for all”.
• Let r and s be relations on schemas R and S
respectively where
– R = (A1, …, Am , B1, …, Bn )
– S = (B1, …, Bn)
The result of r s is a relation on schema
R – S = (A1, …, Am)
r s = { t | t R-S (r) u s ( tu r ) }
Where tu means the concatenation of tuples t and u to
produce a single tuple
Division Operation – Example
Relations r, s:
A B B
1
1
2
3 2
1 s
1
1
3
4
6
1
2
r s: A r
Another Division Example
Relations r, s:
A B C D E D E
a a 1 a 1
a a 1 b 1
a b 1 s
a a 1
a b 3
a a 1
a b 1
a b 1
r
r s:
A B C
a
a
Division Operation (Cont.)
• Property
– Let q = r s
– Then q is the largest relation satisfying q x s r
• Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S R
To see why
R-S,S (r) simply reorders attributes of r
A=C(r x s) A B C D E
1 10 a
2 10 a
2 20 b
Banking Example
branch (branch_name, branch_city, assets)
• Each Fi is either
– the I th attribute of r, if the I th attribute is not updated, or,
– if the attribute is to be updated Fi is an expression, involving
only constants and the attributes of r, which gives the new
value for the attribute
Update Examples
• Make interest payments by increasing all balances by 5 percent.