Relational Algebra Class Note

Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Relational Algebra

Unary Relational Operations


SELECT and PROJECT
SELECT:
• The SELECT operation is used to choose a subset of the tuples
from a relation that satisfies a selection condition
• A filter that keeps only those tuples that satisfy a qualifying
condition
• Restrict the tuples in a relation to only those tuples that satisfy
the condition
• A horizontal partition of the relation into two sets of tuples
• The symbol σ (sigma) is used to define the SELECT operation
Unary Relational Operations
SELECT

SELECT:

• σ<selection condition>(R)
• Sigma is the select statement
• Selection condition :
• Boolean expression (condition)
• Specified on the attributes of relation R
Unary Relational Operations
SELECT
SELECT:

• Boolean expression (condition)


• <attribute name> <comparison op> <constant value>
• <attribute name> <comparison op> <attribute name>
• comparison op: {=, <, ≤, >, ≥, ≠},
• Clauses can be connected by the standard Boolean oper-
ators and, or, and not to form a general selection condition
Unary Relational Operations
SELECT

SELECT * σDno=4(EMPLOYEE)
FROM EMPLOYEE

WHERE Dno=5
Unary Relational Operations
SELECT

SELECT * σSalary>=5000(EMPLOYEE)
FROM EMPLOYEE

WHERE Salary>=5000
Unary Relational Operations
SELECT
Select the tuples for all employees who either work in department 4 and
make over $25,000 per year, or work in department 5 and make over
$30,000,

SELECT *

FROM EMPLOYEE

WHERE (Dno=4 AND Salary >25000) OR (Dno=5 AND Salary >30000)


Unary Relational Operations
SELECT
Select the tuples for all employees who either work in department 4 and
make over $25,000 per year, or work in department 5 and make over
$30,000,

σ(Dno=4 AND Salary>25000) OR (Dno=5 AND Salary>30000)


(EMPLOYEE)
Unary Relational Operations
PROJECT
• Selects certain columns from the table and discards the other columns.

• If interested in only certain attributes of a relation,

• Use the PROJECT operation to project the relation over these


attributes only

• The result of the PROJECT operation:

• A vertical partition of the relation into two relations:

• One has the needed columns (attributes) and contains the result of the
operation, and

• The other contains the discarded columns.


Unary Relational Operations
PROJECT

• The general form of the PROJECT operation is:

• π<attribute list>(R)

• π (pi) is the symbol used to represent the PROJECT

• The result of the PROJECT operation:


• Only the attributes specified in <attribute list>
• In the same order as they appear in the list.
• Its degree is equal to the number of attributes in <attribute list>
Unary Relational Operations
PROJECT

• List each employee’s first and last name and salary:

πLname, Fname, Salary(EMPLOYEE)


Unary Relational Operations
PROJECT

• If the attribute list includes only nonkey attributes of R:


• Duplicate tuples are likely to occur.

• The PROJECT operation removes any duplicate tuples,

• The result of the PROJECT operation is a set of distinct tuples

• This is known as duplicate elimination


Unary Relational Operations
PROJECT

• Example:

•πSex, Salary(EMPLOYEE)
• <‘F’, 25000> appears only once
• Even though this combination of values
appears twice in the EMPLOYEE relation. s
Unary Relational Operations
PROJECT

Example:

πSex, Salary(EMPLOYEE)
IN SQL it is:
SELECT DISTINCT Sex,Salary
FROM EMPLOYEE
Unary Relational Operations
PROJECT

• The number of tuples in a relation resulting from a PROJECT


operation is:
• Always less than or equal to the number of tuples in R.
• If the projection list is a superkey of R:
• The resulting relation has the same number of tuples as R
Sequences of Operations and the
RENAME Operation

• For most queries:


• We need to apply several relational algebra operations one after
the other.
• We can write the operations as a single relational algebra
expression by nesting the operations, or
• We can apply one operation at a time and create intermediate
result relations.
• Intermediate results need to be renamed
Sequences of Operations and the
RENAME Operation

• Write the operations as a single relational algebra expression by


nesting the operations

•π Fname, Lname, Salary( σDno=5(EMPLOYEE))


• IS equivalent in SQL to:
SELECT DISTINCT Fname, Lname, Salary
FROM EMPLOYEE
WHERE Dno=5
Sequences of Operations and the
RENAME Operation
!

• Apply one operation at a time and create intermediate


result relations.

TEMP ← σDno=5(EMPLOYEE)
R(First_name, Last_name, Salary) ← πFname, Lname,
Salary(TEMP)
Sequences of Operations and the
RENAME Operation

ρS(B1, B2, ..., Bn)(R) Rename relation and attributes


or
ρS(R) Rename relation only
or
ρ(B1, B2, ..., Bn)(R) Rename attributes only
IF R(A1,A2,…)
S(B1,B2,…) At the same order
Sequences of Operations and the
RENAME Operation
!
• Apply one operation at a time and create intermediate result relations.

σDno=5(EMPLOYEE)
TEMP ←

ρ S(First_name, Last_name, Salary) ← πFname, Lname,

Salary(TEMP)
!
!
Sequences of Operations and the
RENAME Operation
!
• Apply one operation at a time and create intermediate result relations.

SELECT E.Fname AS First_name, E.Lname AS Last_name, E.Salary AS


Salary
FROM EMPLOYEE AS E
WHERE E.Dno=5
The UNION, INTERSECTION, and
MINUS Operations
!
• UNION:
• Denoted by R ∪ S,
• The results is a relation that includes all tuples that are either in R
or in S or in both R and S.
• Duplicate tuples are eliminated.
• Example:
• Retrieve the Social Security numbers of all employees who :
• Either work in department 5 or
• Directly supervise an employee who works in department 5
!
!
DEP5_EMPS ← σDno=5(EMPLOYEE)
RESULT1 ← πSsn(DEP5_EMPS)
RESULT2(Ssn) ← πSuper_ssn(DEP5_EMPS)
RESULT ← RESULT1 ∪ RESULT2
SELECT ssn from employee where dno=5
UNION SELECT superssn from employee where dno=5
The UNION, INTERSECTION, and
MINUS Operations
!
• INTERSECTION: (INTERSECT in SQL)
• Denoted by R ∩ S,
• The result of this operation is a relation that
includes all tuples that are in both R and S.
• SET DIFFERENCE (or MINUS):
• Denoted by R – S,
• The result of this operation is a relation that
includes all tuples that are in R but not in S.
The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
!
!

• known as CROSS PRODUCT or CROSS JOIN


• Denoted by ×

• A binary set operation
• Produces a new element by:
• Combining every member (tuple) from one relation (set) with
• Every member (tuple) from the other relation (set).
The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
!
!

• The result of R(A , A , ..., A ) × S(B , B , ..., B ) is


1 2 n 1 2 m

• A relation Q with degree n + m attributes Q(A , A , ..., A , B ,


1 2 n 1

B , ..., B ), in that order.


2 m

• The resulting relation Q has one tuple for each combination of


tuples
• One from R and one from S.
• If R has nR tuples (denoted as |R| = nR), and
• S has nS tuples, then

• R × S will have nR * nS tuples.


The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
!
!

• The CARTESIAN PRODUCT operation applied by itself is generally


meaningless.
• It is mostly useful when followed by a selection that matches
values of attributes coming from the component relations.
• For example, suppose that we want to retrieve a list of names of
each female employee’s dependents.
The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
!
!

• FEMALE_EMPS ← σSex=‘F’(EMPLOYEE)
EMPNAMES ← πFname, Lname, Ssn(FEMALE_EMPS)
EMP_DEPENDENTS ← EMPNAMES × DEPENDENT
ACTUAL_DEPENDENTS ← σSsn=Essn(EMP_DEPENDENTS)
RESULT ← πFname, Lname, Dependent_name(ACTUAL_DEPENDENTS)
The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
!
!
The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
!
!
The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
!
!
The CARTESIAN PRODUCT (CROSS
PRODUCT) Operation
!
!

• In SQL:
• CARTESIAN PRODUCT can be realized by using the CROSS
JOIN option in joined tables
• OR,
• If there are two tables in the WHERE clause and
• There is no corresponding join condition in the query,
• The result will also be the CARTESIAN PRODUCT of the two
tables
The JOIN operation

• Denoted by
• Used to combine related tuples from two relations into single
“longer” tuples.
• This operation is very important for any relational database with
more than a single relation because
• It allows us to process relationships among relations.
!
The JOIN operation
• Suppose that we want to retrieve the name of the manager of
each department.
• To get the manager’s name:
• Combine each department tuple with the employee tuple
whose Ssn value matches the Mgr_ssn value in the
department tuple.
• We do this by:
• Using the JOIN operation and then
• Projecting the result over the necessary attributes
The JOIN operation

• DEPT_MGR ← DEPARTMENT Mgr_ssn=Ssn EMPLOYEE


RESULT ← πDname, Lname, Fname(DEPT_MGR)
The JOIN operation

• The JOIN operation can be specified as a CARTESIAN


PRODUCT operation followed by a SELECT operation.
EMP_DEPENDENTS ← EMPNAMES × DEPENDENT
ACTUAL_DEPENDENTS ← σSsn=Essn(EMP_DEPENDENTS)
The EQUIJOIN & NATURAL JOIN

• The most common use of JOIN involves join conditions with


equality comparisons only.
• Such a JOIN, where the only comparison operator used is =, is
called an EQUIJOIN.
The EQUIJOIN & NATURAL JOIN

• NATURAL JOIN

• denoted by *

• Was created to get rid of the second (superfluous) attribute in an EQUIJOIN


con- dition.

• The standard definition of NATURAL JOIN requires that the two join attrib-
utes (or each pair of join attributes) have the same name in both relations.

• If this is not the case, a renaming operation is applied first.


The EQUIJOIN & NATURAL JOIN
• Suppose we want to combine each PROJECT tuple with the DEPARTMENT tuple
that controls the project

• First we rename the Dnumber attribute of DEPARTMENT to Dnum

• So that it has the same name as the Dnum attribute in PROJECT

• Then we apply NATURAL JOIN:

PROJ_DEPT ← PROJECT * ρ(Dname, Dnum, Mgr_ssn, Mgr_start_date)(DEPARTMENT)

• The attribute Dnum is called the join attribute for the NATURAL JOIN operation,

• because it is the only attribute with the same name in both relations.

• The resulting relation is illustrated in Figure(a).

• In the PROJ_DEPT relation, each tuple combines a PROJECT tuple with the
DEPARTMENT tuple for the department that controls the project, but only one join
attribute value is kept.
The EQUIJOIN & NATURAL JOIN
• If the attributes on which the natural join is specified already have the same
names in both relations, renaming is unnecessary.

• For example, to apply a natural join on the Dnumber attributes of


DEPARTMENT and DEPT_LOCATIONS, it is sufficient to write

DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS



• The resulting relation is shown in Figure(b)

• Combines each department with its locations and has one tuple for each location.

• In general, the join condition for NATURAL JOIN is constructed by equating each
pair of join attributes that have the same name in the two relations and combining
these conditions with AND.

• There can be a list of join attributes from each relation, and each corresponding
pair must have the same name.
The DIVISION Operation

• Denoted by ÷,

• Is useful for a special kind of query that sometimes occurs in database


applications.

• Example: Retrieve the names of employees who work on all the projects that
‘John Smith’ works on.

1. Retrieve the list of project numbers that ‘John Smith’ works on in the
intermediate relation SMITH_PNOS:
SMITH ← σFname=‘John’ AND Lname=‘Smith’(EMPLOYEE)

SMITH_PNOS ← πPno(WORKS_ON Essn=SsnSMITH)



The DIVISION Operation

2. create a relation that includes a tuple <Pno, Essn>

1. whenever the employee whose Ssn is Essn works on the project whose
number is Pno in the intermediate relation SSN_PNOS:

SSN_PNOS ← πEssn, Pno(WORKS_ON)

3. Finally, apply the DIVISION operation to the two relations, which gives the
desired employees’ Social Security numbers:

SSNS(Ssn) ← SSN_PNOS ÷ SMITH_PNOS

RESULT ← πFname, Lname(SSNS * EMPLOYEE)

You might also like