0% found this document useful (0 votes)
258 views67 pages

Realtional Model - Relational Algebra

The document discusses relational algebra, which provides a formal foundation for operations in the relational model of databases. Relational algebra includes unary operations like select and project that operate on single relations, as well as binary operations like join that combine tuples from two relations. Common operations are selection, projection, join, set operations from mathematics like union and difference, and aggregate functions. Relational algebra expressions can be used to formally specify database queries and are important for implementing and optimizing queries in database systems.

Uploaded by

neetunarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
258 views67 pages

Realtional Model - Relational Algebra

The document discusses relational algebra, which provides a formal foundation for operations in the relational model of databases. Relational algebra includes unary operations like select and project that operate on single relations, as well as binary operations like join that combine tuples from two relations. Common operations are selection, projection, join, set operations from mathematics like union and difference, and aggregate functions. Relational algebra expressions can be used to formally specify database queries and are important for implementing and optimizing queries in database systems.

Uploaded by

neetunarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

Relational Algebra

Relational Algebra

 The two formal languages for the relational model: the


relational algebra and the relational calculus.
 The basic set of operations for the formal relational
model is the relational algebra.
 These operations enable a user to specify basic retrieval
requests as relational algebra expressions.
 The result of a retrieval query is a new relation. The
algebra operations thus produce new relations, which
can be further manipulated using operations of the
same algebra.
 The relational algebra is very important for several
reasons.
 First, it provides a formal foundation for relational model
operations.
 Second, and perhaps more important, it is used as a
basis for implementing and optimizing queries in the
query processing and optimization modules that are
integral parts of relational database management
systems (RDBMSs).
 Third, some of its concepts are incorporated into the SQL
standard query language for RDBMSs
The relational algebra is often considered to be
an integral part of the relational data model.
Its operations can be divided into two groups.
One group includes set operations from
mathematical set theory; these are applicable
because each relation is defined to be a set of
tuples in the formal relational model.
 Set operations include UNION, INTERSECTION,
SET DIFFERENCE, and CARTESIAN PRODUCT (also
known as CROSS PRODUCT).
The other group consists of operations
developed specifically for relational
databases—these include SELECT, PROJECT, and
JOIN, among others.
The SELECT and PROJECT operations are unary
operations that operate on single relations.
JOIN and other complex binary operations,
which operate on two tables by combining
related tuples (records) based on join
conditions.
Unary Relational Operations:

 The SELECT Operation


 The SELECT operation is used to choose a subset of the
tuples from a relation that satisfies a selection condition.

 The SELECT operation can also be visualized as a


horizontal partition of the relation into two sets of
tuples—those tuples that satisfy the condition and are
selected, and those tuples that do not satisfy the
condition and are filtered out.
 Examples:
Select the EMPLOYEE tuples whose department
number is 4:
 DNO = 4 (EMPLOYEE)

Select the employee tuples whose salary is greater


than $30,000:
 SALARY > 30,000 (EMPLOYEE)
In general, the select operation is denoted by
 <selection condition>(R) where
the symbol  (sigma) is used to denote the select operator
the selection condition is a Boolean (conditional) expression
specified on the attributes of relation R
tuples that make the condition true are selected
 appear in the result of the operation
tuples that make the condition false are filtered out
 discarded from the result of the operation
 For example,
To select the tuples for all employees who either work
in department 4 and make over $25,000 per year, OR
work in department 5 and make over $30,000, we can
specify the following SELECT operation:
Projection

The PROJECT operation, selects certain columns from


the table and discards the other columns.

Therefore, the result of the PROJECT operation can be


visualized as a vertical partition of the relation.
 For example,
 To list each employee’s first and last name and salary, we can
use the PROJECT operation as follows:

 where π (pi) is the symbol used to represent the PROJECT


operation, and <attribute list> is the desired sublist of attributes
from the attributes of relation R.
Sequences of Operations

 For most queries, we need to apply several relational


algebra operations one after the other.

 Either we can write the operations as a single relational


algebra expression by nesting the operations,

 Or we can apply one operation at a time and create


intermediate result relations.
For example,
to retrieve the first name, last name, and salary of all
employees who work in department number 5, we
must apply a SELECT and a PROJECT operation.

We can write a single relational algebra expression,


also known as an in-line expression, as follows:
Alternatively,
We can explicitly show the sequence of
operations, giving a name to each
intermediate relation, and using the assignment
operation, denoted by ← (left arrow), as
follows:
RENAME Operation
It is sometimes simpler to break down a complex
sequence of operations by specifying intermediate
result relations than to write a single relational
algebra expression.
We can also use this technique to rename the
attributes in the intermediate and result relations.
where the symbol ρ (rho) is used to denote the RENAME
operator, S is the new relation name, and B1, B2, … , Bn
are the new attribute names.
Relational Algebra Operations
from Set Theory: The UNION, INTERSECTION,
and MINUS Operations
UNION: The result of this operation, denoted by R ∪ S,
is a relation that includes all tuples that are either in R
or in S or in both R and S. Duplicate tuples are
eliminated.
INTERSECTION: The result of this operation, denoted
by R ∩ S, is a relation that includes all tuples that are
in both R and S.
SET DIFFERENCE (or MINUS): The result of this
operation, denoted by R – S, is a relation that
includes all tuples that are in R but not in S.
The CARTESIAN PRODUCT (CROSS PRODUCT)
Operation

 The CARTESIAN PRODUCT operation—also known as CROSS PRODUCT or


CROSS JOIN—which is denoted by ×. This is also a binary set operation, but
the relations on which it is applied do not have to be union compatible.
 In general, the result of R(A1, A2, … , An) × S(B1, B2, … , Bm) is a relation
 Q with degree n + m attributes Q(A1, A2, … , An, B1, B2, … , Bm), in that
order.
 The resulting relation Q has one tuple for each combination of tuples—
one from R and one from S.
 Hence, if R has nR tuples , and S has nS tuples,
 then R × S will have nR * nS tuples.
For example,
Suppose that we want to retrieve a list of names
of each female employee’s dependents.
The JOIN Operation

 The JOIN operation, denoted by ⋈ , is used to combine


related tuples from two relations into single “longer” tuples.
 To get the manager’s name, we need to combine each
department tuple with the employee tuple whose Ssn value
matches the Mgr_ssn value in the department tuple.
 Note that Mgr_ssn is a foreign key of the DEPARTMENT
relation that references Ssn, the primary key of the
EMPLOYEE relation.
 This referential integrity constraint plays a role in having
matching tuples in the referenced relation EMPLOYEE.
In JOIN, only combinations of tuples satisfying
the join condition appear in the result.

whereas in the CARTESIAN PRODUCT all


combinations of tuples are included in the result.
 A general join condition is of the form
 <condition> AND <condition> AND … AND <condition>
 where each <condition> is of the form Ai θ Bj, Ai is an
attribute of R, Bj is an attribute of S, Ai and Bj have the same
domain, and θ (theta) is one of the comparison operators
{=, <, ≤, >, ≥, ≠}.
 A JOIN operation with such a general join condition is called
a THETA JOIN. Tuples whose join attributes are NULL or for
which the join condition is FALSE do not appear in the result.
Variations of JOIN: The EQUIJOIN and
NATURAL JOIN
 The most common use of JOIN involves join conditions with equality
comparisons only. Such a JOIN, where the only comparison
operator used is =, is called an EQUIJOIN.
 Because one of each pair of attributes with identical values is
superfluous, a new operation called NATURAL JOIN—denoted by *—
 was created to get rid of the second (superfluous) attribute in an
EQUIJOIN condition.
 The standard definition of NATURAL JOIN requires that the two join
attributes (or each pair of join attributes) have the same name in
both relations. If this is not the case, a renaming operation is applied
first.
 Suppose we want to combine each PROJECT tuple with
the DEPARTMENT tuple that controls the project.
 In the following example, first we rename the Dnumber
attribute of DEPARTMENT to Dnum—so that it has the
same name as the Dnum attribute in PROJECT—and
then we apply NATURAL JOIN:
The DIVISION Operation

The DIVISION operation, denoted by ÷, is useful for a


special kind of query that sometimes occurs in
database applications.

An example is Retrieve the names of


employees who work on all the projects that
‘John Smith’ works on.
 To express this query using the DIVISION operation,
proceed as follows.
 First, retrieve the list of project numbers that ‘John Smith’
works on in the intermediate relation SMITH_PNOS:
 Next, create a relation that includes a tuple <Pno, Essn>
whenever the employee whose Ssn is Essn works on the
project whose number is Pno in the intermediate relation
SSN_PNOS:
Finally, apply the DIVISION operation to the two
relations, which gives the desired employees’
Social Security numbers:
Aggregate Functions and Grouping

 Another type of request that cannot be expressed in the basic


relational algebra is to specify mathematical aggregate
functions on collections of values from the database.
 These functions are used in simple statistical queries that
summarize information from the database tuples.
 Common functions applied to collections of numeric values
include SUM, AVERAGE, MAXIMUM, and MINIMUM. The COUNT
function is used for counting tuples or values.
 where <grouping attributes> is a list of attributes of the
relation specified in R, and <function list> is a list of
(<function> <attribute>) pairs. In each such pair,
<function> is one of the allowed functions—such as SUM,
AVERAGE, MAXIMUM, MINIMUM, COUNT—and
<attribute> is an attribute of the relation specified by R.
For example, to retrieve each department
number, the number of employees in the
department, and their average salary, while
renaming the resulting attributes as indicated
below, we write:
OUTER JOIN Operations

A set of operations, called outer joins, were


developed for the case where the user wants to
keep all the tuples in R, or all those in S, or all
those in both relations in the result of the JOIN,
regardless of whether or not they have
matching tuples in the other relation.
 Answer a
 Answer c
 Answer b

You might also like