Sqlrealtion Albra
Sqlrealtion Albra
Sqlrealtion Albra
Relational Algebra came in 1970 and was given by Edgar F. Codd (Father of DBMS). It is also
known as Procedural Query Language(PQL) as in PQL, a programmer/user has to mention two
things, "What to Do" and "How to Do".
Suppose our data is stored in a database, then relational algebra is used to access the data from
the database.
The First thing is we have to access the data, this needs to be specified in the query as "What to
Do", but we have to also specify the method/procedure in the query that is "How to Do" or
how to access the data from the database.
Basic Operations
Derived Operations
Applying these operations over relations/tables will give us new relations as output.
Basic Operations
Six fundamental operations are mentioned below. The majority of data retrieval operations are
carried out by these. Let's know them one by one.
But, before moving into detail, let's have two tables or we can say relations
STUDENT
EMPLOYEE
Notation : σ p(R)
Project (∏)
NAME
Aman
Atul
Baljeet
Harsh
Prateek
ROLL NAME
1 Aman
2 Atul
3 Baljeet
4 Harsh
5 Prateek
6 Prateek
Union (∪)
Union operation is done by Union Operator which is represented by "union"(∪). It is the same as
the union operator from set theory,
i.e., it selects all tuples from both relations but with the exception that for the union of two
relations/tables both relations must have the same set of Attributes.
Notation: R ∪ S
If relations don't have the same set of attributes, then the union of such relations will result
in NULL.
Set Difference as its name indicates is the difference between two relations (R-S). It is denoted
by a "Hyphen"(-) and it returns all the tuples(rows) which are in relation R but not in relation S.
It is also a binary operator.
Notation : R - S
Just like union, the set difference also comes with the exception of the same set of attributes in
both relations.
the names of students who are in STUDENT Relation but not in EMPLOYEE Relation.
Cartesian product is denoted by the "X" symbol. Let's say we have two relations R and S.
Cartesian product will combine every tuple(row) from R with all the tuples from S. I know it
sounds complicated, but once we look at an example, you'll see what I mean.
Notation: R X S
Rename operation is denoted by "Rho"(ρ). As its name suggests it is used to rename the output
relation. Rename operator too is a binary operator.
Notation: ρ(R,S)
STUDENT_NAME
NAME
Aman
Atul
Baljeet
Harsh
Prateek
As you can see, this output relation is named "STUDENT_NAME"
Takeaway
Intersection (∩)
Notation : R ∩ S
NAME
Baljeet
Harsh
Division (÷)
Notation : R(X,Y)/S(Y)
Here,
R is the first relation from which data is retrieved.
S is the second relation that will help to retrieve the data.
X and Y are the attributes/columns present in relation. We can have multiple attributes in
relation, but keep in mind that attributes of S must be a proper subset of attributes of R.
For each corresponding value of Y, the above notation will return us the value of X from
tuple<X,Y> which exists everywhere.
It's a bit difficult to understand this in a theoretical way, but you will understand this with an
example.
ENROLLED consist of two attributes STUDENT_ID and COURSE_ID. It denotes the map of
students who are enrolled in given courses.
ENROLLED
STUDENT_ID COURSE_ID
Student_1 DBMS
Student_2 DBMS
Student_1 OS
Student_3 OS
COURSE
COURSE_ID
DBMS
OS
Now the query is to return the STUDENT_ID of students who are enrolled in every course.
ENROLLED(STUDENT_ID, COURSE_ID)/COURSE(COURSE_ID)
STUDENT_ID
Student_1
Derived Operations
Also known as extended operations, these operations can be derived from basic operations and
hence named Derived Operations. These include three operations: Join Operations, Intersection
operations, and Division operations.
Join Operations
Join Operation in DBMS are binary operations that allow us to combine two or more relations.
They are further classified into two types: Inner Join, and Outer Join.
EMPLOYEE table contains employee's information such as id, name, city, and experience of
employee(In Years).
EMPLOYEE
DEPARTMENT
Inner Join
When we perform Inner Join, only those tuples returned that satisfy the certain condition. It is
also classified into three types:
Theta Join,
Natural Join.
Theta Join (θ)
Theta Join combines two relations using a condition. This condition is represented by the
symbol "theta"(θ). Here conditions can be inequality conditions such as >,<,>=,<=, etc.
Notation : R ⋈θ S
Check the Cartesian Product, if in any tuple/row EXPERIENCE >= MIN_EXPERIENCE then insert
this tuple/row in output relation.
Equi Join
Equi Join is a special case of theta join where the condition can only
contain **equality(=)** comparisons.
A non-equijoin is the inverse of an equi join, which occurs when you join on a condition other
than "=".
to join EMPLOYEE and DEPARTMENT relation where E_NO from EMPLOYEE = E_NO
from DEPARTMENT.
Check Cartesian Product, if the tuple contains same E_NO, insert that tuple in the output
relation
A comparison operator is not used in a natural join. It does not concatenate like a Cartesian
product. A Natural Join can be performed only if two relations share at least one common
attribute. Furthermore, the attributes must share the same name and domain.
Natural join operates on matching attributes where the values of the attributes in both relations
are the same and remove the duplicate ones.
Preferably Natural Join is performed on the foreign key.
Notation : R ⋈ S
Let's say we want to join EMPLOYEE and DEPARTMENT relation with E_NO as a common
attribute.
Notice, here E_NO has the same name in both the relations and also consists of the same
domain, i.e., in both relations E_NO is a string.
Outer Join
Unlike Inner Join which includes the tuple that satisfies the given condition, Outer Join also
includes some/all the tuples which don't satisfy the given condition. It is also of three
types: Left Outer Join, Right Outer Join, and Full Outer Join.
As we can see from the diagram, Left Outer Join returns the matching tuples(tuples present in
both relations) and the tuples which are only present in Left Relation, here R.
However, if the matching tuples are NULL, then attributes/columns of Right Relation, here S
are made NULL in the output relation.
Here we are combining EMPLOYEE and DEPARTMENT relation with the constraint that
EMPLOYEE's E_NO must be equal to DEPARTMENT's E_NO.
As you can see here, all the tuples from left, i.e., EMPLOYEE relation are present. But E-4 is
not satisfying the given condition, i.e., E_NO from EMPLOYEE must be equal to E_NO from
DEPARTMENT, still it is included in the output relation. This is because Outer Join also
includes some/all the tuples which don't satisfy the condition. That's why Outer Join marked E-
4's corresponding tuple/row from DEPARTMENT as NULL.
Right Outer Join returns the matching tuples and the tuples which are only present in
Right Relation here S.
The same happens with the Right Outer Join, if the matching tuples are NULL, then the
attributes of Left Relation, here R are made NULL in the output relation.
We will combine EMPLOYEE and DEPARTMENT relations with the same constraint as
above.
As all the tuples from DEPARTMENT relation have a corresponding E_NO in EMPLOYEE
relation, therefore no tuple from EMPLOYEE relation contains a NULL.
Full Outer Join
Full Outer Join returns all the tuples from both relations. However, if there are no matching
tuples then, their respective attributes are made NULL in output relation.
Again, combine the EMPLOYEE and DEPARTMENT relation with the same constraint.
Takeaway