Unit 2
Unit 2
Unit 2
Data models: Data models define how the logical structure of a database is modeled. Data
Models are fundamental entities to introduce abstraction in a DBMS. Data Model gives us an
idea that how the final system will look like after its complete implementation. It defines the
data elements and the relationships between the data elements. Data Models are used to show
how data is stored, connected, accessed and updated in the database management system.
Though there are many data models being used nowadays but the Relational model is the most
widely used model. Some of the Data Models in DBMS are:
1. Hierarchical Model
Hierarchical Model was the first DBMS model. This model organizes the data in the
hierarchical tree structure. The hierarchy starts from the root which has root data and then it
expands in the form of a tree adding child node to the parent node. This model easily
represents some of the real-world relationships like food recipes, sitemap of a website
etc. Example: We can represent the relationship between the shoes present on a shopping
website in the following way:
One-to-many relationship: The data here is organized in a tree-like structure where the one-
to-many relationship is between the data types. Also, there can be only one path from parent
to any node. Example: In the above example, if we want to go to the node sneakers we only
have one path to reach there i.e. through men's shoes node.
Parent-Child Relationship: Each child node has a parent node but a parent node can have
more than one child node. Multiple parents are not allowed.
Deletion Problem: If a parent node is deleted then the child node is automatically deleted.
Pointers: Pointers are used to link the parent node with the child node and are used to
navigate between the stored data. Example: In the above example the 'shoes' node points to
the two other nodes 'women shoes' node and 'men's shoes' node.
As it does not support more than one parent of the child node so if we have some complex
relationship where a child node needs to have two parent node then that can't be represented
using this model.
2. Network Model
This model is an extension of the hierarchical model. It was the most popular model before
the relational model. This model is the same as the hierarchical model, the only difference is
that a record can have more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student has two parents i.e. CSE
Department and Library. This was earlier not possible in the hierarchical model.
Ability to merge more Relationships: In this model, as there are more relationships so data is
more related. This model has the ability to manage one-to-one relationships as well as many-
to-many relationships.
Many paths: As there are more relationships so there can be more than one path to the same
record. This makes data access fast and simple.
Circular Linked List: The operations on the network model are done with the help of the
circular linked list. The current position is maintained with the help of a program and this
position navigates through the records according to the relationship.
The data can be accessed faster as compared to the hierarchical model. This is because the
data is more related in the network model and there can be more than one path to reach a
particular node. So the data can be accessed in many ways.
As there is a parent-child relationship so data integrity is present. Any change in parent record
is reflected in the child record.
As more and more relationships need to be handled the system might get complex. So, a user
must be having detailed knowledge of the model to work with the model.
Attributes: An entity contains a real-world property called attribute. This is the characteristics
of that attribute. Example: The entity teacher has the property like teacher id, salary, age, etc.
Relationship: Relationship tells how two attributes are related. Example: Teacher works for a
department.
Example:
In the above diagram, the entities are Teacher and Department. The attributes
of Teacher entity are Teacher_Name, Teacher_id, Age, Salary, Mobile_Number. The
attributes of entity Department entity are Dept_id, Dept_name. The two entities are connected
using the relationship. Here, each teacher works for a department.
Features of ER Model
Graphical Representation for Better Understanding: It is very easy and simple to understand
so it can be used by the developers to communicate with the stakeholders.
Database Design: This model helps the database designers to build the database and is widely
used in database design.
Advantages of ER Model
Simple: Conceptually ER Model is very easy to build. If we know the relationship between
the attributes and the entities we can easily build the ER Diagram for the model.
Effective Communication Tool: This model is used widely by the database designers for
communicating their ideas.
Easy Conversion to any Model: This model maps well to the relational model and can be
easily converted relational model by converting the ER model to the table. This model can
also be converted to any other model like network model, hierarchical model etc.
Disadvantages of ER Model
4. Relational Model
Relational Model is the most widely used model. In this model, the data is maintained in the
form of a two-dimensional table. All the information is stored in the form of row and
columns. The basic structure of a relational model is tables. So, the tables are also
called relations in the relational model. Example: In this example, we have an Employee
table.
Features of Relational Model
Tuples: Each row in the table is called tuple. A row contains all the information about any
instance of the object. In the above example, each row has all the information about any specific
individual like the first row has information about John.
Attribute or field: Attributes are the property which defines the table or relation. The values
of the attribute should be from the same domain. In the above example, we have different
attributes of the employee like Salary, Mobile no, etc.
Simple: This model is more simple as compared to the network and hierarchical model.
Scalable: This model can be easily scaled as we can add as many rows and columns we want.
Structural Independence: We can make changes in database structure without changing the
way to access the data. When we can make changes to the database structure without affecting
the capability to DBMS to access the data we can say that structural independence has been
achieved.
Hardware Overheads: For hiding the complexities and making things easier for the user this
model requires more powerful hardware computers and data storage devices.
Bad Design: As the relational model is very easy to design and use. So the users don't need to
know how the data is stored in order to access it. This ease of design can lead to the
development of a poor database which would slow down if the database grows.
But all these disadvantages are minor as compared to the advantages of the relational model.
These problems can be avoided with the help of proper implementation and organization.
A simple ER Diagram:
In the following diagram we have two entities Student and College and their relationship. The
relationship between Student and College is many to one as a college can have many students
however a student cannot study in multiple colleges at the same time. Student entity has
attributes such as Stu_Id, Stu_Name & Stu_Addr and College entity has attributes such as
Col_ID & Col_Name.
Components of an ER Diagram
Entity:
An entity is a real-world thing which can be distinctly identified like a person, place or a
concept. It is an object which is distinguishable from others. If we cannot distinguish it from
others then it is an object but not an entity. An entity can be of two types:
Tangible Entity: Tangible Entities are those entities which exist in the real world
physically. Example: Person, car, etc.
Intangible Entity: Intangible Entities are those entities which exist only logically and have no
physical existence. Example: Bank Account, etc.
Example: If we have a table of a Student (Roll_no, Student_name, Age, Mobile_no) then each
student in that table is an entity and can be uniquely identified by their Roll Number i.e Roll_no.
Note: In E-R model we don't represent the data but we represent the structure or schema. When
we convert E-R model to relational model then data can be stored in tuple or row and hence,
represented as an entity.
Entity Type:
The entity type is a collection of the entity having similar attributes. In the above Student
table example, we have each row as an entity and they are having common attributes i.e each
row has its own value for attributes Roll_no, Age, Student_name and Mobile_no. So, we can
define the above STUDENT table as an entity type because it is a collection of entities having
the same attributes. So, an entity type in an ER diagram is defined by a name(here,
STUDENT) and a set of attributes(here, Roll_no, Student_name, Age, Mobile_no). The table
below shows how the data of different entities( different students) are stored.
The E-R representation of the above Student Entity Type is done below.
Note: We use a rectangle to represent an entity type in the E-R diagram, not entity.
Weak Entity Type: The weak entity in DBMS does not have a primary key and are dependent
on the parent entity. Weak entity is represented by double rectangle.
Weak entity
Weak entity type doesn't have a key attribute. Weak entity type can't be identified on its own.
It depends upon some other strong entity for its distinct identity. This can be understood with
a real-life example. There can be children only if the parent exits. There can be no
independent existence of children. There can be a room only if building exits. There can be no
independent existence of a room. A weak entity is represented by a double outlined rectangle.
The relationship between a weak entity type and strong entity type is called an identifying
relationship and shown with a double outlined diamond instead of a single outlined diamond.
This representation can be seen in the diagram below.
Example: If we have two tables of Customer(Customer_id, Name, Mobile_no, Age, Gender)
and Address(Locality, Town, State, Customer_id). Here we cannot identify the address
uniquely as there can be many customers from the same locality. So, for this, we need an
attribute of Strong Entity Type i.e. ‘Customer’ here to uniquely identify entities of 'Address'
Entity Type.
Entity Set:
Entity Set is a collection of entities of the same entity type. In the above example of
STUDENT entity type, a collection of entities from the Student entity type would form an
entity set. We can say that entity type is a superset of the entity set as all the entities are
included in the entity type. Let's try to understand this with the help of an example.
Example 1: In the below example, two entities E1 (2, Angel, 19, 8709054568) and E2(4,
Analisa, 21, 9847852156) form an entity set.
Example 2: We can form another entity set by taking three entities from the table. (2, Angel,
19, 8709054568) , (3, Priya, 20, 9864257315) and (4, Analisa, 21, 9847852156) can also
form a entity set. Similary, we can form any combination of the entity set using any of the
entities from the entity type 'STUDENT'. Also, we can understand that if we take all the
records to the entity set we get the entity type 'STUDENT'. So, we can say that the entity type
is the superset of the entity set.
ATTRIBUTE:
An attribute is a property or characteristic of an entity. An entity may contain any number of
attributes. One of the attributes is considered as the primary key. In an Entity-Relation model,
attribute is represented in eclipse shape. Attributes are the properties which describe an entity.
There are four types of attributes:
1. Key attributes
2. Composite attribute
3. Multivalued attribute
4. Derived attributes
1. Key attribute:
A key attribute can uniquely identify an entity from the entity set. For example a
student roll number can uniquely identify a student from a set of students. Key attribute is
represented by eclipse same as attribute however the text key attribute is underlined.
2. Composite attribute :
An attribute that can be split into components is a composite attribute.
Example: The address can be further split into house number, street number, city, state,
country, and pin code, the name can also be split into first name middle name, and last name.
Example : Name
3. Multi-valued Attribute:
An attribute which can hold multiple values is known as multivalued attribute. It is
represented with double ovals in an ER diagram. For example – A person can have more than
one phone number or email id so the phone number attribute is multivalued.
When a single instance of an entity is associated with a single instance of another entity then it is
called one to one relationship. For example, a person has only one passport and a passport is
given to one person.
2. One to Many Relationship
When a single instance of an entity is associated with more than one instances of another entity
then it is called one to many relationship. For example – a customer can place many orders but a
order cannot be placed by many customers.
3. Many to One Relationship
When more than one instances of an entity is associated with a single instance of another entity
then it is called many to one relationship. For example – many students can study in a single
college but a student cannot study in many colleges at the same time.
4. Many to Many Relationship
When more than one instances of an entity is associated with more than one instances of another
entity then it is called many to many relationship. For example, a can be assigned to many
projects and a project can be assigned to many students.
What is Keys?
Key is used to uniquely identify any record or row of data from the table. It is also used
to establish and identify relationships between tables. Keys play an important role in the
relational database.
For example: In Student table, ID is used as a key because it is unique for each student. In
PERSON table, passport_number, license_number, SSN are keys since they are unique for each
person.
Types of key:
• Primary Key
• Candidate Key
• Super Key
• Foreign Key
1. Primary key:
It is the first key which is used to identify one and only one instance of an entity uniquely. An
entity can contain multiple keys. The key which is most suitable from those lists become a
primary key.
In the EMPLOYEE table, ID can be primary key since it is unique for each employee. In
the EMPLOYEE table, we can even select License_Number and Passport_Number as primary
key since they are also unique.
2. Candidate key:
A candidate key is an attribute or set of an attribute which can uniquely identify a tuple.
The remaining attributes except for primary key are considered as a candidate key. The candidate
keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary key. Rest of the
attributes like SSN, Passport_Number, and License_Number, etc. are considered as a candidate
key.
3. Super Key:
Super key is a set of an attribute which can uniquely identify a tuple. Super key is a superset of a
candidate key.
For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME) the
name of two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this
combination can also be a key.
4. Foreign key:
Foreign keys are the column of the table which is used to point to the primary key of another
table.
In a company, every employee works in a specific department, and employee and department are
two different entities. So we can't store the information of the department in the employee table.
That's why we link these two tables through the primary key of one table.
We add the primary key of the DEPARTMENT table, Department_Id as a new attribute in the
EMPLOYEE table.
Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.
RDBMS stands for Relational Database Management Systems. It is basically a program that
allows us to create, delete, and update a relational database. Relational Database is a database
system that stores and retrieves data in a tabular format organized in the form of rows and
columns. It is a smaller subset of DBMS which was designed by E.F Codd in the 1970s.
Relational DBMS owes its foundation to the fact that the values of each table are related to
others. It has the capability to handle larger magnitudes of data and simulate queries easily.
Relational Database Management Systems maintains data integrity by simulating the following
features:
Entity Integrity: No two records of the database table can be completely duplicate.
Referential Integrity: Only the rows of those tables can be deleted which are not used by other
tables. Otherwise, it may lead to data inconsistency.
User-defined Integrity: Rules defined by the users based on confidentiality and access.
Domain integrity: The columns of the database tables are enclosed within some structured
limits, based on default values, type of data or ranges.
Characteristics:
Data must be stored in tabular form in DB file, that is, it should be organized in the form of rows
and columns.
Each row of table is called record/tuple. Collection of such records is known as the cardinality of
the table
Each column of the table is called an attribute/field. Collection of such columns is called the
arity of the table.
No two records of the DB table can be same. Data duplicity is therefore avoided by using a
candidate key. Candidate Key is a minimum set of attributes required to identify each record
uniquely.
Tables are related to each other with the help for foreign keys.
Database tables also allow NULL values, that is if the values of any of the element of the table
are not filled or are missing, it becomes a NULL value, which is not equivalent to zero. (NOTE:
Primary key cannot have a NULL value).
Example:
The following table STUDENT consists of three columns Roll Number, Name, Section and four
records of students 1, 2, 3 and 4 respectively. The records can’t be completely same, the Roll
Number acts as a candidate key which separates records.
Advantages
Easy to manage: Each table can be independently manipulated without affecting others.
Security: It is more secure consisting of multiple levels of security. Access of data shared can be
limited.
Flexible: Updating of data can be done at a single point without making amendments at multiple
files. Databases can easily be extended to incorporate more records, thus providing greater
scalability. Also, facilitates easy application of SQL queries.
Users: RDBMS supports client-side architecture storing multiple users together.
Facilitates storage and retrieval of large amount of data.
Easy Data Handling:
Data fetching is faster because of relational architecture.
Data redundancy or duplicity is avoided due to keys, indexes, and normalization principles.
Data consistency is ensured because RDBMS is based on ACID properties for data transactions
(Atomicity Consistency Isolation Durability).
Fault Tolerance: Replication of databases provides simultaneous access and helps the system
recover in case of disasters, such as power failures or sudden shutdowns
Disadvantages
High Cost and Extensive Hardware and Software Support: Huge costs and setups are required to
make these systems functional.
Scalability: In case of addition of more data, servers along with additional power, and memory
are required.
Complexity: Voluminous data creates complexity in understanding of relations and may lower
down the performance.
Structured Limits: The fields or columns of a relational database system is enclosed within
various limits, which may lead to loss of data.