UNIT1_DBMS-1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

23CS1303 DATABASE MANAGEMENT SYSTEMS UNIT I

UNIT I DATABASE FUNDAMENTALS

Introduction: Database System Applications, Purpose of Database Systems, View of Data, Components
and Structure, Database Users and Administrator, History of Database Systems. Data models: ER
model, Entities, Attributes and Entity sets, Relationships and Relationship sets, Additional features of
ER Model, Conceptual Design with the ER Model, Enhanced E-R Model- Mapping ER Model to
Relational Model.

1. INTRODUCTION TO DATA BASE MANAGEMENT:


Database is a collection of data and Management System is a set of programs to store and retrieve
those data.
DBMS is a collection of inter- related data and set of programs to store & access those data in an
easy and effective manner.
What is the need of DBMS?
Database systems are basically developed for large amount of data. When dealing with huge
amount of data, there are two things that require optimization: Storage of data and retrieval of
data.
1.1 Database System Applications
There are wide range of applications that make use of database systems. Some of them are:-
1.Accounting
2. Manufacturing
3. Banking
4. Universities

5. Reservation systems
6.Telecommunication
1.2 PURPOSE OF DATABASE SYSTEM
In the early days, database applications were built directly on top of file systems. For
example, consider a savings-bank enterprise that keeps information about all customers and savings
accounts in files. The users can manipulate the information in the files using application programs.
Some of the programs are,
 A program to debit or credit an account
 A program to add an account
 A program to find the balance of an account
 A program to generate monthly statements
System programmers wrote these applications to meet the needs of the bank. New
application programs are written as the need arises. For example ,if the savings bank decides to
offer checking accounts, the bank creates new permanent files that contain information about all the
checking accounts, and system programmers has to write new application programs to deal with
situations that do not arise in savings accounts, such as overdrafts. Thus as time goes by, the system
DEPTOFCSE 1
23CS1303 DATABASE MANAGEMENT SYSTEMS UNIT I

requires more files and more application programs. This leads to several disadvantages as given
below:
Disadvantages in file systems
The file processing system has a number of disadvantages:
1) Data redundancy and inconsistency
Since different programmers create the files and application programs, the files will have a
different format and the programs may be written in several programming languages. Also the
same information may be duplicated in several files. For example the address and telephone
number of a particular customer may appear in a file that consists of saving-account records and
in a file that consist of checking account records. This redundancy leads to higher storage and
access cost.
Data inconsistency occurs, if the various copies of the same data may no longer agree. For
example, a changed customer address may be reflected in savings account records but not
elsewhere in the system.
2) Difficulty in accessing data
Suppose that a bank officer needs to find the names of all customers who live in a particular
postal-code area. The officer insists the data processing department to generate such a list. The
original system has no application program to meet the request. However, the system has an
application program to generate the list of all customers. The officer has only two choices:
either can obtain the list of all customers and extract the needed information manually or ask the
system programmer to write the necessary application program. Both are unsatisfactory.
Thus the conventional file-processing environments do not allow needed data to be retrieved
in a convenient and efficient manner.
3) Data isolation
Because data are scattered in various files, and files maybe in ,writing new application
programs to retrieve the appropriate data is difficult.
4) Integrity problems
The data values stored in a database must satisfy some consistency constraints. For example,
the balance of a bank account may never fall below a prescribed amount (say, $50). Developers
enforce these constraints in the system by adding appropriate code in the various application
programs. However, when new constraints are added it is difficult to change the programs to
enforce them.
5) Atomicity Problems
A computer system like any other mechanical or electrical device is subject to failure. In
many applications, if a failure occurs, it is difficult to restore the data to a consistent state as it
existed prior to the failure. For example consider a program to transfer $50 from account A to
account B. if a system failure occurs during the execution of the program, it is possible that $50
was removed from account A but was not credited to account Resulting in inconsistent

DEPTOFCSE 2
23CS1303 DATABASE MANAGEMENT SYSTEMS UNIT I

database state. The funds transfer must be atomic. It must happen in its entirety or not at all.
Thus it is difficult to ensure atomicity in conventional file processing system.
6) Concurrent access anomalies
To improve the overall performance of the system, many systems allow multiple users to
update the data simultaneously. In such an environment, interaction of concurrent updates may
result in inconsistent data. For example consider bank account A, containing $500. If two
customers withdraw funds (say $50 and $100 respectively) from account at about the same
time, the result of the concurrent executions may leave the account in an incorrect state. If the
two programs run concurrently, they may both read the value $500, and write the result back
$450 and $400 respectively. Depending on which one writes the value last, the account may
contain either $450 or $400, rather than the correct value of $350.
7) Security Problems
Every user of the database system should not be able to access all the data. For example, in
a banking system, payroll personnel need to see only that part of the database that has
information about the various bank employees. They do not need to access information about
customer accounts. The file processing systems do not enforce security constraints.
Areas of Database Applications:
o Banking: all transactions Airlines: reservations, schedules
o Universities: registration, grades
o Sales: customers, products, purchases
o Online retailers: order tracking, customized recommendations
o Manufacturing: production, inventory, orders, supply
chain
o Human resources: employee records, salaries, tax deductions
o Examples of Database Management Systems
IBM’sDB2, Microsoft’s Access and SQL Server, Oracle
Advantages of Using the Database Approach:
o Controlling redundancy in data storage and in development and maintenance
efforts.

o Sharing of data among multiple users.


o Restricting unauthorized access to data.
o Providing persistent storage for program
Objects Object-oriented DBMSs
o Providing Storage Structures(e.g.indexes)for efficient
QueryProcessing
o Providing backup and recovery services.

DEPTOFCSE 3
23CS1303 DATABASE MANAGEMENT SYSTEMS UNIT I

o Enforcing integrity constraints on the database.

1.2 DATABASE SYSTEM TERMINOLOGIES:


Data & Information:
The term data can be defined as a set of isolated and unrelated raw facts with an implicit
meaning. Data can be anything such as, name of a person, a number, images, sound, etc. For
example, ‘Monica,’ ‘25,’ ‘student,’ etc., is a data. When the data is processed and converted into a
meaningful and useful form, it is known as information.
Database:
A database can be defined as a collection of related data from which users can efficiently
retrieve the desired information.
A database can be anything from a simple collection of roll numbers, names, addresses, and
phone numbers of students to a complex collection of sound, images, and even video or film
clippings.
Database Management System (DBMS):
A Database Management System (DBMS) is an integrated set of programs used to create
and maintain a database. The main objective of a DBMS is to provide a convenient and effective
method of defining, storing, retrieving, and manipulating the data contained in the database. In
addition, the DBMS must ensure the security of the database from unauthorized access and
recovery of the data during system failures. It must also provide techniques for data sharing among
several users.
Examples of database management systems
• IBM’sDB2,Microsoft’s Access and SQL Server, Oracle
Database System:
The database and the DBMS software are collectively known as database system.
1.3 DATABASE CHARACTERISTICS:
Self-describing nature of a database system:
 ADBMS catalog stores the description of a particular database (Example: data structures,
types, and constraints)
 The description is called meta-data.
 This allows the DBMS software to work with different
 This allows the DBMS software to work with different database applications.
Insulation between programs and data:
 Called program-data independence.
 Allows changing data structures and storage organization without having to change the
DBMS access programs.
Data Abstraction:
 A data model is used to hide storage details and present the users with a conceptual view of
the database.

DEPTOFCSE 4
CS8492 DATABASE MANAGEMENT SYSTEMS UNITI

 Programs refer to the data model constructs rather than data storage details
Support of multiple views of the data:
 Each user may see a different view of the database, which describes only the data of interest
to that user.
Sharing of data and multi-user transaction processing:
 Allowing a set of concurrent users to retrieve from and to update the database.
 Concurrency control within the DBMS guarantees that each transaction is correctly executed
or aborted
 Recovery subsystem ensures each completed transaction has its effect permanently recorded
in the database
 OLTP(Online Transaction Processing) is a major part of database applications. This allows
hundreds of concurrent transactions to execute per second.
• Programs are isolated from data through abstraction
• It does not expose details of how(or where) data is stored or how operations are
implemented

1.4 VIEWS OF DATA


Database contains three levels of abstraction namely,
Internal or Physical level: describes the physical structure of the database. The internal schema
uses a physical data model and describes the complete details of data storage and access paths for
the database.
Conceptual or Logical level: describes the structure of the whole database for a community of
users. The conceptual schema hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations and constraints, data stored in database,
and the relationships among the data.
External or View level: includes number of external schemas or user views. Each external schema
describes the part of the database that a particular user group is interested in and hides the rest of
the database from that user group.
The below figure shows the levels of abstraction.
users users

External Schema
External View External View

Logical Schema Conceptual or Logical Level

Physical Schema Internal or Physical Level

Database

DEPTOFCSE 5
CS8492 DATABASE MANAGEMENT SYSTEMS UNITI

Mapping:
The transformation of requests and results from one level to another is called as
mapping.
Schema
It is the logical structure of the database. The two types of schema are:
 Physical schema represents database design at the physical level
 Logical schema represents database design at the logical level
Instance– the actual content of the database at a particular point in time is called the instance of the
schema
Data Independence
The ability to change a schema at one level without affecting the higher level schemas is
called as data independence. There are two types of data independence:
o Physical Data Independence– is the ability to modify the physical or internal
schema without changing the logical or external schemas.
o Logical Data Independence– is the ability to modify logical or conceptual schema
without changing the external schemas or application programs.

1.5 DATABASE SYSTEM ARCHITECTURE OR COMPONENTS OF DBMS


Figure 1.5 illustrates the typical DBMS components. The figure is divided into two parts.
The top part of the figure refers to the various users of the database environment and their
interfaces. The lower part shows the internals of the DBMS responsible for storage of data and
processing of transactions.
Upper Module:
Casual users:
They work with interactive interfaces to formulate queries.
Application programmers:
They create programs using some host programming languages
Parametric users:
They do data entry work by supplying parameters to predefined transactions.

DEPTOFCSE 6
CS8492 DATABASE MANAGEMENT SYSTEMS UNITI

DBA staff:
They work on defining the database and tuning it by making changes to its definition using
the DDL and other privileged commands.
DDL Compiler:
The DDL compiler processes schema definitions, specified in the DDL, and stores
descriptions of the schemas (meta-data) in the DBMS catalog. The catalog includes information
such as the names and sizes of files, names and data types of data items, storage details of each file,
mapping information among schemas, and constraints. In addition, the catalog stores many other
types of information that are needed by the DBMS modules, which can then look up the catalog
information as needed.
Query Compiler & Optimizer:
Query compiler compiles the queries of casual users and translates the min to an internal
form. This internal query is subjected to query optimization. The query optimizer is concerned
with the rearrangement and possible reordering of operations, elimination of redundancies, and use
of correct algorithms and indexes during execution. It consults the system catalog for statistical and
other physical information about the stored data and generates executable code that performs the
necessary operations for the query and makes calls on the runtime processor.
Precompiler, DML Compiler & Host Language Compiler:
Application programmers write programs in host languages such as Java, C, or C++ that
aresubmittedtoaprecompiler.TheprecompilerextractsDMLcommandsfromanapplicationprogram
written in a host programming language. These commands are sent to the DML compiler for
compilation in to object code for database access.
The rest of the program is sent to the host language compiler. The object codes for the DML
commands and the rest of the program are linked, forming a canned transaction whose executable
code includes calls to the runtime database processor. Canned transactions are executed repeatedly
by parametric users, who simply supply the parameters to the transactions. Each execution is
considered to be a separate transaction.
Lower Module:
Runtime database processor:
It executes the privileged commands, the executable query plans, and the canned
transactions with runtime parameters. It works with the system catalog and may update it with
statistics. It also works with the stored data manager, which in turn uses basic operating system
services for carrying out low-level input/output (read/write) operations between the disk and main
memory.
The runtime database processor handles other aspects of data transfer, such as management
of buffers in the main memory. Some DBMSs have their own buffer management module while
others depend on the OS for buffer management. We have shown concurrency control and
backup and recovery systems separately as a module in this figure. They are integrated into the
working of the runtime database processor for purposes of transaction management.

DEPTOFCSE 7
CS8492 DATABASE MANAGEMENT SYSTEMS UNITI

Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level.

There are two types of data independence:

DEPTOFCSE 8
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

1. Logical Data Independence

 Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
 Logical data independence is used to separate the external level from the conceptual
view.
 If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
 Logical data independence occurs at the user interface level.

2. Physical Data Independence

 Physical data independence can be defined as the capacity to change the internal
schema without having to change the conceptual schema.
 If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
 Physical data independence is used to separate conceptual levels from the internal levels.
 Physical data independence occurs at the logical interface level.

Fig: Data Independence

DEPTOFCSE 9
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

1.6 HISTORY OF DATABASE SYSTEMS.

1950s and early 1960s:


 –Data processing using magnetic tapes for storage Tapes provide only sequential access
 –Punched cards for input
Late 1960s and 1970s:
 –Hard disks allow direct access to data
 –Network and hierarchical data models in widespread use
 –Ted Codd defines the relational data model
Would win the ACM Turing Award for this work IBM Research begins System R prototype
UC Berkeley begins Ingres prototype
 –High-performance (for the era) transaction processing 1980s:
 –Research relational prototypes evolve into commercial systems SQL becomes industry standard
 –Parallel and distributed database systems
 –Object-oriented database systems 1990s:
 –Large decision support and data-mining applications
 –Large multi-terabyte data warehouses
 –Emergence of Web commerce 2000s:
 –XML and XQuery standards
 –Automated database administration
 –Increasing use of highly parallel database systems

 –Web-scale distributed data storage systems

1.7 DATAMODELS:
A data model is a collection of concepts that can be used to describe the structure of a
database. By structure of a database we mean the data types, relationships and constraints that
apply to the data.

Types of Data Models:


 Relational model
 Entity-Relationship data model(mainly for database design)

1. Relational Model
The relational model uses a collection of tables to represent both data and relationships
among the data. Each table has multiple columns, and each column has a unique name. The
below table called customer table, shows, for example, that the customer identified by customer-
id 100 is named john and lives at 12 anna st. in Chennai and also shows his account number.

DEPTOFCSE 10
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

Customer_id Customer_name Cutomer_street Customer_city Account_no


100 John 12annast Chennai A-101
101 Karthik 3mainst Chennai A201
103 Lilly 4northst Chennai A-204

The relational model is an example of a record-based model. The relational model is at a


lower level abstraction than the E-R model. Database designs are often carried out in the E-
Rmodel, and then translated to the relational model.
2.Entity-Relationship data model
The entity-relationship(E-R) data model, models an enterprise as a collection of entities and
relationships.
Entity: is a “thing” or“ object” in the enterprise that is distinguishable from other objects.
They are described by a set of attributes
Relationship: is an association among several entities
The E-R model was developed to facilitate database design. The E-R model is very useful in
mapping the meanings and interactions of real world enterprises onto a conceptual schema. The
E-R model is represented diagrammatically by an entity-relationship diagram as shown below.
In the figure customer and account represents entities, the ellipses represent attributes and
deposit or represents relationship among the entities.

1.7.1 Attributes and its Types:-


Attributes:
An entity is represented by a set of attributes. Attributes are descriptive properties possessed
by each member of an entity set. For example possible attributes of the customer entity set are
customer-id, customer-name, customer-street, and customer-city. Each entity has a value for each
attribute. For instance, a particular customer entity may have the value 100-00-20 for customer-id,
the value john for customer-name etc.
Types of attributes:
Simple attribute Vs Composite attributes:
The attributes that cannot be divided into sub parts is called as simple attribute. Composite
attributes are attributes that can be divided into sub parts. Composite attributes can also be used as
a hierarchy. The divided subparts are called as component attributes.
DEPTOFCSE 11
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

Example for Simple Attribute:


A person’s age cannot be divided into subparts.
Example for Composite Attribute:
An attribute name could be structured as a composite attribute consisting of first-name,
middle-initial, last-name.
Single-valued attributes Vs Multivalued attributes:
The attribute that has only one value for a particular entity is called as single-valued attributes.
The attribute that has zero or one or more values is called as multivalued attribute.
Example for Single-valued:
A person can have only one date of birth.
Example for Multi-valued:
A person can have zero or one or more phone numbers.

Derived attribute Vs Stored attribute:


The attribute that derives its value from another attribute (stored attribute) is called as derived
attribute.
Example:
A person’s age can be derived from the person’s dob. So age is the derived attribute and dob is
stored attribute.
Null Attribute:
An attribute that takes a null value when an entity does not have a value for it, is called as null
attribute. Null can also designate that an attribute value is unknown. An unknown value may be
either missing or not known.
Example:
If the name value is null for a particular customer, then it is missing. A null value for the
apartment number could mean that it is either missing or not known. That is either the address
does not include the apartment number or the apartment number exists but it is not known.
1.7.2 Relationship Sets:
A relationship is an association among several entities. A relationship set is the set of
relationships of the same type. A relationship set is a mathematical relation among n  2 entities,
each taken from entity sets {(e1, e2, … en) | e1  E1, e2  E2, …, en 
Example:
Consider two entity sets customer and loan in figure below. The association between these
two entity sets can be expressed using the relationship set borrower. Thefigure depicts this
association.

DEPTOFCSE 12
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

Figure: relationship set borrower


The association between entity sets is referred to as participation. The function that an
entity plays in a relationship is called that entity’s role.
A relationship may also have attributes called descriptive attributes. For example
consider a relationship set depositor with entity sets customer and account. It can be associated
with the attribute access-date to specify the most recent date on which a customer accessed an
account. Figure depicts the example.

Figure : Access-date as descriptive attribute of the deposit or relationship set


Most of the relationship sets in a database system are binary. Occasionally, however,
relationship sets involve more than two entity sets. The number of entity sets that participate in a
relationship set is called the degree of the relationship set. A binary relationship set is of degree 2;
a ternary relationship set is of degree3.
1.7.2. Constraints:
An E_R enterprise schema may define certain constraints to which the contents of a
database must conform. Two important constraints are: mapping cardinalities and
participation constraints.
Mapping Cardinalities:
Mapping cardinalities or cardinality ratios express the number of entities to which another
entity can be associated via a relationship set. They are most useful in describing binary
DEPTOFCSE 13
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

relationship sets.
For a binary relationship set R between entity sets A and B, the mapping cardinality must
be one of the following:
One to one: An entity in A is associated with at most one entity in B, and an entity in B is
associated with at most one entity in A. The next figure, shows one to one relation.
A B

a1 b1

B
a2 2

a3 b3

One to many: An entity in A is associated with any number of entities in B. An entity in B ,


however, can be associated with at most one entity in A. The below figure, shows one to many
relation.
A B

b1
a1
b2
DEPTOFCSE a2 3
b3
a3

Many to one:An entity in A is associated with at most one entity in B. An entity in B,


however can be associated with any number of entities in A. the below figure shows Many to one
relationship
A B

a2 a

a3 a

a4 a

a5

Many to many : An entity in A is associated with any number of entities in B, and an entity in

DEPTOFCSE 14
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

B is associated with any number of entities in A. the below figure shows many to many relation.
Participation Constraints:
Total Participation:
TheparticipationofanentitysetEinarelationshipsetRissaidtobetotalifeveryentityinEparticipate
sinatleastonerelationshipinR.
A B
a1 b1

a2 b2

a3 b3

a4 b4

Example: every loan entity should be related to atleast one customer through the borrower
relationship. Therefore the participation of loan in the relationship set borrower is total.
Partial Participation:
The participation of an entity set E in a relationship set R is said to be partial if only some
entities in E participate in relationships in R.
Example:
An individual can be a bank customer whether or not the person has a loan with the bank.
Hence it is possible that only some of the customer entities are related to the loan entity set
through the borrower relationship. Hence the participation of customer in the borrower
relationship set is partial.

1.8. ADDITIONAL FEATURES OF ER- MODEL


Entity relationship diagram:
Major components:

An ER diagram can express the overall logical structure of a database graphically.ER


diagram consists of the following major components:
 Rectangles represent entity sets.
 Diamonds represent relationship sets.
 Lines link attributes to entity sets and entity sets to relationship sets.
 Ellipses represent attributes
o Double ellipses represent multivalued attributes.
o Dashed ellipses denote derived attributes.
 Underline indicates primary key attributes

DEPTOFCSE 15
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

 Double rectangles represent weak entity sets.


 Double lines represent total participation
The below figure shows the entity relationship diagram which consists of two entity sets,

Customer and loan, related through a binary relationship set borrower.


The attributes associated with customer are customer-id, customer-name, customer-street,
and customer-city. The attributes associated with loan are loan-number and amount. In the figure
the attributes of an entity set that are members of the primary key are underlined.

Cardinality constraints:
Cardinality constraints can be expressed by drawing either a directed line(), signifying
“one,” or an undirected line(—),signifying “many” between the relationship set and the entity set.
One-to-one relationship:
 A customer is associated with atmost one loan via the relationship borrower
 A loan is associated with atmost one customer via borrower as shown below.

One-to-Many:
In the one-to-many relationship a loan is associated with atmost one customer via
borrower, and a customer is associated with several(including)loans via borrower as shown below.

DEPTOFCSE 16
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

Many-to-one:
In a many-to-one relationship a loan is associated with several(including)customers via
borrower, and a customer is associated with atmost one loan via borrower as shown below.

Many-to-Many:
 A customer is associated with several(possibly)loans via borrower
 A loan is associated with several(possibly)customers via borrower as shown below.

Relationship sets with attributes:


If the relationship set has also some attributes associated with it ,then those attributes
are linked with the relationship set as below.

The figure above shows the descriptive attribute attached to the relationship set
deposit or to specify the most recent date on which a customer accessed that account.
E-R diagram with composite, Multivalued and derived attributes:

DEPTOFCSE 17
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

Composite attribute:
In the above figure, the composite attribute name, is represented with the
component attributes first-name, middle-initial, and last-name. Also a composite attributes
address is represented with it’s component attributes street, city, state and zip-code. The attribute
street is itself a composite attribute with the component attributes street-number, street-name and
apartment-number.
Multivalued attribute and Derived attribute:
The previous figure also illustrates a multivalued attribute phone-number, depicted
by a double ellipse, and a derived attribute age, depicted by a dashed ellipse.
Roles: Roles are indicated in E-R diagrams by labeling the lines that connect diamond
store Rectangles. The below figure shows the role indicators manager and worker between the
employee entity set and the works-for relationship set.

In the above figure the labels “manager” and “worker” are called roles; they specify
how employee entities interact via the works for relationship set.
Role labels are optional, and are used to clarify semantics of the relationship.

E-R diagram with ternary relationship:


Non binary relationship sets can be specified as shown in the below figure.

The above figure consists of three entity sets employee, job, and branch, related through
the relationship set works-on.
Participation of an entity set in a relationship set:
Total participation (indicated by double line):

DEPT OF CSE 18
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

Every entity in the entity set Participates in atleast one relationship in the relationship set.
Double lines are used in E-R diagram to indicate the participation of an entity set in a
relationship set is total.
Example:
Participation of loan in borrower is total. Every loan must have a customer associated to it
via borrower
Partial participation:
Some entities may not participate in any relationship in the relationship set
Example:
Participation of customer in borrower is partial
The below figure indicates the participation of entity sets.

Alternative notations for cardinality limits:

Cardinality limits can also express participation constraints an edge between an entity
set and a binary relationship set can have an associated minimum and maximum cardinality, in the
forml...h, where l is the minimum and h is the maximum cardinality. A minimum value of 1
indicates total participation of the entity set in the relationship set. A maximum value of 1
indicates that the entity participates in at most one relationship, while a maximum value *
indicates no limit. The label 1..* on an edge is equivalent to a double line.
Example:

In the previous figure the edge between loan and borrower has a cardinality constraint of
1...1,meaning the minimum and the maximum cardinality are both 1. That is, each loan must have
exactly one associated customer. The limit 0...* on the edge from customer to borrower indicates
that a customer can have zero or more loans. Thus, the relationship borrower is one to many from
customer to loan, and further the participation of loan in borrower is total.

DEPT OF CSE 19
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

Weak Entity Set:


An entity set that does not have a primary key is referred to as a weak entity set. An entity
set that has a primary key is termed as a strong entity set.
Example:
Let payment be an entity set which has three attributes: payment-number, payment-date,
and payment-amt.
Payment numbers are sequential numbers starting from 1, generated separately for each
loan. Thus although each payment entity is distinct, payments for different loans may share the
same payment number. Thus this entity set does not have a primary key and it is a weak entity set.
The existence of a weak entity set depends on the existence of an identifying entity set or
Owner entity set.
Every weak entity set must be associated with an identifying entity. That is the weak entity
set is said to be existence dependent on the identifying entity set.
The relationship associating the weak entity set with the identifying entity set is called the
identifying relationship.
In the example, the identifying entity set for payment is loan and a relationship loan-payment
that associates payment entities with their corresponding loan entities is the identifying
relationship.
The discriminator (or partial key) of a weak entity set is the set of attributes that distinguishes
among all the entities of a weak entity set.
The primary key of a weak entity set is formed by the primary key of the strong entity set
on which the weak entity set is existence dependent, plus the weak entity set’s discriminator.

In the case of entity set payment its primary key is {loan-number, payment-number},
where loan-number is the primary key of the identifying entity set, namely loan,and payment-
number distinguishes payment entities within the same loan.
The weak entity set is depicted by double rectangles in the E-R diagram. The
discriminator of a weak entity set is underlined with a dashed line as shown below.

In the previous figure,


Payment number – discriminator of the payment entity set Primary key for

payment– (loan number, payment number)


DEPT OF CSE 20
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

1.8 ENHANCED E-RFEATURES


The extended E-R features are specialization, generalization, aggregation and attribute
inheritance .

Specialization:
An entity set may include sub-groupings of entities that are distinct in some way from
other entities in the set. The E-R model provides a means for representing these distinctive entity
groupings.
For example consider an entity set person with attributes name, street and city. A person
may be further classified as one of the following:
 Customer
 Employee
Each of these person types is described by a set of attributes that includes all the attributes
of entity set person plus possibly additional attributes. For example, customer entity may be
described further by the attribute customer-id, and employee entity may be described further by
the attributes employee-id and salary.
The process of designating sub groupings within an entity set is called specialization.
The specialization of person allows distinguishing among persons according to whether they are
employees or customers.
Specialization can be applied repeatedly to refine a design scheme. For example, the bank
employees may be classified as one of the following:
 Officer
 Teller
 Secretary

Each of these types has its own attributes. In E-R diagram, specialization is depicted by a
triangle component labeled ISA. The label ISA stands for “is a” and represents, for example,
that a customer “is a” person. The ISA relationship may also be referred as a superclass-subclass
relationship. Higher and lower level entity sets are depicted as rectangles containing the name of
the entity set.
Generalization:
Generalization is a bottom-up design process which combines a number of entity sets that
share the same features into a higher-level entity set. For example employee and customer both
share the attributes name, street, and city which are combined into higher level entity set person.
Specialization and generalization are simple inversions of each other; they are represented in
an E-R diagram in the same way. The terms specialization and generalization are used
interchangeably.

DEPT OF CSE 21
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

The next figure depicts the specialization and generalization.

Attribute inheritance:
Acrucialpropertyofthehigherandlowerlevelentitiescreatedbyspecializationandgeneralizatio
n is attribute inheritance. The attributes of higher-level entity sets are said to be inherited by the
lower-level entity sets. For example, customer and employee inherit the attributes of person. Thus,
customer is described by its name, street, and city attributes and additionally customer-id attribute.
Employee is described by its name, street, and city attributes, and additionally employee-id and
salary attributes.
In the above figure employee, is the lower-level entity set of person and higher level
entity set of the officer, teller, and secretary entity sets. The lower-level entity set with only one
ISA relationship indicates single inheritance. If an entity set is a lower- level entity set with more
than one ISA relationship, then the entity set has multiple inheritance, and the resulting structure
is said to be lattice.

DEPT OF CSE 22
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

Aggregation:

Aggregation is an abstraction through which relationships are treated as higher-level entities.

There is a redundancy in the above figure, since every employee, branch, job
combination in manages is also in works-on. The redundancy can be avoided by aggregation as
shown in the below figure.

.
1.9 CONCEPTUAL DESIGN WITH THE ER MODEL

1.9.1 What does Conceptual Design include?

• Entity-Relationship model is used in the conceptual design of a database (☞


conceptual level, conceptual schema)

• Design is independent of all physical considerations


• (DBMS, OS, . . ).
Questions that are addressed during conceptual design:

 What are the entities and relationships of interest (mini- world )?

DEPT OF CSE 23
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

 What information about entities and relationships among entities


needs to bestored in the database?
 What are the constraints (or business rules) that (must) hold for the
entitiesand relationships?

• A database schema in the ER model can be represented


pictorially(Entity-Relationship diagram)

1.9.2 Entity Types, Entity Sets, Attributes and Keys


1.9.3 Key (super and candidate key) attributes of an Entity Type
1.9.4 Relationships, Relationship Types, and Relationship Sets
1.9.5 Constraints on Relationship Types (mapping cardinalities)
1.9.6 Enhanced ER Modeling Concepts
1.9.7 Steps in Designing an Entity-Relationship Schema

 [Step 1] Identify entity types (entity type vs. attribute)


 [Step 2] Identify relationship types
 [Step 3] Identify and associate attributes with entity and relationship types
 [Step 4] Determine attribute domains
 [Step 5] Determine primary key attributes for entity types
 [Step 6] Associate (refined) cardinality ratio(s) with relationship types
 [Step 7] Design generalization/specialization hierarchies including constraints
(includesnatural language statements as well)

1.9.8 Translation of ER Schema into Relational Mapping(Tables)

1.9.9 Summary of Conceptual Design


 Conceptual design follows requirements analysis, yields a high level description of
data to bestored (conceptual level).
 ER model is a popular model for conceptual design, constructs are expressive, close to
the waypeople think about applications; supported by many CASE tools.
 Basic constructs are entities, relationships, and attributes
 Some additional constructs: ISA hierarchies, cardinality ratios, .
 There are many variations on ER model constructs .
 Several kinds of integrity constraints can be expressed in the ER model: key
constraints,structural constraints, constraints on specializations
 Some of them can be expressed in SQL when translating entity and
relationship typesinto tables.
 Not all constraints can be expressed in the ER model
 Constraints play an important role in determining a good database
design for anapplication domain.
 ER design is subjective: There are many ways to model a given scenario! Analyzing
alternative schemas is important! Entity type vs. attribute, entity type vs. relationship
type, binary vs. n-aryrelationship type, use of IS-A, generalization and specialization, . .
.
 Ensuring a good database design includes analyzing and further refining relational schema
obtainedthrough translating ER schema.
DEPT OF CSE 24
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

1.10 MAPPING ER MODEL TO RELATIONAL MODEL.

The database can be represented using the notations, and these notations can be reduced to a
collection of tables.

In the database, every entity set or relationship set can be represented in tabular form.

The ER diagram is given below:

Converting the ER diagram to the table:


o Entity type becomes a table.

In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual tables.

o All single-valued attribute becomes a column for the table.

In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of STUDENT table.
Similarly, COURSE_NAME and COURSE_ID form the column of COURSE table and so on.

o A key attribute of the entity type represented by the primary key.

In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID, and LECTURE_ID are the key
attribute of the entity.
DEPT OF CSE 25
CS8492-DATABASE MANAGEMENT SYSTEMS UNITI

o The multivalued attribute is represented by a separate table.

In the student table, a hobby is a multivalued attribute. So it is not possible to represent multiple values in a
single column of STUDENT table. Hence we create a table STUD_HOBBY with column name
STUDENT_ID and HOBBY. Using both the column, we create a composite key.

o Composite attribute represented by components.

In the given ER diagram, student address is a composite attribute. It contains CITY, PIN, DOOR#, STREET,
and STATE. In the STUDENT table, these attributes can merge as an individual column.

o Derived attributes are not considered in the table.

In the STUDENT table, Age is the derived attribute. It can be calculated at any point of time by calculating
the difference between current date and Date of Birth.

Using these rules, you can convert the ER diagram to tables and columns and assign the mapping
between the tables. Table structure for the given ER diagram is as below:

DEPT OF CSE 26

You might also like