Design Database Main Module Simplified Tvet

Download as pdf or txt
Download as pdf or txt
You are on page 1of 238

Worabe Polly Technic college

Department ICT
Database Administration level III
Unit of competence: Design a database
LEARNING OUTCOMES:
• LO1 Determine database requirements
• LO2 Develop logical data model
• LO3 Design data structures
• LO4 Design queries, screens and reports
• LO5 Design access and security systems
• LO6 Confirm database design compiled by Aliyan A.
1
Introduction
• Data: What is data?
• Facts concerning people, objects, events or other entities.
• Can be in the form of text, graphics, sound and video segments
• They are difficult to interpret or make decisions based on
• Unprocessed, raw facts and can be stored in database

• Information: What is Information?


• Data presented in a form suitable for interpretation.
• Data processed to be useful in decision making.
• Processed data
• Can’t be stored in database

2
Introduction … cont’d
What is database?
• Database is a collection of related data and data is a collection of
facts and figures that can be processed to produce information.
• An organized collection of logically related data.
• A shared collection of logically interrelated data designed to meet
the varied information needs of an organization
• A shared collection – can be used simultaneously by many
department and user
• Logically related - comprise the important objects and the
relationships between these objects
• A computerized means of record keeping system
3
Introduction …. cont’d
A database has the following implicit properties:
• A database represents some aspect of the real world, sometimes called
the mini world or the Universe of Discourse (UoD).
• Changes to the mini world are reflected in the database.
• A database is a logically coherent collection of data with some inherent
meaning.
• A random assortment of data cannot correctly be referred to as a
database.
• A database is designed, built, and populated with data for a specific
purpose.

4
Introduction …. cont’d

Data
Course Section Semester Name Rank

MIS 3353 100 Su 01 Kemp Instr


MIS 3353 200 Su 01 Schwarzkopf Assoc P
MIS 3373 200 Su 01 Kemp Instr
MIS 4663 900 Fa 01 Schwarzkopf Assoc P
MIS 4663 901 Fa 01 Van Horn Prof

5
Introduction … cont’d
Meta Data: What do we mean by meta data?
Descriptions of the properties or characteristics of the data,
including data types, field sizes, allowable values, and
documentation
Data that describes data
Data about data
Description of fields
Display and format instructions
Structure of files and tables
Security and access rules
Triggers and operational rules

6
Introduction …. cont’d
Metadata
Data Item Value
Name Type Length Min Max Description
Course Char 7 Three digit depart-
ment reference
and 4 digit
Section Integer 3 001 900 Section number

Semester Char 10 Semester and year


Name Char 30 Instructor name
Rank Char 10 Instructor rank

7
Data management approaches
• Data management : keeping your data records
We have three approaches
• Manual Approach
• File-Based Approach
• Database Approach
Manual File Handling Systems
• The primitive and traditional way of information handling
• This may work well if the number of items to be stored is small.
• Includes intensive human labor
• Events and objects are written on files (paper)
• Each of the files containing various kinds of information is labeled and stored in one
or more cabinets
• The cabinets could be kept in safe places for security
8
Limitations of Manual File Handling
• Problem of Data Organization
• Problem of Efficiency
• Prone to error
• Difficult to update, retrieve, integrate
• You have the data but it is difficult to compile the
information
• Significant amount of duplication of data
• Cross referencing is difficult

9
File based Approach
• File based systems were an early attempt to
computerize the manual filing system.
• It is a decentralized computerized data handling method
i.e. to develop a program or a number of programs for
each different application.
• Since every application defines and manages its own
data, the system is subjected to serious data duplication
problem.
• File, in traditional file based approach, is a collection of
records which contains logically related data.
10
Limitations of File-Based systems
• Data Redundancy (Duplication of data)
• Same data is held by different programs
• Staffsalary(staffno, name, sex, salary)
• Staff(staffno,name,position,sex,dateofb,salary)
• Wasted space (Uncontrolled duplication of data)
• Separation and isolation of data: Each program maintains its own set of
data. Users of one program may be unaware of potentially useful data held
by other programs.
• Limited data sharing- No centralized control of data
• Data Inconsistency and confusion
• Data dependence: File structure is defined in the program code and is
dependent on the application programming language.
By Aliyan Aman 11
Limitations of File-Based systems .. Cont’d

• Incompatible file formats - Lack of data sharing and availability)


• Programs are written in different languages, and so cannot easily access
each others files.
E.g. Personnel write in C, Payroll writes in COBOL
• Poor Security and administration
• Update Anomalies
• Modification Anomalies
• Deletion Anomalies
• Insertion Anomalies

12
Database Approach
• Codd proposed that database systems should present the user with a view
of data organized as tables called relations.
• Database is just a computerized record keeping system or a kind of
electronic filing cabinet.
• Database is a repository for collection of computerized data files.
• Database is a shared collection of logically related data designed to meet the
information needs of an organization. Since it is a shared corporate resource,
the database is integrated with minimum amount of or no duplication.
• Database is a collection of logically related data where these logically related
data comprises entities, attributes, relationships, and business rules of an
organization's information.
• Database contains a description of the data which called as “Metadata” or
“Data Dictionary” or “Systems Catalogue” or “Data about Data”.
13
Cont..
• Since a database contains information about the data
(metadata), it is called a self descriptive collection on integrated
records.
• The purpose of a database is to store information and to allow
users to retrieve and update that information on demand.
• Database is deigned once and used simultaneously by many
users.
• That is the separation of the data definition from the
application. Thus the application is not affected by changes
made in the data structure and file organization.
• Each database application will perform the combination of:
Creating database, Reading, Updating and Deleting data.

14
Cont…
• a single repository of data is maintained.
• What emerged were the database and database management systems
• Basic Database terminologies
• Enterprise: an organization like library, bank, university, etc.
• Entity: Person, place, thing, or event about which we wish to keep data
• Attribute (Field): Property of an entity. E.g. Name, age, telephone, grade, sex,
etc.
• Record: A logically connected set of one or more Attributes that describe a person,
place or thing. (Logically related data)
• File: A collection of related records. E.g. Student file
• Relationship: an association among entities (entity records)
• Query: question asked for database

15
Benefits of Database systems
• Data can be shared: two or more users can access and use.
• Improved data accessibility: By using structured query languages, the
users can easily access data without programming experience.
• Redundancy can be reduced: Isolated data is integrated in database.
• Quality data can be maintained: the different integrity constraints in
the database approach will maintain the quality leading to better
decision making.
• Inconsistency can be avoided: controlled data redundancy will avoid
inconsistency of the data in the database to some extent.
• Transaction support can be provided: basic demands of any
transaction support systems are implanted in a full scale DBMS.

16
Benefits of Database systems … cont’d
• Integrity can be maintained: Data at different applications will be integrated
together with additional constraints.
• Security measures can be enforced: The shared data can be secured by data
security mechanisms.
• Improved decision support: the database will provide information useful for
decision making
• Standards can be enforced: ways of using different data by users
• Less Labor: data maintenance will not demand much resource
• Centralized information control: Since relevant data in the organization will
be stored at one repository, it can be controlled and managed at the central
level.
• Data Independence - Applications insulated from how data is structured and
stored

17
Limitations and risk of database approach
• Introduction of new professional and specialized personnel
• High cost to be incurred to develop and maintain the system
• Complex backup and recovery services from the users
perspective
• High impact on the system when failure occurs to the central
system
• Complex backup and recover services from the user’s
perspective.
• Reduced performance due to centralization and data
independency.
• High impact on the system when failure occurs to the central
system.
18
DBMS

• DBMS is a software package used for providing efficient, convenient


and safe multi-user storage of and access to massive amounts of
persistent data.
• Provides a systematic method for creating, updating, storing, retrieving
data in a database.
• DBMS also provides the service of controlling data access, enforcing
data integrity, managing concurrency control, and recovery.
• A full scale DBMS should at least have the following services to provide
to the user.
• Data storage, retrieval and update in the database
• A user accessible catalogue
• Transaction support service

19
DBMS … Cont’d
• Concurrency Control Services: access of database by different users
simultaneously
• Recovery Services: a mechanism for recovering from failure
• Authorization Services (Security): support the access authorization
• Support for Data Communication: support data transfer
• Integrity Services: rules about data and the change that took place on the
data, correctness and consistency of stored data
• Services to promote data independency between the data and the
application
• Utility services: sets of utility service facilities like
• Importing data
• Statistical analysis support
• Index reorganization
• Garbage collection
20
Some general functions of a DBMS:

• Designed to allow the definition, creation, querying, update, and


administration of databases
• Define rules to validate the data and relieve users of framing programs
for data maintenance
• Convert an existing database, or archive a large and growing one
• Run business applications, which perform the tasks of managing business
processes, interacting with end-users and other applications, to capture
and analyze data
• Some well-known DBMSs are Microsoft SQL Server, Microsoft Access,
Oracle, SAP, and others.

21
DBMS Language

1. Data Definition Language (DDL)


• Language used to define each data element required by the organization
• Commands for setting up schema of the database
• Used to set up a database, create, delete and alter table with the facility of handling
constraints
• Is used to define the internal and external schema
2. Data Manipulation Language (DML)
• Used for data manipulation
• Typical manipulations include retrieval, insertion, deletion, and modification of the
data.
• Since the required data or query by the user will be extracted using this type of
language, it is also called “Query Language”

22
DBMS Language … Cont’d
We have two types of DMLs:-
 Non-Procedural Manipulation Languages
 That allows the user to state what data is needed rather than how it is
to be retrieved.
E.g. SQL
 Procedural Data Manipulation Languages
 That allows the user to tell the system what data is needed and
exactly how to retrieve the data;

23
DBMS Language … Cont’d

How the Programmer Sees the DBMS


 Start with DDL to create tables
CREATE TABLE Students (
Name CHAR (30)
ID CHAR (9) PRIMARY KEY NOT NULL,
Category CHAR (20)) . . .
 Continue with DML to populate tables:
INSERT INTO Students
VALUES (‘Rahel’, ‘ICT 123’, ‘undergraduate’)
3. Data dictionary
• The data dictionary contains definitions of objects in the system
such as tables and table relationships and rules defined on
objects.

24
Components of DBMS
• DBMS have several components, each performing very significant tasks
in the database management system environment.
1. Software
• This is the set of programs used to control and manage the overall
database. This includes the DBMS software itself, the Operating
System, the network software being used to share the data among
users, and the application programs used to access data in the DBMS.
2. Hardware
• Consists of a set of physical electronic devices such as computers, I/O
devices, storage devices, etc., this provides the interface between
computers and the real world systems.
3. Data
• DBMS exists to collect, store, process and access data, the most
important component. The database contains both the actual or
operational data and the metadata.
25
Components of DBMS
4. Procedures
• These are the instructions and rules that assist on how to use the
DBMS, and in designing and running the database, using
documented procedures, to guide the users that operate and
manage it.
5. People
• This component includes all DBMS users. On the basis of primary
job functions, five types of users can be identified in a database
system:
• Systen administrator, database administrator, database designers,
system analysts and programmers, and end users.

26
Database Development Life Cycle

• The major steps in database design are;


1. Planning: That is identifying information gap in an organization and propose a
database solution to solve the problem.
2. Analysis: That concentrates more on fact finding about the problem or the
opportunity.
• Feasibility analysis, requirement determination and structuring, and selection
of best design method are also performed at this phase.
3. Design: in database designing more emphasis is given to this phase. The phase is
further divided into three sub-phases.
1. Conceptual Design: concise description of the data, data type,
relationship between data and constraints on the data.
• Used to elicit and structure all information requirements

27
Database Development Life Cycle … Cont’d

2. Logical Design: a higher level conceptual abstraction with selected


specific data model to implement the data structure.
• It is particular DBMS independent and with no other physical
considerations.
3. Physical Design: physical implementation of the upper level design of
the database with respect to internal storage and file structure to develop all
technology and organizational specification.
4. Implementation: coding, testing and deployment of the designed database for
use.
5. Operation and Support: administering and maintaining the operation of the
database system and providing support to users.

28
Users and Actors of Database System
• Actors on the scene: The people whose jobs involve the day-to-day use of a large
database
• Workers behind the scene: Those who work to maintain the database system
environment, but who are not actively interested in the database itself.

Actors on the Scene


• Database Administrators
• Database Designers
• End Users
• System Analysts and Application Programmers (Software Engineers)

29
Database Administrators
• In a database environment, the primary resource is the database itself
and the secondary resource is the DBMS and related software.
• Administering these resources is the responsibility of the Database
Administrator (DBA).
• The DBA is responsible for authorizing access to the database, for
coordinating and monitoring its use, and for acquiring software and
hardware resources as needed.
• The DBA is accountable for problems such as breach of security or
poor system response time.
• Involves in all steps of database development

30
• We can have further classifications of this role in big organizations having
huge amount of data and user requirement.
1. Data Administrator (DA): is responsible on management of data
resources.
• It involves in database planning, development, maintenance of
standards policies and procedures at the conceptual and logical design
phases.
2. Database Administrator (DBA): is more technically oriented role.
He/she is Responsible for the physical realization of the database.
It involves in physical design, implementation, security and integrity
control of the database.

31
Database Designer
• Database designers are responsible for identifying the data to be stored in
the database and for choosing appropriate structures to represent and store
this data.
• It is the responsibility of database designers to communicate with all
prospective database users, in order to understand their requirements, and to
come up with a design that meets these requirements.
• In many cases, the designers are on the staff of the DBA and may be assigned
other staff responsibilities after the database design is completed.
• The final database design must be capable of supporting the requirements of
all user groups.
• They identify and design the whole set of entities, relations, constraints, and
views.

32
End Users

• End users are the people whose jobs require access to the database for querying,
updating, and generating reports;
• The database primarily exists for their use. There are several categories of end
users:
• Casual end users:- occasionally access the database. They are typically middle
or high-level managers or other occasional browsers.
• Naive or parametric end users:- Their main job revolves around constantly
querying and updating the database, using standard types of queries and updates
called canned transactions that have been carefully programmed and tested.
• Bank tellers check account balances and post withdrawals and deposits
• Reservation clerks for airlines, hotels, and car rental companies check
availability for a given request and make reservations

33
End Users … Cont’d
• Sophisticated end users: Include engineers, scientists,
business analysts, and others who thoroughly familiarize
themselves with the facilities of the DBMS so as to
implement their applications to meet their complex
requirements.
• Stand-alone users: Maintain personal databases by using
ready made program packages that provide easy to use
menu or graphics based interfaces.
• An example is the user of a tax package that stores a
variety of personal financial data for tax purposes.

34
System Analysts and Application Programmers (Software Engineers)
• System analysts: Determine the requirements of end users,
especially naive and parametric end users, and develop
specifications for canned transactions that meet these
requirements.
• Application programmers implement these specifications as
programs; then they test, debug, document, and maintain these
canned transactions.
• Such analysts and programmers (nowadays called software
engineers) should be familiar with the full range of capabilities
provided by the DBMS to accomplish their tasks.

35
Workers behind the Scene
• These persons are typically not interested in the database itself.
These include:
• DBMS system designers and implementers:-are persons who
design and implement the DBMS modules and interfaces as a
software package.
• A DBMS is a complex software system that consists of many
components or modules, including modules for implementing the
catalog, query language, interface processors, data access,
concurrency control, recovery, and security.
• The DBMS must interface with other system software, such as
the operating system and compilers for various programming
languages

36
Workers behind the Scene … cont’d
• Tool developers: Include persons who design and implement
tools
• Tools are software packages that facilitate database system
design and use, and help improve performance.
• Tools are optional packages that are often purchased separately.
• They include packages for database design, performance
monitoring, natural language or graphical interfaces,
prototyping, simulation, and test data generation.
• Operators and maintenance personnel: are the system
administration personnel who are responsible for the actual
running and maintenance of the hardware and software
environment for the database system.

37
Some Common uses of Databases
In a university
• Containing information about a student, the course she/he is enrolled in, the
dormitory she/he has been given.
• Containing details of Staff who work at the university at personnel, payroll,
etc.
In a library
• There may be a database containing details of the books in the library and
details of the users,
• The database system handles activities such as
• Allowing a user to reserve a book
• Notifying when materials are overdue

38
Some Common uses of Databases … Cont’d
In travel agencies
• When you make inquiries about a travel, the travel agent may access databases
containing flight details
• Flight no., date, time of departure, time of arrival
Insurance
• When you wish to take out insurance, there is database containing
• Your personal details: name, address, age
• information on whether you drink or smoke,
• Your medical records to determine the cost of the insurance
Supermarkets
• When you buy goods from some supermarkets, a database will be accessed.
• The checkout assistant will run a barcode reader over the purchases.

39
LO2:Develop logical data model
Logical Database Design
•Is the process of constructing a model of the information used in
enterprise based on a specific data model. but independent of a particular
DBMS and other physical considerations.
•Logical Data Base Design is a process of modeling and capturing the end-
user views of an application domain and synthesis them into a data base
structure.
•Normalization is a logical data base design method.
•The basis for normalization is the functional dependencies among
attributes in a table.

40
Database Models
• A data model is a collection of tools or concepts for describing data, the
meaning of data, data relationships, and data constraints.
• is a collection of conceptual tools for describing data, data relationships,
data semantics, and consistency constraints.
• A model is a representation of real world objects and events and their
associations.
• A data model is independent of hardware or software constraints. Rather than
try to represent the data as a database would see it, the data model focuses on
representing the data as the user sees it in the "real world".
Uses of data model
The main purpose of Data Model is to represent the data in understandable
way.
Helps in the visual representation of data and enforces business rules,
regulatory compliances, and government policies on the data.
It serves as a bridge between the concepts that make up real-world events
and processes and the physical representation of those concepts in a
database.
Ensure consistency in naming conventions, default values, semantics,
security while ensuring quality of the data.
emphasizes on what data is needed and how it should be organized instead
of what operations need to be performed on the data.
Data Model is like architect's building plan which helps to build a
conceptual model and set the relationship between data items.
The primary goal of using data model are:
Ensures that all data objects required by the database are accurately
represented. Omission of data will lead to creation of faulty reports and
produce incorrect results.
A data model helps design the database at the conceptual, physical and
logical levels.
Data Model structure helps to define the relational tables, primary and
foreign keys and stored procedures.
It provides a clear picture of the base data and can be used by database
developers to create a physical database.
It is also helpful to identify missing and redundant data.
Though the initial creation of data model is labor and time consuming, in
the long run, it makes your IT infrastructure upgrade and maintenance
cheaper and faster.
Database Models … Cont’d
• Data model can be divided into four:
1. Hierarchical Model
2. Network Model
3. Object oriented Model
4. Relational data Model
Hierarchical Model
• Consists of an ordered set of trees in a parent child mode.
• A parent node can have more than one child node and a child node should
have only one parent
• Connection between child and its parent is called a Link.
• The simplest data model
• Record type is referred to as node or segment
• The top node is the root node
• The relationship between parent and child can be either 1-1 or 1-M
• To add new record type or relationship, the database must be redefined and
then stored in a new form.
Hierarchical Model… Cont’d
Hierarchical Model… Cont’d
Advantage of Hierarchical Model
• Good for tree type problem (e.g. Family Tree Problem)
• Language is simple; uses constructs like GET, GET UNIQUE, GET
NEXT, GET NEXT WITHIN PARENT etc.
Disadvantages of Hierarchical Model
• Addition, deletion, and search operations are very difficult.
• There is duplication of data.
• Navigational and procedural nature of processing
• Database is visualized as a linear arrangement of records
• Little scope for "query optimization"
Network Model

• Allows record types to have more that one parent unlike


hierarchical model
• A network data models sees records as set members
• Each set has an owner and one or more members
• Allow no many to many relationship between entities
• Like hierarchical model network model is a collection of
physically linked records.
• Allow member records to have more than one owner
Network Model … Cont’d
Network Model … Cont’d
Advantage of Network Data Model:
• Network Model is able to model complex relationships and represents
semantics of add/delete on the relationships.
• Can handle most situations for modeling using record types and
relationship types.
• Language is navigational; uses constructs like FIND, FIND member, FIND
owner, FIND NEXT within set, GET etc.
Disadvantages of Network Data Model:
• Navigational and procedural nature of processing
• Database contains a complex array of pointers that thread through a set of
records.
• Little scope for automated "query optimization”
Object Oriented Model
• The OO approach of defining objects that can be used in many
programs is now also being applied to database systems.
• An object can have properties (or attributes) but also behaviour,
which is modelled in methods (functions) in the object.
• In an OO database , each type of object in the database’s mini-
world is modelled by a class i.e. (Customer class, Account class
) like tables in the relational model.
• A class has properties (attributes).
• A class also has methods that are stored with the class definition.
The Object Oriented Model
• Modeled both data and their relationships in a single structure
known as an object
• Object-oriented data model (OODM) is the basis for the object-
oriented database management system (OODBMS)
• OODM is said to be a semantic data model
• Object is an abstraction of a real-world entity
• Attributes describe the properties of an object
• Objects that share similar characteristics are grouped in classes
• Classes are organized in a class hierarchy
• Inheritance is the ability of an object within the class hierarchy
to inherit the attributes and methods of classes above it
Compiled by Aliyan A. 52
Object Oriented Model … Cont’d

• One advantage of the OO model is sub-classes. As there are


different types of account, they can be modelled as sub-classes
of the Account class.
• SavingsAccount and CurrentAccount.
• This makes sense because the different account types have
some different behaviour e.g. gaining interest in a savings
account but some behaviour the same e.g. lodging or
withdrawing cash.
• This is the inheritance concept of OO programming.
Object Oriented Model … Cont’d
Diagram – class name at the top, properties in the middle, methods at the bottom.
The Object Oriented Model (continued)

Compiled by Aliyan A. 55
Relation Model
• Developed by Dr. Edgar Frank Codd in 1970 (famous paper, 'A Relational
Model for Large Shared Data Banks')
• Terminologies originates from the branch of mathematics called set theory
and relation
• Can define more flexible and complex relationship
• Viewed as a collection of tables called “Relations” equivalent to collection
of record types
• Relation: Two dimensional table
• Stores information or data in the form of tables ( rows and columns)
• A row of the table is called tuple which is equivalent to record
• A column of a table is called attribute which is equivalent to fields
• Data value is the value of the Attribute
Relational Model … Cont’d
• Records are related by the data stored jointly in the fields of records in two
tables or files. The related tables contain information that creates the
relation
• The tables seem to be independent but are related some how.
• No physical consideration of the storage is required by the user
• Many tables are merged together to come up with a new virtual view of the
relationship
Relational Model … Cont’d
Relational Data model (also called the second generation data
model), describes entities and their relationships in the form of table

Entity : Student Entity: course


Id Name Courseno Course no. Course_title Credit_
hours
123 Abebe K. INST 321
INST 321 Database Systems 3
234 Almaz M. INST 205
INST 205 Introduction to ICT 3
456 Abebe W. INST 321
Relational Model Concepts
• The relational model represents the database
as a collection of relations.
• Informally, each relation resembles a table
of values or, to some extent, a flat file of
records.
• It is called a flat file because each record has
a simple linear or flat structure.
Relational Data Model
Properties of Relational Databases
• Each row of a table is uniquely identified by a PRIMARY KEY composed
of one or more columns
• Each tuple in a relation must be unique
• Group of columns, that uniquely identifies a row in a table is called a
CANDIDATE KEY
• ENTITY INTEGRITY RULE of the model states that no component of
the primary key may contain a NULL value.
• A column or combination of columns that matches the primary key of
another table is called a FOREIGN KEY.
• FOREIGN KEY is Used to cross-reference tables.
Compiled by Aliyan A. 60
Properties of Relational Databases … Cont’d
• The REFERENTIAL INTEGRITY RULE of the model states
that, for every foreign key value in a table there must be a
corresponding primary key value in another table in the database
or it should be NULL.
• All tables are LOGICAL ENTITIES
• A table is either a BASE TABLES (Named Relations) or VIEWS
(Unnamed Relations)
• Only Base Tables are physically stores
• VIEWS are derived from BASE TABLES with SQL instructions
like: [SELECT .. FROM .. WHERE .. ORDER BY]

Compiled by Aliyan A. 61
Properties of Relational Databases … Cont’d
• Is the collection of tables
• Each entity in one table
• Attributes are fields (columns) in table
• Order of rows and columns is immaterial
• Entries with repeating groups are said to be un-
normalized
• Entries are single-valued
• Each column (field or attribute) has a distinct name
• All values in a column represent the same attribute and
have the same data format
Compiled by Aliyan A. 62
Building Blocks of the Relational Data Model
The building blocks of the relational data model are:
• Entities: real world physical or logical object
• Attributes: properties used to describe each Entity
or real world object.
• Relationship: the association between Entities
• Constraints: rules that should be obeyed while
manipulating the data.

Compiled by Aliyan A. 63
Entity
An Entity is a thing in real-world (an object with a physical existence or an
object with a conceptual existence).
person/employee, car, book, house, employee, etc are example of objects
with physical existence
A project, course, payment, department, etc are examples of objects with
object with a conceptual existence.
Entity can be roles, events, tangible things or concepts.
Each entity must have its own identity that distinguishes it from every other
entity, called unique identifier.
• Example: Each Student has a unique ID that distinguishes one from others.
• Generally in database terms, an entity is a table which is responsible for
storing records in the database.
• In E-R Diagram, an entity is represented by a rectangle.
Compiled by Aliyan A. 64
ENTITIES
• The ENTITIES (persons, places, things etc.) which the
organization has to deal with.
• The name given to an entity should always be a singular noun
descriptive of each item to be stored in it.
E.g. student not students.
• Every relation has a schema, which describes the columns, or
fields
• The relation itself corresponds to our familiar notion of a table.
• A relation is a collection of tuples, each of which contains
values for a fixed number of attributes.
Compiled by Aliyan A. 65
Types of entities
• Entity types that do not have key attributes of their own are called weak
entity types.
• In contrast, regular entity types that do have a key attribute are called
strong entity types.
• Entities belonging to a weak entity type are identified by being related to
specific entities from another entity type in combination with one of their
attribute values.
• A weak entity type normally has a partial key, which is the attribute that
can uniquely identify weak entities that are related to the same owner
entity.
• We call this other entity type the identifying or owner entity type, and we
call the relationship type that relates a weak entity type to its owner the
identifying relationship of the weak entity type.
Compiled by Aliyan A. 66
Types of entities cont…
• A weak entity type always has a total participation constraint
(existence dependency) with respect to its identifying
relationship because a weak entity cannot be identified
without an owner entity.
• However, not every existence dependency results in a weak
entity type.
• For example, a DRIVER_LICENSE entity cannot exist unless it is
related to a PERSON entity, even though it has its own key
(License_number) and hence is not a weak entity.
• In ER diagrams, both a weak entity type and its identifying
relationship are distinguished by surrounding their boxes and
diamonds with double lines.
• The partial key attribute is underlined with a dashed or dotted
line.
Compiled by Aliyan A. 67
ATTRIBUTES
• The ATTRIBUTES - the items of information which
characterize and describe these entities.

• Attributes are pieces of information about entities.

• The analysis must of course identify those which are actually


relevant to the proposed application.

• Attributes will give rise to recorded items of data in the


database
Compiled by Aliyan A. 68
ATTRIBUTES … Cont’d
At this level we need to know such things as:
• Attribute name (be explanatory words or phrases)
• Domain: is a set of values from which attribute values may be taken.
• Each attribute has values taken from a domain.
• For example, the domain of Name is string and that for salary is
real
• Whether the attribute is part of the entity identifier (attributes which
just describe an entity and those which help to identify it uniquely)
• Whether it is permanent or time-varying (which attributes may
change their values over time)
• Whether it is required or optional for the entity (whose values will
sometimes be unknown or irrelevant)

Compiled by Aliyan A. 69
Types of Attributes
1. Simple (atomic) Vs Composite Attributes
• Simple : contains a single value (not divided into sub parts)
• E.g. Age, gender
• Composite: Divided into sub parts (composed of other attributes)
• E.g. Name, address
2. Single-valued Vs multi-valued Attributes
• Single-valued : have only single value(the value may change but has only
one value at one time)
• E.g. Name, Sex, Id. No. color_of_eyes
• Multi-Valued: have more than one value
• E.g. Address, dependent-name
• Person may have several college degrees
Compiled by Aliyan A. 70
Types of Attributes … Cont’d
3. Stored Vs. Derived Attribute
• Stored : not possible to derive or compute E.g. Name, Address
• Derived: The value may be derived (computed) from the values of other
attributes.
• E.g. Age (current year – year of birth), Length of employment (current
date- start date), Profit (earning-cost) , G.P.A (grade point/credit hours)
4. Null Values
• NULL applies to attributes which are not applicable or which do not have
values.
• You may enter the value NA (meaning not applicable)
• Value of a key attribute can not be null.
Default value - assumed value if no explicit value
Compiled by Aliyan A. 71
Understanding relationships
• is an association that exist b/n two or more participating entities.
• Related entities require setting of LINKS from one part of the database to
another.
• A relationship should be named by a word or phrase which explains its function
• Role names are different from the names of entities forming the relationship: one
entity may take on many roles, the same role may be played by different entities.
• An important point about a relationship is how many entities participate in it.

Compiled by Aliyan A. 72
DEGREE of the relationship
• The number of entities participating in a relationship is called the DEGREE of
the relationship.
• The number of entity sets that are connected through the relationship in question
is called the degree of relationship.
• Unary/Recursive Relationship: Tuples/records of a Single entity are related
withy each other.
• Binary Relationships: Tuples/records of two entities are associated in a
relationship.
• Ternary Relationship: Tuples/records of three different entities are
associated.
• N-Nary Relationship: Tuples from arbitrary number of entity sets are
participating in a relationship.

Compiled by Aliyan A. 73
DEGREE of the relationship cont..
manager

Unary relationship Employee works for

worker

Binary relationship date

Client orders Book

Compiled by Aliyan A. 74
Ternary Relationship Name
proj_name

Project Hires Contractor


location
Address

budget
end_date
start_date
end_date
Contract

number start_date

value

Compiled by Aliyan A. 75
N-Nary Relationship

Compiled by Aliyan A. 76
CARDINALITY of the relationship
• Is the number of instances participating or associated with
a single instance from another entity in a relationship.
• ONE-TO-ONE:one tuple of an entity is associated with only one
other entity tuple. , e.g. Building – Location.
• ONE-TO-MANY:one tuple can be associated with many other tuples,
but not the reverse. e.g. hospital – patient.
• MANY-TO-ONE: many tuples are associated with one
tuple but not the reverse. e.g. Employee - Department
• MANY-TO-MANY: one tuple is associated with many other tuples
and from the other side, with a different role name one tuple will be
associated with many tuples . e.g. Author - Book.
Compiled by Aliyan A. 77
CARDINALITY of the relationship cont…

Compiled by Aliyan A. 78
Relational Integrity
• Domain Integrity: No value of the attribute should be beyond
the allowable limits
• Entity Integrity: In a base relation, no attribute of a primary key
can be null
• Referential Integrity: If a foreign key exists in a relation, either
the foreign key value must match a candidate key in its home
relation or the foreign key value must be null foreign key to
primary key match-ups
• Enterprise Integrity: Additional rules specified by the users or
database administrators of a database are incorporated

Compiled by Aliyan A. 79
Key constraints
• If tuples are need to be unique in the database, and then we need to make
each tuple distinct. To do this we need to have relational keys.
• Super Key: an attribute or set of attributes that uniquely identifies a tuple
within a relation. An attribute or a combination of attribute that is used to
identify the records uniquely is known as Super Key. A table can have many
Super Keys. Examples of Super Keys:
• ID
• ID, Name
• ID, Address
• ID, Department_ID
• ID, Salary
• Name, Address
• Name, Address, Department_ID and so on.
• As any combination which can identify the records uniquely will be a Super
Key. Compiled by Aliyan A. 80
Candidate Key
• Is a super key such that no proper subset of that collection is a Super Key within
the relation. It can be defined as minimal Super Key or irreducible Super Key. In
other words an attribute or a combination of attribute that identifies the record
uniquely but none of its proper subsets can identify the records uniquely.
• Examples of Candidate Key: Code , Name, Address
• For above table we have only two Candidate Keys (i.e. Irreducible Super Key)
used to identify the records from the table uniquely.
• Code Key can identify the record uniquely and similarly combination of Name
and Address can identify the record uniquely, but neither Name nor Address can
be used to identify the records uniquely as it might be possible that we have two
employees with similar name or two employees from the same house.
• A candidate key has two properties:
1. Uniqueness
2. Irreducibility
• If a candidate key consists of more than one attribute it is called composite
key.

Compiled by Aliyan A. 81
Relational Constraints/Integrity Rules .. Cont’d
• Primary Key: the candidate key that is selected to identify tuples
uniquely within the relation.
• The entire set of attributes in a relation can be considered as a
primary case in a worst case.

• Foreign Key: an attribute, or set of attributes, within one relation that


matches the candidate key of some relation.
• A foreign key is a link between different relations to create the view
or the unnamed relation
Compiled by Aliyan A. 82
Database Design
Database design consists of several tasks:
• Requirements Analysis,
• Conceptual Design, and Schema Refinement,
• Logical Design,
• Physical Design and Tuning
• In general, one has to go back and forth between these tasks to refine a
database design, and decisions in one task can influence the choices in
another task.
• In developing a good design, one should ask:
• What are the important queries and updates?
• What attributes/relations are involved?
Compiled by Aliyan A. 83
The Three levels of Database Design
• Constructing a model independent of any physical considerations.
• After the completion of Conceptual Design one has to go for refinement
Conceptual
Design
of the schema, which is verification of Entities, Attributes, and
Relationships
• Constructing a model of the information used in an enterprise
based on a specific data model (e.g. relational, hierarchical or
Logical
Design
network or object), but independent of a particular DBMS and
other physical considerations.
• Producing a description of the implementation of the database on
secondary storage.
• Describes the storage structures and access methods used to achieve
Physical
efficient access to the data.
Design • Tailored to a specific DBMS system
• Characteristics are function of DBMS and operating systems
• Includes estimate of storage
Compiledspace
by Aliyan A. 84
Conceptual Database Design
• Conceptual design revolves around discovering and analyzing
organizational and user data requirements
• The important activities are to identify
• Entities
• Attributes
• Relationships
• Constraints
• And based on these components develop the ER model using
• ER diagrams
• Entity-Relationship modeling is used to represent conceptual view
of the database
• ER uses four main components of ER Modeling.
Compiled by Aliyan A. 85
The Entity Relationship (E-R) Model … Cont’d
Before working on the conceptual design of the database, one has to know and answer
the following basic questions.

• What are the entities and relationships in the enterprise?

• What information about these entities and relationships should we store in the
database?

• What are the integrity constraints that hold?

• Constraints on each data with respect to update, retrieval and store.

• Represent this information pictorially in ER diagrams, then map ER diagram into a


relational schema Compiled by Aliyan A. 86
Developing an E-R Diagram
• Designing conceptual model for the database is not a one linear process
but an iterative activity where the design is refined again and again.
• To identify the entities, attributes, relationships, and constraints on the
data, there are different set of methods used during the analysis phase.
• These include information gathered by…
• Interviewing end users individually and in a group
• Questionnaire survey
• Direct observation
• Examining different documents
Compiled by Aliyan A. 87
Developing an E-R Diagram … Cont’d
• The basic E-R model is graphically depicted and presented for review.
• The process is repeated until the end users and designers agree that the E-R
diagram is a fair representation of the organization’s activities and
functions.
• Checking for Redundant Relationships in the ER Diagram.
• Relationships between entities indicate access from one entity to another.
• It is therefore possible to access one entity occurrence from another entity
occurrence even if there are other entities and relationships that separate
them.
• This is often referred to as Navigation' of the ER diagram
• The last phase in ER modeling is validating an ER Model against
requirement of the user.
Compiled by Aliyan A. 88
Graphical Representations in ER Diagramming
• Entity is represented by a RECTANGLE containing the name of the
entity.

• Connected entities are called relationship participants


• Attributes are represented by OVALS and are connected to the entity
by a line.

Compiled by Aliyan A. 89
Graphical Representations in ER Diagramming .. Cont’d
• A derived attribute is indicated by a DOTTED LINE.
(……..)

• PRIMARY KEYS are underlined.

• Relationships are represented by DIAMOND shaped symbols

Compiled by Aliyan A. 90
Graphical Representations in ER Diagramming .. Cont’d

Example 1 : Build an ER Diagram for the following information:


• Students
• Have an Id, Name, Dept, Age, Gpa
• Courses
• Have an Id, Name, Credit Hours
• Students enroll in courses and receive a grade

Compiled by Aliyan A. 91
Graphical Representations in ER Diagramming .. Cont’d

Compiled by Aliyan A. 92
ER diagramming
• EXAMPLE2: Design an ER diagram for the following information.illustrate
the major rules in mapping ER to relational schema:
• Employee ,Department and Project information.
• Employee: Eid, Name,Salary,Tel.
• Department: Did, Dname, Dloc.
• Project: Pid, Pname, Pfund
Business rule: Employee works for department;
• where an employee might be assigned to manage a department.
• Employee might participate on different projects with in the
organization.
• An employee might as well be assigned to lead a project where the
starting and ending date of his/her project leadership and bonus will be
registered.
Compiled by Aliyan A. 93
Solution for example 2

Compiled by Aliyan A. 94
Assignment
• Build an ER Diagram for the following information:
A Personnel record management system will have the
following two basic data object categories with their
own features or properties. Employee will have an Id,
Name, DoB, Age, Tel and Department will have an Id,
Name, Location
Whenever an Employee is assigned in one
Department, the duration of his stay in the
respective department should be registered.
Compiled by Aliyan A. 95
Entity versus Attributes
• Consider designing a database of employees for an organization:
• Should address be an attribute of Employees or an entity (connected to
Employees by a relationship)?
• If we have several addresses per employee, address must be an
entity (attributes cannot be set-valued/multi valued)
• If the structure (city, Woreda, Kebele, etc) is important.
E.g. want to retrieve employees in a given city, address must be modeled
as an entity (attribute values are atomic)
• Cardinality on Relationship expresses the number of entity
occurrences/tuples associated with one occurrence/tuple of related entity.

Compiled by Aliyan A. 96
More on ER diagramming
• Example 3: Create a simple ER diagram for a STUDENT
database. What is the requirement of this database?
• ‘Student attends class. Each class is divided into one or more
sections. Each class will have its own specified subjects.
Students have to attend all the subjects of the class that he
attends’.
• Identify what are the entities?
• Identify corresponding attributes of each entity and their
types.

Compiled by Aliyan A. 97
Solution
STUDENT, CLASS, SECTION, SUBJECT are the entities.
Attributes of these entities are not specified here. But we know what
could be the entities of each of the entities. We can list them as below at
this point of time.
STUDENT CLASS SECTION SUBJECT
STUDENT_ID CLASS_ID SECTION_ID SUBJECT_ID
STUDENT_NAME CLASS_NAME SECTION_NAME SUBJECT_NAME
ADDRESS
DOB
AGE
CLASS_ID
SECTION_ID
Compiled by Aliyan A. 98
What are the relationships we have? ‘Attends’, ‘has section’, ‘have subjects’
and ‘studies subjects’ are the relations here.
With this knowledge of requirement, we can draw the ER diagram as below.

Compiled by Aliyan A. 99
Cont…
• Observe the diagram carefully. Did we miss or drew it correctly? Are we missing
anything on the diagram? Is it inferring correct requirement? What are our
observations?
• Age attribute can be derived from DOB. Hence, we have to draw dashed oval.
• Address is a composite attribute. We have to draw its sub attributes too. So that
we will be very clear about his address details.
• If we see the SECTION entity, by section id, will we be able get the section that
student attends? There is no relation mentioned between Student and Section.
But Section is mapped only with Class. What do we understand from this?
Section is a weak entity. Hence we have to represent it properly.
• If we look at ‘attends’ relationship between STUDENT and CLASS, we can have
‘Joining Date’ and ‘Total Number of Hours’ attributes. But it is an attribute of
relation. We have to show them in the diagram.
• Since each class is having different subjects and Students attends those subjects,
we can modify the relation ‘studies’ to ‘has’ relation on the relation ‘attends’.
Compiled by Aliyan A. 100
Now the diagram will change to reflect all above points.

Fig: Example ER Diagram with Strong Entities, Weak Entities, and Simple attribute, Composite Attribute,
Keys and Relationships etc. Is the above diagram is a complete diagram? No, we have not added the
cardinality and participation in the diagram. Compiled by Aliyan A. 101
Participation Constraints
• From the above diagram all the Students attend any one of the
class, but class can have only certain group of students. Hence
total participation of Students and partial participation of class in
‘Attends’ relation.
• All the class has section and all the section has class. Hence both
are total participation.
• All the Students study some of the subject’s specific for their class
and each class has only some group of subjects.
• Hence partial participation of both STUDENT and CLASS. Each
subject will be studied by some students and it will be part of
some class. Hence this also partial participation.
Compiled by Aliyan A. 102
Cardinalities of All the Relationship
• Each Student attends only one class at a time. Hence it is
a 1: 1 relation.
• Each class has one or more sections. Hence it can be
considered as 1: N relation.
• Each student attends many subjects and each class has
many subjects. Hence it is a 1: N relation.
• Note: If you look at STUDENT and CLASS relationship as
many Students attend one class, then it would be an M: 1
relation.
Compiled by Aliyan A. 103
Cont…

Compiled by Aliyan A. 104


Participation Constraints
• Participation constraints define the least number of
relationship instances in which an entity must compulsorily
participate.
Two Types of Participation Constraints-
• Total participation
• Partial participation

Compiled by Aliyan A. 105


Total Participation
• It specifies that each entity in the entity set must compulsorily participate
in at least one relationship instance in that relationship set. Every entity in
the entity set participates in at least one relationship in the relationship set.
• That is why, it is also called as mandatory participation.
• Total participation is represented using a double line between the entity
set and relationship set.

Example

•Double line between the entity set “Student” and relationship set “Enrolled in” signifies
total participation.
•It specifies that each student must be enrolled inA. at least one course.
Compiled by Aliyan 106
Total participation cont…
Participation of EMPLOYEE in “belongs to” relationship with
DEPARTMENT is total since every employee should belong to a department.

Participation of EMPLOYEE in “manages” relationship with DEPARTMENT,


DEPARTMENT will have total participation but not EMPLOYEE

Compiled by Aliyan A. 107


Partial Participation
It specifies that each entity in the entity set may or may not
participate in the relationship instance in that relationship set.
•That is why, it is also called as optional participation.
•Partial participation is represented using a single line
between the entity set and relationship set.

Example

•Single line between the entity set “Course” and relationship set “Enrolled in”
signifies partial participation.
•It specifies that there might exist some courses for which no enrollments are
made.
Compiled by Aliyan A. 108
Partial participation cont…

Some entities may not participate in any relationship in the relationship


set.
E.g.: Participation of EMPLOYEE in “manages” relationship with
DEPARTMENT, EMPLOYEE will have partial participation since not
all employees are managers.

Compiled by Aliyan A. 109


Relationship between Cardinality and Participation Constraints

• Minimum cardinality tells whether the participation is partial


or total.
• If minimum cardinality = 0, then it signifies partial
participation.
• If minimum cardinality = 1, then it signifies total
participation.
• Maximum cardinality tells the maximum number of entities
that participates in a relationship set.

Compiled by Aliyan A. 110


Entity versus Attributes … Cont’d

• Existence Dependency: the dependence of an entity on the


existence of one or more entities.
• Weak entity: an entity that can not exist without the entity
with which it has a relationship
• Participating entity in a relationship is either optional or
mandatory.

Compiled by Aliyan A. 111


Problem in ER Modeling
• The Entity-Relationship Model is a conceptual data model that
views the real world as consisting of entities and
relationships.
• The model visually represents these concepts by the Entity-
Relationship diagram.
• While designing the ER model one could face a problem on
the design which is called a connection traps.
• Connection traps are problems arising from misinterpreting
certain relationships

Compiled by Aliyan A. 112


Enhanced E-R (EER) Models
• Object-oriented extensions to E-R model
• EER is important when we have a relationship between two
entities and the participation is partial between entity
occurrences. In such cases EER is used to reduce the complexity
in participation and relationship complexity.
• ER diagrams consider entity types to be primitive objects
• EER diagrams allow refinements within the structures of entity
types
EER Concepts
• Generalization
• Specialization
Compiled by Aliyan A. 113
Generalization
• Generalization occurs when two or more entities represent categories of
the same real-world object.
• Generalization is the process of defining a more general entity type from a
set of more specialized entity types.
• A generalization hierarchy is a form of abstraction that specifies that two
or more entities that share common attributes can be generalized into a
higher level entity type.
• Is considered as bottom-up definition of entities.
• Generalization hierarchy depicts relationship between higher level
superclass and lower level subclass.
• Generalization hierarchies can be nested. That is, a subtype of one
hierarchy can be a super type of another. The level of nesting is limited
only by the constraint of simplicity.
Compiled by Aliyan A. 114
Generalization … Cont’d
• Example: Account is a generalized form for Saving and Current
Accounts

Compiled by Aliyan A. 115


Specialization
• Is the result of subset of a higher level entity set to form a lower level
entity set.
• The specialized entities will have additional set of attributes
(distinguishing characteristics) that distinguish them from the
generalized entity.
• Is considered as Top-Down definition of entities.
• Specialization process is the inverse of the generalization process.
• Identify the distinguishing features of some entity occurrences, and
specialize them into different subclasses.
• Reasons for Specialization
• Attributes only partially applying to super classes
• Relationship types only partially applicable to the superclass 116
Compiled by Aliyan A.
Specialization … Cont’d
• In many cases, an entity type has numerous sub-groupings of its
entities that are meaningful and need to be represented explicitly.
• This need requires the representation of each subgroup in the ER
model.
• The generalized entity is a superclass and the set of specialized
entities will be subclasses for that specific Superclass.
• Example: Saving Accounts and Current Accounts are
Specialized entities for the generalized entity Accounts.
Manager, Sales, Secretary: are specialized employees.
Compiled by Aliyan A. 117
Database Design
• Database design is the process of coming up with different kinds of specification for
the data to be stored in the database.
• Describe how data is stored in the computer system.
• Defining its structure, characteristics and contents of data.
Database Design Process
Step 1:- Requirements collection and Analysis
• Prospective users are interviewed to collect information.
• This step result in a concise set of users requirement.
• The functional requirement should be specified as well as data requirements.
• Functional requirement can be documented using diagrams such as sequence diagrams,
DFD scenarios.

Compiled by Aliyan A. 118


Database Design … Cont’d
Step 2:- Conceptual Design
• Create conceptual schema.
• Conceptual schema:- concise description of data requirement of the user, and include a
detailed description of the entity types, relationships, constraint.
• End user must understand it.
Step 3:- Database implementation (Logical Design)
• Use one of DBMS for implementation.
• The conceptual schema is transformed from the high level data model into
implementation model.
Step 4:- Physical Design
• International storage structure, indexes, access paths and file organizations are specified.
• Application programs are designed and implemented.
Compiled by Aliyan A. 119
Database Design … Cont’d
• Generally the design part has divided into three sub phases
• Conceptual Design
• Logical Design
• Physical Design
Design strategies:
1. Top Down
• Start with high level abstraction and refine it.
• High level entities
• Add sub classes
• Attributes
2. Bottom up
• Start with basic abstraction and then combine them.
• Attributes
• Group in to entities
Compiled by Aliyan A. 120
Database Design … Cont’d
3. Inside out
• Special case of bottom up.
• Focus on central set of concepts and work out wards.
• No burned on initial designer.
4. Mixed
• Start with top down then use inside out or bottom up.
• Divide and conquer.

Compiled by Aliyan A. 121


Conceptual Design
• Is the process of constructing a model of the information used in an enterprise,
independent of any physical consideration.
• Is the source of information for logical design.
• Is high level and understand by non technical user.
• Conceptual model of enterprise, independent of implementation detail such as target
DBMS, application programs, programming language, hardware platform, performance
issues etc.
Tasks to be performed:-
• Identity entity types and relationships.
• Associate attributes with entities.
• Determine attribute domains.
• Determine unique identifier (Key) attributes.
• Use entity relationship model (ER).
Compiled by Aliyan A. 122
Conceptual Design … Cont’d
Why conceptual model:
• Independent of DBMS.
• Allow easy common b/n user and developer.
• Is permanent description of the database requirements.
Database requirements
• We must convert written database requirement in to an E-R diagram.
• Need to determine the entities, attributes and relationships.
• Nouns = entities
• Adjectives = attributes
• Verbs = relationships
Compiled by Aliyan A. 123
Converting ER Diagram to Relations
Logical Design
• Is the process of constructing model of data used in an organization.

• Constructing a model based on a specific data model. (E.g. relation, oo)

• Independent of a particular DBMS and other physical consideration.

• Conceptual schema – Logical schema.

• ER Diagram converts to relations.

Compiled by Aliyan A. 124


Converting ER Diagram to Relations
• Three basic rules to convert ER into tables.
• For a relation with one to one cardinality
• All the attributes are merged into a single table.
• i.e. primary key or candidate key of one relation is foreign key for the
other.
• For a relation with one to many cardinality
• Post the primary key or candidate key for the “one” side as a foreign key
attribute to the “many side”.
• For a relationship with many to many
• Create a new table (which is the associative entity) and post primary key
or candidate key from each entity as attributes in the new table along with
some additional attributed (if applicable)
Compiled by Aliyan A. 125
Transform ER Diagram into Tables
• Since ER diagram gives us the good knowledge about the
requirement and the mapping of the entities in it, we can easily
convert them as tables and columns. i.e.; using ER diagrams
one can easily create relational data model, which is nothing
but the logical view of the database.
• There are various steps involved in converting it into tables and
columns.
• Each type of entity, attribute and relationship in the diagram
takes their own depiction here.
• Consider the ER diagram below and will see how it is converted
into tables, columns and mappings.

Compiled by Aliyan A. 126


Compiled by Aliyan A. 127
The basic rules for converting the ER diagrams into tables is
are as follows
1. Convert all the Entities in the diagram to tables.
• All the entities represented in the rectangular box in the ER diagram become
independent tables in the database. In the below diagram, STUDENT, COURSE,
LECTURER and SUBJECTS forms individual tables.
2. Concert all single valued attributes of an entity is converted to a column of the
table
• All the attributes, whose value at any instance of time is unique, are considered as
columns of that table. In the STUDENT Entity,
• STUDENT_ID, STUDENT_NAME form the columns of STUDENT table. Similarly,
LECTURER_ID, LECTURER_NAME form the columns of LECTURER table. And so on.
3. Convert the Key attributes in the ER diagram to the Primary keys of the table.
• In diagram above, STUDENT_ID, LECTURER_ID, COURSE_ID and SUB_ID are the key
attributes of the entities.
• Hence we consider them as the primary keys of respective table.

Compiled by Aliyan A. 128


Cont…
4. Declare the foreign key column, if applicable.
• In the diagram, attribute COURSE_ID in the STUDENT entity is from COURSE
entity. Hence add COURSE_ID in the STUDENT table and assign it foreign key
constraint. COURSE_ID and SUBJECT_ID in LECTURER table forms the foreign key
column. Hence by declaring the foreign key constraints, mapping between the
tables are established.
5. Any multi-valued attributes are converted into new table. A hobby in the Student
table is a multi valued attribute.
Any student can have any number of hobbies.
So we cannot represent multiple values in a single column of STUDENT table.
We need to store it separately, so that we can store any number of hobbies,
adding/ removing / deleting hobbies should not create any redundancy or
anomalies in the system.
Hence we create a separate table STUD_HOBBY with STUDENT_ID and HOBBY as its
columns.
• We create a composite key using both the columns.
Compiled by Aliyan A. 129
Cont…
• Any composite attributes are merged into same table as different
columns.
• In the diagram above, Student Address is a composite attribute. It has
Door#, Street, City, State and Pin. These attributes are merged into
STUDENT table as individual columns.
• One can ignore derived attribute, since it can be calculated at any
time.
• In the STUDENT table, Age can be derived at any point of time by
calculating the difference between Date of Birth and current date.
Hence we need not create a column for this attribute. It reduces the
duplicity in the database.
• These are the very basic rules of converting ER diagram into tables
and columns, and assigning the mapping between the tables.
Compiled by Aliyan A. 130
Cont…
Table structure at this stage would be as below:

Compiled by Aliyan A. 131


Converting Weak Entity
• Weak entity is also represented as table. All the attributes of the weak
entity forms the column of the table.
• But the key attribute represented in the diagram cannot form the
primary key of this table.
• We have to add a foreign key column, which would be the primary key
column of its strong entity.
• This foreign key column along with its key attribute column forms the
primary key of the table.
• In our example above, SUBJECTS is the weak entity.
• Hence, we create a table for it. Its attributes SUBJECT_ID and
SUBJECT_NAME forms the column of this table.
• Although SUBJECT_ID is represented as key attribute in the diagram, it
cannot be considered as primary key. In order to add primary key to the
column, we have to find the foreign key first.
Compiled by Aliyan A. 132
Cont…
COURSE is the strong entity related to SUBJECT.
Hence the primary key COURSE_ID of COURSE is added to
SUBJECT table as foreign key.
Now we can create a composite primary key out of COURSE_ID and
SUBJECT_ID.

Creating a composite primary key out of COURSE_ID and SUBJECT_ID.

Compiled by Aliyan A. 133


Cont…
• Representing 1:1 relationship
• Imagine SUBJECT is not a weak entity, and we have LECTURER teaches
SUBJECT relation. It is a 1:1 relation. i.e.; one lecturer teaches only one
subject. We can represent this case in two ways
• Create table for both LECTURER and SUBJECT.
• Add the primary key of LECTURER in SUBJECT table as foreign key.
• It implies the lecturer name for that particular subject.
• Create table for both LECTURER and SUBJECT. Add the primary key of
SUBJECT in LECTURER table as foreign key.
• It implies the subject taught by the lecturer.

Compiled by Aliyan A. 134


Cont…
In both the case, meaning is same.
Foreign key column can be added in either of the table,
depending on the developer’s choice.

FRepresenting 1:1 relationship

Compiled by Aliyan A. 135


Cont…
Representing 1: N relationship
Consider SUBJECT and LECTURER relation, where each Lecturer teaches
multiple subjects.
This is a 1: N relation. In this case, primary key of LECTURER table is added
to the SUBJECT table.
i.e.; the primary key at 1 cardinality entity is added as foreign key to N
cardinality entity

Fig: Representing 1: N relationship


Compiled by Aliyan A. 136
Representing M: N relationship
Consider the above example, multiple students enrolled for multiple courses,
which is M: N relation. In this case, we create STUDENT and COURSE tables for
the entities. Create one more table for the relation ‘Enrolment’ and name it as
STUD_COURSE. Add the primary keys of COURSE and STUDENT into it, which
forms the composite primary key of the new table.
That is, in this case both the participating entities are converted into tables, and a
new table is created for the relation between them. Primary keys of entity tables
are added into new table to form the composite primary key. We can add any
additional columns, if present as attribute of the relation in ER diagram.

Compiled by Aliyan A. 137


Cont…
Self-Referencing 1: N relation
Consider the example of HOD and Lecturers. Here one of the Lecturers is a
HOD of the department. i.e.; one HOD has multiple lecturers working with him.
In this case, we create LECTURER table for the Lecturer entity.
Create the columns and primary keys as usual. In order to represent HOD, we
add one more column to LECTURER table which is same column as primary
key, but acts as a foreign key. i.e.; LECTURER_ID is the primary key of
LECTURER table. We add one more column HOD, which will have
LECTURER_ID of the HOD.
Hence LECTURER table will show HOD’s Lecturer ID for each Lecturer. In this
case, primary key column acts as a foreign key in the same table.

Compiled by Aliyan A. 138


Logical Design … Cont’d

Compiled by Aliyan A. 139


Logical Design … Cont’d

Compiled by Aliyan A. 140


Logical Design … Cont’d

Compiled by Aliyan A. 141


Logical Design … Cont’d
Mapping Regular Entities to relation
• Simple attributes: ER Attributes map directly on to the relation.
• Composite attribute: Use only their simple, component attributes
• Multi-Valued Attribute: Becomes a separate relation with a foreign key taken from
the super entity.

Compiled by Aliyan A. 142


Logical Design … Cont’d

Compiled by Aliyan A. 143


Normalization
• A relational database is merely a collection of data,
organized in a particular manner. The father of the
relational database approach, Codd created a series of rules
called normal forms that help define that organization.
• One of the best ways to determine what information
should be stored in a database is to clarify what questions
will be asked of it and what data would be included in the
answers.
• Database normalization is a series of steps followed to
obtain a database design that allows for consistent storage
and efficient access of data in a relational database.
Compiled by Aliyan A. 144
Cont…
• These steps reduce data redundancy and the risk of data
becoming inconsistent.
• NORMALIZATION is the process of identifying the
logical associations between data items and designing a
database that will represent such associations but without
suffering the update anomalies which are; Insertion,
Deletion and Modification Anomalies

Compiled by Aliyan A. 145


Cont…
• It is a technique of organizing he data into multiple related tables
to minimize data redundancy. Data redundancy needs extra space
and data anomaly.
• Issues due to redundancy:
• Insertion anomaly: to insert redundant data for every new row is a
data insertion problem or anomaly.
Reason for data repetition:
• Two different but related data is stored in the same table.
• Deletion anomaly: loss of related data set when some other
dataset is deleted.
• Updating anomaly
Compiled by Aliyan A. 146
Compiled by Aliyan A. 147
Insertion anomaly

Compiled by Aliyan A. 148


Updation anomaly

Compiled by Aliyan A. 149


Compiled by Aliyan A. 150
Compiled by Aliyan A. 151
Compiled by Aliyan A. 152
Compiled by Aliyan A. 153
Compiled by Aliyan A. 154
Branch information deleted along with student
data.
deletion anomaly is a loss of related dataset when
some other dataset is deleted.

Compiled by Aliyan A. 155


Normalization
Divide the table into two tables such as
student table and branch table

Compiled by Aliyan A. 156


Cont..

But it is minimizing redundancy


Compiled by Aliyan A. 157
Normalization … Cont’d
• Normalization may reduce system performance since data will be cross
referenced from many tables.
• Thus DE normalization is sometimes used to improve performance, at the
cost of reduced consistency guarantees.
• Normalization normally is considered as good if it is lossless
decomposition.
• Mnemonic for remembering the rationale for normalization could be the
following:
• No Repeating or Redundancy: no repeating fields in the table
• The Fields Depend Upon the Key: the table should solely depend on the
key
• The Whole Key: no partial key dependency
• And Nothing But The Key: noCompiled
inter data dependency
by Aliyan A. 158
Normalization … Cont’d
• All the normalization rules will eventually remove the update anomalies
that may exist during data manipulation after the implementation.

• Pitfalls of Normalization
• Requires data to see the problems
• May reduce performance of the system
• Is time consuming,
• Difficult to design and apply and
• Prone to human error

Compiled by Aliyan A. 159


Normalization … Cont’d
• The underlying ideas in normalization are simple enough. Through
normalization we want to design for our relational database a set of tables
that;
1. Contain all the data necessary for the purposes that the database is to
serve,
2. Have as little redundancy as possible,
3. Accommodate multiple values for types of data that require them,
4. Permit efficient updates of the data in the database, and
5. Avoid the danger of losing data unknowingly
• The type of problems that could occur in insufficiently normalized table is
called update anomalies which includes;

Compiled by Aliyan A. 160


Normalization … Cont’d
1. Insertion anomalies
• An "insertion anomaly" is a failure to place information about a new database entry into
all the places in the database where information about that new entry needs to be
stored.
• In a properly normalized database, information about a new entry needs to be inserted
into only one place in the database.
• In an inadequately normalized database, information about a new entry may need to be
inserted into more than one place and, human fallibility being what it is, some of the
needed additional insertions may be missed.
2. Deletion anomalies
• A "deletion anomaly" is a failure to remove information about an existing database
entry when it is time to remove that entry.
• In a properly normalized database, information about an old, to-be-gotten-rid-of entry
needs to be deleted from only one place in the database.
Compiled by Aliyan A. 161
Normalization … Cont’d
• In an inadequately normalized database, information about that
old entry may need to be deleted from more than one place, and,
human fallibility being what it is, some of the needed additional
deletions may be missed.
3. Modification anomalies
• A modification of a database involves changing some value of the
attribute of a table.
• In a properly normalized database table, what ever information is
modified by the user, the change will be effected and used
accordingly.
• The purpose of normalization is to reduce the chances for anomalies to
occur in a database.

Compiled by Aliyan A. 162


Cont…

Compiled by Aliyan A. 163


Normalization … Cont’d
• Deletion Anomalies: If employee with ID 16 is deleted then ever
information about skill C++ and the type of skill is deleted from the
database. Then we will not have any information about C++ and its skill
type.
• Insertion Anomalies: What if we have a new employee with a skill called
Pascal? We can not decide weather Pascal is allowed as a value for skill and
we have no clue about the type of skill that Pascal should be categorized as.
• Modification Anomalies: What if the address for Helico is changed from
Piazza to Mexico? We need to look for every occurrence of Helico and
change the value of School_Add from Piazza to Mexico, which is prone to
error.
• Database-management system can work only with the information that we
put explicitly into its tables for a given database and into its rules for
working with those tables, where such rules are appropriate and possible.
Compiled by Aliyan A. 164
Functional Dependency (FD)
Data Dependency
• The logical association between data items that point the
database designer in the direction of a good database design are
referred to as determinant or dependent relationships.
• Two data items A and B are said to be in a determinant or
dependent relationship if certain values of data item B always
appears with certain values of data item A.
• If the data item A is the determinant data item and B the
dependent data item then the direction of the association is from
A to B and not vice versa.

Compiled by Aliyan A. 165


Functional Dependency (FD) … Cont’d
• The essence of this idea is that if the existence of something, call it A, implies that B
must exist and have a certain value, then we say that "B is functionally dependent
on A."
• We also often express this idea by saying that "A determines B," or that "B is a
function of A," or that "A functionally governs B." Often, the notions of
functionality and functional dependency are expressed briefly by the statement, "If A,
then B.“
• It is important to note that the value B must be unique for a given value of A, i.e., any
given value of A must imply just one and only one value of B, in order for the
relationship to qualify for the name "function." (However, this does not necessarily
prevent different values of A from implying the same value of B.)
• X Y holds if whenever two tuples have the same value for X, they must have the
same value for Y

Compiled by Aliyan A. 166


Functional Dependency (FD) … Cont’d
• The notation is: A B which is read as; B is functionally dependent on A
• In general, a functional dependency is a relationship among attributes. In relational
databases, we can have a determinant that governs one other attribute or several other
attributes.
• FDs are derived from the real-world constraints on
the attributes

Since the type of Wine served depends on the type of Dinner, we say Wine is functionally
dependent on Dinner. Dinner Wine
Since both Wine type and Fork type are
determined by the Dinner type, we say Wine
is functionally dependent on Dinner and Fork
is functionally dependent on Dinner.
Dinner Wine
Dinner Fork
Compiled by Aliyan A. 167
Partial Dependency
• If an attribute which is not a member of the primary key is
dependent on some part of the primary key (if we have
composite primary key) then that attribute is partially
functionally dependent on the primary key.
• Let {A,B} is the Primary Key and C is no key attribute.
Then if {A,B} C and B C
Then C is partially functionally dependent on {A,B}

Compiled by Aliyan A. 168


Full Dependency
• If an attribute which is not a member of the primary key is not
dependent on some part of the primary key but the whole key (if we
have composite primary key) then that attribute is fully functionally
dependent on the primary key.

• Let {A,B} is the Primary Key and C is no key attribute

Compiled by Aliyan A. 169


Transitive Dependency
• In mathematics and logic, a transitive relationship is a relationship of
the following form: "If A implies B, and if also B implies C, then A
implies C."

Compiled by Aliyan A. 170


Steps of Normalization
• We have various levels or steps in normalization called Normal Forms.
• The level of complexity, strength of the rule and decomposition increases as we move
from one lower level Normal Form to the higher.
• A table in a relational database is said to be in a certain normal form if it satisfies certain
constraints.
• Normal form below represents a stronger condition than the previous one.
• Normalization towards a logical design consists of the following steps:
• Un Normalized Form: Identify all data elements
• First Normal Form: Find the key with which you can find all data
• Second Normal Form: Remove part-key dependencies. Make all data dependent on the
whole key.
• Third Normal Form: Remove non-key dependencies. Make all data dependent on
nothing but the key.
• For most practical purposes, databases are considered normalized if they adhere to third
normal form.
Compiled by Aliyan A. 171
UNNORMALIZED FORM (UNF)
• A table that contains one or more repeating groups.
• A repeating group is a field or group of fields that hold multiple values for a single occurrence of
a field.

Repeating group= (Skill, SkillType, School, SchoolAdd, SkillLevel)


Compiled by Aliyan A. 172
First Normal Form (1NF)
• Requires that all column values in a table are atomic (e.g., a number is an
atomic value, while a list or a set is not).
• We have tow ways of achieving this:
• 1. Putting each repeating group into a separate table and connecting
them with a primary key-foreign key relationship
• 2. Moving this repeating groups to a new row by repeating the
common attributes. If so then Find the key with which you can find all
data
• Definition of a table (relation) in 1NF if:
• There are no duplicated rows in the table. Unique identifier
• Each cell is single-valued (i.e., there are no repeating groups).
• Entries in a column (attribute, field) are of the same kind.
Compiled by Aliyan A. 173
Dependencies: Definitions
• Multivalued Attributes (or repeating groups): non-key
attributes or groups of non-key attributes the values of
which are not uniquely identified by (directly or
indirectly) (not functionally dependent on) the value of
the Primary Key (or its part).
STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Dependencies: Definitions
• Partial Dependency – when an non-key attribute is
determined by a part, but not the whole, of a
COMPOSITE primary key.
Partial Dependency
CUSTOMER

Cust_ID Na me Orde r_ID


101 AT&T 1234
101 AT&T 156
125 Cisco 1250
Dependencies: Definitions
• Transitive Dependency – when a non-key attribute
determines another non-key attribute.

Transitive
Dependency

EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Normal Forms: Review
• Unnormalized – There are multivalued
attributes or repeating groups
• 1 NF – No multivalued attributes or repeating
groups.
• 2 NF – 1 NF plus no partial dependencies
• 3 NF – 2 NF plus no transitive dependencies
Example 1: Determine NF
•ISBN  Title
All attributes are directly or
•ISBN  Publisher indirectly determined by the
primary key; therefore, the
•Publisher  Address relation is at least in 1 NF

BOOK

ISBN Title Publisher Address


Example 1: Determine NF
•ISBN  Title The relation is at least in 1NF.
There is no COMPOSITE
•ISBN  Publisher primary key, therefore there
•Publisher  Address can’t be partial
dependencies. Therefore,
the relation is at least in 2NF
BOOK

ISBN Title Publisher Address


Example 1: Determine NF
•ISBN  Title
Publisher is a non-key attribute, and
•ISBN  Publisher
it determines Address, another non-
•Publisher  Address key attribute. Therefore, there is a
transitive dependency, which means
that the relation is NOT in 3 NF.

BOOK

ISBN Title Publisher Address


Example 1: Determine NF
We know that the relation is at
•ISBN  Title
least in 2NF, and it is not in 3
•ISBN  Publisher NF. Therefore, we conclude that
•Publisher  Address the relation is in 2NF.

BOOK

ISBN Title Publisher Address


Example 1: Determine NF
•ISBN  Title In your solution you will write the following
•ISBN  Publisher justification:
1) No M/V attributes, therefore at least 1NF
•Publisher  Address
2) No partial dependencies, therefore at
least 2NF
3) There is a transitive dependency
(Publisher  Address), therefore, not
3NF
Conclusion: The relation is in 2NF
BOOK

ISBN Title Publisher Address


Example 2: Determine NF
• Product_ID  Description
All attributes are directly or
indirectly determined by the
primary key; therefore, the
relation is at least in 1 NF
ORDER

Order_No Product_ID Description


Example 2: Determine NF
• Product_ID  Description
The relation is at least in 1NF.
There is a COMPOSITE Primary Key (PK) (Order_No,
Product_ID), therefore there can be partial
dependencies. Product_ID, which is a part of PK,
determines Description; hence, there is a partial
dependency. Therefore, the relation is not 2NF. No
sense to check for transitive dependencies!

ORDER

Order_No Product_ID Description


Example 2: Determine NF
• Product_ID  Description We know that the relation is
at least in 1NF, and it is not in
2 NF. Therefore, we conclude
that the relation is in 1 NF.
ORDER

Order_No Product_ID Description


Example 2: Determine NF
• Product_ID  Description

In your solution you will write the following


justification:
1) No M/V attributes, therefore at least 1NF
2) There is a partial dependency (Product_ID
 Description), therefore not in 2NF
Conclusion: The relation is in 1NF
ORDER

Order_No Product_ID Description


Example 3: Determine NFComp_ID and No are not
determined by the primary key;
•Part_ID  Description therefore, the relation is NOT in
•Part_ID  Price 1 NF. No sense in looking at
•Part_ID, Comp_ID  No partial or transitive
dependencies.

PART

Part_ID Descr Price Comp_ID No


Example 3: Determine NF
In your solution you will write
•Part_ID  Description the following justification:
1) There are M/V attributes;
•Part_ID  Price therefore, not 1NF
Conclusion: The relation is
•Part_ID, Comp_ID  No not normalized.

PART

Part_ID Descr Price Comp_ID No


Bringing a Relation to 1NF

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 1NF
• Option 1: Make a determinant of the repeating group
(or the multivalued attribute) a part of the primary key.
Composite
Primary Key
STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 1NF
• Option 2: Remove the entire repeating group from the relation.
• Create another relation which would contain all the attributes of the
repeating group, plus the primary key from the first relation.
• In this new relation, the primary key from the original relation and the
determinant of the repeating group will comprise a primary key.
STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 1NF
STUDENT

Stud_ID Na me
101 Lennon
125 Jons on
STUDENT_COURSE

Stud_ID Course Units


101 MSI 250 3
101 MSI 415 3
125 MSI 331 3
Bringing a Relation to 2NF

Composite
Primary Key

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 2NF
• Goal: Remove Partial Dependencies
Composite Partial
Primary Key Dependencies

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 2NF
• Remove attributes that are dependent from the part but not
the whole of the primary key from the original relation.
• For each partial dependency, create a new relation, with the
corresponding part of the primary key from the original as the
primary key.
STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
CUSTOMER
Bringing a Relation to 2NF STUDENT_COURSE

Stud_ID Name Course_ID Units


Stud_ID Course_ID
101 Lennon MSI 250 3.00 101 MSI 250
101 Lennon MSI 415 3.00
101 MSI 415
125 Johnson MSI 331 3.00
125 MSI 331
STUDENT COURSE

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 3NF

• Goal: Get rid of transitive dependencies.


Transitive
EMPLOYEE Dependency

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Bringing a Relation to 3NF
• Remove the attributes, which are dependent on a non-key attribute, from
the original relation.
• For each transitive dependency, create a new relation with the non-key
attribute which is a determinant in the transitive dependency as a
primary key, and the dependent non-key attribute as a dependent.
EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Bringing a Relation to 3NF
EMPLOYEE DEPARTMENT

Emp_ID F_Name L_Name Dept_ID Dept_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct 1 Acct
122 Sarah Smith 2 Mktg 2 Mktg
EMPLOYEE

Emp_ID F_Name L_Name Dept_ID


111 Mary Jones 1
122 Sarah Smith 2
First Normal Form (1NF) … Cont’d
• FIRST NORMAL FORM (1NF): Remove all repeating groups.

• Distribute the multi-valued attributes into different rows and identify a unique
identifier for the relation so that is can be said is a relation in relational database.

Compiled by Aliyan A. 200


First Normal Form (1NF) … Cont’d
• Example 2: Consider the following UNF relation.

Here , Tele and fax fields are multi-valued


- To change in to 1NF relation, we need to split the table in to three
- The following tables are equivalent 1st Normal form of the above
employee table:

Compiled by Aliyan A. 201


Second Normal form 2NF
• No partial dependency of a non key attribute on part of the primary key.
• Any table that is in 1NF and has a single-attribute (i.e., a non-composite)
key is automatically also in 2NF.
• Definition of a table (relation) in 2NF
• It is in 1NF and
• If all non-key attributes are dependent on all of the key. i.e. no partial
dependency.
• Since a partial dependency occurs when a non-key attribute is dependent
on only a part of the (composite) key, the definition of 2NF is sometimes
phrased as, "A table is in 2NF if it is in 1NF and if it has no partial
dependencies."

Compiled by Aliyan A. 202


Second Normal form 2NF … Cont’d
• Example for 2NF:

• This schema is in its 1NF since we don’t have any repeating groups or attributes
with multi-valued property. To convert it to a 2NF we need to remove all partial
dependencies of non key attributes on part of the primary key.
• {EmpID, ProjNo} EmpName, ProjName, ProjLoc, ProjFund, ProjMangID
• But in addition to this we have the following dependencies
EmpID EmpName
ProjNo ProjName, ProjLoc, ProjFund, ProjMangID
Compiled by Aliyan A. 203
Compiled by Aliyan A. 204
Second Normal form 2NF … Cont’d
• As we can see some non key attributes are partially dependent on some
part of the primary key.
• Thus these collections of attributes should be moved to a new relation.

Compiled by Aliyan A. 205


Second Normal form 2NF … Cont’d
• Example 2: Normalize the following relation.

• The primary key for this table is the composite key (PatientId, RelativeId).
Compiled by Aliyan A. 206
Second Normal form 2NF … Cont’d
• So, to determine if it satisfies 2NF, you have to find out if all other
fields in it depend fully on both PatientId and RelativeId; that
is,you need to decide whether the following conditions are true:
• (PatientId, RelativeId) Relationship and
• (PatientId, RelativeId) Patient_tel.
• However, on the dependencies in the patient table, only the
following are true:
• (PatientId, RelativeId) Relationship and
• (PatientId) Patient_tel.
Therefore; based on the above dependency the normalized relation will be divided into to
tables.
Compiled by Aliyan A. 207
Second Normal form 2NF … Cont’d

Compiled by Aliyan A. 208


Third Normal Form (3NF )
• Eliminate Columns Not Dependent On Key - If attributes
do not contribute to a description of the key, remove them
to a separate table.
• This level avoids update and delete anomalies.
• Definition of a Table (Relation) in 3NF
• It is in 2NF and
• There are no transitive dependencies between
attributes.

Compiled by Aliyan A. 209


Third Normal Form (3NF )
• Example for (3NF): Assumption: Students of same batch (same year) live in one
building or dormitory

• This schema is in its 2NF since the primary key is a single attribute.
Compiled by Aliyan A. 210
Third Normal Form (3NF ) … Cont’d

Compiled by Aliyan A. 211


Third Normal Form (3NF ) … Cont’d
• Consider the following example:

Now, PK = empid

• We have functional dependencies:


• Empid → depid
• Depid → depname
• Or Depid → depbudjet
• Therefore, the above table is not is 3NF. To normalize it, we can
use the functional dependencies:
• Depid → depname
• Depid → depbudjet And
• Empid → depid Compiled by Aliyan A. 212
Third Normal Form (3NF ) … Cont’d
• So that the resulting tables are the following:

Compiled by Aliyan A. 213


Other Normal Forms
• Boyce-Codd Normal Form (BCNF): Isolate Independent Multiple
Relationships - No table may contain two or more 1:n or N:M relationships
that are not directly related.
• The correct solution, to cause the model to be in 4th normal form, is to
ensure that all M:M relationships are resolved independently if they are
indeed independent.
• Def.: A table is in BCNF if it is in 3NF and if every determinant is a
candidate key.
• Forth Normal form (4NF): Isolate Semantically Related Multiple
Relationships - There may be practical constrains on information that justify
separating logically related many-to-many relationships.
• Def.: A table is in 4NF if it is in BCNF and if it has no multi-valued
dependencies.
Compiled by Aliyan A. 214
Other Normal Forms … Cont’d

• Fifth Normal Form (5NF): A model limited to only simple


(elemental) facts.
• Def.: A table is in 5NF, also called "Projection-Join Normal
Form" (PJNF), if it is in 4NF and if every join dependency in
the table is a consequence of the candidate keys of the table.
• Domain-Key Normal Form (DKNF): A model free from all
modification anomalies.
• Def.: A table is in DKNF if every constraint on the table is a
logical consequence of the definition of keys and domains.
Compiled by Aliyan A. 215
Physical Database Design … Cont’d
• The Logical database design is concerned with the what;
• The Physical database design is concerned with the how.
• Physical database design is the process of producing a description of the
implementation of the database on secondary storage.
• It describes the base relations, file organization, and indexes used to
achieve effective access to the data along with any associated integrity
constraints and security measures.
• Sources of information for the physical design process include global
logical data model and documentation that describes model.
• Knowledge of the DBMS that is selected to host the database systems, with
all its functionalities, is required since functionalities of current DBMS
vary widely.
Compiled by Aliyan A. 216
Steps in physical database design
1. Translate logical data model for target DBMS
• To determine the file organizations and access methods that will be used
to store the base relations; i.e. the way in which relations and tuples will
be held on secondary storage
• Design enterprise constraints for target DBMS
• This phase is the translation of the global logical data model to produce a
relational database schema in the target DBMS.
• This includes creating the data dictionary based on the logical model and
information gathered.
• After the creation of the data dictionary, the next activity is to understand
the functionality of the target DBMS so that all necessary requirements are
fulfilled for the database intended to be developed.

Compiled by Aliyan A. 217


Steps in physical database design … Cont’d
• Knowledge of the DBMS includes:
• how to create base relations
• whether the system supports:
• definition of Primary key
• definition of Foreign key
• definition of Alternate key
• definition of Domains
• Referential integrity constraints
• definition of enterprise level constraints
• Some tasks to be done:
• 1.1. Design base relation
• 1.2. Design representation of derived data
• 1.3. Design enterprise constraint

Compiled by Aliyan A. 218


Steps in physical database design … Cont’d
1.1. Design base relation
• Designing base relation involves identification of all necessary requirements about a
relation starting from the name up to the referential integrity constraints.
• The implementation of the physical model is dependent on the target DBMS since
some has more facilities than the other in defining database definitions.
• The base relation design along with every justifiable reason should be fully
documented.
1.2. Design representation of derived data
• While analyzing the requirement of users, we may encounter that there are some
attributes holding data that will be derived from existing or other attributes. A decision
on how to represent such data should be devised.
• Most of the time derived attributes are not expressed in the logical model but will be
included in the data dictionary.
• Whether to store stored attributes in a base relation or calculate them when required is a
decision to be made by the designer considering the performance impact.
Compiled by Aliyan A. 219
Steps in physical database design … Cont’d
1.3. Design enterprise constraint
• Data in the database is not only subjected to constraints on the
database and the data model used but also with some enterprise
dependent constraints.
• This constraint definition is also dependent on the DBMS
selected and enterprise level requirements.
• All the enterprise level constraints and the definition method in
the target DBMS should be fully documented.

Compiled by Aliyan A. 220


Steps in physical database design … Cont’d
2. Design physical representation
This phase is the level for determining the optimal file organizations to store the base
relations and indexes that are required to achieve acceptable performance, that is, the way
in which relations and tuples will be held on the secondary storage.
• 2.1. Analyze transactions
To understand the functionality of the transactions that will run on the database
and to analyze the important transactions
• 2.2. Choose file organization
To determine an efficient file organization for each base relation
• 2.3. Choose indexes
Used for quick access
• 2.4. Estimate disk space and system requirement
To estimate the amount of disk space that will be required by the database.

Compiled by Aliyan A. 221


Steps in physical database design … Cont’d
3. Design user view
To design the user views that were identified in the conceptual
database design methodology
4. Design security mechanisms
5. Consider controlled redundancy
• To determine whether introducing redundancy in a controlled manner
by relaxing the normalization rules will improve the performance of the
system.
6. Monitor and tune the operational system
• To design the access rules to the base relations and user views

Compiled by Aliyan A. 222


Database System Architecture
• In a standard terminology and general architecture for database systems
was produced in 1971 by the DBTG (Data Base Task Group) appointed by
the Conference on Data Systems and Languages (CODASYL, 1971).
• The DBTG recognized the need for a two-level approach with a system
view called the schema and user views called subschema.
• The American National Standards Institute (ANSI) Standards Planning
and Requirements Committee (SPARC) produced a similar terminology
mid architecture in 1975 (ANSI 1975).
• ANSI-SPARC recognized the need for a three-level approach with a
system catalog.

Compiled by Aliyan A. 223


Three-level ANSI-SPARC Architecture of a
Database

Compiled by Aliyan A. 224


ANSI-SPARC Architecture and Database Design
Phases
• External Level: Users' view of the database. It describes that part of
database that is relevant to a particular user. Different users have their own
customized view of the database independent of other users.
• Conceptual Level: Community view of the database. It describes what data
is stored in database and relationships among the data.
• Internal Level: Physical representation of the database on the computer. It
describes how the data is stored in the database.
• The following example can be taken as an illustration for the difference
between the three levels in the ANSI-SPARC database Architecture. Where:
• The first level is concerned about the group of users and their
respective data requirement independent of the other.
The second level is describing the whole content of the database where
one piece of information will be represented once.

Compiled by Aliyan A. 225


Database Schema
• A database schema is the skeleton structure that represents the logical
view of the entire database.
• It defines how the data is organized and how the relations among them are
associated.
• It formulates all the constraints that are to be applied on the data.
• A database schema defines its entities and the relationship among them. It
contains a descriptive detail of the database, which can be depicted by
means of schema diagrams.
• It’s the database designers who design the schema to help programmers
understand the database and make it useful.

Compiled by Aliyan A. 226


Database schema cont…
• 1. External schema: at the external level to describe the various user
views. It usually uses the same data model as the conceptual level.
• The external or view level includes a number of external schemas or
user views.
• Each external schema describes the part of the database that a
particular user group is interested in and hides the rest of the
database from that user group.
• A high-level data model or an implementation data model can be
used at this level.

Compiled by Aliyan A. 227


Database schema cont…
• 2. Conceptual schema: at the conceptual level to describe the
structure and constraints for the whole database for a
community of users.
• It uses a conceptual or an implementation data model.
• The conceptual level has a conceptual schema, which describes
the structure of the whole database for a community of users.
• The conceptual schema hides the details of physical storage
structures and concentrates on describing entities, data types,
relationships, user operations, and constraints.
• A high-level data model or an implementation data model can be
used at this level.
Compiled by Aliyan A. 228
DBMS schemas at three levels Cont…
• 3. The internal level has an internal schema, which
describes the physical storage structure of the database.
• The internal schema uses a physical data model and
describes the complete details of data storage and access
paths for the database.
• Internal schema: at the internal level to describe physical
storage structures and access paths.
• Typically uses a physical data model.
Compiled by Aliyan A. 229
Three schema database architecture

Compiled by Aliyan A. 230


The purpose and origin of the Three-Level database architecture
• It is a convenient tool for the user to visualize the schema levels in a
database system.
• All users should be able to access same data.
• A user's view is unaffected or immune to changes made in other views.
• Users should not need to know physical database storage details.
• DBA should be able to change database storage structures without
affecting the users' views.
• Internal structure of database should be unaffected by changes to
physical aspects of storage.
• DBA should be able to change conceptual structure of database
without affecting all users.
Compiled by Aliyan A. 231
Data Independence
• A database system normally contains a lot of data in addition to users’
data.
• For example, it stores data about data, known as metadata, to locate
and retrieve data easily.
• It is rather difficult to modify or update a set of metadata once it is
stored in the database.
• But as a DBMS expands, it needs to change over time to satisfy the
requirements of the users.
• If the entire data is dependent, it would become a tedious and highly
complex job.

Compiled by Aliyan A. 232


Logical Data Independence:
• Refers to immunity of external schemas to changes in
conceptual schema.
• Conceptual schema changes e.g. addition/removal of entities
should not require changes to external schema or rewrites of
application programs.
• The capacity to change the conceptual schema without
having to change the external schemas and their application
programs.
• Logical data independence is a kind of mechanism, which
liberalizes itself from actual data stored on the disk.
• If we do some changes on table format, it should not change
the data residing on the disk.
Compiled by Aliyan A. 233
Physical Data Independence
• The ability to modify the physical schema without changing the logical schema
• Applications depend on the logical schema.
• In general, the interfaces between the various levels and components should be
well defined so that changes in some parts do not seriously influence others.
• The capacity to change the internal schema without having to change the
conceptual schema.
• Refers to immunity of conceptual schema to changes in the internal schema.
• Internal schema changes e.g. using different file organizations, storage
structures/devices should not require change to conceptual or external
schemas.
• For example, in case we want to change or upgrade the storage system itself −
suppose we want to replace hard-disks with SSD − it should not have any
impact on the logical data or schemas.
Compiled by Aliyan A. 234
Data Independence and the ANSI-SPARC
Three-level Architecture

Compiled by Aliyan A. 235


Schemas
• Schema describes how data is to be structured, defined at setup/Design time (also
called "metadata").
• Since it is used during the database development phase, there is rare tendency of
changing the schema unless there is a need for system maintenance which demands
change to the definition of a relation.
• Database Schema (Intension): specifies name of relation and the collection of the
attributes (specifically the Name of attributes).
• Refer to a description of database (or intention).
• Specified during database design.
• Should not be changed unless during maintenance.
• Schema Diagrams
• Convention to display some aspect of a schema visually.
• Schema Construct
• Refers to each object in the schema (e.g. STUDENT).
• E.g. STUNEDT (FName, LName, Id, Year, Dept, Sex)
Compiled by Aliyan A. 236
Instance
• Is the collection of data in the database at a particular point of time (snap-
shot).
• Also called State or Snap Shot or Extension of the database.
• Refers to the actual data in the database at a specific point in time.
• State of database is changed any time we add, delete or update an item.
• Valid state: the state that satisfies the structure and constraints specified
in the schema and is enforced by DBMS.
• Since Instance is actual data of database at some point in time, changes
rapidly.
• To define a new database, we specify its database schema to the DBMS
(database is empty).
• Database is initialized when we first load it with data.
Compiled by Aliyan A. 237
•End

Thank you!!!!!

Compiled by Aliyan A. 238

You might also like