Chapter 2 - Database System Concepts and Architecture
Chapter 2 - Database System Concepts and Architecture
1
Outlines
2
Data Models
Is A set of concepts to describe the structure of a database, the
operations for manipulating these structures, and certain
constraints that the database should obey.
Constraints specify some restrictions on valid data; these
constraints must be enforced at all times
A data model is a conceptual framework for organizing and
structuring data within a database.
It defines how data is connected, stored, and retrieved.
Slide 2- 3
Data Models … Cont’d
A data model is a collection of concepts that can be used to
describe the structure of a database—provides the necessary
means to achieve the abstraction.
A database model is a type of data model that determines the
logical structure of a database and fundamentally determines in
which manner data can be stored, organized, and manipulated.
4
Key Elements of a Basic Data Model
1.Entities: These are objects or things in the real world that can be
identified distinctly. E.g : A student, a course, or a product.
2.Attributes: These are properties or details about an entity.
E.g: For a student entity, attributes are Name, ID, Age, Grade…
3.Relationships: These define how entities are related to one another.
E.g: A student enrolls in a course (relationship between student
and course).
4.Constraints: These are rules that data must follow in the database.
E.g: A student ID must be unique.
5
Data Models … Cont’d
Basic Data model can be divided into four:
1) Hierarchical Model
2) Network Model
3) Object Oriented Model
4) Relational Model
6
1. Hierarchical Model
Data is organized in a tree-like structure(parent child mode).
Parent-child relationships, where one parent can have many
children, but each child has one parent.
Connection between child and its parent is called a Link.
Is the simplest data model
The top node is the root node
The relationship between parent and child can be either 1-1 or 1-M
7
Hierarchical Model… Cont’d
8
Hierarchical Model… Cont’d
Advantage of Hierarchical Model
Good for tree type problem (e.g. Family Tree Problem)
Language is simple; uses constructs like GET, GET UNIQUE,
GET NEXT, GET NEXT WITHIN PARENT etc.
Disadvantages of Hierarchical Model
Addition, deletion, and search operations are very difficult.
There is duplication of data.
Navigational and procedural nature of processing not good.
Database is visualized as a linear arrangement of records
9
2. Network Model
10
Network Model … Cont’d
11
Network Model … Cont’d
Advantage of Network Data Model:
Network Model is able to model complex relationships and
represents semantics of add/delete on the relationships.
Can handle most situations for modeling using record types and
relationship types.
Language is navigational; uses constructs like FIND, FIND
member, FIND owner, FIND NEXT within set, GET etc.
Disadvantages of Network Data Model:
Navigational and procedural nature of processing if it is
complex.
Database contains a complex array of pointers that thread
through a set of records.
Little scope for automated "query optimization”
12
3. Object Oriented Model
The OO approach of defining objects that can be used in many
programs is now also being applied to database systems.
An object can have properties (or attributes) but also behaviour,
which is modelled in methods (functions) in the object.
In an OO database , each type of object in the database’s mini-
world is modelled by a class i.e. (Customer class, Account
class ) like tables in the relational model.
A class has properties (attributes).
13
Object Oriented Model … Cont’d
A class also has methods that are stored with the class definition.
One advantage of the OO model is sub-classes. E.g D/t types of
account, can be modelled as sub-classes of the Account class.
SavingsAccount and CurrentAccount.
This makes sense because the different account types have
some different behaviour e.g. gaining interest in a savings
account but some behaviour the same e.g. withdrawing cash.
This is the inheritance concept of OO programming.
14
Object Oriented Model … Cont’d
Diagram – class name at the top, properties in the middle, methods at
the bottom.
15
4. Relational Model
Terminologies originates from the branch of mathematics called
set theory and relation.
Can define more flexible and complex relationship
Viewed as a collection of tables called “Relations” equivalent to
collection of record types
Relation: Two dimensional table
Store data in the form of tables ( rows and columns)
A row of the table is called tuple which is equivalent to record
A column of a table is called attribute which is equivalent to
fields
Data value is the value of the Attribute
16
Relational Model … Cont’d
Records are related by the data stored jointly in the fields of
records in two tables or files. The related tables contain
information that creates the relation
The tables seem to be independent but are related some how.
No physical consideration of the storage is required by the user
17
Relational Model … Cont’d
Relational Data model (also called the second generation data
model), describes entities and their relationships in the form of table
Entity : Student Entity: course
18
Relational Data Model
Properties of Relational Databases
Each row of a table is uniquely identified by a PRIMARY KEY
composed of one or more columns
Each tuple in a relation must be unique
Group of columns, that uniquely identifies a row in a table is
called a CANDIDATE KEY
ENTITY INTEGRITY RULE of the model states that no
component of the primary key may contain a NULL value.
A column or combination of columns that matches the primary
key of another table is called a FOREIGN KEY.
FOREIGN KEY is used to cross-reference tables.
19
Properties of Relational Databases … Cont’d
Database is the collection of tables
Each entity in one table
Attributes are fields (columns) in table
Order of rows and columns is immaterial
Entries are single-valued
Each column (field or attribute) has a distinct name
All values in a column represent the same attribute and have the
same data format
20
Schemas, Instances and Database State
Schema :is a description of a particular collection of data, using a
given data model.
It is a definition of database
22
Schemas, Instances, and Database State … Cont’d
23
Schemas, Instances, and Database State … Cont’d
Database State
The data in the database at a particular moment in time is called a
database state or snapshot.
It is also called as the current set of occurrences or instance in the
database.
In a given database state, each schema construct has its own current
set of instances: for example, the STUDENT construct will contain
the set of individual student entities (records) as its instances.
Every time we insert or delete a record, or change the value of a
data item in a record, we change one state of the database into
another state.
24
Schemas, Instances, and Database State … Cont’d
25
Schemas, Instances, and Database State … Cont’d
When we define a new database, we specify its database schema
only to the DBMS.
At this point, the corresponding database state is the empty state
with no data.
We get the initial state of the database when the database is first
populated or loaded with the initial data.
From then on, every time an update operation is applied to the
database, we get another database state. At any point in time, the
database has a current state .
26
Schemas, Instances, and Database State … Cont’d
The DBMS is partly responsible for ensuring that every state of
the database is a valid state
i.e, a state that satisfies the structure and constraints specified
in the schema.
27
DBMS
What is DBMS?
is a software package used for providing efficient, convenient
and safe multi-user storage of and access to massive amounts of
persistent data.
It provides a systematic method for creating, updating, storing,
retrieving data in a database.
A full scale DBMS should at least have the following services to
provide to the user.
Data storage, retrieval and update in the database
A user accessible catalogue
Transaction support service
28
DBMS … Cont’d
Concurrency Control Services: access of database by different
users simultaneously.
Recovery Services: a mechanism for recovering from failure.
Authorization Services (Security): support the access authorization.
Support for Data Communication: support data transfer.
Integrity Services: rules about data and the change that took place
on the data, correctness and consistency of stored data.
29
DBMS … Cont’d
Utility services: sets of utility service facilities like
Importing data
Statistical analysis support
Index reorganization
Garbage collection, etc.
30
DBMS Language
1. Data Definition Language (DDL)
used to define each data element required by the organization
Commands for setting up schema of the database
Used to set up a database, create, delete and alter table with the
facility of handling constraints
Is used to define the internal and external schema
2. Data Manipulation Language (DML)
Used for data manipulation
Typical manipulations include retrieval, insertion, deletion, and
modification of the data.
Since the required data or query by the user will be extracted using this
type of language, DML is also called “Query Language”
31
DBMS Language … Cont’d
We have two types of DMLs:-
Procedural Data Manipulation Languages
That allows the user to tell the system what data is needed
and exactly how to retrieve the data;
Non-Procedural Manipulation Languages
That allows the user to state what data is needed rather than
how it is to be retrieved.
E.g. SQL
32
DBMS Language … Cont’d
How the Programmer Sees the DBMS
Start with DDL to create tables
33
DBMS Architecture and Data Independence
A major aim of a database system is to provide users with an abstract
view of data, hiding certain details of how data is stored and
manipulated.
Since a database is a shared resource, each user may require a
different view of the data held in the database.
Accordingly there are several types of architectures of database
systems.
The American National Standards Institute/Standards Planning and
Requirements Committee (ANSI-SPARC) also introduced the three
level architecture of the database based on their degree of abstraction.
The architecture is consists of the three levels:
internal level
conceptual
external level.
34
DBMS Architecture … Cont’d
In this architecture, schemas can be defined at three levels
The goal of the three-schema architecture is to separate the user applications and
the physical database.
1. The Internal level:
Has an internal schema, which describes the physical storage
structure of the database.
The internal schema uses a physical data model and describes the
complete details of data storage and access paths for the database.
It describes the physical representation of the database on the
computer.
This level describes how the data is stored in the database.
35
DBMS Architecture … Cont’d
The internal level is concerned with such things as:
Storage space allocation for data
Record description for storage
Record placement
36
DBMS Architecture ... Cont’d
2. The conceptual level: Has a conceptual schema, which describes
the structure of the whole database for a community of users.
The conceptual schema hides the details of physical storage
structures and concentrates on describing entities, data types,
relationships, user operations, constraints, security and
integrity information.
A high-level data model or an implementation data model can be
used at this level.
The community view of the database.
This level describes what data is stored in the database and the
relationships among the data.
It is a complete view of the data requirements of the organization.
37
DBMS Architecture … Cont’d
3. The External Level: Includes a number of external schemas or
user views.
Each external schema describes the part of the database that a
particular user group is interested in and hides the rest of the
database from that user group.
The users’ view of the database.
Describe part of the database that is relevant to each user.
Each user has a view of the real world in different way.
For example: dates may be viewed in (day, month, year) or (year,
month, day)
Entities, attributes or relationships that are not of interest to the
users may still be represented in the database, but the users will be
unaware of them.
38
DBMS Architecture … Cont’d
ANSI-SPARC Architecture and Database Design Phases
39
DBMS Architecture … Cont’d
Example of data abstraction
40
Summary of the three schema level
41
Data Independence
The three-schema architecture can be used to explain the concept
of data independence.
Data independence is defined as the capacity to change the
schema at one level of a database system without having to
change the schema at the next higher level.
We can define two types of data independence:
Physical data independence
Logical data independence
42
Data Independence … Cont’d
1. Logical data independence :Is the capacity to change the
conceptual schema without having to change external schemas or
application programs.
We may change the conceptual schema to expand the database (by
adding a record type or data item), or to reduce the database (by
removing a record type or data item).
Only the view definition and the mappings need be changed in a
DBMS that supports logical data independence.
43
Data Independence … Cont’d
Modifications at the logical level are necessary whenever the
logical structure of the database is altered (for example, when
money market accounts are added to a banking system).
2. Physical data independence
Is the capacity to change the internal schema without having to
change the conceptual (or external) schemas.
Changes to the internal schema may be needed because some
physical files had to be reorganized. Modifications at the physical
level are occasionally necessary to improve performance.
44
The End
45