CS3492-DBMS Study Material - Unit I

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

CS2255 – DATABASE MANAGEMENT SYSTEMS

CS 2255 DATABASE MANAGEMENT SYSTEMS LTPC


3003

UNIT I INTRODUCTION 9
Purpose of Database System - Views of data - Data Models - Database Languages - Database
System Architecture - Database users and Administrator - Entity-Relationship model (E-R model
) - E-R Diagrams - Introduction to relational databases

UNIT II RELATIONAL MODEL 9


The relational Model - The catalog - Types - Keys - Relational Algebra - Domain Relational
Calculus - Tuple Relational Calculus - Fundamental operations - Additional Operations - SQL
fundamentals - Integrity - Triggers - Security - Advanced SQL features –Embedded SQL -
Dynamic SQL - Missing Information - Views - Introduction to Distributed Databases and
Client/Server Databases

UNIT III DATABASE DESIGN 9


Functional Dependencies - Non-loss Decomposition - Functional Dependencies – 1 , 2nd, 3rd
st

Normal Forms, Dependency Preservation - Boyce/Codd Normal Form - Multi-valued


Dependencies and Fourth Normal Form - Join Dependencies and 5th Normal Form

UNIT IV TRANSACTIONS 9
Transaction Concepts - Transaction Recovery - ACID Properties - System Recovery - Media
Recovery - Two Phase Commit - Save Points - SQL Facilities for recovery - Concurrency - Need
for Concurrency - Locking Protocols - Two Phase Locking - Intent Locking - Deadlock -
Serializability - Recovery Isolation Levels - SQL Facilities for Concurrency.

UNIT V IMPLEMENTATION TECHNIQUES 9


Overview of Physical Storage Media - Magnetic Disks - RAID - Tertiary storage - File
Organization - Organization of Records in Files - Indexing and Hashing - Ordered Indices - B+
tree Index Files - B tree Index Files - Static Hashing - Dynamic Hashing - Query Processing
Overview - Catalog Information for Cost Estimation - Selection Operation - Sorting - Join
Operation - Database Tuning.
TOTAL: 45 PERIODS
TEXT BOOKS:
1. Abraham Silberschatz, Henry F. Korth, S. Sudharshan, “Database System Concepts”, 5th
Edition, Tata McGraw Hill, 2006. (Unit I and Unit-V)
2. C.J.Date, A.Kannan, S.Swamynathan, “An Introduction to Database Systems”, 8 th Edition,
Pearson Education, 2006. (Unit II, III and IV)

REFERENCES:
1. RamezElmasri, Shamkant B. Navathe, “Fundamentals of Database Systems”, 4thEdition,
Pearson / Addisionwesley, 2007.
2. Raghu Ramakrishnan, “Database Management Systems”, Third Edition, McGraw Hill,
2003.
3. S.K.Singh, “Database Systems Concepts, Design and Applications”, 1st Edition, Pearson
Education, 2006.

2
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

UNIT – I
INTRODUCTION

1.1 Introduction

1.1.1 Database Management System (DBMS)


Database is a collection of data that contains information relevant to an enterprise.
Databases are actually designed to manage large bodies of information. A database management
system (DBMS) is a collection of interrelated and persistent data and a set of programs to access
those data.
Management of data involves both defining structures for storage of information and
providing mechanisms for manipulation of information. The primary goal of a DBMS is to
provide an environment to store and retrieve database information that is both convenient and
efficient.

1.1.2 Why DBMS?


 A database system provides the organization of large information that interrelated
with centralized control on it.
 A DBMS provides a secure and survivable medium for the storage and retrieval of
data.
 If that data is shared, if it is persistent, if the users want it to be secure and easy to
access and manipulate then a DBMS is the perfect solution.

1.1.3 Functions of a DBMS


 Data storage, retrieval and update
 Concurrency control and recovery services
 Support for data communication
 Integrity and utility services
 Services to promote data independence

1.1.4 Characteristics of data in a database


 Persistence – Data in a database exist permanently
 Validity / Integrity / Correctness – Data should be correct with respect to the
real world entity that they represent.
 Security – Data should be protected from an unauthorized access
 Consistency – The values should be consistent with respect to the relationship.
 Non-redundancy –No two data items in a database should represent the same real
world entity.
 Independence – the schema at one level should not affect the other levels.

3
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

1.1.5 Advantages and disadvantages of DBMS


Advantages:
1. Improved security
Database security is the protection of the database from an unauthorized access.
2. Improved data integrity
Database integrity refers to the validity and consistency of stored data. Integrity is
expressed in terms of constraints, which are consistency rules that the database is
not permitted to violate.
3. Data consistency
If a data item is stored more than once and the system is aware of this, the system
can ensure that all copies of the item are kept consistent.
4. Improved data accessibility and responsiveness
Many DBMSs provide query language that allows users to enquire questions and
to obtain the required information at their terminals, without any need of separate
software.
5. Increased concurrency
Many DBMSs manage concurrent database access and ensure the data in the
database is consistent and valid.
6. Improved backup and recovery services
Modern DBMSs provide facilities to minimize the amount of processing that is
lost following a failure.

Disadvantages:
1. Cost of DBMSs
The cost of DBMSs varies significantly, depending on the environment and
functionality provided
2. Complexity and Size
The provision of the functionality makes DBMS an extremely complex piece of
software. Failure to understand the system can lead to bad design decisions.
3. Higher impact of a failure
The centralization of resources increases the vulnerability of the system. Since all
users and applications rely on the availability of the DBMS, the failure of any
component can bring operations to a halt.
4. Cost of conversion
The cost of converting existing applications to run on the new DBMS and
hardware includes the cost of training staff to use these new systems and running
of the system.
5. Performance
The DBMS is written to be more general in order to support applications in all
domains. The effect is that some applications may not run as they used to.

4
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

1.1.6 Database-system applications


Databases are widely used in all application domains such as,
 Banking: For customer information, accounts, loans and banking transactions.
 Airlines: For reservations and schedule information.
 Universities: For student information, course registrations and mark systems.
 Credit card transactions: For purchases on credit cards and generation of
monthly statements
 Sales and Online retailers: For customer, product and purchase information.
For online order tracking, generation of recommendation lists and maintenance
of on-line product evaluations.
 Telecommunications: for keeping records of calls made, generating monthly
bills, maintaining balances on prepaid calling cards and storing information
about the communication networks.
 Finance: For storing information about holdings, sales and purchases of
financial instruments such as stocks and bonds.

1.2 Purpose of DBMS


In traditional file processing, each user defines and implements the files needed
for a specific software application as part of programming the application. This typical
file processing system is supported by a conventional operating system. The system store
permanent records in various files and it needs different application programs to extract
records/ information from the file system.

Keeping organizational information in a file-processing system has a number of major


disadvantages:
 Data redundancy and inconsistency
In file processing, every user group maintains its own files for handling its
data-processing applications. Storing the same data multiple times is called
data redundancy. This redundancy leads to several problems such as storage
space is wasted; need to perform a single logical operation like insertion,
updation and deletion more than one times; Files thatrepresent the same data
may become inconsistent.
 Difficulty in accessing data
File processing environments do not allow needed data to be retrieved in a
convenient and efficient manner.
 Data isolation
Because data are scattered in various files and files may be in different
formats, writing new application programs to retrieve the appropriate data is
difficult.

5
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

 Integrity problems
The data values stored in database must satisfy certain types of consistency
constraints. Developers enforce those constraints in the system by adding
appropriate code in the various application programs.
When new constraints are added, it is difficult to change the programs to
enforce them. The problem is compounded when constraints involve several
data items from different files.
 Atomicity problems
Atomic means the transaction must happen in it’s entirely or not at all. It is
difficult to ensure atomicity in a conventional file-processing system.
A computer system is subject to failure. It is crucial that if a failure occurs,
the data be restored to the consistent state that existed prior to the failure.
 Concurrent-access anomalies
Many systems allow multiple users to update the data simultaneously to
provide overall performance of the system and faster response. The result of
the concurrent access may leave the account in an account in an incorrect
state.
 Security problems
Enforcing security constraints in an ad hoc manner to the file processing
system is difficult. Not every user of the database system should be able to
access all the data.

1.3 View of Data


A database system is a collection of interrelated data and a set of programs that
allow users to access and modify these data. A major purpose of a database system is to
provide users with an abstract view of the data i.e. the system hides certain details of how
the data stored and maintained.

Views have several benefits as follows:


 Views provide a level of security. Views can be setup to exclude data that
some users should not see.

 Views provide a mechanism to customize the appearance of the database.

 A view can present a consistent, unchanging picture of the structure of the


database, even if the underlying database is changed.

1.3.1 Data Abstraction

For the system to be usable, it must retrieve data efficiently. The need for
efficiency has led designers to use complex data structures to represent data in the
database. Since many users are not computer trained, developers hide the complexity from
users through several levels of abstraction to simplify users’ interactions with the system.

6
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

View level
View 1 View 2 …… View n

Logical
Level

Physical
Level

Figure 1.1The three levels of data abstraction

The objective of the three-level architecture is to separate each user’s view of the
database from the way the database is physically represented.
 Physical / Internal Level:The lowest level of abstraction describes how
the data are actually stored. This level describes complex low-level data
structures in details.
 Logical Level: The next-higher level of abstraction describes what data
are stored in the database and what relationships exist among those data.
Although implementation of the simple structures at the logical level does
not need to be aware of this complexity.
Database administrators, who must decide what information to
keep in the database, use the logical level of abstraction.
 View Level:The highest level of abstraction describes only part of the
entire database. Even though the logical level uses simpler structures,
complexity remains because of the variety of information stored in a large
database.

1.3.2 Schemas, Mappings and Instances


The collection of information stored in the database at a particular moment is
called an instance of the database. The overall description of the database is called the
database schema. A database schema corresponds to the variables declarations in a
program.
Database systems have several schemas, partitioned according to the levels of
abstraction.There are three types of schema exist such as,
1. External schema – It corresponds to different views of the data.
2. Conceptual schema – Conceptual schema describes all the entities,
attributes and relationships together with integrity constraints.
3. Internal schema – The internal schema is a complete description of the
internal model, containing the definitions of the stored records, the
methods of representation, the data fields and the indexes and hashing
schemas used.

7
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

A database may have several schemas at the view level is called subschemas,
that describe different views of the database. The physical schema is hidden beneath the
logical schema.

1.3.3 Data Models


The data model is a collection of conceptual tools for describing data, data
relationships, data semantics and consistency constraints. A data model provides a way to
describe of a database at the physical, logical and view level.
The purpose of a data model is to represent data and to make the data
understandable. There are a number of different data models exist that are classified in
four different categories:
1. Relational Model
2. The Entity-Relationship Model
3. Object-Based Data Model
4. Semi-Structured Data Model

According to the types of concepts used to describe the database structure, there
are three data models:
1. An External data model: to represent each user’s view of the
organization.
2. A Conceptual data model: to represent the logical view that is DBMS
independent.
3. An Internal data model: to represent the conceptual schema in such a
way that it can be understood by the DBMS.

Hierarchical Data model

Network Data Model

Relational Data Model

Entity-Relational Data Model

Semantic Data Model

Extended
Object-Oriented
Relational
Data Model
Data Model

Figure 1.2 The development of data model

8
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

Relational Model
The relational model uses a collection of tables to represent both data and the
relationships among those data. It is based on the concept of mathematical relations.
Relational model stores data in the form of a table. Each table corresponds to en
entity and each row represents an instance of that entity. Table are also called relations,
related to each other through the sharing of a common entity characteristic.
The relational data model is widely used data model and a vast majority of current
database systems are based on the relational model e.g.Relational DBMS - DB2, MS SQL
Server.
Advantages:
 Structural independence is promoted by the use of independence of
tables.
 Changes in a table’s structure do not affect data access.
 Ad hoc query capability is based on SQL.
Disadvantages:
 The RDBMS requires substantial hardware and software overhead.
 Isolation of Information will prevent the information integration.
 Ease of design may lead to bad database schema design.

Entity-Relational Model
The ER Model is based on two component namely entity and relationships. An
entity is a collection of basic real time objects and relationship describe associations
among these objects/ data. There are three types of relationships exist such as One-to-
One, One-to-Many and Many-to-Many.An entity is described by a set of attributes that
describes associations among data.
Advantages:
 Visual presentation makes it an effective communication tools.
 It is integrated with dominant relation model.
 Visual modelling produces exceptional conceptual simplicity.
Disadvantages:
 Limited constraint representation.
 Limited relationship representation.
 Lose of information content occurs when attributes are removed from
entities.

Object-Based Data Model


In the object-oriented data model (OODM) both data and their relationships are
contained in a single structure known as an object. The object-oriented data model can be
seen extending the ER model with notions of encapsulation, methods and object identity.
It combines features of the object-oriented data model and relational data model. The
OODM is said to be a semantic data model because semantic indicates meaning.
9
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

The OODM is based on the following components:


 An object is an abstraction of a real-world entity.
 Attributes describe the properties of an object.
 A class is a collection of similar objects with shared structure and
behaviour. Classes are organized in a class hierarchy.
 Inheritance is the ability of an object within the class hierarchy.
Advantages:
 Semantic content is added.
 Inheritance promotes data integrity.
Disadvantages:
 High system overhead which may slow down the transaction processing.
 Difficult to maintain the model structure.

Semi-structured Data Model


This model permits the specification of data where individual data items of the
same type may have different sets of attributes. This is in contrast to all above data
models. The eXtensibleMarkup Language (XML) is widely used to represent semi-
structured data.

1.4 Database Languages


Once the design of a database is completed and a DBMS is chosen to implement
the database, there are two languages are used to specify conceptual and internal schemas
for the database. A database system provides a data-definition language to specify the
database schema and a data-manipulation language to express database queries and
updates.
The data definition language (DDL) is used by the Database Administrator (DBA)
to define the conceptual and internal schema of the particular database.
The data manipulation language (DML) is used to perform a set of operations
which includes insertion, retrieval, modification and deletion of the data.

1.4.1 Data Definition Language (DDL)


A database schema is specified by a set of definitions expressed by a special
language called a data definition language (DDL). This DDL is also used to specify
additional properties of the data.
One DBA can specify the storage structure and access methods used by the
database system by a set of statements in a special type of DDL called a data storage and
definition language (SDL).
The data values stored in the database must satisfy certain consistency constraints
such as domain constraints, referential integrity, assertions and authorization. The
database system checks these constraints every time the database is updated.

10
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

DDL gets as input some instructions, generates some output and it will be stored
in the data dictionary. A database system consults the data dictionary before reading or
modifying actual data.
 Domain Constraints
A domain of possible values must be associated with every attributes e.g. integer
type, number type, date/time type. Declaring an attribute to be a particular domain acts as
a constraint on the values.
Domain constraints are tested by the system whenever a new data item is entered
into the database.

 Referential Integrity
There are some cases a value that appears in one relation for a given set of
attributes needs to appear also in some other relation. Database modifications can cause
violations of referential integrity and when it is violated, the normal procedure is to reject
the action that caused the violation.

 Assertions
An assertion is any condition that the database must always satisfy. Domain
constraints and referential integrity are the special forms of assertions.
When an assertion is created, the system tests it for validity and if it valid then any
future modification to the database is allowed only if it does not cause that assertion to be
violated.
 Authorization
In order to provide different types of privileges to database users, authorization is
the only key term which is used to provide the required privileges. The most common
authorizations are as follows:
1. READ Authorization: which allows reading but modification of data is not
permitted.
2. INSERT Authorization: which allows reading of existing data and insertion of
new data but modification of existing data is not allowed.
3. UPDATE Authorization: which allows insertion, reading and updation of
existing data but deletion is not permitted.
4. DELETE Authorization:which allows modification of existing data including
the deletion operation.

1.4.2 Data Manipulation Language (DML)


A data manipulation language (DML) is a language that enables users to access or
manipulate data. It provides a set of operations to support the basic data manipulation
operations on the data held in the database. Some of data manipulation operations are,
 Insertion of new data into the database
 Modification of data stored in the database
 Retrieval of data stored in the database
 Deletion of data from the database.
11
(8138) – SARANATHAN COLLEGE OF ENGINEERING
CS2255 – DATABASE MANAGEMENT SYSTEMS

There are two types of DML exist namely.


 Procedural DMLsrequire a user to specify whatdata are needed and how
to get those data.
 Declarative / Non-Procedural DMLsrequire a user to specify what data
are needed without specifying how to get those data. It is easier to learn
and use than procedural DMLs.
A query is a statement requesting the retrieval of information. The portion of a
DML that involves information retrieval is called a query language.
Even though there are many of database query languages are in use, SQL is the
most commonly used query language.
The levels of abstraction apply not only to defining of structuring data, but also to
manipulating data. The DML queries are processed by the query processor into sequences
of actions at the physical level of the database system.

1.10 MODEL QUESTIONS


1.10.1 PART-A – 2 Marks
1. Define database management system?
2. List any five applications of DBMS.
3. What are the disadvantages of file processing system?
4. What are the advantages of using a DBMS?
5. List any four significant differences between a file processing system and a DBMS.
6. Give the levels of data abstraction?
7. Define instance and schema?
8. Define the terms 1) physical schema 2) logical schema.
9. What is conceptual schema?
10. Define data model? List the types of data models used?
11. What is storage manager?What are the components of storage manager?
12. What is the purpose of storage manager?
13. What are five main functions/ responsibilities of a database administrator?
14. List the data structures implemented by the storage manager.
15. What is a data dictionary?
16. What is an entity relationship model?
17. What are attributes? Give examples.
18. What is the architectural model?What is the fundamental model?
19. What are the responsibilities of database system?
20. What is relationship? Give examples
21. Define the terms i) Entity set ii) Relationship set
22. Define single valued and multi valued attributes.
12
(8138) – SARANATHAN COLLEGE OF ENGINEERING

You might also like