Database System Concepts Unit 1
Database System Concepts Unit 1
Database System Concepts Unit 1
Unit 1
MUHAMMED MASHAHIL P
ASSISTANT PROFESSOR
CAS VAZHAKKAD
Database
• The database is a collection of inter-related data which is used to retrieve,
insert and delete the data efficiently. It is also used to organize the data in the
form of a table, schema, views, and reports, etc.
• For example: The college Database organizes the data about the admin, staff,
students and faculty etc.
• Using the database, you can easily retrieve, insert, and delete the information.
Database System
• Database management system and database together is called database
system
Database Management System
Database management system is a software which is used to manage the
database. For example: MySQL, Oracle, etc are a very popular commercial
database which is used in different applications.
• DBMS provides an interface to perform various operations like database
creation, storing data in it, updating data, creating a table in the database and
a lot more.
• It provides protection and security to the database. In the case of multiple
users, it also maintains data consistency.
DBMS allows users the following tasks:
• Data Definition: It is used for creation, modification, and removal of definition
that defines the organization of data in the database.
• Data Updation: It is used for the insertion, modification, and deletion of the
actual data in the database.
• Data Retrieval: It is used to retrieve the data from the database which can be
used by applications for various purposes.
• User Administration: It is used for registering and monitoring users, maintain
data integrity, enforcing data security, dealing with concurrency control,
monitoring performance and recovering information corrupted by unexpected
failure.
Characteristics of DBMS
• It uses a digital repository established on a server to store and manage the
information.
• It can provide a clear and logical view of the process that manipulates
data.
• DBMS contains automatic backup and recovery procedures.
• It contains ACID properties which maintain data in a healthy state in case
of failure.
• It can reduce the complex relationship between data.
• It is used to support manipulation and processing of data.
• It is used to provide security of data.
• It can view the database from different viewpoints according to the
requirements of the user
Applications
Databases are widely used. Here are some representative applications:
Enterprise Information :-
– Sales: For customer, product, and purchase information.
– Accounting: For payments, receipts, account balances, assets and other accounting information.
– Human resources: For information about employees, salaries, payroll taxes, and benefits, and for
generation of paychecks.
– Manufacturing: For management of the supply chain and for tracking production of items in factories,
inventories of items in warehouses and stores, and orders for items.
Online retailers: For sales data noted above plus online order tracking, generation of recommendation
lists, and maintenance of online product evaluations.
Banking and Finance ◦
– Banking: For customer information, accounts, loans, and banking transactions. ◦
– Credit card transactions: For purchases on credit cards and generation of monthly statements.
– Finance: For storing information about holdings, sales, and purchases of financial instruments such
as stocks and bonds; also for storing real-time market data to enable online trading by customers and
automated trading by the firm.
Universities: For student information, course registrations, and grades (in addition to standard enterprise
information such as human resources and accounting).
Airlines: For reservations and schedule information. Airlines were among the first to use databases in a
geographically distributed manner.
Telecommunication: For keeping records of calls made, generating monthly bills, maintaining balances on
prepaid calling cards, and storing information about the communication networks.
Advantages of DBMS Disadvantages of DBMS
• Controls database redundancy: It can control data redundancy • Cost of Hardware and
because it stores all the data in one single database file and that Software: It requires a high
recorded data is placed in the database. speed of data processor and
• Data sharing: In DBMS, the authorized users of an organization large memory size to run
can share the data among multiple users. DBMS software.
• Easily Maintenance: It can be easily maintainable due to the • Size: It occupies a large
centralized nature of the database system.
space of disks and large
• Reduce time: It reduces development time and maintenance memory to run them
need. efficiently.
• Backup: It provides backup and recovery subsystems which create
automatic backup of data from hardware and software failures
• Complexity: Database
and restores the data if required. system creates additional
• multiple user interface: It provides different types of user
complexity and
interfaces like graphical user interfaces, application program requirements.
interfaces • Higher impact of
• Improved data security: Having complete authority over the failure: Failure is highly
operational data, enables the DBA in ensuring that the only mean impacted the database
of access to the database is through proper channels. The DBA can because in most of the
define authorization checks to be carried out whenever access to organization, all the data
sensitive data is attempted. stored in a single database
• Data Independence :Ina database system, the database and if the database is
management system provides the interface between the damaged due to electric
application programs and the data. When changes are made to failure or database
the data representation, the meta data obtained by the DBMS is corruption then the data
changed but the DBMS is continues to provide the data to
may be lost forever.
application program in the previously used way
DBMS vs. File System
DBMS File System
• DBMS is a collection of data. In DBMS, the • File system is a collection of data. In this
user is not required to write the system, the user has to write the
procedures. procedures for managing the database.
• DBMS gives an abstract view of data that • File system provides the detail of the data
hides the details. representation and storage of data.
• DBMS provides a crash recovery • File system doesn't have a crash
mechanism, i.e., DBMS protects the user mechanism, i.e., if the system crashes
from the system failure. while entering some data, then the
• DBMS provides a good protection content of the file will lost.
mechanism. • It is very difficult to protect a file under
• DBMS contains a wide variety of the file system.
sophisticated techniques to store and • File system can't efficiently store and
retrieve the data. retrieve the data.
• DBMS takes care of Concurrent access of • In the File system, concurrent access has
data using some form of locking. many problems like redirecting the file
while other deleting some information or
updating some information
Data Models
• Data models define how the logical structure of a database is modelled.
Data Models are fundamental entities to introduce abstraction in a DBMS.
• Data models define how data is connected to each other and how they
are processed and stored inside the system.
– ER MODEL
– RELATIONAL DATA MODEL
– OBJECT ORIENTED DATA MODEL
– SEMI STRUCTURED DATA MODEL
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities
and relationships among them. While formulating real-world scenario into
the database model, the ER Model creates entity set, relationship set, general
attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
– Entity − An entity in an ER Model is a real-world entity having properties
called attributes. Every attribute is defined by its set of values called domain.
For example, in a school database, a student is considered as an entity.
Student has various attributes like name, age, class, etc.
– Relationship − The logical association among entities is called relationship.
Relationships are mapped with entities in various ways. Mapping cardinalities
define the number of association between two entities.
Mapping cardinalities −
– one to one
– one to many
– many to one
– many to many
Relational Model
• The most popular data model in DBMS is the Relational Model. It is more scientific a
model than others. The relational model is an example of a record-based model.
Record-based models are so named because the database is structured in fixed-format
records of several types. Each table contains records of a particular type. Each record
type defines a fixed number of fields, or attributes. The columns of the table
correspond to the attributes of the record type
Schema
• A database schema is the skeleton structure that represents the logical view of
the entire database.
• It defines how the data is organized and how the relations among them are
associated. It formulates all the constraints that are to be applied on the data.
• A database schema defines its entities and the relationship among them.
• It contains a descriptive detail of the database, which can be depicted by means
of schema diagrams.
• It’s the database designers who design the schema to help programmers
understand the database and make it useful.
• schemas are changed infrequently
• Schemas partitioned according to the levels of abstraction
– Physical schema: database design at the physical level
– Logical schema: database design at the logical level
– external schema or subschema is the highest level of a schema which defines the views
for the end users.
Schema Instance
• A schema is the design • whereas instance is the
representation of a snapshot of a database at a
database. particular moment
• changes in schema occurs • Instance changes very
rarely. frequently, whenever data
is removed or added in the
database.
Data Abstraction
Database systems are made-up of complex data structures. To ease the user
interaction with database, the developers hide internal irrelevant details from
users. This process of hiding irrelevant details from user is called data
abstraction.
We have three levels of abstraction:
• Physical level: This is the lowest level of data abstraction. It describes how
data is actually stored in database. You can get the complex data structure
details at this level.
• Logical level: This is the middle level of 3-level data abstraction
architecture. It describes what data is stored in database.
• View level: Highest level of data abstraction. This level describes the user
interaction with database system.
Example: Let’s say we are storing customer
information in a customer table.
• At physical level these records can be
described as blocks of storage (bytes,
gigabytes, terabytes etc.) in memory. These
details are often hidden from the
programmers.
• At the logical level these records can be
described as fields and attributes along with
their data types, their relationship among
each other can be logically implemented. The
programmers generally work at this level
because they are aware of such things about
database systems.
• At view level, user just interact with system
with the help of GUI and enter the details at
the screen, they are not aware of how the
data is stored and what data is stored; such
details are hidden from them.
Three schema architecture
• Process of transforming request and results between three level it's called
mapping.
• There are the two types of mappings:
• Conceptual/Internal Mapping
• External/Conceptual Mapping
1. Conceptual/Internal Mapping:
• The conceptual/internal mapping defines the correspondence between
the conceptual view and the store database.
• It specifies how conceptual record and fields are represented at the
internal level.
• It relates conceptual schema with internal schema.
• If structure of the store database is changed.
• If changed is made to the storage structure definition-then the
conceptual/internal mapping must be changed accordingly, so that the
conceptual schema can remain invariant.
• There could be one mapping between conceptual and internal levels.
2. External/Conceptual Mapping:
• The external/conceptual mapping defines the
correspondence between a particular external view and
conceptual view.
• It relates each external schema with conceptual schema.
• The differences that can exist between these two levels are
analogous to those that can exist between the conceptual
view and the stored database.
• Example: fields can have different data types; fields and
record name can be changed; several conceptual fields can
be combined into a single external field.
• Any number of external views can exist at the same time;
any number of users can share a given external view:
different external views can overlap.
• There could be several mapping between external and
conceptual levels.
Data Independence
• The three schema architecture further explains the concept of data independence, the
capacity to change the schema at one level without having to change the schema at the
next higher level.
• Metadata itself follows a layered architecture, so that when we change data at one layer,
it does not affect the data at another level. This data is independent but mapped to each
other.
• There are two types of data independence:
– Logical data Independence
• Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
• Logical data independence is used to separate the external level from the conceptual view.
• If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
• Logical data independence occurs at the user interface level.
– Physical data Independence
• Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema.
• If we do any changes in the storage size of the database system server, then the Conceptual
structure of the database will not be affected.
• Physical data independence is used to separate conceptual levels from the internal levels.
• Physical data independence occurs at the logical interface level.
Transaction management
• A transaction is a logical unit of processing in a DBMS which entails one or
more database access operation.
• In a nutshell, database transactions represent real-world events of any
enterprise.
• All types of database access operation which are held between the
beginning and end transaction statements are considered as a single
logical transaction.
• During the transaction the database is inconsistent. Only once the
database is committed the state is changed from one consistent state to
another.
Facts about Database Transactions
DDL stands for Data Definition Language. It is used to define database structure or
pattern.
It is used to create schema, tables, indexes, constraints, etc. in the database.
Using the DDL statements, you can create the skeleton of the database.
Data definition language is used to store the information of metadata like the
number of tables and schemas, their names, indexes, columns in each table,
constraints, etc.
Here are some tasks that come under DDL:
Create: It is used to create objects in the database.
Alter: It is used to alter the structure of the database.
Drop: It is used to delete objects from the database.
Truncate: It is used to remove all records from a table.
Rename: It is used to rename an object.
Comment: It is used to comment on the data dictionary.
These commands are used to update the database schema that's why they come
under Data definition language.
2. Data Manipulation Language
DML stands for Data Manipulation Language. It is used for accessing and manipulating
data in a database. It handles user requests.
Here are some tasks that come under DML:
Select: It is used to retrieve data from a database.
Insert: It is used to insert data into a table.
Update: It is used to update existing data within a table.
Delete: It is used to delete all records from a table.
Merge: It performs UPSERT operation, i.e., insert or update operations.
Call: It is used to call a structured query language or a Java subprogram.
Explain Plan: It has the parameter of explaining data.
Lock Table: It controls concurrency.
3. Data Control Language
DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does not have the feature of rolling
back.)
Here are some tasks that come under DCL:
Grant: It is used to give user access privileges to a database.
Revoke: It is used to take back permissions from the user.
There are the following operations which have the authorization of Revoke:
CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.