Data Base System Concepts and Architecture

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 29

Data Models

Data models define how the logical structure of a database is modeled. Data
Models are fundamental entities to introduce abstraction in a DBMS. Data
models define how data is connected to each other and how they are
processed and stored inside the system.
The very first data model could be flat data-models, where all the data used
are to be kept in the same plane. Earlier data models were not so scientific,
hence they were prone to introduce lots of duplication and update
anomalies.

Hierarchical Model
Hierarchical Model was the first DBMS model. This model
organises the data in the hierarchical tree structure. The
hierarchy starts from the root which has root data and then
it expands in the form of a tree adding child node to the
parent node. This model easily represents some of the real-
world relationships like food recipes, sitemap of a website
etc. Example: We can represent the relationship between
the shoes present on a shopping website in the following
way:
Features of a Hierarchical Model

1. One-to-many relationship: The data here is


organised in a tree-like structure where the one-to-
many relationship is between the datatypes. Also, there
can be only one path from parent to any
node. Example: In the above example, if we want to go
to the node sneakers we only have one path to reach
there i.e through men's shoes node.
2. Parent-Child Relationship: Each child node has a
parent node but a parent node can have more than one
child node. Multiple parents are not allowed.
3. Deletion Problem: If a parent node is deleted then the
child node is automatically deleted.
4. Pointers: Pointers are used to link the parent node
with the child node and are used to navigate between
the stored data. Example: In the above example the
'shoes' node points to the two other nodes 'women
shoes' node and 'men's shoes' node.
Advantages of Hierarchical Model

 It is very simple and fast to traverse through a tree-like


structure.
 Any change in the parent node is automatically
reflected in the child node so, the integrity of data is
maintained.
Disadvantages of Hierarchical Model

 Complex relationships are not supported.


 As it does not support more than one parent of the child
node so if we have some complex relationship where a
child node needs to have two parent node then that
can't be represented using this model.
 If a parent node is deleted then the child node is
automatically deleted.

Network Model
This model is an extension of the hierarchical model. It was
the most popular model before the relational model. This
model is the same as the hierarchical model, the only
difference is that a record can have more than one parent. It
replaces the hierarchical tree with a graph. Example: In the
example below we can see that node student has two
parents i.e. CSE Department and Library. This was earlier not
possible in the hierarchical m

odel.

Features of a Network Model

1. Ability to Merge more Relationships: In this model,


as there are more relationships so data is more related.
This model has the ability to manage one-to-one
relationships as well as many-to-many relationships.
2. Many paths: As there are more relationships so there
can be more than one path to the same record. This
makes data access fast and simple.
3. Circular Linked List: The operations on the network
model are done with the help of the circular linked list.
The current position is maintained with the help of a
program and this position navigates through the records
according to the relationship.
Advantages of Network Model

 The data can be accessed faster as compared to the


hierarchical model. This is because the data is more
related in the network model and there can be more
than one path to reach a particular node. So the data
can be accessed in many ways.
 As there is a parent-child relationship so data integrity
is present. Any change in parent record is reflected in
the child record.
Disadvantages of Network Model

 As more and more relationships need to be handled the


system might get complex. So, a user must be having
detailed knowledge of the model to work with the
model.
 Any change like updation, deletion, insertion is very
complex.

Relational Model
Relational Model is the most widely used model. In this
model, the data is maintained in the form of a two-
dimensional table. All the information is stored in the form of
row and columns. The basic structure of a relational model is
tables. So, the tables are also called relations in the
relational model. Example: In this example, we have an
Employee table.

Advnatages of Relational Model

 Simple: This model is more simple as compared to the


network and hierarchical model.
 Scalable: This model can be easily scaled as we can
add as many rows and columns we want.
 Structural Independence: We can make changes in
database structure without changing the way to access
the data. When we can make changes to the database
structure without affecting the capability to DBMS to
access the data we can say that structural
independence has been achieved.
Disadvantages of Relatinal Model

 Hardware Overheads: For hiding the complexities


and making things easier for the user this model
requires more powerful hardware computers and data
storage devices.
 Bad Design: As the relational model is very easy to
design and use. So the users don't need to know how
the data is stored in order to access it. This ease of
design can lead to the development of a poor database
which would slow down if the database grows.
But all these disadvantages are minor as compared to the
advantages of the relational model. These problems can be
avoided with the help of proper implementation and
organisation.

Object-Oriented Data Model


The real-world problems are more closely represented
through the object-oriented data model. In this model, both
the data and relationship are present in a single structure
known as an object. We can store audio, video, images, etc
in the database which was not possible in the relational
model(although you can store audio and video in relational
database, it is adviced not to store in the relational
database). In this model, two are more objects are
connected through links. We use this link to relate one
object to other objects. This can be understood by the
example given below.
Object-Relational Model
As the name suggests it is a combination of both the
relational model and the object-oriented model. This model
was built to fill the gap between object-oriented model and
the relational model. We can have many advanced features
like we can make complex data types according to our
requirements using the existing data types. The problem
with this model is that this can get complex and difficult to
handle. So, proper understanding of this model is required.

Database Schema
A database schema is the skeleton structure that represents the
logical view of the entire database. It defines how the data is
organized and how the relations among them are associated. It
formulates all the constraints that are to be applied on the data.
A database schema defines its entities and the relationship
among them. It contains a descriptive detail of the database,
which can be depicted by means of schema diagrams. It’s the
database designers who design the schema to help programmers
understand the database and make it useful.
A database schema can be divided broadly into two categories −
 Physical Database Schema − This schema pertains to the
actual storage of data and its form of storage like files, indices,
etc. It defines how the data will be stored in a secondary storage.
 Logical Database Schema − This schema defines all the logical
constraints that need to be applied on the data stored. It defines
tables, views, and integrity constraints.
Database Instance
It is important that we distinguish these two terms individually. Database
schema is the skeleton of database. It is designed when the database
doesn't exist at all. Once the database is operational, it is very difficult to
make any changes to it. A database schema does not contain any data or
information.
A database instance is a state of operational database with data at any given
time. It contains a snapshot of the database. Database instances tend to
change with time. A DBMS ensures that its every instance (state) is in a valid
state, by diligently following all the validations, constraints, and conditions
that the database designers have imposed.

Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema
at one level of the database system without altering the schema at the next
higher level.

There are two types of data independence:

1. Logical Data Independence


o Logical data independence refers characteristic of being able to change the
conceptual schema without having to change the external schema.
o Logical data independence is used to separate the external level from the
conceptual view.
o If we do any changes in the conceptual view of the data, then the user view
of the data would not be affected.
o Logical data independence occurs at the user interface level.

2. Physical Data Independence


o Physical data independence can be defined as the capacity to change the
internal schema without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then
the Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the
internal levels.
o Physical data independence occurs at the logical interface level.
Three schema Architecture
o The three schema architecture is also called ANSI/SPARC architecture
or three-level architecture.
o This framework is used to describe the structure of a specific database
system.
o The three schema architecture is also used to separate the user
applications and physical database.
o The three schema architecture contains three-levels. It breaks the
database down into three different categories.

The three-schema architecture is as follows:


1. Internal Level
o The internal level has an internal schema which describes the physical
storage structure of the database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data
will be stored in a block.
o The physical level is used to describe complex low-level data
structures in detail.
2. Conceptual Level
o The conceptual schema describes the design of a database at the
conceptual level. Conceptual level is also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the
database and also describes what relationship exists among those
data.
o In the conceptual level, internal details such as an implementation of
the data structure are hidden.
o Programmers and database administrators work at this level.

3. External Level
o At the external level, a database contains several schemas that
sometimes called as subschema. The subschema is used to describe
the different view of the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user
group is interested and hides the remaining database from that user
group.
o The view schema describes the end user interaction with database
systems.

Mapping between Views


The three levels of DBMS architecture don't exist independently of each
other. There must be correspondence between the three levels i.e. how they
actually correspond with each other. DBMS is responsible for correspondence
between the three types of schema. This correspondence is called Mapping.

There are basically two types of mapping in the database


architecture:

o Conceptual/ Internal Mapping


o External / Conceptual Mapping

Conceptual/ Internal Mapping

The Conceptual/ Internal Mapping lies between the conceptual level and the
internal level. Its role is to define the correspondence between the records
and fields of the conceptual level and files and data structures of the internal
level.

External/ Conceptual Mapping

The external/Conceptual Mapping lies between the external level and the
Conceptual level. Its role is to define the correspondence between a
particular external and the conceptual view.

Database Language
o A DBMS has appropriate languages and interfaces to express database
queries and updates.
o Database languages can be used to read, store and update the data in
the database.

Types of Database Language

1. Data Definition Language


o DDL stands for Data Definition Language. It is used to define database
structure or pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the
number of tables and schemas, their names, indexes, columns in each table,
constraints, etc.

Here are some tasks that come under DDL:

o Create: It is used to create objects in the database.


o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.
These commands are used to update the database schema that's why they
come under Data definition language.

2. Data Manipulation Language


DML stands for Data Manipulation Language. It is used for accessing and
manipulating data in a database. It handles user requests.

Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.


o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java subprogram.
o Explain Plan: It has the parameter of explaining data.
o Lock Table: It controls concurrency.

3. Data Control Language


o DCL stands for Data Control Language. It is used to retrieve the stored or
saved data.
o The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does
not have the feature of rolling back.)

Here are some tasks that come under DCL:

o Grant: It is used to give user access privileges to a database.


o Revoke: It is used to take back permissions from the user.

There are the following operations which have the authorization of Revoke:

CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.

4. Transaction Control Language


TCL is used to run the changes made by the DML statement. TCL can be
grouped into a logical transaction.

Here are some tasks that come under TCL:

o Commit: It is used to save the transaction on the database.


o Rollback: It is used to restore the database to original since the last Commit.

Interfaces in DBMS
A database management system (DBMS) interface is a user interface that
allows for the ability to input queries to a database without using the query
language itself.
User-friendly interfaces provided by DBMS may include the following:

1. Menu-Based Interfaces for Web Clients or Browsing –


These interfaces present the user with lists of options (called menus)
that lead the user through the formation of a request. Basic advantage
of using menus is that they removes the tension of remembering
specific commands and syntax of any query language. The query is
basically composed step by step by collecting or picking options from
a menu that is shown by the system. Pull-down menus are a very
popular technique in Web based interfaces. They are also often used
in browsing interface which allow a user to look through the contents
of a database in an exploratory and unstructured manner.
2. Forms-Based Interfaces –
A forms-based interface displays a form to each user. Users can fill
out all of the form entries to insert new data, or they can fill out only
certain entries, in which case the DBMS will redeem same type of data
for other remaining entries. These types of forms are usually designed
or created and programmed for the users that have no expertise in
operating system. Many DBMSs have forms specification
languages which are special languages that help specify such forms.
Example: SQL* Forms is a form-based language that specifies
queries using a form designed in conjunction with the relational
database schema.

3. Graphical User Interface –


A GUI typically displays a schema to the user in diagrammatic
form.The user then can specify a query by manipulating the diagram.
In many cases, GUIs utilize both menus and forms. Most GUIs use a
pointing device such as mouse, to pick a certain part of the displayed
schema diagram.

4. Natural language Interfaces –


These interfaces accept request written in English or some other
language and attempt to understand them. A Natural language
interface has its own schema, which is similar to the database
conceptual schema as well as a dictionary of important words.
The natural language interface refers to the words in its schema as
well as to the set of standard words in a dictionary to interpret the
request.If the interpretation is successful, the interface generates a
high-level query corresponding to the natural language and submits it
to the DBMS for processing, otherwise a dialogue is started with the
user to clarify any provided condition or request. The main
disadvantage with this is that the capabilities of this type of interfaces
are not that much advance.

5. Speech Input and Output –


There is limited use of speech be it for a query or an answer to a
question or being a result of a request it is becoming commonplace.
Applications with limited vocabularies such as inquiries for telephone
directory, flight arrival/departure, and bank account information are
allowed speech for input and output to enable ordinary folks to access
this information.
The Speech input is detected using predefined words and used to set
up the parameters that are supplied to the queries. For output, a
similar conversion from text or numbers into speech takes place.

6. Interfaces for DBA –


Most database system contains privileged commands that can be
used only by the DBA’s staff. These include commands for creating
accounts, setting system parameters, granting account authorization,
changing a schema, reorganizing the storage structures of a
databases.

Database System Environment


The database system environment is comprised of the
components that are meant for defining and managing the data that
we collect, store, manage and use in the database environment. In
the section ahead, we will discuss all the components that constitute
the database environment and along with this, we will also discuss
the system utilities that are used to control and manage the
database.

Components of Database System


Environment
Every system environment is made up of certain components that
help the system to get organized and managed. Even the database
system environment is made up of the following components:
1. Hardware

The hardware component of the database system environment


includes all the physical devices that comprise the database system.
It includes storage devices, processors, input and output devices,
printers, network devices and many more.

2. Software

The software component of the database environment includes all


the software that we require to access, store and regulate the
database. Like operating systems, DBMS and application programs
and utilities. The operating system invokes computer hardware, and
let other software runs. DBMS software controls and regulates the
database. The application program and utilities access the database
and if required you can even manipulate the database.

3. People

If talk of the people component then it will include all the people
who are related to the database. There may be a group of people
who will access the database just to resolve their queries i.e. end-
user, there may be people that are involved in designing the
database i.e. database designer.

Some may be involved in designing the applications that will have


an interface through which data entry is possible i.e. database
programmer and analyst and some may also be there to monitor the
database i.e. database administrator.

4. Procedures

The procedure component of the database environment is nothing


but the function that regulates and controls the use of the database.

5. Data

Data component include a collection of related data which are the


known fact that can be recorded and it has an implicit meaning in
the database.

DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server
architecture is used to deal with a large number of PCs, web servers,
database servers and other components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which
are connected via the network.
o DBMS architecture depends upon how users are connected to the database
to get their request done.

Types of DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically,


database architecture is of two types like: 2-tier architecture and 3-tier
architecture.

1-Tier Architecture
o In this architecture, the database is directly available to the user. It means
the user can directly sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't
provide a handy tool for end users.
o The 1-Tier architecture is used for development of the local application,
where programmers can directly communicate with the database for the
quick response.

2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier
architecture, applications on the client end can directly communicate with
the database at the server side. For this interaction, API's
like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query
processing and transaction management.
o To communicate with the DBMS, client-side application establishes a
connection with the server side.
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server.
In this architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which
further communicates with the database system.
o End user has no idea about the existence of the database beyond the
application server. The database also has no idea about any other user
beyond the application.
o The 3-Tier architecture is used in case of large web application.

Centralized Database Management System


A centralized database is stored at a single location such as a
mainframe computer. It is maintained and modified from that
location only and usually accessed using an internet connection
such as a LAN or WAN. The centralized database is used by
organisations such as colleges, companies, banks etc.

Distributed DBMS
A distributed database system allows applications to access data from local
and remote databases.

Client-server architecture of Distributed system.

 A client server architecture has a number of clients and a few servers


connected in a network.
 A client sends a query to one of the servers. The earliest available server
solves it and replies.
 A Client-server architecture is simple to implement and execute due to
centralized server system.
2. Collaborating
server
architecture.

 Collaborating server architecture is designed to run a single query on


multiple servers.
 Servers break single query into multiple small queries and the result is
sent to the client.
 Collaborating server architecture has a collection of database servers.
Each server is
capable for executing
the current
transactions across
the databases.
3. Middleware architecture.

 Middleware architectures are designed in such a way that single query is


executed on multiple servers.
 This system needs only one server which is capable of managing queries
and transactions from multiple servers.
 Middleware architecture uses local servers to handle local queries and
transactions.
 The softwares are used for execution of queries and transactions across
one or more independent database servers, this type of software is called
as middleware.
Classification of Database Management
System
Classification of database management system is based on
various parameters such as the kind of data model used to construct
the DBMS, the number of users that will be using the database
system, the way in which the database is distributed. In the section
ahead, we will be discussing all the criteria on which the database
system can be classified.

1. Based on Data Model


2. Based on Number of Users
3. Based on Database Distribution
4. Based on Cost of Database
5. Based on Usage
6. Based on Flow Control

1. Based on Data Model


The data model defines the physical and logical structure of a
database which involves the data types, the relationship among the
data, constraints applied on the data and even the basic operations
specifying retrieval and updation of data in the database. Depending
upon how the data is structured, data models are further classified
into:

a. Relational Data Model

In the relational data model, we use tables to represent data and


the relationship among that data. Each of the tables in the relational
data model has a unique name. A table has multiple columns where
each column name is unique. A table holds records which has value
for each column of the table.

We also refer to the relational database model as a record-based


data model as it holds records of fixed-format. The relational
database model is the most currently used data model.

b. Entity-Relationship Model

The Entity-Relationship model (E-R data model) represents data


using objects and the relationship among these objects. These
objects are referred to as entities that represent the real ‘thing’ or
‘object’ in the real world.

. Object-Based Data Model

Nowadays, object-oriented programming such as Java, C++, etc. is


widely used to develop most of the software. This motivated the
development of an object-based data model. The object-based data
model is an extension of the E-R model which also include notion for
encapsulation, methods. There is also an object-relational data
model which is a combination of the object-oriented data model and
relational data model.

d. Semistructured Data Model

The semistructured data model is different from what we have


studied above. In the semistructured data model, the data items or
objects of the same kind might have a different set of attributes.
The Extensible Markup Language represents the semistructured
data.

2. Classification Based on Number of Users


The database management system can also be classified on the
basis of its user. So, a DBMS can either be used by a single user or it
can be used by multiple users. The database system that can be
used by a single user at a time is referred to as a single-user
system and the database system that can be used by multiple
users at a time is referred to as a multiple user system.

3. Based on Database Distribution


Depending on the distribution of the database over numerous sites
we can classify the database as:

a. Centralized DBMS

In the centralized DBMS, the entire database is stored in a single


computer site. Though the centralized database support multiple
users still the DBMS software and the data both are stores on a
single computer site.

b. Distributed DBMS

In the distributed DBMS (DDBMS) the database and the DBMS


software are distributed over many computer sites. These computer
sites are connected via a computer network. The DDBMS is further
classified as homogeneous DDBMS and heterogeneous DDBMS.

 Homogeneous DDBMS: The homogeneous DDBMS has the


same DBMS software at all the distributed sites.
 Heterogeneous DDBMS: The heterogeneous DDBMS has
different DBMS software for different sites.

4. Based on Cost of Database


Well, it is quite difficult to classify the database on the basis of its
cost as nowadays you can have free open source DBMS products
such as MySQL and PostgreSQL. Although the personal version of
RDBMS can cost up to $100.

The large systems along with the components that can handle
distribution of database, replication of database, parallel processing,
mobile capability and so on can be sold in the form of licenses. The
site license allows unlimited use while another kind of license limits
the number of concurrent licenses.

5. Classification Based on Usage


On the basis of the access path that is used to store the files, the
database can be classified as general-purpose DBMS and special-
purpose DBMS. The special-purpose DBMS is the one that is
designed for a specific application and it can not be used for another
application without performing any major changes we refer to
this as online transaction processing (OLTP). The OLTP system
supports a large number of transactions concurrently without any
delay.
The general-purpose DBMS is the one that is designed to meet the
need of as many applications as possible.

6. Based on Flow Control


Based upon the flow of control from application to DBMS the
database management system is broadly classified into two types
active database management system and passive database
management system.

You might also like