Mca Dbms Notes Unit 1
Mca Dbms Notes Unit 1
Mca Dbms Notes Unit 1
UNIT-1
As the name suggests, the database management system consists of two parts. They are:
1. Database and
2. Management System
What is a Database?
To find out what database is, we have to start from data, which is the basic building block of
any DBMS.
Data: Facts, figures, statistics etc. having no particular meaning (e.g. 1, ABC, 19 etc).
Record: Collection of related data items, e.g. in the above example the three data items had
no meaning. But if we organize them in the following way, then they collectively represent
meaningful information.
1 ABC 19
1 ABC 19
2 DEF 22
3 XYZ 28
The columns of this relation are called Fields, Attributes or Domains. The rows
are called Tuples or Records.
DATA INFORMATION
Databases touch all aspects of our lives. Some of the major areas of application are as
follows:
Sales: For customer, product, and purchase information.
Accounting: For payments, receipts, account balances, assets and other accounting
information.
Human resources: For information about employees, salaries, payroll taxes, and
benefits, and for generation of paychecks.
Manufacturing: For management of the supply chain and for tracking production of
items in factories, inventories of items in warehouses and stores, and orders for items.
Online retailers: For sales data noted above plus online order tracking, generation of
recommendation lists, and maintenance of online product evaluations.
Credit card transactions: For purchases on credit cards and generation of monthly
statements.
Finance: For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds; also for storing real-time market data to enable
online trading by customers and automated trading by the firm.
Universities: For student information, course registrations, and grades (in addition to
standard enterprise information such as human resources and accounting).
Airlines: For reservations and schedule information. Airlines were among the first to
use databases in a geographically distributed manner.
Since files and application programs are created by different programmers over a long
period of time, the files are likely to be having different formats and the programs may be written
in several programming languages. Moreover, the same piece of information may be duplicated
in several places. This redundancy leads to higher storage and access cost. In addition, it may
lead to data inconsistency.
The conventional file processing environments do not allow needed data to be retrieved
in a convenient and efficient manner. Better data retrieval system must be developed for general
use.
Data Isolation:
Since data is scattered in various files, and files may be in different formats, it is difficult
to write new application programs to retrieve the appropriate data.
In order to improve the overall performance of the system and obtain a faster response
time, many systems allow multiple users to update the data simultaneously. In such an
environment, interaction of concurrent updates may result in inconsistent data.
Security Problems:
Not every user of the database system should be able to access all the data. For example,
in banking system, payroll personnel need only that part of the database that has information
about various bank employees. They do not need access to information about customer accounts.
It is difficult to enforce such security constraints.
Integrity Problems:
The data values stored in the database must satisfy certain types of consistency constraints. For
example, the balance of a bank account may never fall below a prescribed amount. These
constraints are enforced in the system by adding appropriate code in the various application
programs. When new constraints are added, it is difficult to change the programs to enforce
them. The problem is compounded when constraints involve several data items for different files.
Atomicity Problem:
A computer system like any other mechanical or electrical device is subject to failure. In
many applications, it is crucial to ensure that once a failure has occurred and has been detected,
the data are restored to the consistent state existed prior to the failure.
VIEW OF DATA
Database is a collection of large volumes of data. A user will not always require
complete data. The responsibility of the database system is to provide only the data that is
required by the user.
Data Abstraction:
Database Systems are made up of complex data structures. The data abstraction is the
process of showing only necessary data and hiding the unnecessary data is know as data
abstraction. In DBMS there are 3 levels of data abstraction. The goal of the abstraction in the
DBMS is to separate the users request and the physical storage of data in the database.
Storage Disk
3 Levels of Abstraction:
The lowest Level of Abstraction describes “How” the data are actually stored in DB.
The physical level describes complex low level data structures in detail.
It is also called as Storage level or Internal level
This next higher level of data Abstraction describes “What” data are to be stored in the
database and what relationships exist among those data.
Database Administrators use the logical level of abstraction.
It is also called as Conceptual level.
It is the highest level of data Abstracts that describes only part of entire database.
Different users require different types of data elements from each
database. The system may provide many views for the some database.
It is also called as External level / View level
Data Independence
o Data independence can be explained using the three-schema
architecture.
o Data independence refers characteristic of being able to modify the
schema at one level of the database system without altering the
schema at the next higher level.
There are two types of data independence:
1. Logical Data Independence
Logical data independence refers characteristic of being able to
change the conceptual schema without having to change the
external schema.
Logical data independence is used to separate the external level
from the conceptual view.
Instance:-
The data which is stored in the database at a particular moment of time is called an
instance of the database.
Database schema defines the variable declarations in tables that belong to a particular
database; the value of these variables at a moment of time is called the instance of
that database.
Student
In the above table 1201, 1202, Venkat etc are said to be instance of student table.
Schema:
Data Models
Underlying the structure of a database is the data model: a collection of conceptual tools
for describing data, data relationships, data semantics, and consistency constraints. A data
model provides a way to describe the design of a database at the physical, logical, and view
levels.
• Relational Model.
The relational model uses a collection of tables to represent both data and the
relationships among those data. Each table has multiple columns, and each column
has a unique name. Tables are also known as relations. The relational model is an
example of a record-based model.
Record-based models are so named because the database is structured in fixed-format
records of several types. Each table contains records of a particular type.
Each record type defines a fixed number of fields, or attributes. The columns of the
table correspond to the attributes of the record type.
The relational data model is the most widely used data model, and a vast majority of
current database systems are based on the relational model.
Entity-Relationship Model.
The entity-relationship (E-R) data model uses a collection of basic objects, called
entities, and relationships among these objects.
An entity is a “thing” or “object” in the real world that is distinguishable from other
objects. The entity-relationship model is widely used in database design.
The semi-structured data model permits the specification of data where individual
data items of the same type may have different sets of attributes.
This is in contrast to the data mode is mentioned earlier, where every data item of a
particular type must have the same set of attributes.
The Extensible Markup Language (XML) is widely used to represent semi-
structured data.
Historically, the network data model and the hierarchical data model preceded the
relational data model.
These models were tied closely to the underlying implementation, and complicated the
task of modeling data.
As a result they are used little now, except in old database code that is still in service in
some places.
Database Languages
A database system provides a data-definition language to specify the database
schema and a data-manipulation language to express database queries and updates. In
practice, the data-definition and data-manipulation languages are not two separate
languages; instead they simply form parts of a single database language, such as the widely
used SQL language.
Data-Manipulation Language
We specify the storage structure and access methods used by the database system by a set of
statements in a special type of DDL called a data storage and definition language. These
statements define the implementation details of the database schemas, which are usually
hidden from the users. The data values stored in the database must satisfy certain consistency
constraints.
A database system is partitioned into modules that deal with each of the responsibilities
of the overall system. The functional components of a database system can be broadly divided
into the storage manager and the query processor components.
The storage manager is important because databases typically require a large amount of
storage space. Some Big organizations Database ranges from Giga bytes to Tera bytes. So the
main memory of computers cannot store this much information, the information is stored on
disks. Data are moved between disk storage and main memory as needed.
The query processor also very important because it helps the database system simplify and facilitate
access to data. So quick processing of updates and queries is important. It is the job of the database
system to translate updates and queries written in a nonprocedural language.
Naïve users:
Naive user is some one ,who don't know the underlying details of the software. For example in
case of Database Naive Users are unsophisticated users who interact with the system by using
permanent application programs (e.g. automated teller machine).They don't know the coding
details.
Application Programmers:
(RAD) tools are tools that enable an application programmer to construct forms and reports
without writing a program.
Sophisticated Users
Sophisticated users interact with the system without writing programs. Instead, they form
their requests in a database query language. They submit each such query to a query
processor, whose function is to break down DML statements into instructions that the storage
manager understands. Analysts who submit queries to explore data in the database fall in this
category.
Specialized Users:
Specialized users are sophisticated users who write specialized database applications that do
not fit into the traditional data-processing framework.
Among these applications are computer-aided design systems, knowledge base and expert
systems, systems that store data with complex data types (for example, graphics data and
audio data), and environment-modeling systems.
StorageManager:
A storage manager is a program module that provides the interface between the low level
data stored in the database and the application programs and queries submitted to the system.
The storage manager is responsible for the interaction with the file manager. The storage
manager translates the various DML statements into low-level file-system commands. Thus, the
storage manager is responsible for storing, retrieving, and updating data in the database.
Authorization and integrity manager which tests for the satisfaction of integrity
constraints and checks the authority of users to access data.
Transaction manager which ensures that the database itself remains in a consistent
state despite system failures, and that concurrent transaction executions proceed without
conflicting.
File manager: which manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
Buffer manager which is responsible for fetching data from disk storage into main
memory. Storage manager implements several data structures as part of the physical system
implementation. Data files are used to store the database itself. Data dictionary is used to stores
metadata about the structure of the database, in particular the schema of the database.
DDL interpreter: It interprets DDL statements and records the definitions in the data dictionary.
DML compiler: It translates DML statements in a query language into an evaluation plan
consisting of low-level instructions that the query evaluation engine understands.
Query evaluation engine: It executes low-level instructions generated by the DML compiler.
Application Architectures:
Most users of a database system today are not present at the site of the database system,
but connect to it through a network. We can therefore differentiate between client machines, on
which remote database users’ work, and server machines, on which the database system runs.
Database applications are usually partitioned into two or three parts. They are:
Two-Tier Architecture:
The application is partitioned into a component that resides at the client machine,
which invokes database system functionality at the server machine through query language
statements. Application program interface standards like ODBC and JDBC are used for
interaction between the client and the server.
Three-Tier Architecture:
The client machine acts as merely a front end and does not contain any direct database
calls. Instead, the client end communicates with an application server, usually through forms
interface. The application server in turn communicates with a database system to access data.
The business logic of the application, which says what actions to carry out under what
conditions, is embedded in the application server, instead of being distributed across multiple
clients. Three-tier applications are more appropriate for large applications, and for applications
that run on the World Wide Web.
DATABASE USERS:
Database users are the one who really use and take the benefits of database. There will be
different types of users depending on their need and way of accessing the database.
1. Application Programmers - They are the developers who interact with the database
by means of DML queries. These DML queries are written in the application programs
like C, C++, JAVA, Pascal etc. These queries are converted into object code to
communicate with the database. For example, writing a C program to generate the
report of employees who are working in particular department will involve a query to
fetch the data from database. It will include a embedded SQL query in the C Program.
2. Sophisticated Users - They are database developers, who write SQL queries to
select/insert/delete/update data. They do not use any application or programs to request
the database. They directly interact with the database by means of query language like
SQL. These users will be scientists, engineers, analysts who thoroughly study SQL and
DBMS to apply the concepts in their requirement. In short, we can say this category
includes designers and developers of DBMS and SQL.
3. Specialized Users - These are also sophisticated users, but they write special database
application programs. They are the developers who develop the complex programs to
the requirement.
4. Stand-alone Users - These users will have stand –alone database for their personal
use. These kinds of database will have readymade database packages which will have
menus and graphical interfaces.
5. Native Users - these are the users who use the existing application to interact with the
database. For example, online library system, ticket booking systems, ATMs etc which
has existing application and users use them to interact with the database to fulfill their
requests
DATABASE ADMINISTRATORS:
A database administrator's responsibilities can include the following tasks.
Data processing tasks such as payroll were automated, with data stored on tapes.
Data could also be input from punched card decks, and output to printers.
Late 1960s and 1970s: The use of hard disks in the late 1960s changed the scenario
for data processing greatly, since hard disks allowed direct access to data.
With disks, network and hierarchical databases could be created that allowed data
structures such as lists and trees to be stored on disk. Programmers could construct and
manipulate these data structures.
With disks, network and hierarchical databases could be created that allowed data
structures such as lists and trees to be stored on disk. Programmers could construct and
manipulate these data structures.
In 1980s
Initial commercial relational database systems, such as IBM DB2, Oracle, Ingress, and
DEC Rdb, played a major role in advancing techniques for efficient processing of
declarative queries.
In the early 1980s, relational databases had become competitive with network and
hierarchical database systems even in the area of performance.
The 1980s also saw much research on parallel and distributed databases, as well as
initial work on object-oriented databases.
Early 1990s:
Decision support and querying re-emerged as a major application area for databases.
The major event was the explosive growth of the World Wide Web.
Databases were deployed much more extensively than ever before. Database systems
now had to support very high transaction processing rates, as well as very high
reliability and 24 * 7 availability (availability 24 hours a day, 7 days a week, meaning
no downtime for scheduled maintenance activities).
DATABASE DESIGN:
The database design process can be divided into six steps. The ER Model is most relevant
to the first three steps. Next three steps are beyond the ER Model.
1. Requirements Analysis:
The very first step in designing a database application is to understand what data is to be
stored in the database, what applications must be built on top of it, and what operations are most
frequent and subject to performance requirements. The database designers collect information of
the organization and analyzer, the information to identify the user’s requirements. The database
designers must find out what the users want from the database.
Once the information is gathered in the requirements analysis step a conceptual database design
is developed and is used to develop a high level description of the data to be stored in the
database, along with the constraints that are known to hold over this data. This step is often
carried out using the ER model, or a similar high-level data model.
In this step convert the conceptual database design into a database schema (Logical
Database Design) in the data model of the chosen DBMS. We will only consider relational
DBMSs, and therefore, the task in the logical design step is to convert an ER schema into a
relational database schema. The result is a conceptual schema, sometimes called the logical
schema, in the relational data model.
The first three steps are more relevant to the ER Model. Once the logical scheme is
defined designer consider the physical level implementation and finally provide certain security
measures. The remaining three steps of database design are briefly described below:
4. Schema Refinement:
The fourth step in database design is to analyze the collection of relations in our
relational database schema to identify potential problems, and to refine it. In contrast to the
requirements analysis and conceptual design steps, which are essentially subjective, schema
refinement can be guided by some elegant and powerful theory.
In this step we must consider typical expected workloads that our database must support
and further refine the database design to ensure that it meets desired performance
ER model /Diagrams
ER model stands for an Entity-Relationship model. It is a high-level data model. This
model is used to define the data elements and relationship for a specified system.
It develops a conceptual design for the database. It also develops a very simple and easy
to design view of data.
In ER modeling, the database structure is portrayed as a diagram called an entity-
relationship diagram.
For example, Suppose we design a school database. In this database, the student will be
an entity with attributes like address, name, id, age, etc. The address can be another entity
with attributes like city, street name, pin code, etc and there will be a relationship
between them.
Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.
a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't contain
any key attribute of its own. The weak entity is represented by a double rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent an
attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.
a. Key Attribute
The key attribute is used to represent the main characteristics of an entity. It represents a primary
key. The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute
An attribute that composed of many other attributes is known as a composite attribute. The
composite attribute is represented by an ellipse, and those ellipses are connected with an ellipse.
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a multivalued attribute.
The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It can be
represented by a dashed ellipse.
For example, A person's age changes over time and can be derived from another attribute like
Date of birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is used to
represent the relationship.
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is known as one to
one relationship.
For example, A female can marry to one male, and a male can marry to one female.
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity on the
right associates with the relationship then this is known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by the only
specific scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity on the
right associates with the relationship then it is known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an entity on
the right associates with the relationship then it is known as a many-to-many relationship.
For example, Employee can assign by many projects and project can have many employees.
1. PARTICIPATION CONSTRAINTS
1. Total participation
2. Partial participation
1. Total Participation-
It specifies that each entity in the entity set must compulsorily participate in at least one
relationship instance in that relationship set.
That is why, it is also called as mandatory participation.
Total participation is represented using a double line between the entity set and
relationship set.
Example-
2. WEAK ENTITIES
CLASS HIERARCHIES
Generalization
o Generalization is like a bottom-up approach in which two or more entities of lower level
combine to form a higher level entity if they have some attributes in common.
o In generalization, an entity of a higher level can also combine with the entities of the
lower level to form a further higher level entity.
o Generalization is more like subclass and superclass system, but the only difference is the
approach. Generalization uses the bottom-up approach.
o In generalization, entities are combined to form a more generalized entity, i.e., subclasses
are combined to make a superclass.
For example, Faculty and Student entities can be generalized and create a higher level entity
Person.
Specialization
o Specialization is a top-down approach, and it is opposite to Generalization. In
specialization, one higher level entity can be broken down into two lower level entities.
o Specialization is used to identify the subset of an entity set that shares some
distinguishing characteristics.
o Normally, the superclass is defined first, the subclass and its related attributes are defined
next, and relationship set are then added.
Aggregation
In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.
For example: Center entity offers the Course entity act as a single entity in the relationship
which is in a relationship with another entity visitor. In the real world, if a visitor visits a
coaching center then he will never enquiry about the Course only or just about the Center instead
he will ask the enquiry about both.
Relational model can represent as a table with columns and rows. Each row is known as a tuple.
Each table of the column has a name or attribute.
Attribute: It contains the name of a column in a particular table. Each attribute Ai must have a
domain, dom(Ai)
Relational instance: In the relational database system, the relational instance is represented by a
finite set of tuples. Relation instances do not have duplicate tuples.
Relational schema: A relational schema contains the name of the relation and name of all
columns or attributes.
Relational key: In the relational key, each row has one or more attributes. It can identify the row
in the relation uniquely.
In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and AGE are the
attributes.
The instance of schema STUDENT has 5 tuples.
t3 = <Laxman, 33289, 8583287182, Gurugram, 20>
Properties of Relations
Views in SQL
Views in SQL are considered as a virtual table. A view also contains rows and columns.
To create the view, we can select the fields from one or more tables present in the
database.
A view can either have specific rows based on certain condition or all the rows of a
table.
Sample table:
Student_Detail
Student_Marks
1. Creating view
A view can be created using the CREATE VIEW statement. We can create a view from a single
table or multiple tables.
Syntax:
CREATE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE condition;
In this example, we create a View named DetailsView from the table Student_Detail.
Query:
CREATE VIEW DetailsView AS
SELECT NAME, ADDRESS
FROM Student_Details
WHERE STU_ID < 4;
Just like table query, we can query the view to view the data.
SELECT * FROM DetailsView;
Output:
NAME ADDRESS
Stephan Delhi
Kathrin Noida
David Ghaziabad
View from multiple tables can be created by simply include multiple tables in the SELECT
statement.
In the given example, a view is created named MarksView from two tables Student_Detail and
Student_Marks.
Query:
CREATE VIEW MarksView AS
SELECT Student_Detail.NAME, Student_Detail.ADDRESS, Student_Marks.MARKS
FROM Student_Detail, Student_Mark
WHERE Student_Detail.NAME = Student_Marks.NAME;
SELECT * FROM MarksView;
4. Deleting View
Syntax
DROP VIEW view_name;
Example:
DROP VIEW MarksView;