Dbms Notes
Dbms Notes
Dbms Notes
Data is the collection of raw facts, figures, statistics which can be processed to produce
meaningful information.
For Ex: Name of student, marks obtained in the exam, rate, quantity, sales figures
Information is the processed data with some definite meaning.
For Ex: Student marks are processed to get the final result.
Payroll generation
Example for data and information
Student data maintenance- regno, name, class, comb fees
Payroll generation-empno,empname,empaddress,basicpay
Paper is used for storing processed data Electronic devices are used for storing
processed data
Initial setup cost is low Initial setup cost is high
Data processing
Data storage
Data input:
The data has to be collected i.e. gathering data, fact, figures and strategies.
The collected data should be given as input to the processing system using various input devices
such as keyboard, mouse, scanners, digital camera.
This stage converts the raw data into computer understandable form.
Data processing
This is a series of operations that converts the data into information
It includes various operations such as calculations, classification, sorting of data, data verification,
data manipulation, indexing, accessing, extracting.
Data storage
The processed data (information) are stored in suitable devices such as hard disk, CD and other
secondary storage devices permanently.
Output
The processed data is presented to the user in the required format such as hardcopy or softcopy
represented in the form of printouts, animated audio or video, picture.
Communication
The processed data (information) can be communicated to its users through various modern
communication modes using computers and other devices.
DATABASE
DEFINITION
A database is a collection of logically related data organized in a way that data can be easily
accessed, managed and updated
Applications of Database
Banking: For customer information, accounts, and loans, and banking transactions.
Water meter billing : The RR number and all the details are stored in the database and connected to
the server based works.
Rail and Airlines: For reservations and schedule information. Airlines were among the first to use
databases in a geographically distributed manner terminals situated around the world accessed the
central database system through phone lines and other data networks.
Credit card transactions: For purchases on credit cards and generation of monthly statements.
Telecommunication: For keeping records of calls made, generating monthly bills, maintaining
balances on prepaid calling cards, and storing information about the communication networks.
Finance: For storing information about holdings, sales, and purchases of financial instruments such
as stocks and bonds.
Manufacturing: For management of supply chain and for tracking production of items in factories,
inventories of items in warehouses/ stores, and orders for items.
Human resources: For information about employees recruitment, salaries, payroll taxes and
benefits, and for generation of paychecks
DOMAIN
DEFINITION
A DBMS is a software that allows users to create, define, manipulate and maintain a database.
In the database the data is stored at central location and is shared among multiple users.
Some examples of popular DBMS are: MySql, Oracle, Sybase, Microsoft Access and IBM DB2 etc.
Data sharing
Data stored in the database can be shared among multiple users or application programs. New
applications can be developed using same stored data.
Data security
As the data is stored at a centralized location, enforcing security constraints such as identifying key
attributes, providing passwords and user rights are easy to implement.
➢ Initial investments are high for procuring software, hardware and training.
➢ Additional processing is involved in implementing data integrity, data security and data
sharing.
➢ Backup and recovery operations are complex in DBMS environment.
DBMS USERS
DBA
RDBMS PROGRAMMER
DATABASE
END USERS
System analyst
System analysts determine the requirement of end users develop specifications for transactions
that meet these requirements.
Application programmers
Application programmers implement these parameters in programs.
End users
People who require access to the database for querying, updating and generating reports.
Database designers
Database designers are responsible to design appropriate structures to represent and store the data.
A major purpose of a database system is to provide users with an abstract view of the data i.e. the
system hides certain details of how the data are stored and maintained.
The database description can be define at three levels namely
1. Internal or physical level
2. Conceptual or logical level
3. External or view level
Schema: Description of data at some level or the overall design of the database
Each level has its own schema.
Schema objects are database objects that contain data or govern or perform operations on data.
INTERNAL LEVEL
This is the lowest level of data abstraction that
describes how the data are physically stored and organized on the storage medium.
The physical level describes complex low-level data structures in detail.
At this level various aspects are considered such as storage space allocation techniques for data and
indexes, access paths such as indexes, data compression and encryption techniques and record
placement.
CONCEPTUAL LEVEL
This level of abstraction deals with the logical structure of the entire database and thus, it is also
known as LOGICAL LEVEL
Conceptual level describes what data are stored in the database and what relationships exist among
those data.
That is, it hides the complexity of physical storage structures.
The conceptual level is the overall view of the database and it includes all the information that is
going to be represented in the database.
EXTERNAL LEVEL
This is the highest level of abstraction and it is close to the user’s and it is concerned with the way
in which data are viewed by individual users. It is also known as VIEW LEVEL
Most of the users of the database are not concerned with all the information contained in the
database. Instead they need only a part of the database relevant to them. Eg: bank account holder
DATA INDEPENDENCE
The major objective of 3 level architecture is to provide data independence, which means
upperlevels are unaffected by change in lowerlevels.
Definition
The ability to modify schema definition in one level without affecting a schema definition in the
next higher level is called data independence.
Data independence occurs because when one schema is changed at one level, the schema at next
level remains unchanged and only the mapping between the levels is changed.
What is mapping?
The process of transforming the requests and results between various levels of DBMS architecture
is known as mapping.
CLASSIFICATION OF DATA INDEPENDENCE
Data
Independence
File Entity
Data Models Normalization
organization relationship
Internal
Physical data
Independence mapping
mapping
mapping
Conceptual Logical data External
independence
Types of Keys
Primary key: Primary key is a key that uniquely identify each row/record in a database table.
• A table can have only one primary key
• It is also called as super key
• The value of the primary key should not be repeated and it should not be null/blank.
Super Key: An attribute or set of attributes that uniquely identifies a tuple within a relation/table.
Alternate Key or Secondary key: The alternate keys of any table are those candidate keys which are
not currently selected as a primary key
For eg: In student database regno is the primary key whereas name, class, comb serves as an
alternate key.
Candidate key: Any set of one or more columns whose combined values are unique among all
occurrences
For eg: In student database (Name, class) – whose combined values are unique
Foreign Key: A key used to link two tables together is called foreign key. It is also known as
External key/referencing key.
Composite Key: when multiple fields are used as a primary key they are called a composite key
FILE ORGANIZATION
What is file organization?
It refers to the way records are physically arranged on a storage device
There are three types of file organization used in DBMS data processing. They are
Sequential
Direct Access /Random Access
Indexed Sequential access method (ISAM)
HIERARCHICAL MODEL
• The hierarchical data model is the oldest type of data model, developed by IBM in 1968.
• In this model data is represented in a tree-like structure.
• A hierarchical database consists of collection of records which are connected to one another
through links.
• Each record is a collection of attributes, each of which contains only one data value.
• A link is an association between two records.
• To create a link between the records, the hierarchical model uses parent-child relationship.
• In hierarchical model data is represented as collection of records and the relationship are
represented by links.
ADVANTAGES
• It provides an easy way for addition and deletion of new records into the database.
• The data at the top level in the hierarchy is fast to access.
• This model work well to represent one-to-one and one-to-many relationship.
DISADVANTAGES
• Accessing the data at lower level in the hierarchy is time consuming.
• Searching a particular information requires dbms to run through entire model.
• The physical links make it difficult to expand or modify the database and the changes require
substantial redesigning efforts.
• This model does not support many-to-many relationship.
• This model was presented by Conference on Data Systems Languages (CODASYL) in 1969.
• In a network model the data is represented by a collection of records,
• Relationships among data are represented by links.
• The data is organized in the form of graphs and some entities can be accessed through
several paths.
• However, the link in a network data model represents an association between precisely two
records.
• Each record of a particular record type represents a node.
• All the nodes are linked to each other without any hierarchy.
ADVANTAGES
• The network data model supports one-to-one and many to many relation.
• The network model is simple and easy to design.
DISADVANTAGES
• The network data model is that it can be quite complicated to maintain all the links and a
single broken link can lead to problems in the database.
• In addition, since there are no restrictions on the number of relationships, the database
design can become complex.
RELATIONAL DATABASE MODEL
• The relational data model was developed by E. F. Codd in 1970.
• All data is maintained in the form of two-dimensional tables (generally, known as relations)
consisting of rows and columns.
• Each table consists of rows and columns.
• A row (record) is alos called as tuple.
• A column hearder is called an attribute
• A table is a relation.
• A domain is a set of atomic values drawn from the column.
• The relationship between the two tables is implemented through a common attribute in the
tables.
• This makes the querying much easier in a relational database system.
• Oracle, Sybase, DB2, Ingres, Informix, MS-SQL Server are few of the popular relational
DBMSs
DBMS ARCHITECTURE
ONE-TIER ARCHITECTURE
• The user directly operates on the database.
• Any changes done here will directly done on the DBMS itself.
• It does not provide any tools for end users
• Database designer and programmers use single-tier architecture
• Example: Client platform, hosting web browser
•
TWO-TIER ARCHITECTURE
• Two-tier Client / Server architecture is used for User Interface program and Application
Programs that runs on client side.
• An interface is provided to access the DBMS.
• As interface called ODBC(open database connectivity) provides an API, that allow client-side
program to call the DBMS
• A client program may connect to several DBMS's.
THREE-TIER ARCHITECTURE
• Three-tier Client / Server database architecture is commonly used architecture for many
web applications
APPLICATION SERVER:
• Intermediate layer called Application server stores the web connectivity software and the
business logic (constraints) part of application used to access the right amount of data from
the database server.
• This layer acts like medium for sending partially processed data between the database
server and the client.
DATABSE TIER
• In this tier only database, query processing languages, relations and their constraints
resides.
PRESENTATION TIER/CLIENT/USER TIER
• This tier is used by end users.
• It provides multiple views of database.
• All views are generated by applications which resides in application tier.
Entity Relationship (ER) diagram
Entity is an object, place, person or class. It is represented using rectangle.
Structure
ATTRIBUTE
RELATION
ENTITY ATTRIBUTE
ATTRIBUTE
Example
NAME
TEACH
TEACHER SUBJECT
IDNO
Data warehouse
A data ware house is a repository of an organization’s electronically stored data.
OR
A data warehouse is a structured repository of historic data.
Advantages
• Enhance end-user access to reports and analysis of information.
• Increases data consistency.
• Increases productivity and decreases computing costs.
• Able to combine data from different sources, in one place.
• Data warehouses provide an infrastructure that could support changes to data and
replication of the changed data back into the operational systems.
Disadvantages
• Extracting, cleaning and loading data could be time consuming.
• Data warehouses can get outdated relatively quickly.
• Providing training to end-users.
• Security could develop into a serious issue, especially if the data warehouses is internet
accessible.
• A data warehouses is usually not static and maintenance costs are high
DATA MINING
Some of the data mining software’s are SPSS, SAS, Think Analytics and G-Sat etc.