0% found this document useful (0 votes)
8 views

Chapter 1 - Introduction to Database Systems

The document introduces database systems, highlighting their evolution from early data storage methods to modern databases that organize and manage data efficiently. It discusses the structure and properties of databases, the role of Database Management Systems (DBMS), and the advantages of using databases over traditional file systems. Additionally, it outlines the various actors involved in database management, including Database Administrators, designers, and end users.

Uploaded by

Tesfalegn Yakob
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Chapter 1 - Introduction to Database Systems

The document introduces database systems, highlighting their evolution from early data storage methods to modern databases that organize and manage data efficiently. It discusses the structure and properties of databases, the role of Database Management Systems (DBMS), and the advantages of using databases over traditional file systems. Additionally, it outlines the various actors involved in database management, including Database Administrators, designers, and end users.

Uploaded by

Tesfalegn Yakob
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Chapter 1.

Introduction to Database Systems (G2 CS)

Introduction

During early computer days, data was collected and stored on tapes, which were mostly write-
only, which means once data is stored on it, it can never be read again. They were slow and
bulky, and soon computer scientists realized that they needed a better solution to this problem.
Database and database technology have a major impact on the growing use of computers. It is
fair to say that databases play a critical role in almost all areas where computers are used,
including business, e-commerce, engineering, medicine, law, education, library services and
more…………

Database

Database is a collection of related data organized in a way that data can be easily accessed,
managed and updated. By data, we mean known facts that can be recorded and that have implicit
meaning. For eg: Consider the names, telephone numbers, and address of the people you may
know. You may have recorded these data in an indexed book or stored in a hard disk of a
computer with help of some softwares like MS-Excel or MS-Access. This collection of related
data with an implicit meaning is a database.

Generally, a database has the following properties:-

 A database represents some aspects of the real world, sometimes called the mini world
or Universe of Discourse (UoD). Changes to the mini world are reflected in the database
 A database is a logically coherent collection of data with some inherent meaning. A
random mixture/collection of data cannot be correctly referred to as a database.
 A database is designed, built, and populated with data for a specific purpose. It has an
intended group of users and some fixed applications in which these users are interested.

In other words, a database has some source from which data is derived, some degree of
interaction with events in the real world, and an audience that is actively interested in its
contents.

File Organization

The database is stored as a collection of files. A file is organized logically as a sequence of


records. A record is a sequence of fields. A file organization refers to the organization of the
data of a file into records, blocks, and access structures; this includes the way records and blocks

1
Chapter 1. Introduction to Database Systems (G2 CS)

are placed on the storage medium and interlinked. The operating system (O.S) decides how the
files are stored in a secondary storage medium-referred as file system.

A database can be of any size and complexity. For eg: the list of names and addresses of the
students in a class may consist of only a few hundreds of records each with a simple structure.
On the other hand, a large computerized library may contain millions of entries organized under
different categories – primarily by author’s name, by subject, by book title, etc…. – with each
category organized alphabetically. This huge amount of information must be organized and
managed, so that, users can easily search for, retrieve, and update the data as needed.

An example of a large commercial database is Amazon.com. It contains data for over 20 million
books, CDs and DVDs, videos, games, electronic items, and many other items. This database
occupies 2TB storage area and is stored on 200 different server computers. About 15 million
visitors access Amazon.com each day and use the database to make purchases. The database is
continually updated as new books and other items are added to the inventory and stock quantities
are updated as purchases are transacted. About 100 employees are responsible keeping the
Amazon.com database up-to-date.

A database may be generated and maintained manually or it may be computerized. A


computerized database may be created and maintained either by a group of application programs
written specifically for that task or by a database management system.

Data Base Management System (DBMS)

DBMS is a collection of programs that enables users to create and maintain a database. The
DBMS is a general purpose software system that facilitates the process of defining,
constructing, manipulating and sharing database among various users and applications. Defining
a database involves specifying the datatype, structures and constraints/restrictions of the data to
be stored in the database.

Constructing the database is the process of storing the data on some storage medium that is
controlled by the DBMS. Manipulating a database includes functions such as querying the
database to retrieve specific data, updating the database to reflect changes in the mini world, and
generating reports from the data. Sharing a database allows multiple users and programs to
access the database simultaneously.

2
Chapter 1. Introduction to Database Systems (G2 CS)

Other important functions provided by the DBMS include protecting the database and
maintaining it over a long period of time. Protection includes system protection against
hardware or software malfunction (or crashes) and security protection against un-authorized or
malicious access. There are general-purpose DBMS softwares as well as special-purpose DBMS
softwares to implement a computerized database. Most DBMSs are very complex software
systems. The database and DBMS software together called as a Database System. Here are
some examples of popular DBMS used these days: MySql,SQL Server, PostgreSQL, IBM
DB2,oracle Amazon SimpleDB (cloud based) etc.

The following diagram shows a simplified database system environment

Database system and File System

File system

Traditionally, data accessed through computers has been stored on different storage media in the
form of individual files. Files proved to be quite satisfactory as long as computerization was
limited to few application areas. However, the number of users became increased, especially
with the advent of internet and online transaction systems, the file systems gave rise to many
serious problems. The discipline of database systems evolved in response to these problems.

Problems with file systems

Most data processing systems use files for storing, accessing, and manipulating data. Files are
typically stored on magnetic tapes and disks. Most of the problems with files arise out of the fact
that, the files are specific to an application. As application increases, the total number of

3
Chapter 1. Introduction to Database Systems (G2 CS)

computerized files grows considerably. A large number of files gave rise to the following
problems.

 Files involve a high level of redundancy in data; ie, same data item being stored at many
different places. Redundancy often results in inconsistency. What is worse is that, it may
exist in different stages of update at different places and thus may have different values.
 Individual files are not adaptable to rapid changes, especially with respect to the way the
data items are structured within the file. Data stored in a file system is purely dependent
of the physical medium used.
 In file system, only pre-defined questions can be answered.
 Modifications in one program may require modifications in other programs, which
interface with this program.

Why do we need a database?

The various reasons for which we require databases are:

 Size of Data: The small amount of data storing into spreadsheet is fine, however it might
turns into a large amount of data then Spreadsheet solution will not work. Even if the size
of data records goes into millions then storing data in multiple spreadsheet which will
create a problem of speed. It will take you long time to find a record from the multiple
spreadsheet files.
 Ease of Updating Data: Multiple peoples cannot edit the same file on same time. Other
peoples must wait until files are available to update which results into wastage of time.
 To manage large chunks of data: Yes, you can store data into a spreadsheet, but if you
add large chunks of data into the sheet, it will simply not work. For instance: if your size
of data increases into thousands of records, it will simply create a problem of speed.
 Accuracy: When doing data entry files in a spreadsheet, it becomes difficult to manage
the accuracy as there are no validations present in it.
 Security of data: There is no denying the fact that your data is less secure in
spreadsheets. Anyone can easily get access to file and can make changes to it. With
databases you have security groups and privileges you set to restrict access.
 Data integrity: Data integrity also becomes a question when storing data in spreadsheets.
In databases, you can be assured of accuracy and consistency of data due to the built in
integrity checks and access controls.
4
Chapter 1. Introduction to Database Systems (G2 CS)

Data Abstraction

DBMS supports Data abstraction. By Data Abstraction we mean DBMS provide user’s with an
abstract view of the system. i.e, the system hides certain details of how data is stored, created and
maintained. Main purpose of a database system is to hide complexity from database users.

3 Levels of Abstraction

• Physical Level
• Conceptual (logical)Level
• View level

Physical level:

Describes how a record (e.g., customer) is stored.


– The level that is closest to physical storage.
– Deals with ‘How the data are stored physically’.
– It’s the lowest level of abstraction
Logical (Conceptual) level:

Describes data stored in database, and the relationships among the data.
– Database Administrator level.
– Describes what data are stored.
– Describes the relationships among data.
– Next highest level of abstraction.
Eg:
type customer = record

5
Chapter 1. Introduction to Database Systems (G2 CS)

customer_id:string;
customer_name:string;
customer_street:string;
customer_city : string;
end;

View level:
The level that is closest to the users.
– Its concerned with the way the data seen by individual users.
– Describes part of the database for a particular group of users.
– Can be many different views of a database.
Characteristics of the database approach

In the database approach, a single repository of data is maintained. That is, defined once and then
accessed by various users. In file systems, each application is free to name data elements
independently. In contrast, in a database, the names and labels of data are defined once and used
repeatedly by queries, transactions, and applications.

The main characteristics of the database approach against the file processing approach are the
following:-

 Self- describing nature:-


A fundamental characteristic of the database approach is that, the database system contains
not only the database itself but also a complete definition or description of the database
structure and constraints. This definition is stored in the DBMS catalog, which contains
information such as structure of each file, type and storage format of each data item, and
various constraints on the data. The information stored in the catalog is called meta-data,
and it describes the structure of the primary database.
A general-purpose DBMS software package is not written for a specific database
application. Therefore, it must refer to the catalog to know the structure of the files in a
specific database, such as type and format of data it will access. The DBMS software must
work with any number of database applications- for example, a university database, a
banking database, or a company database- as long as the database definition is stored in the
catalog. In traditional file processing, data definition is typically part of the application

6
Chapter 1. Introduction to Database Systems (G2 CS)

program themselves. Hence, these programs are constrained to work with only one specific
database, whose structure is declared in application programs.
 Insulation between Programs and Data, and Data Abstraction:-
In traditional file system, the structure of data files is embedded in the application programs,
so any changes to the structure of a file may require changing all programs that access that
file. By contrast, DBMS access programs do not require such changes in most cases. The
structure of data files is stored in the DBMS catalog separately from the access programs.
This property is known as program-data independence.
DBMS provides users with a conceptual representation of data that does not include
many of the details of how the data is stored or how the operations are implemented. This
characteristic is called as data abstraction. Many other details of file storage organization-
such as the access path specified on a file- can be hidden from database users by the DBMS.
 Support of Multiple Views of the Data:-
A database typically has many users, each of whom may require different perspective or
view of the database. A multiuser DBMS, as its name implies, must allow multiple users to
access the database at the same time. The users will have variety of distinct applications
which defines multiple views
 Sharing of Data and Multiuser Transaction Processing:-
The DBMS must include concurrency control software to ensure a controlled, correct
updation since several users may try to update the same data in a multiuser DBMS. For
example, when several reservation clerks try to assign a seat on an airline flight, the DBMS
should ensure that each seat can be accessed by only one clerk at a time for assignment to a
passenger. These types of applications are generally called online transaction processing
(OLTP) applications. A fundamental role of multiuser DBMS software is to ensure that
concurrent transactions operate correctly and efficiently. A transaction is an executing
program or process that includes one or more database accesses, such as reading or updating
of database records.

Actors on the Scene

7
Chapter 1. Introduction to Database Systems (G2 CS)

In large organizations, many people are involved in the design, use, and maintenance of a large
database with hundreds of users. The people whose job involve the day-to-day use of a large
database: called as actors on the scene, and are classified as follows:-

 Database Administrator (DBA)


In a database environment, the primary resource is the database itself, and the secondary
resource is the responsibility of the DBA. The DBA is responsible for authorizing access to
the database, coordinating and monitoring its use and acquiring software and hardware
resources as needed.
 Database designers
Database designers are responsible for identifying the data to be stored in the database and
for choosing appropriate structures to represent and store this data. It is the responsibility of
the database designers to communicate with all prospective database users in order to
understand their requirements and to create a design that meets these requirements. Database
designers typically interact with each potential group of users and develop views of the
database that meet the data and processing requirements of these groups. The final database
design must be capable of supporting the requirements of all user groups.
 End user
End users are the people whose jobs require access to the database for querying, updating,
and generating reports; the database primarily exists for their use. There are several
categories of end users:
Casual end user: they access the database occasionally, but they may need different
information each time.
Naive or parametric end user: their main job function revolves around constantly querying
and updating the database, using standard types of queries and updates- called canned
transactions- that have been carefully programmed and tested.
Sophisticated end user: include engineers, scientists, business analysts who try to learn most
of the DBMS facilities in order to achieve their complex requirements.
Standalone users: maintain personal database by using ready-made program packages that
provide easy-to use-menu-based or graphical-based interfaces.
 System Analysts and Application Programmers (software engineers)

System analysts determine the requirements of end users, especially naive users, and develop
specifications for canned transactions that meet these requirements. Application programmers
implement these specifications as programs; then they test, debug, document, and maintain these
canned transactions.

There are several categories of end users:


■ Casual end users occasionally access the database, but they may need different information
each time. They use a sophisticated database query language to specify their requests and are
typically middle- or high-level managers or other occasional browsers.
■ Naive or parametric end users make up a sizable portion of database end users. Their main
job function revolves around constantly querying and updating the database, using standard types

8
Chapter 1. Introduction to Database Systems (G2 CS)

of queries and updates—called canned transactions—that have been carefully programmed and
tested. The
tasks that such users perform are varied:
 Bank tellers check account balances and post withdrawals and deposits.
 Reservation agents for airlines, hotels, and car rental companies check availability for a
given request and make reservations.
 Employees at receiving stations for shipping companies enter package identifications via
bar codes and descriptive information through buttons to update a central database of
received and in-transit packages.
■ Sophisticated end users include engineers, scientists, business analysts, and others who
thoroughly familiarize themselves with the facilities of the DBMS in order to implement their
own applications to meet their complex requirements.
■ Standalone users maintain personal databases by using ready-made program packages that
provide easy-to-use menu-based or graphics-based interfaces.An example is the user of a tax

A typical DBMS provides multiple facilities to access a database. Naive end users need to learn
very little about the facilities provided by the DBMS; they simply have to understand the user
interfaces of the standard transactions designed and implemented for their use. Casual users
learn only a few facilities that they may use repeatedly. Sophisticated users try to learn most of
the DBMS facilities in order to achieve their complex requirements. Standalone users typically
become very proficient in using a specific software package.

You might also like