ECEG-4191 Database Systems

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 28

ECEG-4191

Database Systems

Chapter 01
Introduction
Database Applications
• Businesses: Banking, Insurance, Retail, Transportation, Healthcare, Manufacturing

• Service industries: Financial, Real-estate, Legal, Electronic Commerce, Small


businesses

• Education : Resources for content and Delivery

• Recently:
• Social Networks, Environmental and Scientific Applications, Medicine and Genetics

• Personalized applications: based on smart mobile devices

…. The list goes on!


Basic Definitions
data.
- Database is a collection of related _______
- Data: known facts that can be recorded and have an implicit meaning.
______________________________________________________
A catalog that stores the description of a particular database
- Meta-data: ______________________________________________________
 Database Management System (DBMS):
 A software package/system to facilitate the creation and maintenance of a computerized

database.
 Database System:
 The DBMS software together with the data itself. Sometimes, the application programs

and interfaces are also included.


 A Database System models a real world enterprise (mini world):

 Entities
_____________ (e.g., students, courses)
 Relationships (e.g., John is takes ECEG 4191)
_____________
File System v.s. DBMS
File System is a collection of individual ____
files accessed by application programs.
Example: Store the data in comma-separated value (CSV) files.
 Use a separate file per entity.
 Write applications to access and manipulate this data as they are needed

File Approach Database Approach


Procedure FindStudent; SELECT
(SQL) *
begin FROM Students
open file Student.txt; WHERE
Line=ReadLine(); name = 'Bart'
While not EOF(Student)
Student = ParseStudent(Line);
If Student.name='Bart' then
display(Student);
EndIf
Read(Student);
EndWhile;
End;
Limitation of Traditional File Based Systems
- Data Redundancy and Inconsistency
data is stored in multiple file formats resulting in duplication of information in different files

- Difficulty in Accessing Data


Need to write a new program to carry out each new task

- Data Isolation
Multiple files and formats

- Integrity Problems
• Integrity constraints (e.g., account balance > 0) become “buried” in program code rather than being stated explicitly
• Hard to add new constraints or change existing ones

- Schema Change
• requires changing file formats
• need to rewrite virtually all applications
Limitation of Traditional File Based Systems (Cont.)
• Atomicity of updates
• Failures may leave database in an inconsistent state with partial updates carried out
• Example: Transfer of funds from one account to another should either complete or not happen at all

• Concurrent access by multiple users


• Concurrent access needed for performance
• Uncontrolled concurrent accesses can lead to inconsistencies
• Ex: Two people reading a balance (say 100) and updating it by withdrawing money (say 50 each) at the
same time

• Security problems: Problems of fine-grained, content-based access control


• Hard to provide user access to some, but not all, data.
• Password/file-based authorization insufficient
Why Use a DBMS(1)
 Insulation between programs and data(a.k.a. Data Independence)
 Allows changing data structures and data storage organization without having to change the
DBMS access programs.
 Efficient access
 Data Administration
 Data Integrity: Enforcing integrity constraints on the database.
 Recovery From Crashes: Providing backup and recovery services.
 Concurrent Access
 Allowing a set of user transactions to access and update the database concurrently.
 Security: Restricting unauthorized access to data.
Why Use a DBMS(2)
 Potential for Enforcing Standards:
 Crucial for the success of database applications in large organizations.

 Reduced application development time:


 The time needed to add each new application is reduced.

 Flexibility to change data storage structures:


 Storage structures may evolve to improve performance, or because of new

requirements.
 Availability of up-to-date information:
 Extremely important for on-line transaction systems such as airline, hotel, car

reservations.
 Economies of scale:
 Wasteful overlap of resources and personnel can be avoided by consolidating data and

applications across departments.

… The list goes on! Read the text book.


Typical DBMS Functionality
 data types _________,
Define a particular database in terms of its __________, constraints
structures and __________

 Construct or Load the initial database contents on a secondary storage medium (typically
hard disk)

 Manipulate the database:


 Retrieval
  Querying (selection)
___________________
 Modification
 Insertions, deletions and updates to its content
_________________________________________
 Accessing/changing the database through Web/Desktop applications

 Processing and Sharing by a set of concurrent users and application programs – yet,
keeping all data valid and consistent

… The list goes on! Read the text book.


Types of Database Applications
 Traditional Applications:
 Numeric and Textual Databases in Business Applications
 More Recent Applications:
 Multimedia Databases (images, videos, voice, etc.)
 Geographic Information Systems (GIS)
 store and analyze maps, weather data, and satellite images
 Data Warehouses
 extract and analyze useful business information from very large databases to support decision making
 Real-time and Active Databases
 control industrial and manufacturing processes.
 Many other applications
Simplified
Database System
Environment
Multi-User DBMS

Types of Database Users

Actors on the Scene:

Those who actually use and control the database content, and those who design,

develop and maintain database applications.

Workers Behind the Scene

Those who design and develop the DBMS software and related tools, and the
computer systems operators.
Database Users: DB Administrators (DBAs) and Designers

 Database Administrators:
 Responsible for authorizing/controlling access to the database;
coordinating and monitoring its use; acquiring software and hardware
resources; and monitoring efficiency of operations.

 Database Designers:
 Responsible for defining database structure, constraints, and transactions;
communicate with users to understand their needs.
Database Users: End-users
 End-users: use the database for queries, reports, and updating the database content. Can be
categorized into:

 Casual end-users
 Access database occasionally when needed
 Naïve (or Parametric) end-users
 Use previously implemented and tested programs (called “canned transactions”) to

access/update the database.


 Examples are bank-tellers or hotel reservation clerks or sales clerks.

 Sophisticated end-users:
 These include business analysts, scientists, engineers, others thoroughly familiar with the

system capabilities.
 Many use tools in the form of software packages that work closely with the stored database.

 Stand-alone end-users:
 Mostly maintain personal databases using ready-to-use packaged applications.

 An example is a tax program user that creates its own internal database.

 Another example is a user that maintains an address book .


A simplified
architecture for a
database system
Three-Schema Architecture
• Proposed to support DBMS characteristics of:
• Program-data independence.
• Support of multiple views of the data.
• Defines DBMS schemas at three levels:
Levels of Abstraction

• Internal schema at the internal level to describe physical storage structures and access paths (e.g
indexes).
• Typically uses a physical data model.
• Conceptual(logical) schema at the conceptual level to describe the structure and constraints for the
whole database for a community of users.
• Uses a conceptual or an implementation data model.
• External schemas at the external level to describe the various user views.
• Usually uses the same data model as the conceptual schema.

 This is also known as the ANSI/SPARC (American National Standards Institute/ Standards Planning And
Requirements Committee) architecture, after the committee that proposed it (Tsichritzis & Klug, 1978).
The three-schema(ANSI-SPARC) architecture

 Includes views used to control


data access and enforce security

 Contains the conceptual and


logical data models.
E.g., tables

 Includes the internal data


model.
E.g., Index
 Changes in one layer should have no to minimal impact on the others
 Physical data independence
 Logical data independence
Data Independence
• Logical Data Independence:
• The capacity to change the conceptual schema without having to change the external
schemas and their associated application programs.

• Physical Data Independence:


• The capacity to change the internal schema without having to change the conceptual schema.

In a DBMS that fully supports data independence.

 When a schema at a lower level is changed:


 only the mappings between the schema and higher-level schemas need to be changed
What:
Database
Systems
Then
What:
Database
Systems
Today
What:
Database
Systems
Today
What:
Database
Systems
Today
What:
Database
Systems
Today
What:
Database
Systems
Today
Recent Developments (1)
• Social Networks started capturing a lot of information about people and about
communications among people-posts, tweets, photos, videos in systems such
as:
- Facebook
- Twitter
- Linked-In
• All of the above constitutes data
• Search Engines, Google, Bing, Yahoo: collect their own repository of web pages
for searching purposes
Recent Developments (2)

• New technologies are emerging from the so-called non-SQL to manage vast
amounts of data generated on the web:
• Example: NOSQL (Non-SQL, Not Only SQL) systems (Chapter 24)

• A large amount of data now resides on the “cloud” which means it is in huge
data centers using thousands of machines.
Why is “big data” a “big deal”?

• Private Sector
• Walmart handles more than 1 million customer transactions every hour, which is imported
into databases estimated to contain more than 2.5 petabytes of data
• Facebook handles 40 billion photos from its user base
• Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide

• Science
• Large Synoptic Survey Telescope will generate 140 Terabyte of data every 5 days

• Biomedical computation like decoding human Genome and personalized medicine


When not to use a DBMS
 Cost of using a DBMS:
 High initial investment and possible need for additional hardware.
 Overhead for providing generality, security, concurrency control, recovery, and other functions.

 When a DBMS may be unnecessary:


 If the database and applications are simple, well defined, and not expected to change.
 If there are stringent real-time requirements that may not be met because of DBMS overhead.
 If access to data by multiple users is not required.

 When no DBMS may suffice:


 If the database system is not able to handle the complexity of data because of modeling limitations
 If the database users need special operations not supported by the DBMS
 When DBMS overhead makes it impossible to achieve the needed application performance

You might also like