0% found this document useful (0 votes)
5 views40 pages

CHPT 2

The document provides an overview of database concepts and architecture, focusing on the evolution of distributed database systems and modern DBMS environments in cloud computing. It covers essential topics such as data models, schemas, instances, the three-schema architecture, and various DBMS languages and interfaces. Additionally, it discusses data independence, client/server architectures, and the classification of different types of DBMSs.

Uploaded by

bandasolomon043
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views40 pages

CHPT 2

The document provides an overview of database concepts and architecture, focusing on the evolution of distributed database systems and modern DBMS environments in cloud computing. It covers essential topics such as data models, schemas, instances, the three-schema architecture, and various DBMS languages and interfaces. Additionally, it discusses data independence, client/server architectures, and the classification of different types of DBMSs.

Uploaded by

bandasolomon043
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

CSC 2702

Database
Concepts &
Architecture
Department of Computer Science
School of Natural Sciences
University of Zambia
Overview

• The growth in data storage needs has led to the development of distributed
database systems
• operate across thousands of computers,
• aligning with the shift from centralized mainframes to networked workstations
and servers.

• Modern DBMS environments, particularly in cloud computing, consist of


thousands of servers managing large-scale data (big data) for users over
the web.
Overview

• We will look at essential database concepts, including

• data models, schemas, instances, and the three-schema architecture

• We will also look at an overview of

• DBMS interfaces, languages, system software environments, various


client/server architectures, and a classification of DBMS types
Data Models, Schemas, and Instances

• The database approach emphasizes data abstraction, allowing users to


view data at varying levels of detail by suppressing the complexities of data
organization and storage

• A data model is central to achieving data abstraction, defining the structure


of the database, including data types, relationships, and constraints, as well
as providing basic operations for data retrieval and updates

• Modern data models increasingly include concepts to specify the dynamic


behavior of database applications, allowing user-defined operations specific
to database objects

• User defined operations, like COMPUTE_GPA for a STUDENT object,


complement the basic operations of data models (insert, delete, modify,
retrieve) by adding functionality specific to certain data objects
Categories of Data Models

1. High-level (conceptual) data models

• provide concepts that are close to the way many users perceive data

• These models, such as the entity-relationship model, use concepts like


entities, attributes, and relationships to represent data in a way that aligns with
how users perceive the real world

2. Low-level (physical) data models

• provide concepts that describe the details of how data is stored on the
computer storage media, typically magnetic disks - meant for computer
specialists
Categories of Data Models

3. Representational (implementation or record-based) data models

• provide concepts that may be easily understood by end users but that are not
too different from the way data is organized in computer storage

• frequently used in commercial DBMSs, these models balance user


understanding with how data is organized on storage media, often using record
structures.

• Examples include the relational model as well as old data models such as the
network and hierarchical models

• Object data model is a higher-level implementation data model that is closer to


a conceptual data model, frequently utilized in the software engineering domain
Categories of Data Models

4. Self-Describing Data Models


• In these models, data storage combines the data description with the data
values themselves
• Examples include XML and various NoSQL systems used for managing big
data
Database Schemas

• The description of a database is called the database schema, and is


defined during design phase.
• It is not expected to change frequently but it can

• A schema diagram visually represents certain aspects of a schema, such


as the names of record types and data items, but does not include actual
data instances or detailed constraints
• Each object within a schema, like a STUDENT or COURSE, is referred to as a
schema construct
• Some constraints, like complex rules for data relationships or conditions, are
difficult to represent in schema diagrams and may not be visually depicted
Database Schema - Example
Database Schemas

• The schema and constraints are stored in the DBMS catalog as meta-data,
allowing the DBMS to reference them as needed

• Although schemas are not meant to change often, schema evolution


occurs when application requirements change, necessitating modifications
to the schema

• Most modern DBMSs support operations that allow schema evolution to


take place while the database remains operational
Database Instances and States

• The actual data in a database at a specific moment, is known as the


database state or snapshot
• It changes frequently as new data is added or updated
• Schema vs. State: The database schema defines the structure of the
database, while the database state refers to the current data held in the
database at any given time

• The initial state of a database occurs when it is first populated with data,
and the current state is the database's state at any moment after updates

• The DBMS ensures that every database state is valid by adhering to the
structure and constraints defined in the schema
Three-Schema Architecture

• The three-schema architecture separates user applications from the


physical database by defining schemas at three distinct levels: internal,
conceptual, and external
1. Internal Level: The internal schema at this level describes the physical
storage structure of the database, including data storage details and access
paths, using a physical data model
2. Conceptual Level: The conceptual schema represents the entire database
structure for all users, focusing on entities, relationships, and constraints,
while hiding the details of physical storage
3. External Level: The external or view level consists of multiple external
schemas, each tailored to the needs of specific user groups, presenting only
the relevant part of the database and hiding the rest
Three-Schema Architecture

• The DBMS must transform requests between the external, conceptual, and
internal schemas through a process called schema mapping, which can be
complex and time-consuming, especially in systems with user-defined views

• Although many DBMSs do not fully separate these three levels, the three-
schema architecture remains important in database design, helping to clarify
the distinctions between user views, database structure, and physical
storage
Three-Schema Architecture
Data Independence

• Data independence refers to the ability to change the schema at one level
of a database system without affecting the schema at the next higher level,
ensuring stability in user applications and external views

1. Logical Data Independence: allows changes to the conceptual schema,


such as adding or removing data elements, without affecting external
schemas or application programs, requiring only changes to the view
definitions and mappings

2. Physical Data Independence: enables changes to the internal schema, like


reorganizing physical files or adding access paths, without altering the
conceptual schema or affecting external schemas and application programs
Data Independence

• Achieving logical data independence is more challenging than physical


data independence because it involves maintaining application functionality
despite structural changes at the conceptual level

• Mappings between different schema levels are stored in the catalog, and
data independence is maintained by adjusting these mappings when
schema changes occur, without modifying higher-level schemas or
applications
DBMS Languages

• These include:
1. The Data Definition Language (DDL) is used by database administrators
(DBAs) and designers to define both conceptual and internal schemas,
with its description stored in the DBMS catalog
2. The Storage Definition Language (SDL) used to specify the internal
schema
3. The View Definition Language (VDL) is used to define user views and
map them to the conceptual schema
4. The Data Manipulation Language (DML) allows users to manipulate data
through operations like retrieval, insertion, deletion, and modification

• Modern DBMSs often use a comprehensive integrated language that


integrates DDL, VDL, and DML functionalities, with storage definition usually
handled separately for performance tuning by the DBA staff
DBMS Languages

• Types of DMLs
1. High-level DMLs (set-oriented, declarative) allow for concise specification of
database operations
2. Low-Level DMLs (procedural) require embedding in a programming
language for record-by-record processing, with the host language serving as
the primary language and the DML as the data sublanguage

• Casual users typically interact with the database using a high-level query
language, while programmers may embed DML in their code;

• User-friendly interfaces are available for those who prefer not to learn the
details of the query language
DBMS Interfaces

• Quite a few:
1. Menu-based Interfaces: Guide users through queries by presenting
options in a step-by-step manner via menus, commonly used in web
clients
2. Mobile Apps Interfaces: Provide users with access to their data on mobile
devices, featuring programmed interfaces for tasks like paying bills, or
making reservations, often tailored to specific industries like banking or
reservations
3. Forms-based Interfaces: Allow users to input or retrieve data by filling
out forms, typically used for canned transactions
4. Graphical User Interfaces (GUIs): Visually represent database schemas in
a diagrammatic form, enabling users to specify queries by interacting with
the diagram, often combining menus and forms for enhanced usability
DBMS Interfaces

5. Natural Language Interfaces: Accept user requests in natural language,


interpreting them into high-level database queries, and engaging in dialogue
with users if clarification is needed
6. Keyword-based Search: Allow users to search databases using natural
language keywords, with results ranked by relevance
7. Speech Input and Output: Enable limited voice interaction for querying
and receiving database responses, useful in applications like directory
inquiries or flight information systems
8. Parametric User Interfaces: Streamline operations with single function
keys/minimal keystrokes, for users with repetitive tasks, e.g. bank tellers
9. DBA Interfaces: Provide DBAs which an interface for privileged tasks like
creating accounts, setting system parameters, and granting privileges, crucial
for database management and security
DBMS Component Modules

• Divided into
1. user interfaces for various
types of users (DBAs, casual
users, application
programmers, and
parametric users)
2. internal modules
responsible for data storage
and transaction processing
DBMS Component Modules

• DBA Staff work with DDL Statements and Privileged Commands


interfaces to formulate queries

• Casual Users work with Interactive Query interfaces to formulate queries

• Application Programmers write Application Programs that interact with


the DBMS

• Parametric Users do data entry work by supplying parameters to


predefined Compiled Transactions

• Application Programs are compiled by a Precompiler, a DML Compiler


which links them with DML commands, and a Host Programming Compiler
resulting in Compiled Transactions
DBMS Component Modules

• Queries from Interactive Query interfaces are processed by a Query


Compiler and optimized by a Query Optimizer
• The optimized queries together with the Compiled Transactions and
Privileged Commands are executed by the Runtime Database Processor
• The Runtime Database Processor executes DBA Commands, Queries,
and Transactions and also handles Concurrency Control, Backup and
Recovery by means of its subsystems

• The Stored Data Manager controls Input/Output access to information in


the Stored Database
• DDL Statements are compiled by the DDL Compiler which processes
schema definitions specified and stores the descriptions of the schemas
(meta-data) in the DBMS System Catalog
DBMS Component Modules

• The System Catalog also stores many other types of information that are
needed by other DBMS component modules such as the Query Optimizer,
the DML Compiler and the Runtime Database Processor, which can then
look up the catalog information when needed

• DBMS often operates in a client-server setup, where the client, application


server, and database server are on separate machines, communicating over
a network.
Database System Utilities

• DBMSs have database utilities that help the manage the database system
1. Loading: Used to load existing data files (e.g., text or sequential files) into the
database, automatically reformatting the data to match the target database
structure, and facilitating data transfer between different DBMSs

2. Backup: Creates a backup copy of the database, typically by dumping it onto


storage mediums, with options for full backups or more complex incremental
backups that save storage space by only recording changes

3. Storage Reorganization: Reorganizes the database files into different


organizations and creates new access paths, improving performance

4. Performance Monitoring: Monitors database, providing statistics that help


decide on actions like file reorganization or index management to enhance
performance
Database System Utilities

• Additional Utilities: May include tools for

1. sorting files

2. handling data compression

3. monitoring user access

4. interfacing with networks

5. other functions that assist in database management


Centralized DBMSs Architecture

• Centralized DBMS: all DBMS functions, application execution, and user


interface processing occurs on a single machine, maintaining a
centralized architecture
• Early DBMS architectures relied on mainframe computers to handle all system
functions, including DBMS functionality, application programs, and user
interfaces, as users accessed the system via terminals with only display
capabilities
• In these older systems, all processing was performed on the central computer,
with display terminals merely receiving and displaying information while being
connected through communications networks
• As hardware prices decreased, users began replacing terminals with PCs,
workstations, and eventually mobile devices, though initially, these devices
were used similarly to display terminals, with DBMS processing still centralized
Centralized DBMSs Architecture
Basic Client/Server Architectures

• Client/Server Architecture manages environments with many


interconnected PCs, workstations, mobile devices, servers, and other
equipment via a network, with specialized servers providing specific
functionalities that clients can access

• Clients are typically user machines (PCs, workstations, or mobile devices)


that offer user interface capabilities and local processing, connecting to
servers for additional functionalities like database access

• Servers are systems that provide services to clients, such as file access,
printing, archiving, and database access, typically running server software .

• Two main types of DBMS architectures—two-tier and three-tier


Two-Tier Client/Server Architectures for DBMSs
Two-Tier Client/Server Architectures for DBMSs

• In two-tier client/server architectures for RDBMSs


• the client handles user interfaces and application execution

• The server, often called a query server, transaction server, or SQL server,
handles SQL and transaction processing

• Open Database Connectivity (ODBC) is a standard that allows client


programs to communicate with the DBMS server, enabling queries and
transactions to be processed at the server and results sent back to the client
• Java Database Connectivity (JDBC) is a similar standard for Java programs

• Advantages of Two-Tier Architecture are that it is simple and compatible


with existing systems, making it easy to implement and maintain
Two-Tier Client/Server Architectures for DBMSs
Three-Tier and n-Tier Architectures for Web Apps

• Three-tier architecture has an intermediate layer, known as the application


server, between the client and database, enhancing functionality and security
• The application server runs application programs, stores business rules,
checks client credentials and processes requests from the client before
interacting with the database server
• Application server can also serve as a web server, retrieving and formatting query
results from the database server into dynamic web pages for client-side viewing

• The three tiers include


1. the user interface on the client side
2. business logic on the application server
3. data access on the database server
Three-Tier and n-Tier Architectures for Web Apps

• The three-tier architecture can be viewed as having

1. a presentation layer for user interaction

2. a business logic layer for processing rules and constraints

3. a data management layer for database services

• n-Tier Architectures
• Further division of layers into finer components creating e.g., four or five
tiers, allowing independent operation and optimization of each layer, often
used in ERP and CRM systems
Three-Tier and n-Tier Architectures for Web Apps
Classification of DBMSs

1 - Data Model Classification: DBMSs can be classified based on the data


model they use, including relational (SQL systems), object, object-relational,
NoSQL, key-value, hierarchical, network, and XML-based systems.
• Relational DBMSs: The most common DBMSs, based on the relational data
model, represent databases as tables and use SQL for querying and managing
data, with evolving features incorporating object database concepts
• NoSQL and Big Data Systems: These systems utilize various data models,
including key-value, document-based (e.g., JSON), graph-based, and
column-based models, optimized for handling large-scale, distributed data
• Legacy Data Models: Older DBMSs, based on the network and hierarchical
models, are known as legacy systems and were widely used historically,
particularly in the CODASYL DBTG network model and IMS hierarchical model
Classification of DBMSs

2 - Single-User vs. Multiuser Systems: DBMSs can be categorized by the


number of users they support, with
• Single-user Systems typically used on PCs and
• Multiuser Systems supporting concurrent access by multiple users

3 - Centralized vs. Distributed Systems: A DBMS can be


• Centralized, with all data stored at a single site, or
• Distributed, with data and DBMS software spread across multiple sites
connected by a network
Classification of DBMSs

4 - Homogeneous vs. Heterogeneous Distributed DBMSs:


• Homogeneous DDBMSs use the same DBMS software across all sites, while
• Heterogeneous DDBMSs may use different DBMS software, sometimes
connected through middleware in federated systems

5 - Cost Classification: DBMSs can vary widely in cost, from


• Open-source systems like MySQL and PostgreSQL to
• Expensive, Modular Systems with additional features for distribution,
replication, and data warehousing
Classification of DBMSs

6 - Access Path Options: Some DBMSs are classified based on their


access path options, such as those using Inverted File Structures to
optimize data retrieval

7 - General-Purpose vs. Special-Purpose DBMSs: DBMSs can be


• general-purpose, supporting a wide range of applications, or
• special-purpose, designed for specific tasks like online transaction processing
(OLTP) system

8 - XML-based DBMSs: Emerging DBMSs based on the XML model


represent data in Hierarchical Tree Structures, with XML becoming a standard
for data exchange on the web and integrated into many commercial DBMSs
Classification of DBMSs

9 - Federated DBMSs: These systems use middleware to connect and


manage multiple autonomous databases, often running on heterogeneous
DBMS software, allowing for loosely coupled, decentralized data management

You might also like