1.
Basic Database Concepts
Definition
A database is an organized collection of related data stored
electronically. It helps store, manage, and retrieve large amounts
of information easily and efficiently.
Why Are Databases Important?
In today’s world, data is everywhere. Organizing data properly
is important so it is not duplicated, stays secure, and multiple
users can access it at the same time without problems. Databases
help with all of this.
Key Terms
Data: Raw facts or figures that have no meaning on their
own.
Example: A student’s name, roll number, or address.
Information: Processed data that is meaningful and useful.
Example: A student’s details shown in a report card.
Database: A collection of related data organized in a way
that makes it easy to access.
DBMS (Database Management System): Software that
creates, manages, and controls access to the database.
Components of a Database System
1. Hardware: Physical devices like hard drives and servers
where data is stored.
2. Software: DBMS software that manages the database.
3. Data: The actual stored data inside the database.
4. Users: People who access or update the database.
5. Procedures: Rules and instructions followed to manage
and use the database.
Advantages of Using a Database
Reduces Data Redundancy: Data is not unnecessarily
repeated.
Ensures Data Consistency: Data stays accurate and
uniform across the system.
Improves Data Security: Prevents unauthorized access to
data.
Enables Data Sharing: Multiple users can access data
easily.
Supports Backup and Recovery: Data can be recovered if
lost.
Provides Efficient Data Access: Data can be retrieved
quickly.
Types of Users in a Database System
Database Administrator (DBA): Manages and secures the
database.
Database Designer: Designs the database structure.
End Users: Use the database to view or enter data.
Application Programmers: Develop applications that
interact with the database.
Summary
Understanding basic database concepts is very important
because databases organize and protect data. They help
businesses and organizations manage data efficiently and
securely.
2. Database Management System (DBMS)
Definition
A Database Management System (DBMS) is software that
helps users create, manage, and control access to databases. It
acts as a bridge between users and the database.
Purpose of DBMS
To store large amounts of data efficiently.
To provide easy access to data.
To maintain data security and integrity.
To allow multiple users to use the database at the same
time without conflicts.
To backup data and recover it in case of loss.
Functions of DBMS
Data Storage Management: DBMS stores data in an
organized way.
Data Retrieval: Allows users to get data easily using
queries.
Data Manipulation: Users can insert, update, delete data.
Security Management: Controls who can access or
modify data.
Backup and Recovery: Protects data from loss and
restores it if needed.
Data Integrity: Ensures data is accurate and consistent.
Examples of DBMS
MySQL
Oracle Database
Microsoft SQL Server
PostgreSQL
MongoDB (for NoSQL databases)
Advantages of DBMS
Reduces data redundancy and inconsistency.
Provides better data security than file-based systems.
Supports multi-user environments.
Provides backup and recovery facilities.
Improves data sharing and data integrity.
Summary
DBMS is the software that manages databases and makes sure
data is stored, accessed, and protected properly. It’s an essential
tool for handling large and complex data efficiently.
3. Database Approach vs File-Based System
What is File-Based System?
In a file-based system, data is stored in separate files. Each
application program has its own files to manage data.
What is Database Approach?
In the database approach, data is stored in a centralized
database managed by a DBMS. Multiple applications and users
access the same database.
Differences Between Database Approach and File-Based
System
Feature File-Based System Database Approach
High (data is
Data Low (data stored once
duplicated in many
Redundancy centrally)
files)
Data
Difficult to maintain Easier to maintain
Consistency
Data Sharing Difficult Easy for multiple users
Data Security Minimal Strong security features
Feature File-Based System Database Approach
Low (application High (applications
Data
tightly linked to independent of data
Independence
files) storage)
Data Access Slow and inefficient Fast and flexible
Backup and
Limited or manual Supported automatically
Recovery
Advantages of Database Approach over File-Based System
Eliminates data duplication.
Provides better control over data.
Allows concurrent access by multiple users.
Simplifies data management and updating.
Improves data security and backup.
Summary
Database approach is better than file-based system because it
organizes data centrally, reduces redundancy, and supports
multiple users while keeping data secure and consistent.
4. Database Architecture, Three-Level
Schema Architecture
What is Database Architecture?
Database architecture means the overall design and structure of
a database system. It defines how data is stored, organized, and
accessed in a way that makes it easy to manage, secure, and use
by different users and applications.
A well-designed database architecture helps in managing
complexity and makes sure the system runs efficiently.
Three-Level Schema Architecture
This is the most common and important architecture used in
database systems. It divides the database system into three
separate levels:
1. Internal Level (Physical Level)
This is the lowest level of the database.
It describes how data is physically stored on storage
devices like hard disks.
It involves details like file organization, indexing methods,
and storage paths.
Normal users or programmers don’t see or interact with this
level directly.
Example: Data stored as blocks, pages, or files on the disk.
2. Conceptual Level (Logical Level)
This is the middle level and it describes what data is
stored and the relationships between different data
elements.
It provides a logical view of the entire database without
showing physical details.
It hides how data is physically stored but shows the
structure of the data (tables, columns, relationships).
It is used by database designers and administrators to
define the database structure.
Example: Defining entities like “Students,” “Courses,” and
their relationships.
3. External Level (View Level)
This is the highest level and represents the user’s view of
the database.
Different users or applications may have different views
depending on their needs.
It hides the details of the conceptual and internal levels and
only shows relevant data.
Helps in data security by restricting user access to only
certain parts of the database.
Example: A student sees only his own grades, while a
teacher sees grades of all students.
Why Use Three-Level Architecture?
Data Abstraction:
Each level hides the complexity of the level below it. Users
and applications don’t need to know about physical storage
details.
Data Independence:
Changes at the internal level (like changing storage
devices) don’t affect the conceptual or external levels.
Similarly, changes in the conceptual level (like adding new
tables) don’t affect users’ views.
Improved Security:
Users get access only to the data they are authorized to see
through external views.
Simplifies Database Management:
Designers can change the database structure or storage
without affecting end users.
Diagram Explanation (If you want to visualize)
User/View Level (External) --> User’s specific views
of data
-----------------------------
Logical Level (Conceptual) --> Overall logical
structure of the database
-----------------------------
Physical Level (Internal) --> Physical storage details
on hardware
Summary
Database architecture organizes data management into
three levels: internal, conceptual, and external.
This separation allows flexibility, security, and ease of
management.
Users see only what they need; changes in storage don’t
affect users; and designers have a clear structure to work
with.
5. Data Independence
What is Data Independence?
Data Independence is the ability to change the database structure
at one level without affecting the other levels or users'
applications. It means changes in how data is stored or
organized do not affect how users see or use the data.
This makes the database flexible and easier to maintain.
Types of Data Independence
1. Physical Data Independence
o This means you can change the physical way data is
stored (like changing files or storage devices) without
changing the logical structure of the database.
o Users and programs do not notice these changes.
o For example, moving data from one hard disk to
another or changing file formats does not affect the
way users interact with the data.
2. Logical Data Independence
o This means you can change the logical structure of the
database (like adding or deleting tables, changing
columns) without affecting the users’ views or
applications.
o For example, adding a new field to a table or a new
entity does not affect existing programs that use the
database.
Why is Data Independence Important?
Easy Maintenance: Changes can be made to the database
without disturbing users or programs.
Flexibility: The database can evolve and grow over time
without major disruptions.
Reduced Errors: Less chance of errors in user applications
because they are shielded from structural changes.
Better Security: Different users can have their views
controlled, while changes happen behind the scenes.
Less Downtime: Applications keep running smoothly even
if the database structure changes.
Summary
Data Independence is a key feature of modern databases that
separates how data is physically stored, how it is logically
organized, and how it is viewed by users. It protects users and
applications from changes in the database, making the system
flexible and easier to manage.
6. Relational Data Model
What is the Relational Data Model?
The Relational Data Model is a way to organize data in
databases using tables (called relations). Each table stores data
in rows and columns, making it easy to understand and use.
This model was introduced by Edgar F. Codd in 1970 and is the
most popular model used in modern databases like MySQL,
Oracle, and SQL Server.
Key Concepts of Relational Data Model
1. Relation (Table)
o A relation is a table with rows and columns.
o Each table represents an entity or object (like
Students, Courses).
o Tables have a name and contain data about that entity.
2. Tuple (Row)
o A tuple is a single row in the table.
o It represents one record or instance of the entity.
o For example, one student’s information in the Students
table.
3. Attribute (Column)
o An attribute is a column in the table.
o Each attribute represents a property or characteristic of
the entity.
o For example, StudentName, Age, or CourseID.
4. Domain
o Domain is the set of allowable values for an attribute.
o For example, the domain of Age might be all numbers
between 1 and 120.
5. Schema
o Schema is the structure or design of the table.
o It defines the table name, attributes, and data types.
6. Relation Instance
o This is the actual data present in a relation at a
particular time.
o It is the set of tuples (rows) in the table.
Features of Relational Data Model
Data is stored in tables (relations).
Each table has a unique name.
Rows (tuples) are unique and do not have duplicates.
Columns (attributes) have specific data types and domains.
Relationships between tables can be created using keys.
Why is Relational Model Popular?
It is simple and easy to understand.
Supports powerful query languages like SQL.
Allows flexible data retrieval and management.
Supports data integrity and constraints.
Provides a clear and logical way to organize data.
Summary
The Relational Data Model organizes data in tables with rows
and columns. It uses clear concepts like relations, tuples, and
attributes to manage data efficiently and simply. It is the
foundation of most modern database systems.
7. Keys of Relations
What are Keys?
In a relational database, keys are special attributes (columns)
that help uniquely identify each record (row) in a table
(relation). They are very important because they ensure data
accuracy and help in searching and linking tables.
Types of Keys
1. Super Key
o A set of one or more attributes that can uniquely
identify a row in a table.
o It may have extra attributes that are not necessary for
uniqueness.
o Example: In a student table, {StudentID,
StudentName} can be a super key if StudentID alone
is unique.
2. Candidate Key
o A minimal super key, meaning it has no extra
attributes; only the necessary ones to identify rows
uniquely.
o A table can have multiple candidate keys.
o Example: StudentID or Email could both be candidate
keys if both are unique.
3. Primary Key
o The chosen candidate key to uniquely identify rows in
the table.
o It must be unique and cannot be NULL.
o Example: StudentID is often the primary key in a
student table.
4. Alternate Key
o Candidate keys which are not selected as the primary
key.
o Example: If StudentID is primary key, Email can be
an alternate key.
5. Foreign Key
o An attribute (or set of attributes) in one table that
refers to the primary key in another table.
oIt is used to create relationships between tables.
o Example: In an “Enrollments” table, StudentID can be
a foreign key referencing StudentID in the “Students”
table.
6. Composite Key (or Compound Key)
o A key made up of two or more attributes combined to
uniquely identify a row.
o Used when no single attribute can uniquely identify
the record.
o Example: In a table tracking student courses, the
combination of StudentID and CourseID can be a
composite key.
Importance of Keys
Uniqueness: Ensure each record can be uniquely
identified.
Data Integrity: Help prevent duplicate records.
Relationships: Foreign keys link tables together for
relational queries.
Searching: Keys speed up data retrieval.
Summary
Keys are essential parts of a relational database that uniquely
identify records and connect tables. Primary keys, candidate
keys, foreign keys, and composite keys all have important roles
to maintain accurate and organized data.
8. Attributes, Schemas, Tuples, Domains,
Relation Instances
8.1 Attributes
Attributes are the columns in a table (relation).
Each attribute represents a property or characteristic of
an entity.
Example: In a Student table, attributes could be StudentID,
Name, Age, and Address.
Each attribute has a name and a data type (like integer,
string, date).
8.2 Schema
A schema is the structure or design of a database or table.
For a relation (table), schema defines the table name and its
attributes with data types.
It is like a blueprint of the table, showing what kind of data
will be stored.
Example: Student(StudentID: integer, Name: string, Age:
integer)
8.3 Tuple
A tuple is a row in a table.
It represents a single record or instance of the entity.
Each tuple contains a value for every attribute in the
schema.
Example: (101, "Ali", 20, "Lahore") is a tuple in the
Student table.
8.4 Domain
A domain is the set of allowable values for an attribute.
It defines what kind of data is valid for a column.
Example: Age attribute can have domain as numbers from
1 to 100.
Domains help maintain data integrity by restricting invalid
data.
8.5 Relation Instance
A relation instance is the set of tuples (rows) present in a
relation (table) at a specific point in time.
It represents the actual data stored in the table.
Example: If the Student table has 5 rows, those 5 rows form
the current relation instance.
Summary
Attributes = Columns/properties of a table.
Schema = Structure/design of the table (attributes + data
types).
Tuple = Row/record in the table.
Domain = Allowed values for each attribute.
Relation Instance = Actual data present in the table at a
moment.
These terms form the basic language of the relational database
model and help us understand how data is organized and
managed.
9. Integrity Constraints and Relational
Algebra
9.1 Integrity Constraints
Integrity Constraints are rules that ensure the accuracy and
consistency of data in a database. They help prevent invalid or
inconsistent data from being entered.
Types of Integrity Constraints
1. Entity Integrity
o Ensures that the primary key of a table cannot have
NULL values.
o This guarantees each record is uniquely identifiable.
o Example: In a Student table, StudentID (primary key)
must always have a value.
2. Referential Integrity
o Ensures that a foreign key value always points to an
existing primary key in another table or is NULL.
o It prevents “orphan” records that refer to non-existing
entries.
o Example: In an Enrollment table, StudentID must
match a valid StudentID in the Student table.
3. Domain Constraints
o Ensures the values in a column are within a specified
domain (allowed set of values).
o Example: Age must be between 1 and 120.
4. Unique Constraint
o Ensures that all values in a column or group of
columns are unique across the table.
o Example: Email addresses must be unique for every
user.
9.2 Relational Algebra
Relational Algebra is a set of mathematical operations used to
manipulate and retrieve data from relations (tables). It is the
theoretical foundation of SQL queries.
Basic Operations in Relational Algebra
1. Selection (σ)
o Selects rows from a table that satisfy a condition.
o Example: Select all students where Age > 20.
2. Projection (π)
o Selects specific columns from a table.
o Example: Show only StudentID and Name columns.
3. Union (∪)
o Combines rows from two tables with the same
structure, removing duplicates.
o Example: Union of two student lists.
4. Set Difference (−)
o Returns rows in one table that are not in another.
o Example: Students enrolled last year but not this year.
5. Cartesian Product (×)
o Combines every row of one table with every row of
another table.
o Usually used as a step before joining tables.
6. Join (⨝)
o Combines rows from two tables based on a related
column (usually a foreign key).
o Example: Join Student and Enrollment tables on
StudentID.
Why are Integrity Constraints Important?
They keep the data accurate, consistent, and trustworthy.
They prevent mistakes like duplicate records or invalid
references.
They maintain data quality in multi-user environments.
Summary
Integrity constraints enforce rules on the data to maintain
correctness.
Relational algebra provides fundamental operations to
retrieve and manipulate data.
Both are essential for efficient and reliable database
management.
10. Concurrency Control in Databases
What is Concurrency Control?
Concurrency Control is a technique used in databases to manage
simultaneous access by multiple users or applications. It
ensures that when many users try to read or write data at the
same time, the database remains correct, consistent, and
reliable.
Why is Concurrency Control Important?
Without control, multiple users could change the same data
at the same time, causing errors.
It prevents problems like lost updates, inconsistent reads,
and data corruption.
It allows multiple transactions to happen safely at once.
Key Concepts of Concurrency Control
1. Transaction
o A transaction is a sequence of operations performed as
a single logical unit of work.
o Transactions must follow the ACID properties
(Atomicity, Consistency, Isolation, Durability).
2. Isolation
o One important ACID property.
o It ensures that the operations of one transaction are
invisible to others until the transaction is complete.
3. Problems Without Concurrency Control
o Lost Update: Two users update the same data; one
update gets lost.
o Dirty Read: A transaction reads uncommitted
changes from another.
o Unrepeatable Read: Data changes between two reads
in a transaction.
o Phantom Read: New rows appear in between queries
within a transaction.
Techniques for Concurrency Control
1. Locking
o Locks restrict access to data while it is being used.
o Types of locks:
Shared Lock (Read Lock): Multiple users can
read but not write.
Exclusive Lock (Write Lock): Only one user
can read/write.
2. Timestamp Ordering
o Each transaction gets a timestamp.
o Transactions are ordered and executed based on
timestamps to avoid conflicts.
3. Optimistic Concurrency Control
o Assumes conflicts are rare.
o Transactions proceed without locking, but check for
conflicts before commit.
4. Two-Phase Locking (2PL)
o Transactions lock resources in two phases:
Growing phase: Locks are acquired.
Shrinking phase: Locks are released.
Summary
Concurrency control keeps data safe and consistent when many
users access the database at the same time. It uses locks,
timestamps, and protocols to avoid conflicts and ensure smooth
transaction processing.
11. Normalization
What is Normalization?
Normalization is a process in database design that organizes data
to reduce redundancy (repeated data) and improve data
integrity (accuracy and consistency). It breaks down large
tables into smaller, related tables.
Why Normalize?
To avoid data duplication, which wastes space.
To prevent update anomalies (errors when updating data).
To make the database efficient and easy to maintain.
Key Concepts in Normalization
1. Functional Dependency
o A relationship where one attribute (or a set of
attributes) determines another attribute.
o Example: StudentID → StudentName (StudentID
determines StudentName).
2. Normal Forms
o Steps or rules that guide how to organize data. Each
form reduces redundancy and anomalies.
o Common normal forms are:
Normal Forms Explained
1. First Normal Form (1NF)
o Table must have atomic (indivisible) values. No
repeating groups or arrays.
o Each column contains only one value per row.
o Example: Instead of storing multiple phone numbers
in one column, create separate rows or a new table.
2. Second Normal Form (2NF)
o Must be in 1NF.
o All non-key attributes must depend fully on the
primary key (no partial dependency).
o Applies only if the primary key is composite (made of
multiple columns).
3. Third Normal Form (3NF)
o Must be in 2NF.
o No transitive dependency (non-key attribute depends
on another non-key attribute).
o Example: If Student → Department and Department
→ DepartmentHead, then DepartmentHead should be
in a separate table.
4. Boyce-Codd Normal Form (BCNF)
o Stronger version of 3NF.
o Every determinant must be a candidate key.
Benefits of Normalization
Data is stored logically and efficiently.
Eliminates duplicate data.
Helps maintain data integrity.
Makes updates, insertions, and deletions safer.
Summary
Normalization is essential to design good databases by
organizing data into tables that avoid duplication and
inconsistency. It uses functional dependencies and normal forms
to guide this process.
12. Functional Dependencies and Normal
Forms
12.1 Functional Dependencies (FDs)
A Functional Dependency is a relationship between
attributes in a database.
It means if you know the value of one attribute (or set of
attributes), you can determine the value of another attribute.
Notation: X → Y means attribute(s) X functionally
determine attribute(s) Y.
Example: In a Student table, StudentID → StudentName
means if you know StudentID, you can find the
StudentName.
Why Functional Dependencies Matter
They help identify how data relates inside tables.
Important for database design and normalization.
They show which attributes rely on others.
12.2 Types of Functional Dependencies
1. Trivial Functional Dependency
o When the dependent attribute is part of the
determinant.
o Example: {StudentID, Name} → StudentID (always
true).
2. Non-trivial Functional Dependency
o When the dependent attribute is not part of the
determinant.
o Example: StudentID → StudentName.
3. Transitive Dependency
o If A → B and B → C, then A → C is transitive.
o Causes redundancy and should be removed in
normalization.
12.3 Normal Forms (Brief Recap)
Normal forms are rules based on functional dependencies
that help design tables without redundancy.
Common Normal Forms:
o 1NF: No repeating groups.
o 2NF: No partial dependency.
o 3NF: No transitive dependency.
o BCNF: Every determinant is a candidate key.
How FDs relate to Normal Forms
1NF ensures data is atomic.
2NF requires full functional dependency on the primary
key.
3NF requires removal of transitive dependencies.
BCNF is stricter, ensuring all determinants are candidate
keys.
Summary
Functional dependencies describe relationships between
attributes and are key to understanding how to organize tables
correctly. They guide the normalization process to keep
databases efficient and consistent.
13. Entity-Relationship Model (ER Model)
What is an ER Model?
The Entity-Relationship (ER) Model is a diagrammatic way to
represent the structure of a database. It shows how data is related
using entities, attributes, and relationships.
Key Components of ER Model
1. Entity
o An object or thing in the real world with an
independent existence.
o Example: Student, Teacher, Car.
o Represented by a rectangle.
2. Attributes
o Properties or details that describe an entity.
o Example: For Student entity, attributes can be
StudentID, Name, Age.
o Represented by ovals connected to their entity.
3. Entity Set
o A collection of similar entities.
o Example: All students in a university form the Student
entity set.
4. Relationship
o Shows how entities are connected or related to each
other.
o Example: Student enrolls in Course.
o Represented by a diamond.
5. Relationship Set
o A set of relationships of the same type.
o Example: All enrollments in a semester.
Types of Attributes
Simple (Atomic) Attribute: Cannot be divided further
(e.g., Age).
Composite Attribute: Can be divided into smaller parts
(e.g., FullName into FirstName and LastName).
Derived Attribute: Can be derived from other attributes
(e.g., Age from Date of Birth).
Multi-valued Attribute: Can have multiple values (e.g.,
Phone Numbers).
Types of Relationships
1. One-to-One (1:1)
o One entity from A relates to only one entity from B,
and vice versa.
2. One-to-Many (1:N)
o One entity from A relates to many entities in B, but
each entity in B relates to only one in A.
3. Many-to-Many (M:N)
o Entities from A relate to many in B and vice versa.
Why use ER Model?
Helps design the database structure visually before
implementation.
Clarifies the relationships and constraints between data.
Makes it easier to translate to a relational database.
Summary
ER Model is a powerful tool to represent the database logically
using entities, attributes, and relationships. It provides a clear
blueprint for building databases.
14. Keys of Relations
What is a Key?
A key is an attribute or a set of attributes in a table (relation) that
uniquely identifies a record (tuple) in that table.
Why are Keys Important?
To uniquely identify each row in a table.
To avoid duplicate records.
To establish relationships between tables.
Types of Keys
1. Candidate Key
o A minimal set of attributes that can uniquely identify a
tuple.
o There can be more than one candidate key in a table.
o Example: In a Student table, both StudentID and
Email might uniquely identify a student.
2. Primary Key
o A candidate key chosen by the database designer to
uniquely identify tuples.
o It must be unique and not null.
o Example: StudentID as primary key.
3. Alternate Key
o Candidate keys that are not chosen as the primary key.
o Example: If StudentID is primary, then Email could be
an alternate key.
4. Super Key
o A set of one or more attributes that uniquely identify
tuples.
o A super key may contain extra attributes beyond what
is necessary.
5. Foreign Key
o An attribute (or set) in one table that refers to the
primary key of another table.
o Used to maintain relationships between tables.
o Example: In a CourseEnrollment table, StudentID
might be a foreign key referring to Student table.
Characteristics of Keys
Uniqueness: No two rows can have the same key value.
Minimality: No extra attribute should be in the key (only
necessary attributes).
Non-null: Primary keys cannot have null values.
Example
StudentID (PK) Name Email
101 Ali ali@example.com
102 Sara sara@example.com
Here, StudentID is the Primary Key.
Email could be a Candidate Key or Alternate Key if it is
unique.
Summary
Keys are essential in relational databases to uniquely identify
records, maintain data integrity, and define relationships
between tables.
15. Attributes, Schemas, Tuples, Domains,
Relation Instances
15.1 Attributes
Attributes are the columns or fields in a table that describe
the properties of an entity.
Example: In a Student table, attributes can be StudentID,
Name, Age.
Each attribute has a name and a data type (like integer,
string).
15.2 Schema
The schema is the structure or design of a database or a
relation (table).
It defines the table name, attributes, and the type of data
each attribute holds.
Think of it as a blueprint for the table.
Example: Student(StudentID: int, Name: varchar, Age: int)
15.3 Tuple
A tuple is a single row in a table, which contains data
corresponding to all attributes.
It represents one record or instance of the entity.
Example: (101, "Ali", 20) is a tuple in the Student table.
15.4 Domain
A domain is the set of allowable values for an attribute.
It defines the type and range of data an attribute can have.
Example: Domain of Age could be integers between 0 and
150.
15.5 Relation Instance
A relation instance is the actual set of tuples (rows)
present in a table at any given time.
It changes as data is inserted, updated, or deleted.
Example: The Student table with 3 rows is a relation
instance.
Summary
Attributes: Describe the data fields (columns).
Schema: Defines the structure of the table.
Tuples: Individual records (rows).
Domains: Allowed values for attributes.
Relation Instance: Actual data present in the table at a
moment.
16. Integrity Constraints
What are Integrity Constraints?
Integrity constraints are rules that ensure the accuracy and
consistency of data in a database. They prevent incorrect,
incomplete, or inconsistent data from being entered.
Types of Integrity Constraints
1. Entity Integrity
oEnsures that primary key values are unique and not
null.
o This means every row in a table must have a unique
identifier.
2. Referential Integrity
o Ensures that foreign key values must either be null
or match primary key values in the referenced
table.
o This maintains valid relationships between tables.
3. Domain Integrity
o Ensures that attribute values fall within a defined
domain or range.
o For example, age must be between 0 and 150.
4. User-defined Integrity
o Custom rules defined by users to meet specific
business requirements.
o Example: A student’s grade must be between A and F
only.
Why are Integrity Constraints Important?
Maintain data accuracy and trustworthiness.
Prevent data anomalies and errors.
Help in enforcing business rules.
Examples
StudentID (PK) Name Age DepartmentID (FK)
101 Ali 20 10
StudentID (PK) Name Age DepartmentID (FK)
102 Sara 22 11
Entity Integrity: StudentID must be unique and not null.
Referential Integrity: DepartmentID in Student must exist
in Department table or be null.
Domain Integrity: Age must be within a realistic range,
e.g., 0-150.
Summary
Integrity constraints are essential rules that ensure data stays
valid, consistent, and reliable in a database system.
17. Relational Algebra
What is Relational Algebra?
Relational Algebra is a formal set of operations used to
manipulate and retrieve data from relational databases.
It provides the foundation for querying databases and is used
to express queries in a mathematical way.
Why is Relational Algebra Important?
It helps understand how database queries work internally.
Provides a way to manipulate relations (tables) to get
desired results.
Basis for SQL query operations.
Basic Operations in Relational Algebra
1. Selection (σ)
o Selects rows from a table that satisfy a condition.
o Example: Select students with Age > 20.
o Symbol: σ_condition(Relation)
o Example: σ_Age>20(Student)
2. Projection (π)
o Selects specific columns (attributes) from a table.
o Example: Get only Student Names and IDs.
o Symbol: π_attribute-list(Relation)
o Example: π_StudentID, Name(Student)
3. Union (∪)
o Combines tuples from two relations with the same
schema, removing duplicates.
o Example: Students from two different classes.
4. Set Difference (-)
o Returns tuples in one relation but not in the other.
o Example: Students in Class A but not in Class B.
5. Cartesian Product (×)
o Combines every tuple of one relation with every tuple
of another.
o Used to prepare for joins.
6. Rename (ρ)
o Renames the relation or attributes for convenience.
7. Join
o Combines related tuples from two relations based on a
condition.
o Types:
Theta Join: Join based on any condition.
Equi-Join: Join where condition is equality.
Natural Join: Join using all common attributes.
Example
Suppose we have two tables:
Student(StudentID, Name, Age)
Enrollment(StudentID, CourseID)
To find names of students enrolled in CourseID = 101:
1. Select enrollments with CourseID = 101:
σ_CourseID=101(Enrollment)
Student ⨝ Enrollment
2. Join with Student table on StudentID:
π_Name(Student ⨝ σ_CourseID=101(Enrollment))
3. Project only Student Names:
Summary
Relational Algebra is a set of operations that allows you to
manipulate and query data in relational databases
mathematically and precisely. It’s the theoretical basis for SQL
queries.
18. Concurrency Control
What is Concurrency Control?
Concurrency Control is the technique used in database
management systems (DBMS) to ensure that multiple users can
access and modify the database at the same time without
causing problems like data inconsistency or corruption.
Why is Concurrency Control Important?
Multiple users often access the database simultaneously.
Without control, simultaneous operations can lead to
conflicts and incorrect data.
Ensures data integrity and consistency when transactions
happen concurrently.
Key Concepts
1. Transaction
o A set of database operations that perform a single
logical task.
o Must be completed fully or not at all (Atomicity).
2. Problems without Concurrency Control
o Lost Updates: Two transactions update the same data
and one update is lost.
o Temporary Inconsistency: One transaction reads
uncommitted data from another.
o Deadlock: Two or more transactions wait for each
other forever.
Techniques of Concurrency Control
1. Locking
o Locks prevent other transactions from accessing data
until the current transaction is complete.
o Types:
Shared Lock (Read Lock): Multiple
transactions can read but cannot write.
Exclusive Lock (Write Lock): Only one
transaction can write and no one else can
read/write.
2. Timestamp Ordering
o Transactions get timestamps, and the system orders
them based on time to avoid conflicts.
3. Optimistic Concurrency Control
o Assumes conflicts are rare. Transactions execute
without restrictions but validate at commit time.
Summary
Concurrency Control ensures safe and reliable database
operations when many users access data simultaneously. It
prevents data corruption and maintains consistency.
19. Database Normalization
What is Normalization?
Normalization is a process in database design that organizes
data to reduce redundancy and improve data integrity. It
structures tables and their relationships to minimize duplicate
data.
Why Normalize?
To avoid data redundancy (repeated data).
To prevent update anomalies (errors when updating data).
To make the database more efficient and easier to maintain.
Normal Forms (Types of Normalization)
1. First Normal Form (1NF)
o Each column contains atomic (indivisible) values.
o No repeating groups or arrays in a table.
2. Second Normal Form (2NF)
o Must be in 1NF.
o All non-key attributes fully depend on the primary key
(no partial dependency).
3. Third Normal Form (3NF)
o Must be in 2NF.
o No transitive dependency (non-key attributes depend
only on the primary key).
4. Boyce-Codd Normal Form (BCNF)
o Stronger version of 3NF where every determinant is a
candidate key.
Example of Normalization
Suppose a table stores:
StudentID StudentName CourseID CourseName Instructor
Here, CourseName and Instructor depend on CourseID, not
StudentID → violates 2NF.
Normalization splits this into two tables:
1. Student (StudentID, StudentName)
2. Course (CourseID, CourseName, Instructor)
Summary
Normalization helps create a database that is free from
redundancy and update problems, ensuring data accuracy and
efficiency.
20. Functional Dependencies
What is a Functional Dependency?
A functional dependency (FD) is a relationship between two sets
of attributes in a database. It means if you know the value of
one attribute (or a set of attributes), you can uniquely
determine the value of another attribute.
Notation
If attribute B depends on attribute A, it is written as:
A→B
This means A functionally determines B.
Example
In a Student table:
StudentID → StudentName
If you know StudentID, you can find the StudentName.
Types of Functional Dependencies
1. Trivial Functional Dependency
o When an attribute depends on itself or a subset of
itself.
o Example: StudentID, Name → Name (this is trivial).
2. Non-Trivial Functional Dependency
o When the dependency is not trivial, i.e., B is not a
subset of A.
3. Fully Functional Dependency
o When an attribute depends on the whole composite
key, not just part of it.
4. Partial Dependency
o When an attribute depends on part of a composite key,
which violates 2NF.
5. Transitive Dependency
o When attribute A determines B, and B determines C,
so A determines C indirectly.
Importance in Database Design
Functional dependencies help identify keys and design
tables.
They guide the normalization process to reduce
redundancy.
Summary
Functional dependencies describe how one attribute’s value is
determined by another and are essential for understanding and
designing relational databases correctly.
21. Entity Relationship Model (ER Model)
What is an ER Model?
The Entity Relationship Model is a diagrammatic technique
used to represent the structure of a database. It shows entities,
their attributes, and the relationships among them visually.
Key Components
1. Entity
o A real-world object or concept with a distinct
existence.
o Example: Student, Teacher, Course.
2. Entity Set
o A collection of similar entities.
o Example: All students form a Student entity set.
3. Attributes
o Properties or details about an entity.
o Example: Student entity may have StudentID, Name,
Age.
4. Relationship
o An association among entities.
o Example: A student enrolls in a course.
5. Relationship Set
o A set of similar relationships.
6. Primary Key
o An attribute that uniquely identifies an entity in an
entity set.
Types of Entities
Strong Entity: Exists independently (has a primary key).
Weak Entity: Depends on another entity (no primary key
alone).
Types of Relationships
One-to-One (1:1)
One entity in set A relates to only one entity in set B.
One-to-Many (1:N)
One entity in set A relates to many in set B.
Many-to-Many (M:N)
Many entities in set A relate to many in set B.
ER Diagram Symbols
Rectangle: Entity
Oval: Attribute
Diamond: Relationship
Lines connect entities and relationships.
Why Use ER Model?
Helps in database design by visualizing data structure.
Simplifies communication between developers and users.
Summary
The ER Model visually represents database elements and their
relationships, making database design easier and more
organized.
✅ Question 1: How to use Entity–Relationship
(ER) modelling in database design?
💡 Answer:
ER modelling is the first step in designing a database. It
helps us understand and organize the data requirements before
actually creating the database structure.
🔧 Steps to Use ER Modelling in Database Design:
1. Understand Requirements
o Carefully read what the user or system needs.
o Identify the key data that must be stored.
2. Identify Entities
o Find the real-world objects (like Student, Course,
Teacher) that will be stored.
3. Identify Attributes
Decide what information you need to store about each
o
entity.
Example: Student → StudentID, Name, Age
4. Identify Relationships
o Figure out how entities are connected.
Example: A Student enrolls in a Course.
5. Draw the ER Diagram
o Use symbols (rectangles, ovals, diamonds) to create a
diagram of all entities, attributes, and relationships.
6. Apply Constraints
o Add keys (like primary key), relationship types (1:1,
1:N, M:N), and participation constraints (total or
partial).
7. Convert to Relational Schema
o Finally, convert the ER model to tables in a relational
database system.
✅ Question 2: Basic concepts associated with
ER model
💡 Answer:
Here are the basic concepts of the ER model:
Concept Description
A real-world object or concept
Entity
(e.g., Student)
Group of similar entities (e.g.,
Entity Set
All Students)
Attribute Properties of an entity (e.g.,
Concept Description
Name, Age)
Unique identifier for an entity
Key Attribute
(e.g., StudentID)
How entities are connected (e.g.,
Relationship
Student enrolls in Course)
Relationship Set Set of similar relationships
Relationship type: one-to-one
Cardinality (1:1), one-to-many (1:N), many-
to-many (M:N)
Total or partial participation of
Participation
an entity in a relationship
Cannot exist without a related
Weak Entity strong entity (e.g., Payment
depends on Order)
Inheritance in entities (e.g.,
Generalization/Specialization Employee → {Manager,
Developer})
✅ Question 3: Diagrammatic technique for
displaying ER model using Unified Modelling
Language (UML)
💡 Answer:
The UML Class Diagram is commonly used to show ER
models in modern systems.
🛠️UML Elements Used for ER Model:
ER Concept UML Representation
Entity Class in UML
Attribute Properties/fields inside the class
Primary Key Marked with PK or <<id>>
Relationship Association (line between classes)
Multiplicity 1..1, 0..1, 1..*, etc. (written near association line)
✅ Example UML Diagram:
pgsql
CopyEdit
+----------------+ 1 enrolls *
+---------------+
| Student |-------------------------------> |
Course |
|----------------|
|---------------|
| studentID (PK) | |
courseID (PK) |
| name | |
title |
+----------------+
+---------------+
1 means one student
* means many courses
✅ Question 4: How to build an ER model
from a requirements specification?
💡 Answer:
Here’s how to build an ER model step by step from a
requirement specification:
🪜 Step-by-Step Guide:
1. Read the Requirements Carefully
o Understand what the system needs to do and what data
it must handle.
2. Identify Entities
o Find all real-world objects that need to be stored.
E.g., Student, Teacher, Course, Department
3. Identify Attributes
o Note the data required for each entity.
E.g., Student → ID, Name, DOB
4. Determine Primary Keys
o Choose the unique identifier for each entity.
E.g., StudentID, CourseID
5. Identify Relationships
o Look at how entities are related.
E.g., Student enrolls in Course, Teacher teaches
Course
6. Find Relationship Cardinality
o Decide if it's 1:1, 1:N, or M:N.
E.g., One student can enroll in many courses (1:N)
7. Check Participation
o Is the relationship required (total) or optional (partial)?
8. Draw the ER Diagram
o Use ER symbols (rectangles, ovals, diamonds) or
UML if needed.
9. Review and Validate
o Recheck everything with the user or requirement to
make sure it’s correct.
What are Entity Types?
An Entity Type is a category or class of real-world objects
that have the same properties (attributes). It defines a set of
similar things you want to store information about in a database.
For example, in a school database:
Student is an entity type.
Teacher is another entity type.
Each actual student (like Ali, Sara) is an entity of the Student
entity type.
🔍 Types of Entity Types
There are mainly two types of entity types:
1. Strong Entity Type (also called Regular Entity)
✅ Definition:
A strong entity is an entity that has its own primary key and
can exist independently of other entities.
✅ Key Points:
Has a unique identifier (Primary Key).
Does not depend on any other entity.
Represented as a rectangle in ER diagrams.
✅ Example:
Student (StudentID is the primary key)
Course (CourseID is the primary key)
📌 These entities can be stored and identified on their own.
2. Weak Entity Type
✅ Definition:
A weak entity is an entity that does not have a primary key of
its own. It depends on a strong entity to be identified.
✅ Key Points:
Does not have a unique key by itself.
Must be linked to a strong entity using a foreign key.
Represented as a double rectangle in ER diagrams.
The relationship with the strong entity is shown with a
double diamond.
✅ Example:
Payment entity depends on Order.
Dependent entity (like a child) depends on Employee
entity.
Example:
Entity: Payment
PaymentNo
Amount
But we need OrderID from the Order entity to uniquely identify
a payment. So it is weak.
📘 Example to Show the Difference:
Entity Has Primary Can Exist Depends on Another
Type Key? Alone? Entity?
Student ✅ Yes ✅ Yes ❌ No
✅ Yes (depends on
Payment ❌ No ❌ No
Order)
🎨 ER Diagram Symbols
Symbol Meaning
Rectangle Strong Entity
Double Rectangle Weak Entity
Oval Attribute
Double Diamond Identifying Relationship (for weak entities)
✅ Summary
Entity Types define a class of real-world things stored in
the database.
Strong entities have their own primary keys and exist
independently.
Weak entities rely on strong entities and don’t have unique
keys.
13. Relationship in DBMS
📘 Definition:
A relationship in DBMS represents a connection or
association between two or more entity types.
Just like real life, entities interact with each other — and these
interactions are represented using relationships in a database.
✅ Example:
In a university database:
A Student can enrolls in a Course
→ This interaction is a relationship.
So:
Entities = Student, Course
Relationship = Enrolls
📂 Types of Relationships (by degree)
📌 Degree of a Relationship
The degree of a relationship is the number of entity types
involved in the relationship.
📌 1. Unary Relationship (Degree 1)
Involves only one entity type.
The entity has a relationship with itself.
✅ Example:
An Employee supervises another Employee.
| Entity: Employee → Relationship: supervises → Entity:
Employee |
📌 2. Binary Relationship (Degree 2)
Involves two different entity types.
This is the most common type of relationship.
✅ Example:
Student enrolls in Course
| Entity 1: Student → Relationship: enrolls → Entity 2: Course |
📌 3. Ternary Relationship (Degree 3)
Involves three entity types.
✅ Example:
Doctor prescribes Medicine to a Patient
| Entity 1: Doctor → Entity 2: Patient → Entity 3: Medicine |
📌 4. N-ary Relationship (Degree N)
A relationship involving N entities.
Used when more than 3 entities are related.
✅ Example:
Supplier supplies Product to Store at a specific Location
→ This is a 4-ary relationship.
🎨 ER Diagram Representation
Rectangle: Entity
Diamond: Relationship
Line: Connection between entity and relationship
📍 The number of lines connecting entities to the diamond shows
the degree of the relationship.
✅ Summary
Type Degree Involves Example
Unary 1 1 entity Employee supervises Employee
Binary 2 2 entities Student enrolls in Course
Ternary 3 3 entities Doctor prescribes Medicine to Patient
N-ary N N entities Supplier supplies Product to Store at Location
Structural Constraints in DBMS (ER Model)
📘 Definition:
Structural constraints define the rules or restrictions on how
entities participate in relationships.
They tell us how many times an entity can be linked to
another entity through a relationship.
These constraints are very important during ER modeling
because they help design accurate relationships in databases.
There are two main types of structural constraints:
✅ 1. Cardinality Ratio (Participation Count)
This defines the number of instances of one entity that can or
must be associated with instances of another entity.
🔸 Four types of Cardinality Ratios:
Type Meaning
One-to-One (1:1) One entity is related to only one other entity
One-to-Many One entity is related to many instances of
(1:N) another entity
Many-to-One
Many entities are related to one entity
(N:1)
Many-to-Many Many entities from both sides are related to
(M:N) each other
✅ Examples:
1. One-to-One (1:1)
One person has one passport.
2. One-to-Many (1:N)
One teacher can teach many students.
3. Many-to-One (N:1)
Many employees work in one department.
4. Many-to-Many (M:N)
Many students can register in many courses.
✅ 2. Participation Constraint (Total or
Partial Participation)
This defines whether all instances of an entity are required to
participate in a relationship or not.
🔸 Two types:
Type Meaning
Total All entities must participate in the
Participation relationship (represented by double line)
Partial Some entities may or may not participate
Participation (represented by single line)
✅ Examples:
Total Participation:
Every employee must work in a department.
→ So, Employee entity has total participation in the
"Works_In" relationship.
Partial Participation:
Some students may borrow books from the library. Not
all do.
→ So, Student entity has partial participation in the
"Borrows" relationship.
🎨 ER Diagram Notation
Concept Symbol
Total Participation Double line
Partial Participation Single line
One-to-One 1—1
One-to-Many 1—N
Many-to-Many M—N
✅ Summary
Constraint Type Purpose
Cardinality Ratio Defines how many instances are involved
Participation Defines whether all instances must participate
Structural constraints help us accurately model relationships
and ensure that data is consistent and meaningful.
14. Structural Constraints in Binary
Relationships
📘 What are Structural Constraints?
Structural constraints define the rules about how many
instances of an entity can be associated with instances of
another entity in a binary relationship.
Binary relationships are those that involve exactly two entities.
✅ Types of Binary Relationships (Cardinality
Ratios)
Binary relationships can be categorized into the following four
types, based on how many entities can relate to each other.
🔹 1. One-to-One (1:1)
📌 Definition:
Each entity in set A is related to at most one entity in set B, and
vice versa.
✅ Example:
Each person has one passport
Each passport belongs to one person
🔁 Diagram:
scss
CopyEdit
Person (1) <------> (1) Passport
🔹 2. One-to-Many (1:*) (Optional)
📌 Definition:
An entity in set A can be related to many entities in set B,
but each entity in set B is related to only one in set A.
✅ Example:
One teacher teaches many students
But each student has only one teacher (in that
relationship)
🔁 Diagram:
scss
CopyEdit
Teacher (1) <------> (*) Student
🔹 3. Many-to-One (*:1) (Mandatory)
📌 Definition:
Many entities in set A are associated with one entity in set B.
Each entity in set A must participate in the relationship.
✅ Example:
Many employees work in one department
Every employee must belong to a department
→ So, it is mandatory participation
🔁 Diagram:
Employee (*) ——> (1) Department
🔹 4. Many-to-Many (:) (Optional)
📌 Definition:
Entities in set A can be related to many entities in set B, and
vice versa.
✅ Example:
Students can enroll in many courses
Each course can have many students
This type of relationship is optional because:
A student may not be enrolled in any course
A course may not have any students yet
🔁 Diagram:
Student (*) <------> (*) Course
🧠 Quick Comparison Table
Relationship Type Description Example
One-to-One (1:1) One entity ↔ One entity Person ↔ Passport
One-to-Many
One ↔ Many (Optional) Teacher ↔ Students
(1:*)
Many-to-One Many ↔ One Employees ↔
(*:1) (Mandatory) Department
Many-to-Many (:) Many ↔ Many (Optional) Students ↔ Courses
✅ Summary:
Structural constraints control participation and define
how entities connect.
Binary relationships are classified as 1:1, 1:*, *:1, and *:*.
Some are optional, while others are mandatory,
depending on whether participation is required or not
15. Subtype and Supertype Entities
📘 Definition:
In database design (especially in the ER model), a supertype is
a general entity type, and a subtype is a more specific entity
type that inherits attributes from the supertype.
This concept is useful when some entities share common
properties, but also have unique properties of their own.
✅ 1. Supertype
A supertype is a general category.
It contains common attributes that are shared by all
subtypes.
There is always at least one attribute or characteristic
common to all subtypes.
🔹 Example:
If we have an entity “Employee”, all employees have:
Name
Employee ID
Address
So Employee is the supertype.
✅ 2. Subtype
A subtype is a specialized version of the supertype.
It has its own unique attributes in addition to those
inherited from the supertype.
Subtypes are connected to their supertype through a
relationship called "IS-A".
🔹 Example:
We can have two subtypes of Employee:
Manager (with extra attribute: DepartmentManaged)
Technician (with extra attribute: SkillSet)
Both Manager and Technician are Employees, but they have
their own specific fields too.
✅ 3. IS-A Relationship
The "IS-A" relationship means that:
Subtype IS-A Supertype
Example:
Manager is an Employee
Technician is an Employee
This relationship is shown using an arrow from subtype to
supertype in ER diagrams.
✅ 4. Disjoint and Overlapping Subtypes
🔹 Disjoint (Exclusive) Subtypes:
An entity instance can belong to only one subtype at a
time.
🟢 Example:
An employee is either a manager or a technician, but not both.
🔹 Overlapping Subtypes:
An entity instance can belong to multiple subtypes at the
same time.
🟢 Example:
A person may be both a teacher and a researcher.
✅ 5. Total vs Partial Specialization
🔸 Total Specialization
Every supertype entity must belong to at least one
subtype.
Shown using a double line in the diagram.
🟢 Example:
Every Employee is either a Manager or a Technician.
🔸 Partial Specialization
Some supertype entities may not belong to any subtype.
Shown using a single line in the diagram.
🟢 Example:
Some Employees may be general staff with no subtype.
✅ ER Diagram Example:
Employee (Supertype)
--------------------
Emp_ID
Name
Address
|
---------------------
| |
Manager (Subtype) Technician (Subtype)
--------------- -------------------
DeptManaged SkillSet
✅ Summary:
Term Meaning
Supertype General entity with common attributes
Subtype Specialized entity with extra attributes
IS-A Relationship that connects subtype to supertype
Disjoint Entity can be in only one subtype
Overlapping Entity can be in multiple subtypes
Term Meaning
Total All supertype entities must belong to a subtype
Some supertype entities may not belong to any
Partial
subtype
16. Keys in Database
📘 What is a Key?
In a database, a key is an attribute (or set of attributes) that is
used to:
Uniquely identify a record (row) in a table.
Establish relationships between different tables.
✅ 1. Types of Keys
Here are the main types of keys in a relational database:
🔹 1.1 Primary Key
A primary key uniquely identifies each record in a table.
It cannot be null and must be unique for every record.
A table can have only one primary key.
🟢 Example:
sql
CopyEdit
Student Table:
+------------+--------+
| StudentID | Name |
+------------+--------+
| 101 | Aisha |
| 102 | Usman |
Primary Key: StudentID
🔹 1.2 Candidate Key
These are all the possible keys that can uniquely identify
rows.
One of them is selected as the primary key, the rest are
called alternate keys.
🟢 Example:
If both Email and StudentID are unique, both are candidate
keys.
🔹 1.3 Alternate Key
A candidate key not chosen as the primary key.
🔹 1.4 Composite Key (Concatenated Key / Joint Key)
A composite key is formed by combining two or more
columns to uniquely identify a row.
Used when no single column can uniquely identify a row.
🟢 Example:
sql
CopyEdit
CourseRegistration Table:
+------------+----------+--------+
| StudentID | CourseID | Grade |
+------------+----------+--------+
| 101 | CS101 | A |
| 101 | MA102 | B |
Composite Primary Key = (StudentID + CourseID)
This combination is unique, but neither StudentID nor CourseID
alone is unique.
🔹 1.5 Foreign Key
A foreign key is a column that refers to the primary key of
another table.
It is used to link tables together.
🟢 Example:
In a table Orders, CustomerID can be a foreign key that refers to
Customers table.
✅ 2. Concatenated (Composite) Key
📌 What is it?
A concatenated key is just another name for a composite
key.
It is a key made by joining (concatenating) multiple
attributes.
🟢 Example:
In a LibraryIssue table:
BookID + MemberID + IssueDate can form a concatenated
key.
✅ 3. Types of Concatenated (Composite)
Primary Keys
There are no formal "types" but in practice, we consider:
🔸 a) Two-Attribute Composite Key
Made of two columns
Example: StudentID + CourseID
🔸 b) Multi-Attribute Composite Key
Made of more than two columns
Example: BookID + MemberID + IssueDate
✅ Summary Table:
Key Type Description
Uniquely identifies each record, must be
Primary Key
unique & not null
Candidate Key All possible unique keys in a table
Key Type Description
Alternate Key Candidate key not selected as primary
Composite Key A key made of two or more attributes
Foreign Key References a primary key from another table
Concatenated
Another name for Composite Key
Key
Types of Two-Attribute and Multi-Attribute based on
Composite the number of columns joined
17. Compound Primary Key, Foreign Key
(FK), and Candidate Key
✅ 1. Compound Primary Key (also called Composite Key)
📘 Definition
A Compound Primary Key is a primary key that consists of
two or more columns.
It is used when no single column is able to uniquely identify a
row.
🟢 Example:
Imagine a table called StudentCourses that stores which student
is enrolled in which course.
StudentID CourseID Grade
101 CS101 A
101 MA101 B
102 CS101 A
🔸 Compound Primary Key = (StudentID + CourseID)
This combination is unique for each row.
✅ 2. Foreign Key (FK)
📘 Definition
A Foreign Key is an attribute (or group of attributes) in one
table that refers to the Primary Key in another table.
It is used to create a relationship between two tables.
🟢 Example:
We have two tables:
Students Table
StudentID Name
101 Aisha
102 Usman
StudentCourses Table
StudentID (FK) CourseID
101 CS101
102 MA101
Here, StudentID in the StudentCourses table is a foreign key.
It refers to StudentID in the Students table.
This ensures that only valid students can be enrolled in a course.
✅ 3. Candidate Key
📘 Definition
A Candidate Key is any column (or set of columns) that can
uniquely identify a row in a table.
A table can have multiple candidate keys, but only one is
chosen as the Primary Key.
The others are called Alternate Keys.
🟢 Example:
Employees Table
EmpID Email CNIC
1 ali@gmail.com 35201-1234567-8
2 sara@hotmail.com 35202-9876543-1
In this case:
EmpID is a candidate key
Email is also unique → another candidate key
CNIC is also unique → another candidate key
You can choose any one of them as the Primary Key.
The remaining are Alternate Keys.
✅ Summary Table
Key Type Definition Example
Compound A Primary Key made of
(StudentID, CourseID)
Primary Key two or more columns
Refers to the Primary Key
Foreign Key StudentCourses.StudentID →
in another table to create
(FK) Students
a relationship
Any attribute that can EmpID, Email, CNIC in
Candidate Key
uniquely identify rows Employees table
18. Integrity, Unauthorized Users,
Authorized Users, and Constraints
✅ 1. Integrity in Databases
📘 Definition:
Integrity means maintaining the accuracy, consistency, and
correctness of data in a database over its entire life.
It ensures that:
The data entered is valid and meaningful
There are no conflicts, duplications, or invalid
relationships
✅ 2. Authorized Users vs Unauthorized Users
🔐 Authorized Users:
Users who are permitted to access or modify the database.
They have login credentials and specific permissions (like
read, write, update).
Examples:
o Database Administrator (DBA)
o Application developers
o Staff members with access rights
🚫 Unauthorized Users:
Users who are not allowed to access or use the database.
They could be hackers or people without login access.
They must be blocked for security reasons.
✅ 3. Constraints in DBMS
📘 Definition:
A Constraint is a rule or restriction that ensures that data
entered into the database is valid, consistent, and reliable.
These rules cannot be violated — otherwise, the system will
reject the invalid data.
📌 Purpose of Constraints:
Enforce data accuracy
Avoid errors and duplicates
Protect data integrity
Control what type of data can go into tables
✅ 4. Types of Constraints
🔹 a. Domain Constraint
Ensures that a value entered in a column is from a valid
data type or range.
Example:
o Age must be a number between 0 and 120
o Gender column must only have "Male", "Female", or
"Other"
🔹 b. Entity Integrity Constraint
The Primary Key cannot be NULL.
Ensures that each row in a table is uniquely identified.
🔹 c. Referential Integrity Constraint
Foreign key must refer to a valid value in the referenced
(parent) table.
Example: If a student is enrolled in a course, the course
must exist in the Course table.
🔹 d. Unique Constraint
A column must have unique values only.
Example: Email addresses must be unique in the user table.
🔹 e. Not Null Constraint
A column cannot be left empty.
Example: The "Name" field in the Students table must not
be left blank.
🔹 f. Check Constraint
Checks for a specific condition before inserting or
updating data.
Example:
o Check (Salary > 0)
o Ensures salary is always positive.
✅ Summary Table
Term Explanation
Ensures accuracy, consistency, and correctness of
Integrity
data
Authorized Users Have permission to access/modify database
Unauthorized Users Not allowed to access the database
Constraint A rule to restrict invalid data entry
Domain Constraint Data must be from a valid domain/type
Entity Integrity Primary key must be unique and not null
Referential Foreign key must match a value in the referenced
Integrity table
Unique Constraint No duplicate values allowed in a column
Not Null Constraint Field must not be left blank
Check Constraint Data must meet a specific condition
19. Normalization, Purpose, Characteristics
& Normal Forms
✅ 1. Normalization
📘 Definition:
Normalization is a process in database design that organizes
data to reduce redundancy (duplicate data) and improve data
integrity.
It breaks down a large table into smaller tables and defines
relationships between them.
✅ 2. Purpose of Normalization
Eliminate duplicate data to save space
Avoid data anomalies (problems when inserting, updating,
or deleting data)
Ensure data dependencies make sense (data stored
logically)
Improve database efficiency by structuring data properly
✅ 3. Characteristics of a Normalized Database
No repeated groups or arrays in a single table
Every data item depends only on the key, the whole key,
and nothing but the key
Data stored logically so update, insert, delete anomalies do
not happen
Data redundancy minimized
Relationships between tables are well defined
✅ 4. Normal Forms (NF)
Normalization is done in steps called Normal Forms. Each
form has rules to follow:
🔹 First Normal Form (1NF)
Table must have only atomic (indivisible) values in each
cell
No repeating groups or arrays
Each column contains unique values of a single type
🔹 Second Normal Form (2NF)
Table must be in 1NF
All non-key attributes must depend on the whole primary
key, not part of it
This means no partial dependency on part of the key
(applies if primary key is composite)
🔹 Third Normal Form (3NF)
Table must be in 2NF
No transitive dependency — non-key attributes should
not depend on other non-key attributes
All fields depend only on the primary key
🔹 Boyce-Codd Normal Form (BCNF)
A stronger version of 3NF
Every determinant must be a candidate key
Resolves some rare anomalies 3NF does not handle
🔹 Fourth Normal Form (4NF)
Table must be in BCNF
No multi-valued dependencies (no independent multi-
values in one table)
Data with multiple independent relationships should be
split into separate tables
🔹 Fifth Normal Form (5NF)
Table must be in 4NF
Handles cases where data can be reconstructed from
smaller pieces of information
Breaks table further to avoid redundancy caused by join
dependencies
✅ 5. Example
Imagine a table that stores students and the courses they take
along with their instructors:
StudentID StudentName Course Instructor
101 Ali Math Mr. Khan
101 Ali Science Mrs. Ahmed
StudentID StudentName Course Instructor
102 Sara Math Mr. Khan
Issues:
Redundancy: StudentName repeated multiple times
Update anomaly: If instructor changes, multiple rows need
update
Insert anomaly: Cannot add instructor without
course/student
After Normalization:
1. Students Table
| StudentID | StudentName |
|-----------|-------------|
| 101 | Ali |
| 102 | Sara |
2. Courses Table
| CourseID | CourseName | Instructor |
|----------|------------|-------------|
| 1 | Math | Mr. Khan |
| 2 | Science | Mrs. Ahmed |
3. Enrollments Table
| StudentID | CourseID |
|-----------|----------|
| 101 | 1 |
| 101 | 2 |
| 102 | 1 |
✅ Summary:
Normal Form Main Rule
1NF Atomic values, no repeating groups
2NF No partial dependency on part of primary key
3NF No transitive dependency
BCNF Every determinant is a candidate key
4NF No multi-valued dependencies
5NF No join dependencies causing redundancy
20. Anomalies in Databases
✅ 1. What are Anomalies?
Anomalies are problems or inconsistencies that happen when
performing insert, update, or delete operations on poorly
designed (non-normalized) databases.
They cause data redundancy, inconsistency, and sometimes
loss of data.
✅ 2. Types of Anomalies
🔹 a. Insertion Anomaly
Occurs when new data cannot be inserted into the
database without the presence of other data.
Example:
You cannot add a new course if no student is enrolled in it
yet (in a single table that stores both students and courses).
🔹 b. Update Anomaly
Occurs when updating data in one place requires
multiple updates in other places to keep data consistent.
If all places are not updated, inconsistency happens.
Example:
If an instructor’s name changes, but you update it in only
one row instead of all rows where it appears.
🔹 c. Deletion Anomaly
Occurs when deleting some data unintentionally causes
loss of other important data.
Example:
Deleting a student’s record also deletes the record of the
course they were enrolled in (if both are stored in the same
table).
✅ 3. Why Do Anomalies Occur?
Mainly due to data redundancy (repeated data)
When tables are not normalized properly
When multiple types of data are stored in a single table
instead of splitting into related tables
✅ 4. How to Avoid Anomalies?
By normalizing the database (breaking down tables into
smaller related tables)
Using Primary Keys and Foreign Keys to maintain data
consistency
Enforcing constraints to ensure valid data entry
✅ Summary Table
Anomaly Type What Happens Example
Insertion Cannot insert data without Cannot add a course
Anomaly other data without student
Update Update must be done in Instructor name update
Anomaly many places incomplete
Deletion Deleting data causes loss of Deleting student deletes
Anomaly other data course info
21. Database Backup & Recovery Techniques
✅ 1. What is Database Backup?
Backup means making a copy of the database data to
protect it from loss or damage.
It helps to restore the database in case of hardware
failure, software issues, accidental deletion, or any disaster.
✅ 2. Why Backup is Important?
Prevents permanent loss of data
Protects against system crashes or failures
Helps recover from human errors (like accidental deletion)
Enables recovery after virus attacks or hacking
✅ 3. Types of Database Backup
Full Backup
Copies the entire database every time.
Advantage: Complete copy
Disadvantage: Time-consuming, large storage needed
Incremental Backup
Copies only data changed since the last backup (full or
incremental).
Advantage: Faster, saves storage
Disadvantage: Recovery takes longer as multiple backups
needed
Differential Backup
Copies all data changed since the last full backup.
Advantage: Faster than full backup, easier recovery than
incremental
Disadvantage: Requires more storage than incremental
backup
✅ 4. Database Recovery
Recovery means restoring the database to a correct state
after failure or crash.
Uses backups and transaction logs to undo incomplete
transactions and redo completed ones.
✅ 5. Recovery Techniques
Rollback: Undo changes made by incomplete transactions
to maintain data consistency.
Rollforward: Reapply changes from backup logs to bring
the database up to date.
Shadow Paging: Keeps a copy of data pages before change
to revert if failure happens.
Checkpointing: Save current state periodically to reduce
recovery time.
✅ 6. Summary
Backup Type Description Pros Cons
Copies entire Complete Slow, needs
Full Backup
database restore more storage
Incremental Copies changes Fast, saves Recovery
Backup since last backup storage slower
Differential Copies changes Faster Uses more
Backup since last full recovery storage
22. Database Security & Authorization
✅ 1. What is Database Security?
Protecting database from unauthorized access, misuse, or
damage.
Ensures confidentiality, integrity, and availability of data.
✅ 2. Types of Threats
Unauthorized access (hacking)
Data theft or data loss
Data tampering or modification
Software bugs or hardware failure
Denial of Service (DoS) attacks
✅ 3. Database Authorization
Authorization means granting or restricting access to
database users.
Only authorized users can read, write, update, or delete
data based on permissions.
✅ 4. Methods of Database Security
Authentication: Verify user identity using passwords,
biometrics, or tokens.
Access Control: Define who can access what data (read,
write, delete).
Encryption: Convert data into unreadable form for
unauthorized users.
Audit Trails: Keep logs of user activities for monitoring
and tracking.
User Roles: Group users with specific permissions to
simplify management.
✅ 5. SQL Commands for Authorization
GRANT: Give permission to users
Example: GRANT SELECT, INSERT ON Students TO User1;
REVOKE: Remove permission from users
Example: REVOKE INSERT ON Students FROM User1;
✅ 6. Summary
Security
Description Example/Method
Aspect
Authentication Verifying user identity Passwords, biometrics
GRANT and REVOKE
Authorization Controlling user access
commands
Protecting data from Data encryption
Encryption
unauthorized use algorithms
Monitoring user
Audit Trails Database logs
actions
23. Transaction Processing & Optimization
✅ 1. What is a Transaction?
A transaction is a sequence of database operations (like
read, write, update) treated as a single logical unit.
It must either be completed fully (commit) or not
executed at all (abort/rollback).
Example: Transferring money from one bank account to
another involves multiple steps but is one transaction.
✅ 2. Properties of Transactions (ACID)
Atomicity: All parts of a transaction succeed or all fail
together.
Consistency: Database remains in a valid state before and
after the transaction.
Isolation: Transactions appear to run independently
without interference.
Durability: Once committed, transaction changes are
permanent even if system crashes.
✅ 3. Transaction States
Active: Transaction is being executed.
Partially Committed: Last operation done but changes not
yet permanent.
Committed: All changes are saved permanently.
Failed: Transaction is aborted due to error.
Terminated: Transaction ends successfully or
unsuccessfully.
✅ 4. Transaction Optimization
Aim to reduce transaction time and improve database
performance.
Techniques include:
o Minimizing locking time.
o Efficient use of indexes.
o Query optimization.
o Batch processing of transactions.
24. Concurrency Control
✅ 1. What is Concurrency Control?
Concurrency Control manages simultaneous execution of
transactions so that database consistency is maintained.
Ensures that multiple transactions do not interfere with
each other.
✅ 2. Why is Concurrency Control Important?
Without it, simultaneous transactions can cause:
o Lost updates (one update overwrites another).
o Temporary inconsistency (reading uncommitted
data).
o Deadlocks (transactions waiting indefinitely for each
other).
✅ 3. Concurrency Control Techniques
🔹 a. Lock-Based Protocols
Transactions acquire locks on data items before using them.
Types of locks:
o Shared Lock (S-lock): Allows read access.
o Exclusive Lock (X-lock): Allows read and write.
Locks are released after transaction commits or aborts.
🔹 b. Timestamp-Based Protocols
Each transaction gets a timestamp.
Transactions are ordered based on timestamps to avoid
conflicts.
🔹 c. Optimistic Concurrency Control
Transactions execute without restrictions.
At commit time, the system checks for conflicts and aborts
if needed.
✅ 4. Deadlock
Occurs when two or more transactions wait indefinitely for
each other’s locks.
Deadlock detection and resolution techniques include:
o Timeout (abort after waiting too long).
o Wait-for graph analysis.
o Transaction rollback.
✅ 5. Summary Table
Concept Description
Transaction Single logical unit of work
ACID Properties Atomicity, Consistency, Isolation, Durability
Lock-Based Control Uses locks to manage concurrent access
Timestamp Protocol Orders transactions by timestamps
Deadlock Waiting cycle among transactions
25. DBMS Available in Market
✅ 1. What is DBMS in Market?
DBMS (Database Management System) is software that
helps create, manage, and use databases.
There are many DBMS products available, each with
different features, licenses, and uses.
✅ 2. Popular DBMS Products
DBMS Name Type License Description
Oracle Powerful, enterprise-
Relational Commercial
Database level, widely used
Open Popular for web apps,
MySQL Relational
Source easy to use
DBMS Name Type License Description
Microsoft Integration with
Relational Commercial
SQL Server Microsoft products
Open Advanced features,
PostgreSQL Relational
Source standards compliant
NoSQL Open For flexible, scalable
MongoDB
(Document) Source JSON-like data
Lightweight, used in
Open
SQLite Embedded mobile and embedded
Source
systems
✅ 3. Choosing a DBMS
Depends on requirements like scale, data type, budget, and
platform.
Relational DBMS (RDBMS) are common for structured
data.
NoSQL DBMS are used for big data, flexible schemas.
26. Oracle SQL: Installation, History &
Features
✅ 1. History of Oracle
Oracle was founded in 1977 by Larry Ellison, Bob Miner,
and Ed Oates.
It was one of the first companies to develop a commercial
relational database.
Oracle Database has grown to become one of the most
popular and powerful DBMS worldwide.
✅ 2. Oracle SQL Installation Overview
Oracle SQL is part of Oracle Database software.
Installation steps (basic):
1. Download Oracle Database installer from the official
Oracle website.
2. Run the installer and choose installation type (Typical
or Custom).
3. Configure database settings like username, password,
storage options.
4. Complete installation and start Oracle services.
Oracle provides tools like SQL*Plus and Oracle SQL
Developer for managing databases and running SQL
commands.
✅ 3. Key Features of Oracle SQL
High Performance: Optimized for large data and complex
queries.
Scalability: Can handle very large databases and many
users.
Security: Supports advanced user authentication, roles, and
encryption.
Backup & Recovery: Powerful tools to backup and
recover data.
Concurrency Control: Manages multiple users efficiently.
Support for SQL and PL/SQL: Full SQL support plus
Oracle’s procedural language (PL/SQL).
Data Integrity: Strong support for constraints, triggers,
and relationships.
Cross-platform: Runs on Windows, Linux, Unix, etc.
Tools & Utilities: Oracle SQL Developer, Oracle
Enterprise Manager for easy database management.
✅ 4. Summary
Aspect Description
Oracle Founded 1977
Installation Download, run installer, configure DB
Key Features High performance, security, scalability
Languages Supported SQL, PL/SQL
Platforms Supported Windows, Linux, Unix
27. Relational Algebra and Calculus
✅ 1. Relational Algebra
A procedural query language used to retrieve data from
relational databases.
Works on relations (tables) and produces new relations as
results.
Basic operations include:
Operation Description
Select (σ) Selects rows that satisfy a condition.
Operation Description
Project (π) Selects columns (attributes) from a relation.
Combines rows from two relations, removing
Union (∪)
duplicates.
Returns rows in one relation but not in
Set Difference (-)
another.
Cartesian Combines every row of one relation with
Product (×) every row of another.
Rename (ρ) Renames the output relation or attributes.
Combines related tuples from two relations
Join
based on a condition.
✅ 2. Relational Calculus
A non-procedural query language (specifies what to
retrieve, not how).
Two types:
o Tuple Relational Calculus (TRC): Works with tuple
variables.
o Domain Relational Calculus (DRC): Works with
domain variables (attribute values).
Example: {t | t ∈ Employee and t.salary > 5000}
means retrieve all tuples t from Employee relation where
salary > 5000.
28. Basic SQL Statements
✅ 1. SQL (Structured Query Language)
Standard language to communicate with relational
databases.
Used for querying, updating, and managing data.
✅ 2. Common SQL Statements
Statement Purpose Example
Retrieve data
SELECT SELECT * FROM Students;
from a table
Add new records INSERT INTO Students VALUES
INSERT (1, 'Ali');
into a table
Modify existing UPDATE Students SET
UPDATE name='Ahmed' WHERE id=1;
data
DELETE FROM Students WHERE
DELETE Remove data id=1;
CREATE Create a new CREATE TABLE Students (id
TABLE table INT, name VARCHAR(50));
DROP
Delete a table DROP TABLE Students;
TABLE
29. DDL, DML, DCL
✅ 1. Data Definition Language (DDL)
Used to define or modify database structures.
Commands include:
o CREATE — create tables, indexes, etc.
o ALTER — modify structure of tables.
o DROP — delete tables or other objects.
o TRUNCATE — remove all records from a table quickly.
✅ 2. Data Manipulation Language (DML)
Used to manipulate data inside tables.
Commands include:
o SELECT — retrieve data.
o INSERT — add data.
o UPDATE — modify data.
o DELETE — remove data.
✅ 3. Data Control Language (DCL)
Used to control access to data.
Commands include:
o GRANT — give user permissions.
o REVOKE — remove permissions.
✅ Summary Table
Language
Purpose Commands
Type
Define or change CREATE, ALTER,
DDL
database structure DROP, TRUNCATE
Manipulate data in SELECT, INSERT,
DML
tables UPDATE, DELETE
Manage permissions
DCL GRANT, REVOKE
and security
30. Storage & File Structure
✅ 1. What is Storage in DBMS?
Storage refers to how data is physically saved on storage
devices like hard disks or SSDs.
Efficient storage helps fast data retrieval and better
database performance.
Data in DBMS is stored in files, and files are organized to
optimize access.
✅ 2. File Structure in Databases
A file is a collection of related records stored together.
Files can be organized in different ways to improve data
retrieval:
File Structure
Description
Type
Heap File Records stored in no particular order. Simple
(Unordered) but slow for search.
Records stored in sorted order based on key.
Sequential File
Faster search for range queries.
Has indexes (like a table of contents) to
Indexed File
quickly locate records.
Hash File Uses a hash function to determine record
location, giving very fast access for exact
File Structure
Description
Type
matches.
✅ 3. Records and Blocks
Data is stored in records (rows), and multiple records are
stored together in blocks or pages.
Blocks are the smallest units transferred between disk and
memory.
Efficient block management reduces disk I/O and speeds up
queries.
✅ 4. File Organization Goals
Minimize disk access time.
Allow fast insert, update, and delete operations.
Support efficient query processing.
31. File Storage Concepts
✅ 1. Storage Devices
Data is stored on physical devices like:
o Hard Disk Drives (HDD)
o Solid State Drives (SSD)
o Magnetic Tapes (less common now)
✅ 2. Storage Access Methods
Sequential Access: Data read in sequence; slow for
random access.
Direct (Random) Access: Data read directly using an
address or key; faster.
✅ 3. Buffer Management
A buffer temporarily holds data transferred between disk
and memory.
Efficient buffer management reduces disk I/O and speeds
up database operations.
✅ 4. Indexing
Indexes are special data structures stored separately to
speed up data retrieval.
Common types:
o B-Tree Index: Balanced tree structure, good for range
queries.
o Hash Index: Fast exact match queries.
✅ 5. File Storage Summary
Concept Explanation
Storage Devices HDD, SSD, tapes
File Structures Heap, Sequential, Indexed, Hash
Access Methods Sequential vs Direct
Concept Explanation
Buffer
Manages data between disk and memory
Management
Speeds up searches by maintaining pointers
Indexing
to data
✅ Summary
Efficient file storage and organization are essential for fast data
access and good DBMS performance. Understanding file
structures and storage concepts helps design databases that work
well with the underlying hardware.
32. Database Efficiency and Tuning
✅ 1. What is Database Efficiency?
Database efficiency means how well a database performs in
storing, retrieving, and managing data.
Efficient databases respond quickly to queries and use
resources (CPU, memory, disk) effectively.
✅ 2. Factors Affecting Database Efficiency
Query Optimization: Writing efficient queries and letting
the DBMS optimize execution plans.
Indexing: Proper use of indexes speeds up data retrieval.
Storage Management: Organizing files and data properly
for fast access.
Buffer Management: Efficient use of memory buffers
reduces disk access.
Concurrency Control: Managing multiple users without
conflict or delays.
Hardware Resources: CPU speed, RAM size, and disk
speed impact performance.
✅ 3. Database Tuning
The process of improving database performance.
Techniques include:
Technique Description
Creating, dropping, or modifying indexes
Index Tuning
for better query speed.
Query Analyzing and rewriting queries for
Optimization efficiency.
Database Adjusting DBMS settings like cache size,
Configuration connection limits.
Splitting large tables into smaller parts for
Partitioning
faster access.
Monitoring & Using tools to monitor database
Profiling performance and identify bottlenecks.
✅ 4. Benefits of Database Tuning
Faster query response.
Better resource usage.
Reduced downtime.
Improved user experience.
33. Introduction to Data Warehousing &
Data
✅ 1. What is Data Warehousing?
A data warehouse is a special type of database designed to
store large amounts of historical data from multiple
sources.
It is used for analysis and reporting, not for daily
transaction processing.
Helps businesses make informed decisions by providing a
consolidated view of data.
✅ 2. Characteristics of Data Warehouses
Subject-Oriented: Focused on specific areas like sales,
finance, etc.
Integrated: Combines data from different sources.
Non-Volatile: Data is stable; changes are done
periodically, not frequently.
Time-Variant: Stores historical data to analyze trends over
time.
✅ 3. Components of Data Warehousing
Component Description
Data
Original databases and external data sources.
Sources
ETL Extract, Transform, Load — process of cleaning
Process and loading data into the warehouse.
Data
The actual warehouse storing integrated data.
Storage
Metadata Data about data — defines warehouse contents.
Access Tools Tools for querying and reporting data.
✅ 4. Purpose of Data Warehousing
Support decision-making processes.
Perform complex queries and data analysis.
Provide historical data for trend analysis.
✅ Summary
Database efficiency and tuning ensure databases run fast and use
resources well. Data warehousing helps organizations analyze
big data collected over time for better business decisions.