(501) DBMS Notes
(501) DBMS Notes
UNIT-I Introduction:
Entity types, Entity set, attribute and key, relationships, relation types, roles and
structural constraints, weak entities, enhanced E-R and object modeling, Sub
classes; Super classes, inheritance, specialization and generalization.
Indexed sequential access files; implementation using B & B++ trees, hashing,
hashing functions, collision resolution, extendible hashing, dynamic hashing
approach implementation and performance.
A database approach involves organizing and managing data using a database management
system (DBMS), which provides several key advantages over traditional file systems. Here are
the characteristics explained in detail:
2. Data Models
Data models define how data is structured, stored, and manipulated within a database system.
There are several types of data models:
3. DBMS Architecture
The architecture of a DBMS defines how the system is designed to manage data, process
requests, and provide access to users.
1. Three-Level Architecture:
○ A layered approach that provides data abstraction:
■ Internal Level:
■ Details physical storage (e.g., how records are placed on disk).
■ Conceptual Level:
■ Represents the logical structure of the database.
■ Example: Relations, attributes, constraints.
■ External Level:
■ Tailored user views of the data.
■ Example: A report showing only sales data for a specific region.
2. Client-Server Architecture:
○ Client: User-facing application that sends requests.
○ Server: Processes requests and returns results.
○ Example: An online shopping website where the client (browser) requests
product details from the server (database).
3. Centralized vs. Distributed DBMS:
○ Centralized DBMS:
■ Data is stored and managed on a single server.
■ Example: A university database hosted on a mainframe.
○ Distributed DBMS:
■ Data is distributed across multiple locations but appears unified to the
user.
■ Example: Global banking systems.
4. Data Independence
Data independence refers to the ability to modify the database schema without affecting the
applications that use the database.
1. Entity:
○ An object in the real world that can be identified uniquely and stored as data.
○ Example: A "Student" in a university database.
2. Entity Types:
○ A collection of similar entities.
○ Example: "Student" is an entity type, while individual students like "John" or
"Priya" are entities.
3. Entity Set:
○ A collection of all entities of a particular entity type at any point in time.
○ Example: All students currently enrolled in a course.
1. Attributes:
○ Properties or characteristics of an entity.
○ Example: A "Student" entity might have attributes like "StudentID," "Name," and
"Age."
2. Types of Attributes:
○ Simple Attributes: Cannot be divided further.
■ Example: "Age."
○ Composite Attributes: Can be divided into smaller parts.
■ Example: "Full Name" divided into "First Name" and "Last Name."
○ Derived Attributes: Values derived from other attributes.
■ Example: "Age" calculated from "Date of Birth."
○ Multivalued Attributes: Can have multiple values.
■ Example: "Phone Numbers."
3. Keys:
○ Attributes that uniquely identify an entity in an entity set.
■ Primary Key: Unique and mandatory attribute for each entity.
Example: "StudentID."
■ Candidate Key: All attributes that could serve as a primary key.
■ Composite Key: A combination of two or more attributes that uniquely
identify an entity.
Example: "CourseCode" + "Semester."
1. Relationship:
○ A logical connection between entities.
○ Example: "Enrolls" relationship between "Student" and "Course."
2. Relationship Types:
○ Unary (Self-Relationship): Entities of the same type relate to each other.
■ Example: "Employee" supervises another "Employee."
○ Binary Relationship: Two entities are related.
■ Example: "Student" enrolls in "Course."
○ Ternary Relationship: Involves three entities.
■ Example: "Student" gets "Grades" in a "Course."
1. Roles:
○ Explain how entities participate in a relationship.
○ Example: In a "WorksFor" relationship, "Employee" and "Department" have roles
like "Worker" and "Employer."
2. Structural Constraints:
○ Cardinality: Minimum and maximum number of times an entity participates in a
relationship.
■ Example: "One-to-Many (1:N):" One teacher teaches many students.
○ Participation Constraints:
■ Total Participation: All entities must participate.
Example: Each "Student" must enroll in at least one "Course."
■ Partial Participation: Some entities may not participate.
Example: Some "Professors" might not supervise "Projects."
5. Weak Entities
1. Definition:
○ Entities that cannot be uniquely identified without a related strong entity.
○ Example: "Dependent" in a "Dependent-BelongsTo-Employee" relationship.
2. Characteristics of Weak Entities:
○ Lack a primary key.
○ Depend on a strong entity for identification.
○ Use a discriminator (partial key) to identify entities within the same weak entity
set.
○ Have a total participation constraint.
Enhanced E-R (EER) modeling extends the basic E-R model to support advanced concepts.
1. Entities:
○ Student (StudentID, Name, Age)
○ Course (CourseID, Title, Credits)
2. Relationships:
○ Enrolls (StudentID, CourseID, Grade)
3. Enhanced Features:
○ Specialization of "Student" into "Undergraduate" and "Graduate."
UNIT-III: File Organization
File organization refers to the method of arranging data in files on a storage medium to ensure
efficient data storage, retrieval, and manipulation.
1. Definition:
○ A hybrid of sequential and direct access methods.
○ Files are organized sequentially, and an index is maintained for quick access.
2. Structure:
○ Index Table: Contains keys and pointers to the actual records.
○ Data Blocks: Store the actual data in sequential order.
3. Advantages:
○ Faster searches due to indexing.
○ Supports sequential and random access.
4. Example:
○ A phone directory where names are stored alphabetically, and an index table
helps locate a specific name quickly.
1. B-Tree:
○ A self-balancing tree structure used for database indexing.
○ Ensures all leaf nodes are at the same level.
○ Keys and data are stored in the nodes.
2. B++ Tree:
○ An extension of B-Tree.
○ Data is stored only in leaf nodes, and internal nodes store only keys for
navigation.
3. Features:
○ Balanced height for efficient searches, insertions, and deletions.
○ Supports sequential access due to linked leaf nodes (in B++ Trees).
4. Applications:
○ Database indexing and file systems.
3. Hashing
1. Definition:
○ A technique to map data to a fixed-size key (hash code) using a hashing function.
2. Hashing Functions:
○ A mathematical formula that converts input data into a hash code.
○ Example: hash(key) = key % table_size
3. Collision:
○ Occurs when two keys generate the same hash code.
1. Separate Chaining:
○ Store collided keys in a linked list or bucket at the hashed location.
2. Open Addressing:
○ Linear Probing: Check the next available slot.
○ Quadratic Probing: Check slots at intervals of squares of numbers.
○ Double Hashing: Use a secondary hashing function.
5. Extendible Hashing
1. Definition:
○ A dynamic hashing technique that adjusts the hash table size based on the data
volume.
2. Implementation:
○ Uses a directory of pointers to buckets.
○ Hash function generates binary values, and the directory adjusts based on data
growth.
3. Advantages:
○ Efficient storage utilization.
○ Handles dynamic data growth without performance loss.
6. Dynamic Hashing
1. Definition:
○ A flexible hashing technique that grows and shrinks dynamically with data.
○ Used to address limitations of static hashing.
2. Features:
○ Hash table size increases with data volume.
○ Avoids overflow by redistributing records into new buckets.
3. Applications:
○ Databases with frequently changing data sizes.
7. Performance of Hashing
1. Evaluation Criteria:
○ Search Time: Average time to locate a record.
○ Storage Utilization: Efficient use of storage space.
○ Collision Handling: Effectiveness of the chosen collision resolution technique.
2. Factors Affecting Performance:
○ Quality of the hashing function.
○ Load factor (ratio of filled slots to total slots).
○ Handling of collisions.
UNIT-IV: Relational Data Model
The relational data model is a widely used model that organizes data into tables (relations) with
rows and columns.
1. Relation (Table):
○ A two-dimensional structure with rows (tuples) and columns (attributes).
Example:
Student Table:
+-----------+--------+-----+
| StudentID | Name | Age |
+-----------+--------+-----+
| 101 | John | 21 |
| 102 | Priya | 22 |
+-----------+--------+-----+
○
2. Tuple:
○ A single row in a table representing a single entity instance.
○ Example: (101, John, 21) is a tuple in the "Student" table.
3. Attribute:
○ A column in a table representing a property of the entity.
○ Example: "Name" is an attribute of the "Student" table.
4. Domain:
○ The set of all possible values for an attribute.
○ Example: The domain of the "Age" attribute is all integers between 18 and 60.
5. Relation Schema:
○ A description of a relation’s structure, including its name, attributes, and domains.
○ Example: Student(StudentID, Name, Age)
6. Relation Instance:
○ A snapshot of the relation at a given point in time, i.e., the actual data in the
table.
2. Relational Constraints
1. Domain Constraints:
○ Ensure attribute values fall within the defined domain.
○ Example: "Age" should be an integer between 18 and 60.
2. Key Constraints:
○ Primary Key: Uniquely identifies each tuple in a relation.
■ Example: "StudentID" in the "Student" table.
○ Candidate Key: Attributes that can serve as a primary key.
○ Foreign Key: An attribute in one relation that references a primary key in another
relation.
■ Example: "CourseID" in the "Enrollment" table references "CourseID" in
the "Course" table.
3. Integrity Constraints:
○ Entity Integrity: Primary keys cannot have NULL values.
○ Referential Integrity: Foreign key values must match primary key values in the
referenced table.
4. Other Constraints:
○ User-defined constraints like "GPA must be between 0 and 4."
3. Relational Algebra
1. Definition:
○ A procedural query language that uses operations to retrieve and manipulate
data from relations.
2. Basic Operations:
○ Selection (σ): Filters rows based on a condition.
■ Example: σ Age > 20 (Student) retrieves students older than 20.
○ Projection (π): Selects specific attributes.
■ Example: π Name, Age (Student) retrieves only the "Name" and
"Age" columns.
○ Union (∪): Combines tuples from two relations.
○ Intersection (∩): Returns tuples common to both relations.
○ Difference (-): Returns tuples in one relation but not in another.
○ Cartesian Product (×): Combines tuples from two relations.
○ Join (⋈): Combines related tuples from two relations based on a condition.
1. Definition:
○ A standard programming language for managing and querying relational
databases.
2. Basic SQL Queries:
○ SELECT Statement:
○ INSERT Statement:
INSERT INTO Student (StudentID, Name, Age) VALUES (103, 'Alice', 23);
○ UPDATE Statement:
○ DELETE Statement:
Remove records.
3. SQL Programming:
○ Joins:
Combine rows from multiple tables.
○ Nested Queries:
SELECT Name FROM Student WHERE Age = (SELECT MAX(Age) FROM Student);
The Enhanced Entity-Relationship model builds on the basic ER model by introducing additional
concepts to represent complex relationships and hierarchical structures.
2. ER to Relational Mapping
ER Diagram:
Relational Schema:
1. Functional Dependencies:
○ A relationship where one attribute uniquely determines another.
○ Example: StudentID → Name, Age
■ Knowing StudentID determines the Name and Age.
2. Normal Forms:
○ Different levels of database normalization, each addressing specific types of
redundancy.
Normal Forms
Example:
Non-1NF Table:
+-----------+-------+-----------+
| StudentID | Name | Subjects |
+-----------+-------+-----------+
| 101 | John | Math, Eng |
+-----------+-------+-----------+
1NF Table:
+-----------+-------+----------+
| StudentID | Name | Subject |
+-----------+-------+----------+
| 101 | John | Math |
| 101 | John | Eng |
+-----------+-------+----------+
Example:
Non-2NF Table:
+-----------+----------+-------+
| StudentID | CourseID | Name |
+-----------+----------+-------+
| 101 | C1 | John |
| 102 | C2 | Priya |
+-----------+----------+-------+
2NF Table:
+-----------+----------+
| StudentID | Name |
+-----------+----------+
| 101 | John |
| 102 | Priya |
+-----------+----------+
Example:
Non-3NF Table:
+-----------+-------+--------+
| StudentID | Name | Dept |
+-----------+-------+--------+
| 101 | John | CS |
| 102 | Priya | IT |
+-----------+-------+--------+
3NF Table:
+-----------+-------+
| StudentID | Dept |
+-----------+-------+
| 101 | CS |
| 102 | IT |
+-----------+-------+
2. Concurrency Control
1. Transaction Processing:
○ A transaction is a sequence of database operations treated as a single logical
unit.
○ Properties of transactions (ACID):
■ Atomicity: Transactions are all or nothing.
■ Consistency: Transactions maintain database integrity.
■ Isolation: Concurrent transactions do not interfere with each other.
■ Durability: Once committed, changes persist even if the system fails.
2. Locking Techniques:
○ Prevents concurrent access conflicts.
○ Types of locks:
■ Shared Lock (S): Multiple transactions can read data but not modify it.
■ Exclusive Lock (X): Only one transaction can read or write the data.
○ Two-phase locking (2PL):
■ Growing phase: Locks are acquired.
■ Shrinking phase: Locks are released.
3. Deadlocks:
○ Occur when two or more transactions wait for each other to release locks.
○ Resolution Strategies:
■ Timeout
■ Deadlock detection and recovery
4. Database Recovery:
○ Ensures data integrity in case of failure.
○ Types of recovery:
■ Deferred Update: Changes are applied only after the transaction
commits.
■ Immediate Update: Changes are applied as the transaction executes,
but logs ensure recovery.
5. Database Security and Authorization:
○ Protects data from unauthorized access and manipulation.
○ Techniques:
■ User authentication (passwords, biometrics)
■ Access control (GRANT and REVOKE in SQL)
■ Encryption for secure data transmission.