CS3492 DBMS Univ - QP Answer AM 2024

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 19

CARE COLLEGE OF ENGINEERING, TRICHY

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


Anna University Examinations - April / May 2024
Answer for University Questions

CLASS : II B.E CSE MAX MARKS : 100


SEMESTER : IV DURATION : 3 Hours
SUBJECT : Database Management Systems CODE : CS3492
COURSE NO : C212 DATE : 23-07-2024 (FN)
ACADEMIC YEAR : 2023 – 24 (EVEN) EXAM : University Exam

PART – A(10*2=20)
1. Differentiate File processing system and database management system (Understand)
File system is a method of organizing the files with a hard disk or other medium of storage. File system
arranges the files and helps in retrieving the files, when required. It is compatible with different file
types, such as mp3, doc, txt, mp4,etc and these are also grouped into directories.
DBMS, meanwhile, is the acronym for Database Management System. It is also a software used to
store and regain user’s data, while also maintaining the required security measures. This includes a
group of programmes that can help to manipulate the database. In bigger systems, DBMS helps the
users as well as third party software to store and recover the data.
2. List some relational algebra operations. (Understand)
Five basic operations in relational algebra: Selection, Projection, Cartesian product, Union, and Set
Difference.
3. Define Entity, Relationship and attributes in ER model. ( Remembering)

An entity set is a group of similar entities and these entities can have attributes. In terms of DBMS, an
entity is a table or attribute of a table in database, so by showing relationship among tables and their
attributes, ER diagram shows the complete logical structure of a database.
4. Why BCNF is preferred over 3NF? ( Understanding)

BCNF is a stronger form of normalization than 3NF because it eliminates the second condition for 3NF,
which allowed the right side of the FD to be a prime attribute. Thus, every left side of an FD in a table
must be a super key. Every table that is BCNF is also 3NF, 2NF, and 1NF, by the previous definitions.

5. List the properties of transactions. ( Remembering)


Properties: Atomicity, Consistency, Isolation, and Durability. ACID is an acronym that refers to the set
of 4 key properties that define a transaction: Atomicity, Consistency, Isolation, and Durability.
6. How will you handle dead lock during two transactions in database? ( Understanding)

If both transactions run concurrently, a deadlock can occur if Transaction A locks Account X and
waits for Account Y, while Transaction B locks Account Y and waits for Account X. Solution: Ensure
both transactions lock the accounts in the same order.
7 What is hash based indexing? ( Rememberibg)
Hash-based indexing does not maintain any ordering among the indexed values; rather it is based on
mapping the search-key values on a collection of buckets. Therefore it can only address equality (or
membership) queries. Tree-based indices maintain order and can thus also address range queries.
8 List three components of Query processor. ( Rememberibg)
The principal components of the query processor are the wrappers, the sources and services model
(SSM), and the planner.
1 The Sources and Services Model. The SSM stores the relationships between the concepts and roles in
the ontology and the functions used to wrap sources in CPL. ...
2 The Query Planner. ...
3 The Wrappers.
9 Define Distributed Database. ( Rememberibg)
In the most basic terms, a distributed database is a database that stores data in multiple locations instead
of one location. This means that rather than putting all data on one server or on one computer, data is
placed on multiple servers or in a cluster of computers consisting of individual nodes.
10 What are the challenges faced when using an encrypted system? ( Rememberibg)
The complexity of managing encryption and keys, maintaining performance, ensuring accessibility,
complying with regulations, and covering costs are significant hurdles that organizations must
overcome.

PART – B(5*13=65)

11.a. What is data model? List its different types. Explain with suitable example. ( Understanding)

Data Models

Definition: It is a collection of conceptual tools for describing data, relationships among data, semantics
(meaning) of data and constraints.

• Data model is a structure below the database.

• Data model provides a way to describe the design of database at physical, logical and view level.

• There are various data models used in database systems and these are as follows -

(1) Relational model:

• Relation model consists of collection of tables which stores data and represents the relationship
among the data.

• Table is also known as relation.

• The table contains one or more columns and each column has unique name.

• Each table contains record of particular type, and each record type defines a fixed number of fields or
attributes.

• For example - Following figure shows the relational model by showing the relationship between
Student and Result database. For example - Student Ram lives in city Chennai and his marks are 78.
Thus the relationship between these two databases is maintained by the SeatNo. Column

Advantages:

(i) Structural Independence: Structural independence is an ability that allows us to make changes in one
database structure without affecting other. The relational level model have structural independence.
Hence making required changes in thedatabase is convenient in relational database model.

(ii)Conceptual Simplicity: The relational model allows the designer to simply focus on logical design
and not on physical design. Hence relational models are conceptually simple to understand.

(iii) Query Capability: Using simple query language (such as SQL) user can get agile information from
the database or designer can manipulate the database structure.

(iv) Easy design, maintenance and usage: The relational models can be designed logically hence they
are easy to maintain and use.

Disadvantages:

(i) Relational model requires powerful hardware and large data storage devices.

(ii) May lead to slower processing time.

(iii) Poorly designed systems lead to poor implementation of database systems.

1) Entity relationship model:

• As the name suggests the entity relationship model uses collection of basic objects called entities and
relationships.

• The entity is a thing or object in the real world.

• The entity relationship model is widely used in database design.

• For example - Following is a representation of Entity Relationship modelin which the relationship
works_for is between entities Employee and Department.
Advantages:

i) Simple: It is simple to draw ER diagram when we know entities and relationships.

ii) Easy to understand: The design of ER diagram is very logical and hence they are easy to design and
understand.

iii) Effective: It is effective communication tool.

iv) Integrated: The ER model can be easily integrated with Relational model.

v) Easy conversion: ER model can be converted easily into other type of models.

Disadvantages:

i) Loss of information: While drawing ER model some information can be hidden or lost.

ii) Limited relationships: The ER model can represent limited relationships as compared to other
models.

iii) No Representation for data manipulation: It is not possible to represent data manipulation in ER
model.

iv) No industry standard: There is no industry standard for notations of ER diagram.

(3) Object Based Data Model:

• The object oriented languages like C++, Java, C# are becoming the

dominant in software development.

• This led to object based data model.

• To The object based data model combines object oriented features with relational data model.

Advantages:

i) Enriched modelling: The object based data model has capability of modelling the real world objects.

ii) Reusability: There are certain features of object oriented design such as inheritance, polymorphism
which help in re usability.

iii) Support for schema evolution: There is a tight coupling between data and b applications, hence
there is strong support for schema evolution.

iv)Improved performance: Using object based data model there can be significant improvement in
performance using object based data model.
Disadvantages:

i) Lack of universal data model: There is no universally agreed data model for an object based data
model, and most models lack a theoretical foundation.

ii) Lack of experience: In comparison with relational database management the use of object based data
model is limited. This model is more dependent on the skilled programmer.

iii) Complex: More functionalities present in object based data model make the design complex.

(4) Semi-structured data model:

• The semi-structured data model permits the specification of data where individual data items of same
type may have different sets of attributes.

• The Extensible Markup Language (XML) is widely used to represent semi- structured data model.

Advantages

i) Data is not constrained by fixed schema.

ii) It is flexible.

iii) It is portable.

Disadvantages

i) Queries are less efficient than other types of data model.


11.b. Discuss about domain integrity. Give an example. ( Understanding)

In DBMS, constraints are the set of rules that ensures that when an authorized user modifies the
database they do not disturb the data consistency and the constraints are specified within the DDL
commands like “alter” and “create” command. There are several types of constraints available in
DBMS and they are:

Domain constraints
Entity Integrity constraints
Referential Integrity constraints
Key constraints
In this article, we will only discuss domain constraints.

Domain Constraints
Domain Constraints are user-defined columns that help the user to enter the value according to the data
type. And if it encounters a wrong input it gives the message to the user that the column is not fulfilled
properly. Or in other words, it is an attribute that specifies all the possible values that the attribute can
hold like integer, character, date, time, string, etc. It defines the domain or the set of values for an
attribute and ensures that the value taken by the attribute must be an atomic value(Can’t be divided)
from its domain.
Domain Constraint = data type(integer / character/date / time / string / etc.) +
Constraints(NOT NULL / UNIQUE / PRIMARY KEY /
FOREIGN KEY / CHECK / DEFAULT)
Type of domain constraints:
There are two types of constraints that come under domain constraint and they are:
1. Domain Constraints – Not Null: Null values are the values that are unassigned or we can also say that
which are unknown or the missing attribute values and by default, a column can hold the null values.
Now as we know that the Not Null constraint restricts a column to not accept the null values which
means it only restricts a field to always contain a value which means you cannot insert a new record or
update a record without adding a value into the field.

Example: In the ’employee’ database, every employee must have a name associated with them.

Create table employee


(employee_id varchar(30),
employee_name varchar(30) not null,
salary NUMBER);
12.a. What is normalization? List its benefits and explain briefly about 3NF,4NF and BCNF with suitable
. example. ( Understanding)

Normalization is the process to eliminate data redundancy and enhance data integrity in the table.
Normalization also helps to organize the data in the database. It is a multi-step process that sets the data
into tabular form and removes the duplicated data from the relational tables.

Parameter 3NF BCNF

Complexity Less complex than BCNF. More complex than 3NF.

Preservation Preserves data integrity partially. Preserves data integrity completely.

Practical Use Widely used in database design. Used in specific cases where data integrity is
critical.

Fourth Normal Form (4NF)

A relation is in fourth normal form (4NF) if it is in BCNF and has no multi-valued dependencies.
Multi-valued dependencies occur when an attribute depends on a set of values rather than a single
value. By eliminating multi-valued dependencies, data redundancy is further reduced.

The concept of 4NF refines the principles of BCNF. It goes a step further by ensuring the absence of
multi-valued dependencies within a table. This normalization form eliminates scenarios where a single
attribute depends on a combination of other attributes, enhancing the overall integrity of the data
structure.
b. i. Illustrate functional dependency with an example. ( Analysis )

The functional dependency is a relationship that exists between two attributes. It typically exists
between the primary key and non-key attribute within a table.

X → Y
The left side of FD is known as a determinant, the right side of the production is known as a dependent.
For example:

Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address.


Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table because if we
know the Emp_Id, we can tell that employee name associated with it.

Functional dependency can be written as:

Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on Emp_Id.
Types of Functional dependency
1. Trivial functional dependency
A → B has trivial functional dependency if B is a subset of A.
The following dependencies are also trivial like: A → A, B → B
Example:

Consider a table with two columns Employee_Id and Employee_Name.


{Employee_id, Employee_Name} → Employee_Id is a trivial functional dependency as
Employee_Id is a subset of {Employee_Id, Employee_Name}.
Also, Employee_Id → Employee_Id and Employee_Name → Employee_Name are trivial
dependencies too.
2. Non-trivial functional dependency
A → B has a non-trivial functional dependency if B is not a subset of A.
When A intersection B is NULL, then A → B is called as complete non-trivial.
Example:

ID → Name,
Name → DOB
b. ii. Discuss about dependency preservation. (Remembering)

Dependency Preserving Decomposition is a technique used in Database Management System (DBMS)


to decompose a relation into smaller relations while preserving the functional dependencies between the
attributes. The goal is to improve the efficiency of the database by reducing redundancy and improving
query performance.

A FD X → Y is preserved in a relation R if R contains all the attributes of X and Y . A FD can


therefore be checked by accessing only R. R(A, B, C, D) under F = {A → B, B → C}. Consider the
following decomposed tables: R1(AB), R2(BC), and R3(AD) all of which are in BCNF.

Dependencies are relationships between tasks that determine their order in a project. Identifying and
recording dependencies is vital for organizing work and making a project schedule. The 4 main types of
dependencies are: mandatory, discretionary, internal, or external.
13.a. Demonstrate conflict serializability and view serializability. ( Analysis)

Conflict Serializability in DBMS checks if a non-serial schedule is conflict serializable or not. View
Serializability checks if a schedule is view serializable or not. If a schedule is a view equivalent to a
Serial Schedule, it is said to be view serializable.

Conflicting Operations
Two operations are said to be conflicting if all conditions are satisfied:

They belong to different transactions


They operate on the same data item
At Least one of them is a write operation
Example:

Conflicting operations pair (R1(A), W2(A)) because they belong to two different transactions on the
same data item A and one of them is a write operation.
Similarly, (W1(A), W2(A)) and (W1(A), R2(A)) pairs are also conflicting.
On the other hand, the (R1(A), W2(B)) pair is non-conflicting because they operate on different data
items.
Similarly, ((W1(A), W2(B)) pair is non-conflicting.
Consider the following schedule:

S1: R1(A), W1(A), R2(A), W2(A), R1(B), W1(B), R2(B), W2(B)


If Oi and Oj are two operations in a transaction and Oi< Oj (Oi is executed before Oj), same order will
follow in the schedule as well. Using this property, we can get two transactions of schedule S1:

T1: R1(A), W1(A), R1(B), W1(B)


T2: R2(A), W2(A), R2(B), W2(B)
Possible Serial Schedules are: T1->T2 or T2->T1

-> Swapping non-conflicting operations R2(A) and R1(B) in S1, the schedule becomes,

S11: R1(A), W1(A), R1(B), W2(A), R2(A), W1(B), R2(B), W2(B)


-> Similarly, swapping non-conflicting operations W2(A) and W1(B) in S11, the schedule becomes,

S12: R1(A), W1(A), R1(B), W1(B), R2(A), W2(A), R2(B), W2(B)


S12 is a serial schedule in which all operations of T1 are performed before starting any operation of T2.
Since S has been transformed into a serial schedule S12 by swapping non-conflicting operations of S1,
S1 is conflict serializable.
b. i. Discuss in detail about Multiple Granularity. ( Remembering )
Multiple granularity locking is a locking mechanism that provides different levels of locks for different
database objects. It allows for different locks at different levels of granularity. This mechanism allows
multiple transactions to lock different levels of granularity, ensuring that conflicts are minimized, and
concurrency is maximized.

To illustrate, let's consider a tree structure that has four levels of nodes. The top level represents the
entire database, and below it are nodes of type "area", which represent specific areas of the database.
Each area has child nodes called "files", and each file represents a specific subset of data within that
area. Importantly, no file can span more than one area.

Finally, each file has child nodes called "records", which represent individual units of data within the
file. Like files, each record is a child node of its corresponding file and cannot be present in more than
one file. Therefore, the tree can be divided into the following levels, starting from the top −

Database

Area
File

Record

13.b. Explain different types of locks. ( Understanding)


ii
Shared Lock
It allows multiple transactions to read the same data simultaneously. It is used to prevent other
transactions from modifying the data while a transaction is reading it.

Exclusive Lock
It prevents any other transaction from accessing the data. It is used to prevent other transactions from
reading or modifying the data while a transaction is writing to it.

Different Types of Intention Mode Locks in Multiple Granularity

Intention mode locks are a type of lock used in multiple granularity locking that allows multiple
transactions to acquire locks on the same resource, but with different levels of access.

There are three types of intention mode locks in multiple granularity locking −

Intent Shared (IS) Locks


This lock is used when a transaction needs to read a resource but does not intend to modify it. It
indicates that the transaction wants to acquire a Shared lock on a resource.

Intent Exclusive (IX) Locks


This lock is used when a transaction needs to modify a resource but does not intend to share it. It
indicates that the transaction wants to acquire an Exclusive lock on a resource.

Shared with Intent Exclusive (SIX) Locks


This lock is used when a transaction intends to acquire both Shared and Exclusive locks on a resource.
It indicates that the transaction wants to acquire an Exclusive lock on a resource after acquiring Shared
locks on other resources.

These intention mode locks are used to optimize the locking mechanism in a database by allowing
transactions to acquire locks on multiple resources in a coordinated manner. They help prevent
deadlocks and improve concurrency in a database system.

The compatibility metrics for these lock modes are described below −

Compatibility Matrix

IS IX S SIX X
IS YES YES YES YES NO
IX YES YES NO NO NO
S YES NO YES NO NO
SIX YES NO NO NO NO
X NO NO NO NO NO
Intention lock modes are utilized in the multiple-granularity locking protocol to ensure serializability.
According to this protocol, when a transaction (T) attempts to lock a node, it must adhere to the
following guidelines
14.a Discuss B+ tree. Discuss about this Dynamic Index Structure. ( Understanding)

In a B + tree, data pointers are stored only at the leaf nodes of the tree. In a B+ tree structure of a leaf
node differs from the structure of internal nodes. The leaf nodes have an entry for every value of the
search field, along with a data pointer to the record (or to the block that contains this record).

A dynamic index structure is an index that automatically adjusts its structure as the underlying data
grows, shrinks, or otherwise changes. Unlike static index structures, which are fixed in size and
organization, dynamic index structures reorganize themselves in response to changes in the data they
index. This makes them more flexible and scalable, capable of handling varying workloads and data
distributions efficiently.

Dynamic index structures are crucial for databases that are expected to evolve over time, especially
when it comes to insertion, deletion, and updating of records.

B+ Trees

A B+ Tree is a type of self-balancing tree structure commonly used in databases and file systems to
maintain sorted data in a way that allows for efficient insertion, deletion, and search operations. B+
Trees are an extension of B-Trees but differ mainly in the way they handle leaf nodes, which contain all
the key values and point to the actual records.

A B+ Tree of order `n` has the following properties:


Every node has a maximum of `n` children.
Every node (except the root) has a minimum of `n/2` children.
The tree is perfectly balanced, meaning that all leaf nodes are at the same level.
All keys are stored in the leaf nodes, and the internal nodes act as 'guides' to locate the leaf nodes faster.
Operations on B+ Trees:
Search: Starts at the root and traverses down the tree, guided by the key values in each node, until it
reaches the appropriate leaf node.
Insert: Inserts a new key-value pair and then reorganizes the tree as needed to maintain its properties.
Delete: Removes a key-value pair and then reorganizes the tree, again to maintain its properties.
14.b. Compare I/O costs for all File Organizations. ( Understanding)

The I/O costs for different file organizations in database management systems can vary
significantly depending on factors such as the type of access (sequential or random), the structure
of the file organization, and the type of operations (read, write, update, delete). Below is a
comparison of I/O costs for some common file organizations: Heap files, Sorted files, Hash files,
and B+ Tree files.

1. Heap Files
Sequential Access:
Cost: Low
Reason: Data is stored in the order it is inserted, so sequential access involves reading blocks
sequentially.
Random Access:
Cost: High
Reason: No particular order to the records, so finding a specific record may require reading many
blocks.
Insertion:
Cost: Low
Reason: New records can be appended to the end of the file.
Deletion:
Cost: Variable
Reason: Deleting a record may require searching for it, which can be costly.
Update:
Cost: High
Reason: Finding the record to update can be costly.
2. Sorted Files

Sequential Access:
Cost: Low
Reason: Records are stored in sorted order, so sequential access is efficient.
Random Access:
Cost: Medium
Reason: Binary search can be used to locate records, but this still requires logarithmic I/O
operations.
Insertion:
Cost: High
Reason: Maintaining sorted order requires shifting records.
Deletion:
Cost: Medium to High
Reason: Similar to insertion, maintaining order requires shifting records.
Update:
Cost: Medium
Reason: Updating a record may require maintaining order.
3. Hash Files
Sequential Access:
Cost: Very High
Reason: No ordering to the records, making sequential access inefficient.
Random Access:
Cost: Low
Reason: Hash function allows direct access to records.
Insertion:
Cost: Low
Reason: Records are placed according to the hash function.
Deletion:
Cost: Low to Medium
Reason: Finding the record via hash function is efficient, but may involve rehashing.
Update:
Cost: Low
Reason: Similar to insertion and deletion.

4. B+ Tree Files
Sequential Access:
Cost: Low
Reason: Leaf nodes of B+ tree are linked, enabling efficient sequential access.
Random Access:
Cost: Medium
Reason: Logarithmic search through tree levels.
Insertion:
Cost: Medium to High
Reason: Maintaining tree balance requires additional operations.
Deletion:
Cost: Medium to High
Reason: Maintaining tree balance requires additional operations.
Update:
Cost: Medium
Reason: Efficient access to records, but maintaining tree balance requires additional operations.
Summary
Heap Files: Best for workloads with heavy insertions and limited searches or updates.
Sorted Files: Efficient for range queries and sequential access, but costly for insertions and
deletions.
Hash Files: Optimal for direct access based on a key, poor for sequential access.
B+ Tree Files: Balanced performance across different types of operations, especially efficient for
range queries and ordered data access.
Selecting the appropriate file organization depends on the specific requirements of the application
and the expected workload.
15.a Explain distributed database architecture in detail. ( Understanding)

Distributed database architecture involves a single logical database that is spread physically across
computers in multiple locations connected by a network. The design aims to improve performance,
availability, and reliability while maintaining the illusion of a single database to the user. Here’s a
detailed explanation:

Key Components of Distributed Database Architecture


Sites (Nodes):

Each site is a database system in its own right, with its own hardware, operating system, DBMS, and
applications.
Sites are connected via a network, which could be LAN, WAN, or the internet.
Distributed Database Management System (DDBMS):

Manages the distributed database and provides an interface between users and the distributed data.
Ensures that the distributed nature of the database is transparent to the user.
Types of Distributed Databases
Homogeneous Distributed Database:

All sites use the same DBMS software.


Provides better consistency and easier management since all sites are uniform.
Heterogeneous Distributed Database:

Different sites may use different DBMS software.


Requires middleware to handle differences between systems, making it more complex but flexible.
Distributed Database Design
Data Fragmentation:

Data is divided into fragments which can be stored at different sites.


Horizontal Fragmentation: Divides a table into rows (tuples).
Vertical Fragmentation: Divides a table into columns (attributes).
Hybrid Fragmentation: A combination of both horizontal and vertical fragmentation.
Data Replication:

Copies of the data are stored at multiple sites.


Full Replication: Entire database is replicated at each site.
Partial Replication: Only specific fragments are replicated at certain sites.
Data Allocation:

Deciding where to place the data and fragments in the network.


Can be static (fixed) or dynamic (changes based on demand).
Key Concepts in Distributed Database Systems
Transparency:

Location Transparency: Users do not need to know the physical location of the data.
Replication Transparency: Users are unaware of the data replication.
Fragmentation Transparency: Users are unaware of data fragmentation.
Distributed Transactions:

Transactions that access data in multiple sites.


Require mechanisms for ensuring Atomicity, Consistency, Isolation, and Durability (ACID properties)
across distributed systems.
Two-Phase Commit Protocol (2PC): Ensures that all parts of a distributed transaction either commit or
rollback in unison.
Distributed Query Processing:

Optimizing query execution across multiple sites.


Involves breaking down queries into sub-queries that run at different sites and combining results.
Distributed Concurrency Control:

Managing concurrent access to data across multiple sites.


Techniques include distributed locking and timestamp ordering.
Distributed Deadlock Management:

Detecting and resolving deadlocks in a distributed environment.


Techniques include centralized, distributed, and hierarchical deadlock detection.
Fault Tolerance and Recovery:

Ensuring the system can recover from failures.


Techniques include data replication, distributed checkpoints, and logging.
Advantages of Distributed Databases
Improved Performance:

Data can be located closer to where it is needed, reducing access time.


High Availability and Reliability:

Data replication ensures that even if one site fails, others can continue to provide data.
Scalability:

System can be expanded easily by adding more sites.


Modular Growth:

New sites can be added without affecting existing sites.


Challenges of Distributed Databases
Complexity:

Managing and maintaining a distributed database is more complex than a centralized database.
Security:

Ensuring data security across multiple sites and over a network can be challenging.
Consistency:

Keeping replicated data consistent across sites requires sophisticated protocols.


Latency:

Network latency can affect the performance of distributed transactions.


b. Explain in detail about key value stores and role based access control in advanced database
management systems. ( Understanding)

Key-Value Stores
Key-value stores are a type of NoSQL database that use a simple key-value pair to store data. They are
designed for simplicity, speed, and scalability, making them suitable for applications that require fast
read/write operations on large volumes of data.

Key Concepts
Key-Value Pairs:
Key: A unique identifier for the data. Typically a string, but can also be a number or other simple data
type.
Value: The data associated with the key. Can be a simple data type (string, integer) or a more complex
structure (JSON, binary data).
Storage:

Data is stored as a dictionary or hash table, where each key maps to a value.
The database can be distributed across multiple servers to ensure scalability and fault tolerance.
Operations:

Get (key): Retrieves the value associated with the given key.
Put (key, value): Stores the value associated with the given key.
Delete (key): Removes the key-value pair from the store.
Scalability:

Horizontal scaling by distributing data across multiple nodes.


Consistent hashing is often used to distribute keys evenly across nodes.
RBAC:
Role-Based Access Control (RBAC)
Role-Based Access Control (RBAC) is a method for regulating access to resources based on the roles
assigned to users within an organization. It enhances security by ensuring users only have access to the
information and systems necessary for their roles.

Key Concepts
Roles:

Defined based on job functions within an organization.


Each role has a set of permissions associated with it.
Permissions:

Specific rights or privileges required to perform certain operations on resources.


Examples: Read, write, execute, delete.
Users:

Individuals who are assigned one or more roles.


Users inherit the permissions associated with their roles.
Resources:

The objects or entities that need protection.


Examples: Files, databases, applications, systems.
Components
Role Assignment:

Users are assigned roles based on their responsibilities and job functions.
Roles can be hierarchical, where higher-level roles inherit permissions from lower-level roles.
Role Permissions:

Permissions are granted to roles, not directly to users.


This abstraction simplifies management and enhances security.
User Role Review:
Regular reviews to ensure users have the appropriate roles and permissions.
Helps in maintaining compliance and reducing the risk of unauthorized access.
Implementation in Advanced Database Management Systems
Defining Roles and Permissions:

Database administrators define roles and associate them with specific database operations.
Examples: DBA role with full access, ReadOnly role with read permissions.
Assigning Roles to Users:

Users are assigned roles through database management commands or interfaces.


Examples: GRANT command in SQL.
Enforcing Access Controls:

The DBMS enforces access controls based on the roles and permissions.
Unauthorized access attempts are logged and denied.
Auditing and Monitoring:

Regular auditing of role assignments and permissions.


Monitoring user activities to detect and prevent security breaches.
Benefits
Enhanced Security:

Users have access only to the data and operations necessary for their role.
Reduces the risk of data breaches and unauthorized access.
Simplified Management:

Easier to manage permissions through roles rather than individual user assignments.
Streamlines the process of adding, modifying, or removing access.
Compliance:

Helps in meeting regulatory requirements by ensuring proper access controls.


Facilitates auditing and reporting for compliance purposes.

PART C (1*15=15)

16.a. Consider the following relational schemas for a library database:


Book(Title, Author, Catalog_no, Publisher, Year, Price)
Collection (Title, Author, Catalog_no)
The following are functional dependencies:
i. TitleAuthor ->Catalog_no
ii. Catalog_no->Title Author Publisher Year
iii. Publisher Title Year -> Price
iv. Assume {Author Title} is the key for both schemas. Apply the appropriate normal form for
Book and Cancellation? ( Analysis)
Schema: Book(Title, Author, Catalog_no, Publisher, Year, Price)

Given Functional Dependencies:

1. \text{Title, Author} ->{Catalog_no}


2. \text{Catalog_no} -> {Title, Author, Publisher, Year}
3. Publisher, Title, Year→Price{Publisher, Title, Year} -> {Price}Publisher, Title, Year→Price

Key:

 The given key for the Book schema is {Author, Title}\{ \text{Author, Title} \}{Author, Title}.

Step-by-Step Normalization:

1. First Normal Form (1NF):

A relation is in 1NF if it only contains atomic (indivisible) values.

 The schema Book\text{Book}Book is already in 1NF as all attributes contain atomic values.

2. Second Normal Form (2NF):

A relation is in 2NF if it is in 1NF and all non-key attributes are fully functionally dependent on the
primary key.

 The primary key is {Author, Title}{ {Author, Title}{Author, Title}.


 Non-key attributes: {Catalog_no}, {Publisher}, {Year}, {Price}.
 From {Title, Author} ->{Catalog_no}, {Catalog_no} is fully dependent on the primary key.
 From \text{Catalog_no}-> {Title, Author, Publisher, Year}, Publisher,Year{Publisher},
{Year}Publisher,Year are not directly dependent on the primary key, but through {Catalog_no}.
 From Publisher, Title, Year→Price{Publisher, Title, Year} \rightarrow
{Price}Publisher, Title, Year→Price, Price{Price}Price depends on a combination of attributes,
but not directly on the primary key.

To achieve 2NF, we need to decompose Book{Book}Book:

{Book1(Title, Author, Catalog_no)} Book2(Catalog_no, Publisher, Year, Price)}

3. Third Normal Form (3NF):

A relation is in 3NF if it is in 2NF and no transitive dependencies exist.

 {Book1(Title, Author, Catalog_no)} is in 3NF as there are no transitive dependencies.


 {Book2(Catalog_no, Publisher, Year, Price)} has a transitive dependency \text{Catalog_no} \
rightarrow {Publisher, Year} and Publisher, Title, Year→Pricet{Publisher, Title, Year} ->
{Price}Publisher, Title, Year→Price.

To achieve 3NF, decompose Book2{Book2}Book2:


\text{Book2a(Catalog_no, Publisher, Year)} Book2b(Publisher, Title, Year, Price){Book2b(Publisher,
Title, Year, Price)}Book2b(Publisher, Title, Year, Price)

Schema: Collection (Title, Author, Catalog_no)

Given Functional Dependency:

 The primary key is {Author, Title}{ Author, Title}}{Author, Title}.

Step-by-Step Normalization:

1. First Normal Form (1NF):

The schema Collection{Collection}Collection is already in 1NF as all attributes contain atomic values.

2. Second Normal Form (2NF):

 The primary key is {Author, Title}\{ {Author, Title}}{Author, Title}.


 Non-key attribute: Catalog_no}.
 From {Title, Author} -> {Catalog_no}, {Catalog_no} is fully dependent on the primary key.

Therefore, Collection is in 2NF.

3. Third Normal Form (3NF):

A relation is in 3NF if it is in 2NF and no transitive dependencies exist.

 Collection{Collection}Collection is in 3NF as there are no transitive dependencies.

Summary:

 Book Schema Decomposition:


o Book1(Title, Author, Catalog_no) (3NF)
o Book2a(Catalog_no, Publisher, Year) (3NF)
o Book2b(Publisher, Title, Year, Price) (3NF)
 Collection Schema:
o Collection(Title, Author, Catalog_no) (Already in 3NF)

Thus, the Book schema is decomposed into three schemas to achieve 3NF, and the Collection schema is
already in 3NF and does not require further decomposition.
16.b Consider a B+ tree in which the maximum number of keys in a node is 5. Calculate the minimum
number of keys in any non-root node. ( Analysis)

Properties of B+ Tree

 Each internal node can have at most mmm children, where mmm is the order of the B+ tree.
Here, m=6m = 6m=6 since the maximum number of keys is 5, and an internal node with kkk

Each internal node (except the root) must have at least ⌈m2⌉ frac{m}{2} ceil⌈2m⌉ children.
keys has k+1k+1k+1 children.

Minimum Number of Keys

For a non-root node in a B+ tree:

 Since each internal node must have at least ⌈m2⌉ ceil frac{m}{2} ceil⌈2m⌉ children, and each
child connection corresponds to a key in the node, the minimum number of keys is one less than
the minimum number of children.

Given m=6m = 6m=6, calculate:

⌈m2⌉=⌈62⌉=⌈3⌉=3 ceil frac{m}{2} ceil = ceil frac{6}{2} ceil = ceil 3 ceil = 3⌈2m⌉=⌈26⌉=⌈3⌉=3

Thus, each non-root internal node must have at least 3 children. Therefore, the minimum number of
keys in each non-root internal node is:

3−1=23 - 1 = 23−1=2

Summary

The minimum number of keys in any non-root node of a B+ tree, where the maximum number of keys
in a node is 5, is 2 .

Faculty In-Charge HoD/CSE Principal


(S. Nageshwari ,AP/CSE)

You might also like