Unit 4 Notes DBMS
Unit 4 Notes DBMS
Unit 4
Recovery System & Security Failure Classifications, Recovery & Atomicity, Log Base
Recovery, Recovery with Concurrent Transactions, Shadow Paging, Failure with Loss of
Non-Volatile Storage, Recovery from Catastrophic Failure, Introduction to Security &
Authorization, Introduction to emerging Databases - OODBMS, ORDBMS, Distributed
database, Multimedia database, Special database-limitations of conventional databases,
advantages of emerging databases.
Recovery System & Security Failure Classifications
Recovery system security failures in a database management system (DBMS) can be classified
into several categories based on the nature of the security breach and the impact it has on the
system. Here are some common classifications of recovery system security failures in a DBMS:
1. Unauthorized Access:
• Unauthorized Read: An attacker gains access to data they are not authorized to
read.
• Unauthorized Write: An attacker gains access to data they are not authorized to
modify or delete.
2. Data Tampering:
• Data Integrity Violation: An attacker modifies or deletes data, causing data
integrity violations in the database.
• Data Corruption: Malicious changes or deletions of data that render it unusable
or unreliable.
3. Insider Threats:
• Employee Misuse: An authorized user (e.g., an employee) abuses their
privileges to perform unauthorized actions.
• Privilege Elevation: An attacker escalates their privileges within the DBMS to
gain unauthorized access to data or system functions.
4. Data Disclosure:
• Data Leakage: Sensitive or confidential data is exposed to unauthorized parties.
• Data Breach: Unauthorized access to sensitive data, leading to its disclosure.
5. Denial of Service (DoS) Attacks:
•Resource Exhaustion: Attackers overload the DBMS with requests, causing a
system outage or degradation of service.
• Data Availability: Attackers disrupt access to the database, making it
temporarily or permanently unavailable.
6. Authentication and Authorization Failures:
•Weak Authentication: Inadequate or compromised authentication mechanisms
that allow unauthorized users to access the DBMS.
• Authorization Bypass: Flaws in the authorization system that allow users to
bypass access controls.
7. Logging and Audit Trail Tampering:
• Log Manipulation: Attackers alter or delete log files, making it difficult to detect
their actions.
• Audit Trail Evasion: Attackers take steps to avoid detection by the auditing
mechanisms.
8. Injection Attacks:
•SQL Injection: Attackers exploit vulnerabilities to execute malicious SQL
queries, potentially gaining unauthorized access or causing data corruption.
• Command Injection: Attackers inject malicious commands that are executed by
the DBMS, potentially compromising system security.
9. Encryption and Decryption Failures:
• Key Management: Failures in key management processes can lead to data
exposure if encryption keys are compromised.
• Decryption Vulnerabilities: Weaknesses in the decryption process that allow
unauthorized access to encrypted data.
10. Backdoor Entry:
It's essential to implement security measures, including access controls, encryption, intrusion
detection systems, and regular auditing and monitoring, to mitigate the risks associated with
these security failures in a DBMS. Additionally, a well-defined incident response plan should
be in place to address security breaches promptly and effectively.
Together, the undo and redo operations help the DBMS ensure that the database returns to a
consistent state after a failure, maintaining the ACID properties of the database.
In summary, atomicity ensures that transactions are treated as indivisible units, and recovery
mechanisms (undo and redo) ensure that the database remains consistent and intact in the face
of system failures or errors. These concepts are fundamental for maintaining data integrity and
reliability in a DBMS.
4. Recovery Process: In the event of a system failure, the DBMS uses the log to recover
the database to a consistent state. The recovery process involves two main phases:
a. Forward Pass (Redo): In this phase, the DBMS replays the log from the last
checkpoint to the end, reapplying the changes to the database. This step ensures that all
committed transactions are applied to the database, even if some of them were lost due
to a crash.
b. Backward Pass (Undo): After the forward pass, the DBMS performs a backward pass
through the log. This phase is used to undo the effects of any uncommitted or partially
committed transactions that might have been in progress at the time of the failure. The
system restores the database to a consistent state as of the last checkpoint.
Log-based recovery ensures that the database remains consistent and durable, even in the face
of system failures. It's a critical feature in DBMS, providing transactional integrity and helping
to maintain data reliability. This approach is widely used in many relational database
management systems like Oracle, PostgreSQL, SQL Server, and others.
Shadow Paging
Shadow paging is a technique used in database management systems (DBMS) to implement
the concept of snapshots or versions of a database. It is primarily used for maintaining
consistent and recoverable versions of a database, typically for read-only or historical purposes.
Shadow paging can be seen as an alternative to traditional transaction logging and is
particularly useful in some situations where logging might be less efficient.
Here's how shadow paging works in a DBMS:
1. Original Database Pages: In a database system, data is often stored in pages, and these
pages are the basic units of storage. The original database consists of these pages.
2. Shadow Pages: To create a snapshot or version of the database, shadow pages are
introduced. Shadow pages are a copy of the original pages. At the start, shadow pages
are identical to the original pages.
3. Updating the Shadow Pages: When changes are made to the database, instead of
directly updating the original pages, the DBMS updates the corresponding shadow
pages. This ensures that the original data remains intact.
4. Page Table: To keep track of which pages are shadowed and their corresponding
shadow pages, a page table is used. The page table contains entries that map each
original page to its corresponding shadow page.
5. Transaction Commit: When a transaction is committed, and the changes are to be made
permanent, the page table is updated to point to the shadow pages. This effectively
switches the pointers, making the shadow pages the new primary pages.
Shadow paging offers several advantages, such as simplicity and efficiency. It eliminates the
need for transaction logs, making it useful in read-only or historical databases. However, it has
some limitations, such as potentially high space requirements (as both original and shadow
pages must be maintained) and the inability to efficiently support concurrent updates.
In contrast, traditional database systems use transaction logs to record changes, which can be
used for recovery and maintaining consistency. Depending on the specific requirements and
constraints of a database system, one approach may be more suitable than the other.
2. Downtime: The database system may become unavailable during and after the storage
failure. This can lead to significant downtime, disrupting operations and impacting
productivity.
5. Data Consistency: If a failure occurs during a database transaction, data may be left in
an inconsistent state. Recovery mechanisms need to ensure data consistency by either
rolling back or completing incomplete transactions.
6. Prevention: It's important to take measures to prevent non-volatile storage failures. This
includes using high-quality storage hardware, implementing proactive monitoring and
alerting systems, and performing routine maintenance and hardware health checks.
7. Disaster Recovery: In some cases, non-volatile storage failures may be part of a larger
disaster, such as a natural disaster or a catastrophic system failure. A well-defined
disaster recovery plan can help ensure business continuity in such situations.
9. Data Integrity and Security: Data stored in a database system should also be encrypted
and protected from unauthorized access. In the event of storage failure, data integrity
and security must be maintained.
10. Testing and Drills: Regular testing of backup and recovery procedures, as well as
disaster recovery drills, can help ensure that the database system can withstand storage
failures and be restored effectively.
In summary, a failure with the loss of non-volatile storage in a database system is a critical
event that can result in data loss, downtime, and potential financial and operational
consequences. Implementing robust backup, redundancy, and recovery strategies, as well as
proactive monitoring and preventative measures, is essential to mitigate the impact of such
failures and ensure the long-term health of the database system.
Recovering from a catastrophic failure in a database system is a complex and critical process.
The specific steps and strategies will depend on the database technology you are using and the
nature of the failure. It's essential to plan ahead, regularly test your recovery procedures, and
take proactive measures to minimize the risk of such failures.
• Audit Trails and Logging: Maintaining logs and audit trails helps track who
accessed the database and what actions they performed. This is valuable for monitoring
and forensic purposes.
• Firewalls and Intrusion Detection Systems (IDS): These are network-level security
measures that protect the database server from external threats and can detect and
respond to potential intrusions.
• Patch Management: Keeping the DBMS and underlying system up to date with
security patches is essential to mitigate vulnerabilities.
• Roles and Privileges: Roles are collections of permissions grouped together, making
it easier to manage access control. Users can be assigned roles with predefined
privileges, simplifying authorization.
4. Multimedia Databases: Multimedia databases are optimized for storing and managing
multimedia content, such as images, videos, audio, and other non-textual data. They
provide efficient storage and retrieval mechanisms for large media files, and they
support content-based queries to locate and retrieve multimedia objects based on their
content characteristics.
5. Special Databases: Special databases refer to databases designed for specific industries
or applications, such as geospatial databases for mapping and navigation, time-series
databases for financial data, and bioinformatics databases for genetic and biological
data. These databases are tailored to the unique requirements of their respective
domains.
OODBMS, ORDBMS,
OODBMS and ORDBMS are two different types of database management systems that offer
distinct features and capabilities. Let's explore each of them:
1. OODBMS (Object-Oriented Database Management System):
•
OODBMS is a type of database management system that is designed to work
with data in an object-oriented programming paradigm.
• It stores data in the form of objects, which encapsulate both data and methods
(functions) that operate on the data. This allows for a more natural
representation of data and relationships in applications that use object-oriented
programming languages.
• OODBMS provides support for features like inheritance, polymorphism, and
encapsulation, which are fundamental concepts in object-oriented
programming.
• It is well-suited for applications where the data has complex structures and
relationships, such as CAD/CAM systems, multimedia databases, and
applications that involve complex business logic.
• Examples of OODBMS include GemStone/S, ObjectStore, and db4o.
2. ORDBMS (Object-Relational Database Management System):
• ORDBMS is a hybrid database management system that combines features of
both relational databases and object-oriented databases.
• It extends the traditional relational database model by adding object-oriented
features, allowing you to store and manipulate complex data structures more
easily.
• ORDBMS enables the storage of user-defined data types, methods, and
relationships in a manner similar to an OODBMS, while still supporting SQL
for querying and maintaining data.
• It offers the benefits of data integrity and consistency that relational databases
provide, making it suitable for applications that require the ACID (Atomicity,
Consistency, Isolation, Durability) properties.
• ORDBMS is commonly used in applications where data consistency is critical,
such as enterprise-level systems and applications with complex data models.
In summary, OODBMS focuses on storing data as objects with methods, making it ideal for
applications with complex, interconnected data structures. ORDBMS combines the features of
both relational and object-oriented databases, offering a balance between data consistency and
the flexibility of object-oriented modeling. The choice between these database systems depends
on the specific needs and requirements of the application you are developing.
Distributed database
A distributed database in a database system refers to a database that is spread across multiple
locations or nodes in a network. Unlike a centralized database, where all the data is stored on
a single server, a distributed database stores its data across multiple servers or nodes, which
can be located in different geographical locations. This distribution of data offers several
advantages and challenges, making it a crucial concept in modern database management
systems. Here are some key aspects of distributed databases:
1. Data Distribution: In a distributed database, data is partitioned and stored across
multiple nodes. The distribution can be horizontal (splitting rows of a table), vertical
(splitting columns of a table), or a combination of both. Data can be replicated across
multiple nodes for fault tolerance or performance improvement.
2. Data Localization: Data may be stored closer to the users or applications that need it.
This can reduce data access latency and improve response times.
3. Scalability: Distributed databases can easily scale by adding new nodes or servers to
the network. This allows them to handle increased data volumes and user loads without
major disruptions.
4. Fault Tolerance: Redundancy in data storage helps ensure data availability in case of
hardware failures or network issues. Data can be replicated across multiple nodes, and
there are mechanisms for data recovery.
5. Load Balancing: Data distribution can be used to balance the workload among different
nodes, ensuring that no single node is overwhelmed with requests.
8. Security and Access Control: Distributed databases require robust security measures to
control data access and protect against unauthorized access, especially in a distributed
environment.
9. Data Replication: Data replication can be used to improve data availability, fault
tolerance, and load balancing. However, it also introduces challenges related to data
consistency and synchronization.
10. Data Synchronization: Ensuring that data remains consistent across distributed nodes
can be complex, and various synchronization techniques, like eventual consistency or
strict consistency, may be employed.
Distributed databases are commonly used in scenarios where data needs to be geographically
dispersed, accessed by users or applications in different locations, or scaled to accommodate a
large number of users. Examples of distributed databases include content delivery networks
(CDNs), cloud-based databases, global e-commerce platforms, and large-scale social media
networks.
Designing, managing, and maintaining a distributed database system requires careful planning
and consideration of the specific requirements of the application or use case to ensure data
consistency, availability, and performance.
Multimedia database
A multimedia database is a type of database that is designed to store and manage multimedia
data, such as text, images, audio, video, and other non-textual forms of data. Multimedia
databases are used in various applications, including digital libraries, content management
systems, e-commerce websites, entertainment industry databases, and more. Here are some key
aspects and challenges of multimedia databases in a database system:
1. Data Types: Multimedia databases store a wide variety of data types, including text,
images, audio, video, 3D models, and more. Each data type may require different
storage and retrieval mechanisms.
3. Data Retrieval: Retrieving multimedia data is more complex than querying traditional
text data. Users may want to search for multimedia content based on various attributes,
such as metadata, content similarity, and content-based queries (e.g., searching for
images based on color or shapes). This requires specialized retrieval techniques like
content-based image retrieval (CBIR).
4. Indexing and Search: To efficiently retrieve multimedia data, databases use various
indexing techniques. For example, in image databases, color histograms, texture
features, or shape descriptors can be used to create indexes that help speed up search
operations.
6. Streaming and Real-time Data: Some multimedia data, like live video streams or real-
time sensor data, may need to be handled in real-time, requiring low-latency access and
efficient data streaming capabilities.
10. Data Management: Multimedia databases need tools and functionalities for managing
data, including data insertion, deletion, updates, and version control.
11. Cross-Modal Retrieval: In some applications, users may want to search for multimedia
content across different modalities, like finding a video clip based on a textual
description. Cross-modal retrieval techniques enable this type of search.
12. Data Integration: In some cases, multimedia databases may need to integrate with other
types of databases, like relational databases, to provide a comprehensive view of data
across an organization.
Designing and managing multimedia databases can be challenging due to the complexity of
multimedia data and the diversity of user requirements. It often requires a combination of
specialized database systems, indexing techniques, and retrieval algorithms tailored to handle
multimedia content efficiently.
3. Complex Queries: As the data and schema complexity grows, it can become
challenging to write and optimize complex SQL queries. Performance issues may arise
with joins, subqueries, and large datasets.
4. Limited Support for Unstructured Data: Conventional databases are not well-
suited for handling unstructured or semi-structured data like text documents, images,
or JSON, which are becoming increasingly important in modern applications.
10. Vendor Lock-In: Using a specific relational database system can lead to vendor
lock-in. Migrating data and applications to a different database system can be costly
and time-consuming.
To address these limitations, various alternative database systems have emerged, including
NoSQL databases, NewSQL databases, in-memory databases, and distributed databases. These
systems aim to provide solutions for specific use cases and trade-offs in terms of scalability,
flexibility, and performance. Choosing the right database system depends on the specific
requirements and constraints of your application.
2. Flexibility: Emerging databases are often more flexible in terms of data modeling. They
support various data types and structures, including structured, semi-structured, and
unstructured data, making them suitable for a wide range of use cases.
4. High Availability and Fault Tolerance: Emerging databases are often designed with
built-in mechanisms for high availability and fault tolerance. This means they can
continue to operate even when there are hardware failures or other issues, ensuring data
reliability.
5. Real-time Data Processing: Some emerging databases excel at handling real-time data
processing and analytics, making them suitable for applications that require low-latency
data access and analysis.
6. Geospatial Data Support: Certain emerging databases offer advanced geospatial data
support, which is crucial for applications like location-based services, geographic
information systems (GIS), and mapping applications.
8. Machine Learning Integration: Some emerging databases come with built-in machine
learning capabilities, allowing you to integrate data analytics and machine learning
directly into the database, streamlining the process of deriving insights from data.
10. Cloud-native Support: Emerging databases are often designed to work seamlessly in
cloud environments, taking advantage of cloud-native features like auto-scaling, pay-
as-you-go pricing, and managed services.
11. Support for Time Series Data: Time series databases, a subcategory of emerging
databases, are optimized for storing and analyzing time-stamped data, making them
suitable for applications like IoT, monitoring, and financial data analysis.
13. Cost Efficiency: Some emerging databases are cost-effective, both in terms of licensing
and hardware requirements, which can be advantageous for organizations with budget
constraints.
It's important to note that the choice of a database system should be based on the specific
requirements and use cases of your application. While emerging databases offer many
advantages, traditional relational databases still have their place in various scenarios. The
decision should take into account factors such as data structure, data volume, query patterns,
and the skillset of your development and operations teams.