Summative Assessment 2 Activity 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

1.

Summative Assessment 2

ACTIVITY 1- Introduction to Database Management system

1.) A relational database stores and organizes data points that are related to one another. Based on the
relational database model, a relational database presents data sets as a collection of tables and provides
relational operators to manipulate the data in tabular form.

2.a) Entity integrity: requires that every table have a primary key: neither the primary key nor any

part of it, can contain null values

2b.) Constraints: the rules that forces DBMSs to check that data satisfies the semantics

2c.) Referential integrity: requires that a foreign key must have a matching primary key or it must be null

2d.) Identifying relationship: where the primary contains the foreign key, indicated in an ERD by a solid
line.

3.)
DBMS Type Characteristics of DBMS Example
 Data is organized into tables Microsoft SQL Server
 Tables are related to each other Oracle
by means of keys MySQL
 Each row in the table represents IBM DB2
a record, whilst each column in
Relational Database the database represents a
Management (RDBMS) unique attribute of the table.
Each individual field is an actual
data value. Each table can be
accessed by using a unique
column attribute
 RDBMS’s use SQL to query the
tables. Various operations such
as insert, update and delete can
be performed on the database
tables.

The hierarchical DBMS uses the IBM Information Management


hierarchical database structure which is System (IMS)
modelled in a hierarchical, tree-like
structure of different levels. The Windows Registry, Microsoft
information is stored in structured
records that are made up of different Windows O/s2
Hierarchical DBMS segment levels which are further
subdivided into fields. In this database
structure, a record is regarded as the
root segment and all its dependent
segments and fields.
• The hierarchical DBMS supports
parent-child relationships between the
tables.
• Each child table can only have a single
parent table, whilst a parent table can
have multiple child tables. Any data
entries within a child table must have
corresponding entries within the parent
tables.
Network DBMS’s use a tree-shaped data Integrated Database
model structure. In a network database Management System (IDMS)
however, the child nodes (known as
members in the network database) can TurboIMAGE
Network DBMS have more than one parent (known as
an occupier) in the network database. Integrated Data Store (IDS)
Because of this, the network DBMS lends
itself to complicated many-to-many
relationships which can exist between
database elements.

• The network database model looks


similar to a cobweb as it consists of a link
of interconnected network records.

• Network DBMS can also be used in


developments such as file sharing, blogs
and wiki applications
The Object-oriented DBMS represents ObjectDatabase ++
data in the form of objects as
underpinned by the object-oriented ObjectStore
(OO) programming paradigm which
supports concepts such as inheritance,
Object-Oriented DBMS encapsulation and polymorphism. When
compared to relational database
management systems which store data
in tables made up of rows and columns,
object-oriented DBMS do not store data
using relational tables but rather stores
data as objects.

• OODBMS are utilized for applications


that require complex data sets that can
have many-to-many relations. Most
commercial OODBMS follow the
standards set out by the Object Data
Management Group (ODMG) which is
responsible for establishing the
principles of designing object models for
OODBMS.

• OODBMS create independence and


encapsulation of data and object code,
whilst RDBMS, separates the application
program from the data. Furthermore,
RDBMS only store data,

Activity 2 - Database Management System (DBMS) Security


Terms Description
1 Database Security E The aspect of DBMS which deals with features such as
system-related controls and legal, ethical and policy
issues for protecting sensitive information.
2 Access Control A Concerns the control of access to data stored in the
database by using authentication and authorization
processes.
3 Data Storage Security J Implementing appropriate security measures that
safe-guard the physical storage data.
4 Privacy and data protection K Protecting data and the confidentiality of data in the
database by using techniques such Pseudonymisation,
data masking and data scrambling
5 Data Auditing B Keeping track of all database activities, database
operations and access to the data within the
database.
6 Audit Trails H Database log files that keep records of all changes
effected on the database.
7 Database Backup C Is about creating exact copies or replicas of the data
in the database which can be used to restore the
database after failures are experienced.
8 Database Recovery I Restoring the database from system crashes and
system failures back to a consistent state
9 Database Security Risk Plan F Identifies, evaluates and implements security control
plans for an organizations database resources focuses
on preventing security threats, defects and
vulnerabilities to organizational data.
10 Database Security Risk assessment D Consists of 4 steps: identification, assessment,
prevention and migration
11 Data administration G Performed by Database Administrators (DBA’s)

Activity 3 – Database Transaction Processing

1) Describe the different states that a transaction within a database can have.

Active - This is the initial state of a transaction. The transaction stays active while it is executing.

Partially completed - A transaction is partially committed when it has executed the final
statement.

Committed - A transaction is said to be committed once it has successfully executed and the
database has transformed from one consistent state to another consistent state. For example,
the successful effects of a write operation should persist or commit to the database. If it fails,
the entire transaction should be aborted. Once a transaction is committed, the changes can’t be
undone, unless a compensating transaction is executed.
Failed - a transaction is in a failed state if the normal execution of the transaction cannot
proceed.

Aborted - a transaction is said to be aborted when the transaction has rolled back and the database is
being restored to the consistent state prior to the start of the transaction.

2) How do Database Management Systems ensure the recovery of transactions within a


database?

DBMS use what is known as a transaction log which keeps track of all transactions that update the
database. The data is stored in a log which is used by DBMS for recovery (i.e. rollback).

Transaction recovery makes use of various methods such as:


Deferred-updates. This means that transaction operations do not immediately update the
physical database. Instead, only the transaction log is updated. The database is then physically
updated only after the transaction reaches a commit point. During recovery, the DBMS uses the
transaction log records to re-execute the transaction and update to the database accordingly.

Immediate updates- the database is updated immediately by transaction operations during


transaction execution (even before transaction is committed). If transaction is aborted before
completion, a rollback or undo operation is invoked to restore the database to a consistent state

3) Identify and describe the ACID properties of a transaction.

Atomic - This means that all the parts of a transaction must be executed completely, or the
transaction should fail. A transaction can’t be divided into smaller parts similarly to how an atom can’t
be broken down.

Consistent - This means that the data on all systems should reflect the same state. When a
transaction is completed the database is said to be in a consistent state.

Isolated - Transactions do not interact/interfere with one another, transactions act


independently. Data used during execution of a transaction can’t be used by a second transaction until
the first one is complete.

Durability - Once a transaction’s changes are done (committed), they can’t be undone, or lost,
even in the even event of a system failure.

4) Discuss the concept of “Concurrency Control” and why is concurrency control important in
Database Management Systems.

Concurrency control refers to the process of coordinating and managing the simultaneous transactions
in a shared database with multiple users. Essentially, concurrency control ensures that the outcomes of
transactions which are generated concurrently are correct and processed in the most efficient manner
with minimal impact on the performance of the database.
Concurrency control is an important task for the DBMS because it ensures that the simultaneous
execution of transactions doesn’t create problems that relate to data integrity and data consistency such
as:

• Lost Updates: This problem happens when one transaction has not yet committed when a
second transaction starts to execute. The value from the first transaction gets overwritten by the
value of the second transaction. This results in incorrect values in the database.

• Uncommitted Data: this occurs when two transactions are executed concurrently, and the first
transaction is rolled back after the second transaction has already accessed the uncommitted data
values, thus violating the isolation property of transactions.

• Inconsistent Retrievals: This issue occurs when a transaction accesses data before or after the
other transaction has updated successfully.

5) How can concurrency control impact on the performance of the Database Management
System?

Activity 4 – Distributed Database and Data Integration

1) Describe what is meant by a Distributed Database (DDB), and what is the rationale behind
implementing Distributed Database Management Systems.

A distributed database represents multiple interconnected databases spread out across several sites
connected by a network. Since the databases are all connected, they appear as a single database to the
users. Distributed databases utilize multiple nodes. They scale horizontally and develop a distributed
system.

https://www.google.com/search?q=Describe+what+is+meant+by+a+Distributed+Database+(DDB)
%2C&rlz=1C1GCEU_enZA982ZA982&oq=Describe+what+is+meant+by+a+Distributed+Database+(DDB)
%2C+&aqs=chrome...69i57j33i160l5.1993j0j15&sourceid=chrome&ie=UTF-8#imgrc=QsKP4RRCV6OPwM

The rationale behind implementing Distributed Database Management Systems.

 Business operations are becoming increasingly decentralized into different


geographically dispersed sites.
 Organizations have to compete at a global level, so it is essential to ensure global
footprint of operations. This is because customer demands, and market needs favor a
more decentralized management style.
 Companies require rapid ad-hoc data access across multiple sites in quick-response
times for critical decision making.
 There has been an increasing focus on aspects such as data analysis, data warehousing,
big data analytics etc., all which require data replication and distributed queries that
facilitate data extraction and integration.

2) Describe the concept of Data integration in Distributed Database Management Systems.

Data integration in a distributed database refers to the production of union-compatible views


for similar information expressed dissimilarly in different nodes. Such a facility is needed for
location transparency and for easier formulation of global queries over the apparently
incompatible data aggregated from different nodes. This paper examines the issues in data
integration within a relational context, and proposes a solution based on special relational
constructs which produce union-compatible relations.

https://www.semanticscholar.org/paper/Data-Integration-in-Distributed-Databases-Deen-Amin/
3580b0f60bca1730c4d308d4ffcd6894a1ffbafd

3) What are data access tools? Give examples of at least 2 data access tools and also describe the
basic features, limitations and ease of use of these tools.

 Microsoft SQL Server Management Studio


 Oracle SQL Developer
 SAS Enterprise Guide
Data Access Tools
MS SQL Server Management SAS Enterprise Guide
Studio
• Provides a front-end user
• Provides a front-end user interface which allows analysts to
interface which allows analysts to access data and perform data
access data easily and perform analysis.
data analysis (uses T-SQL).
• Frequent data access terms:
• Frequent data access terms: - Data set
- Observations & variables
- Table - Merge & Extract
- Rows & columns
Basic features - Joins & Queries • Contains a host of procedures
which perform specific tasks such
• Easy installation and set-up. as statistical analysis (i.e. ANOVA,
REGRESSION).
• SQL queries are submitted
individually to the database. • SAS has the ability to seamlessly
interact with other data
manipulation tools such as MS
EXCEL.
• SQL can be used within SAS by
using a procedure called PROC
SQL.

• SAS has no column limitations.

• Ideally suited for advanced data


analytics.
• SQL Server has column • SAS is only available
limitations (up to 1024 columns in commercially and is an expensive
non-wide tables, and 30 000 tool.
columns in a wide table)
Limitations • SAS is not available as an open-
• Whilst commercial versions of source product.
Microsoft SQL Server can be
relatively expensive, Open-source
SQL platforms such as MySQL are
free and easily accessible.

• Free course material can be • Free course material can be


accessed readily online. accessed readily online.

• User support community is large. • User support community is vast.

• Depends on user preference, • Depends on user preference


Ease of use some users prefer T-SQL.
4) How do Distributed Database Management Systems impact on the availability, reliability and
performance of a DBMS?

Distributed Databases result in increased availability and reliability of organizational data as faults are
isolated to specific sites without affecting other databases on the network.

Reliability refers to the ability of the database to function according to specifications and produces the
desired set of results for a specified time.

Reliability is used to measure how successful a system is in producing the service it was intended to.
DBMS reliability address transaction atomicity and durability.

This means that the DBMS functions at a specific point in time in such a manner that it fulfils the “all-or-
nothing” property of transactions and that in the event of a failure, the effects of all transactions are
reflected in the database, or none of the effects at all.

Improved Performance –

We can achieve inter-query and intra-query parallelism by executing multiple queries at different sites
by breaking up a query into a number of subqueries that basically executes in parallel which basically
leads to improvement in performance.

https://www.geeksforgeeks.org/advantages-of-distributed-database/

You might also like