Basic Database Concepts
Basic Database Concepts
A database is information that's set up for easy access, management and updating.
Computer databases typically store aggregations of data records or files that contain
information such as sales transactions, customer data, financials and product
information.
Databases are used for storing, maintaining and accessing any sort of data. They
collect information on people, places or things. This information is gathered in one
place so it can be observed and analyzed. Databases can be thought of as an
organized collection of information.
Databases are essential for storing large amounts of data in one place. With
databases, organizations can quickly access, manage, modify, update, organize and
retrieve their data
Business, government and scientific fields use databases for data storage, analysis
and management. Organizations use the data stored in databases to make informed
business decisions. Some of the ways organizations use databases include the
following:
Improve business processes. Companies collect data about business
processes, such as sales, order processing and customer service. They analyze
that data to improve these processes, expand their business and grow revenue.
While the different types of databases vary in schema, data structure and data types
most suited to them, they're all comprised of the following five basic components:
Hardware. This is the physical device that database software runs on. Database
hardware includes computers, servers and hard drives.
Data. This is the raw information that the database stores. Database
administrators organize the data to make it more meaningful.
Data access language. This is the programming language that controls the
database. The programming language and the DBMS must work together. One
of the most common database languages is SQL.
Procedures. These rules determine how the database works and how it handles
the data.
In any given column of a table, all the items are of the same kind, whereas
items in different columns may not be of the same kind.
For a row, each column must have an atomic value, and also, for a row, a
column cannot have more than one value.
All rows of a relation are distinct.
The ordering of rows in a relationship is immaterial.
The column of a relation are assigned distinct names, and the ordering of
these columns is immaterial.
Degree: The number of attributes in a relation determines the degree of the relation.
A relation having three attributes is said to have a relation of degree 3.
Types of databases
There are many types of databases. They're classified according to content type:
bibliographic, full text, numeric and images. In computing, databases are often
classified by the organizational approach they use.
1. There are several types of databases, that are briefly explained below.
2. Hierarchical databases
3. Network databases
4. Object-oriented databases
5. Relational databases
6. Cloud Database
7. Operational Database
8. NoSQL databases
A DBMS provides physical and logical independence from data. Users and
applications don't need to know either the physical or logical locations of data. A
DBMS can also limit and control access to the database and provide different views
of the same database schema to multiple users. Some examples of DBMSes include
Microsoft SQL Server, MySQL and Oracle Database.
After designing the conceptual model of the Database using ER diagram, we need to
convert the conceptual model into a relational model which can be implemented using
any RDBMS language like Oracle SQL, MySQL, etc. So we will see what the Relational
Model is.
The relational model uses a collection of tables to represent both data and the relationships
among those data. Each table has multiple columns, and each column has a unique name.
Tables are also known as relations. The relational model is an example of a record-based
model. Record-based models are so named because the database is structured in fixed-
format records of several types. Each table contains records of a particular type. Each record
type defines a fixed number of fields, or attributes. The columns of the table correspond to
the attributes of the record type. The relational data model is the most widely used data
model, and a vast majority of current database systems are based on the relational model.
Important Terminologies
Attribute: Attributes are the properties that define an entity.
e.g.; ROLL_NO, NAME, ADDRESS
Relation Schema: A relation schema defines the structure of the relation and
represents the name of the relation with its attributes. e.g.; STUDENT
(ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the relation schema for
STUDENT. If a schema has more than 1 relation, it is called Relational Schema.
Tuple: Each row in the relation is known as a tuple. The above relation contains
4 tuples, one of which is shown as:
NULL Values: The value which is not known or unavailable is called a NULL
value. It is represented by blank space. e.g.; PHONE of STUDENT having
ROLL_NO 4 is NULL.
Relational Model Concepts
As discussed earlier, a relational database is based on the relational model. This database
consists of various components based on the relational model. These include:
Relation: Two-dimensional table used to store a collection of data elements.
Tuple: Row of the relation, depicting a real-world entity.
Attribute/Field: Column of the relation, depicting properties that define the relation.
Attribute Domain: Set of pre-defined atomic values that an attribute can take i.e., it
describes the legal values that an attribute can take.
Degree: It is the total number of attributes present in the relation.
Cardinality: It specifies the number of entities involved in the relation i.e., it is the
total number of rows present in the relation. Read more about Cardinality in DBMS.
Relational Schema: It is the logical blueprint of the relation i.e., it describes the
design and the structure of the relation. It contains the table name, its attributes,
and their types:
For our Student relation example, the relational schema will be:
Super Key:
Super Key is defined as a set of attributes within a table that can uniquely identify each
record within a table. Super Key is a superset of Candidate key. In the table defined
above super key would include student_id, (student_id, name), phone etc.
Candidate Key
Candidate keys are defined as the minimal set of fields which can uniquely identify each
record in a table. It is an attribute or a set of attributes that can act as a Primary Key for a
table to uniquely identify each record in that table. There can be more than one candidate key.
In our example, student_id and phone both are candidate keys for table Student.
A candidate key can never be NULL or empty. And its value should be unique.
Primary key is a candidate key that is most appropriate to become the main key for any table.
It is a key that can uniquely identify each record in a table.
For the table Student we can make the student_id column as the primary key.
Composite Key
Key that consists of two or more attributes that uniquely identify any record in a table is
called Composite key. But the attributes which together form the Composite key are not a
key independentely or individually.
In the above picture we have a Score table which stores the marks scored by a student in a
particular subject.
In this table student_id and subject_id together will form the primary key, hence it is a
composite key.
Foreign Key FOREIGN KEY is a field (or collection of fields) in one table, that refers to the
PRIMARY KEY in another table. The table with the foreign key is called the child table, and
the table with the primary key is called the referenced or parent table. It is a key it acts as a
For example, DNO is a primary key in the DEPT table and a non-key in EMP
Use of Foreign Key
The use of a foreign key is simply to link the attributes of two tables together with the help of
a primary key attribute. Thus, it is used for creating and maintaining the relationship between
What is SQL?
Structured query language (SQL) is a popular query language that is frequently used
in all types of applications. Data analysts and developers learn and use SQL
because it integrates well with different programming languages. For example, they
can embed SQL queries with the Java programming language to build high-
performing data processing applications with major SQL database systems such as
Oracle or MS SQL Server. SQL is also fairly easy to learn as it uses common
English keywords in its statements
History of SQL
SQL was invented in the 1970s based on the relational data model. It was initially
known as the structured English query language (SEQUEL). The term was later
shortened to SQL. Oracle, formerly known as Relational Software, became the first
vendor to offer a commercial SQL relational database management system.
SQL table
A SQL table is the basic element of a relational database. The SQL database table
consists of rows and columns. Database engineers create relationships between
multiple database tables to optimize data storage space.
For example, the database engineer creates a SQL table for products in a store:
Then the database engineer links the product table to the color table with the Color
ID:
SQL statements
SQL statements, or SQL queries, are valid instructions that relational database
management systems understand. Software developers build SQL statements by
using different SQL language elements. SQL language elements are components
such as identifiers, variables, and search conditions that form a correct SQL
statement.
example
Every table is broken up into smaller entities called fields. The fields in the
Customers table consist of CustomerID, CustomerName, ContactName,
Address, City, PostalCode and Country. A field is a column in a table that
is designed to maintain specific information about every record in the
table.
A record, also called a row, is each individual entry that exists in a table.
For example, there are 91 records in the above Customers table. A record
is a horizontal entity in a table.
A column is a vertical entity in a table that contains all information
associated with a specific field in a table.
Data definition language (DDL) refers to SQL commands that design the database
structure. Database engineers use DDL to create and modify database objects
based on the business requirements. For example, the database engineer uses the
CREATE command to create database objects such as tables, views, and indexes.
Data query language (DQL) consists of instructions for retrieving data stored in
relational databases. Software applications use the SELECT command to filter and
return specific results from a SQL table.
The relational engine uses transaction control language (TCL) to automatically make
database changes. For example, the database uses the ROLLBACK command to
undo an erroneous transaction.
What is NoSQL?
NoSQL refers to non-relational databases that don't use tables to store data.
Developers store information in different types of NoSQL databases, including
graphs, documents, and key-values. NoSQL databases are popular for modern
applications because they are horizontally scalable. Horizontal scaling means
increasing the processing power by adding more computers that run NoSQL
software.
SQL vs. NoSQL
Recovery is the process whereby you retrieve and restore that backup data to your
production systems to avoid downtime.
Reliable backups and fast recovery together ensure business continuity and business
resilience.
What are the types of data recovery?
The amount of data organizations create, capture, and store has skyrocketed over the
last decade. And analysts anticipate the amount of new data generated will grow at
more than 50% compounded annually.
Because enterprises and people are storing data in more places, new categories of
backup data recovery have emerged. These include:
Here, you will learn head-to-head comparisons between Backup and Recovery. The
main differences between Backup and Recovery are as follows:
Database index
An index in a database is a data structure that improves the speed of data retrieval operations
on a database table. It acts like a lookup table that allows the database management system
(DBMS) to find data more efficiently without scanning the entire table.
1. Purpose:
o Indexes are used to quickly locate and access data in a database table,
significantly speeding up query performance, especially for large datasets.
2. Types of Indexes:
o Single-column Index: Created on a single column of a table.
o Multi-column (Composite) Index: Created on multiple columns to enhance
queries that filter on more than one column.
o Unique Index: Ensures that all values in the indexed column(s) are unique,
often used for primary keys.
o Full-text Index: Allows for efficient searching of text in large textual data.
3. Structure:
o Indexes are typically implemented using data structures like B-trees or hash
tables, allowing for efficient searching and retrieval.
4. Trade-offs:
o While indexes speed up read operations (SELECT queries), they can slow
down write operations (INSERT, UPDATE, DELETE) because the index needs to
be updated as well.
o Indexes consume additional disk space.
5. Creating an Index:
o In SQL, you can create an index using the CREATE INDEX statement. For
example:
sql
Copy code
CREATE INDEX idx_last_name ON employees(last_name);
6. Using Indexes:
o When a query is executed, the database engine decides whether to use an
index based on its query optimizer and can significantly reduce the number of
rows scanned.
Conclusion: