Updated Chapter 3 Database SQL

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 39

Data Engineering

Working with Database (SQL)


Outcomes
• Database
• Database in Data Engineering
• SQL
• SQL Basics – Data types
• SQL Commands
Database
• Any collection of data or information that is specially organized
for rapid search and retrieval by a computer.
• Databases are structured to facilitate the storage, retrieval,
modification, and deletion of data in conjunction with various
data-processing operations.
• A database is a computer system that holds large amounts of data.
Database in Data Engineering
• Data engineers are expert users of database systems.
• The Databases provide a structured way to store, retrieve, and manage
data efficiently.
• Databases enforce rules and constraints to ensure the accuracy and
reliability of data. E.g. Age column can’t be negative.
• Databases facilitate various data processing tasks, including data
transformation, aggregation, and analysis. Quires help in performing
complex queries and data manipulations.
Database- SQL
• SQL (Structured Query Language).
• SQL: is a programming language for
storing and processing information in a
relational database.
• Relational database: stores information in
tabular form, with rows and columns
representing different data attributes and the
various relationships between the data
values.

Rational Database example


SQL
• SQL plays a significant role in the entire data engineering project lifecycle, from data extraction to data
transformation and loading.
• It is a programming language that is commonly used by data engineers to interact with databases and
perform various tasks, such as extracting data, manipulating data, and creating and modifying tables.
• Learning SQL can help data engineers work more effectively with data analysts and data scientists as
they share a common language for querying and analyzing data: When all team members understand
SQL, they can communicate more effectively about how data is structured, accessed, and manipulated.
• SQL skills are highly transferable as they can be applied to a wide range of databases and data
management tools.
• Knowing SQL helps data engineers optimize data infrastructures for better performance and efficiency.
Importance of SQL for Data Engineers
 Data Integration and Analytical Queries
• Ex: Merging customer data from different sources to create a unified customer profile
 Database and Table Modification.
• Ex: Using SQL to modify database and table structures.
 Data Manipulation and Definition Languages (DML & DDL)
• Ex: Creating a new table and inserting data into it
 Data Analysis and Validation.
• Ex: Checking for duplicate customer entries.
 Data Loading into Data Warehouses.
 Ex: Importing data using SQL commands like COPY INTO.
 Data Transformation
• Example: Using SQL for data transformations like concatenation and date formatting.
SQL Queries
• Main component in SQL
• Queries are requests made to the database management system for
specific information
• As the database's structure becomes more and more complex, it
becomes more difficult to get the specific pieces of information we
want.
• A Google search is a query
SQL Basics

• Tables:
• A table is a collection of related data held in a
structured format within a database. It consists of rows
and columns.
• Each table has a unique name that identifies it within customer_id name email phone_number
the database. John
1 john@example.com 123-456-7890
• A database most often contains one or more tables. Doe
• Rows: 2
Jane
jane@example.com 987-654-3210
• Rows, also known as records or tuples, represent Smith
individual entries in the table. Each row in a table is a Bob
3 bob@example.com 555-666-7777
unique instance of data. Brown
• Columns:
• also known as fields or attributes, represent the
structure of the data in the table. Each column has a
specific data type and contains all the information of a
single type.
SQL Basics – Data types
1. Numeric data types (INT, TINYINT, BIGINT, FLOAT, REAL, etc.)
2. Date and Time data types (DATE, TIME, DATETIME, etc.)
3. Character and String data types (CHAR, VARCHAR, TEXT, etc.)
4. Unicode character string data types (NCHAR, NVARCHAR, NTEXT, etc.)
5. Binary data types (BINARY, VARBINARY, etc.)
6. Miscellaneous data types (CLOB, BLOB, XML, CURSOR, TABLE, etc.)
SQL Basics – Data types Numeric Data Types

Data Type Definition Example


INT A standard integer data type. INT age = 25;
TINYINT A very small integer, usually 1 byte in size. TINYINT flag = 1;
A large integer data type for storing very large
BIGINT BIGINT population = 9000000000;
numbers.
A floating-point number with approximate
FLOAT FLOAT price = 19.99;
precision.
A floating-point number, typically with higher
REAL REAL distance = 12345.6789;
precision than FLOAT.
A fixed-point number with exact precision, defined
DECIMAL DECIMAL(10, 2) salary = 1234.56;
by scale and precision.
Similar to DECIMAL, used for exact numeric
NUMERIC NUMERIC(8, 3) weight = 65.123;
values.
SQL Basics – Data types Date and Time Data Types

Data Type Definition Example

DATE Stores date values (year, month, day). DATE birthdate = '1990-01-01';

TIME Stores time values (hour, minute, second). TIME startTime = '13:45:30';

DATETIME Stores date and time values. DATETIME event = '2023-08-07 12:00:00';

Stores date and time with time zone TIMESTAMP ts =


TIMESTAMP
information, often used for tracking changes. CURRENT_TIMESTAMP;

YEAR Stores year values. YEAR year = 2024;


SQL Basics – Data types Character and String Data Types

Data Type Definition Example


Fixed-length character
CHAR CHAR(10) code = 'ABC';
string.
Variable-length character VARCHAR(50) name =
VARCHAR
string. 'John Doe';

Variable-length string used TEXT description = 'This is


TEXT
for large amounts of text. a long text.';
SQL Commands
1. Data Definition Language (DDL)
• DDL changes the structure of the table like creating a table, deleting a table, altering a table, etc.
Command Description
CREATE Creates a new table, a view of a table, or other object in the database.
ALTER Modifies an existing database object, such as a table.
DROP Deletes an entire table, a view of a table, or other object in the database.

2. Data Manipulation Language (DML)


• DML commands are used to modify the database. It is responsible for all forms of changes in the
database values.
Command Description
INSERT Creates a record.
UPDATE Modifies records.
DELETE Deletes records.
Data Definition Language (DDL)
1. Create:
• This statement is used for creating tables/databases and all its objects, including the tables, functions,
views, etc.
• A CREATE statement should be written using the following syntax:

• Example : Performing Create a query for the Customers table


Data Definition Language (DDL)
1. Create:
Data Definition Language (DDL)
2. Alter:
• This statement is used to change database structures and/or objects currently present in the database.
• The following syntax can be used for adding a column in the existing table:

ALTER TABLE table_name


**Make your wanted modifications

• **Make your wanted modifications: Alter also could be:


 Add a new column
 Drop a column
 Rename a column/table
Data Definition Language (DDL)
2. Alter:
 Add a new column
• Syntax:

• Example: add a new column (attribute) called Email:


Data Definition Language (DDL)
Email column is
added
2. Alter:
Data Definition Language (DDL)
• Alter:
 Drop a column
• Syntax:

ALTER TABLE table_name


DROP COLUMN column_name

• Example: drop the Email column:

ALTER TABLE Customers


DROP COLUMN Email
Data Definition Language (DDL)
2. Alter:
 Rename a column/table
• Syntax rename a table:

ALTER TABLE table_name


RENAME TO new_name

• Example: change the Customers table to clients:

ALTER TABLE Customers


RENAME TO clients
Data Definition Language (DDL)
3. Drop:
• This statement is used to delete a database object permanently.
• The following syntax should be used when writing a DROP statement to drop a table:

DROP TABLE table_name

• Ex: drop the clients table:

DROP TABLE clients


Data Manipulation Language (DML)
Data Manipulation Language (DML)
• INSERT:
• Data records or rows are added in a database table using the INSERT SQL command. In an INSERT statement, the
data value to be inserted and the column names for which the entry must be made are both specified.
• The following syntax is used for writing INSERT statements in SQL:

• Performing INSERT query to Customers table:


Data Manipulation Language (DML)
• INSERT:
Data Manipulation Language (DML)
• UPDATE:
• A database table's existing column values can be changed by using the UPDATE command or statement.
• The following syntax must be used when writing an UPDATE statement:

• Performing UPDATE query to Customers table:


Data Manipulation Language (DML)
• UPDATE:

Change to
Data Manipulation Language (DML)
• DELETE:
• One or more rows can be deleted from a database table using the DELETE statement in SQL.
• The data records are not permanently deleted. A DELETE command can always be reversed by using a rollback action.
• The following is the syntax for writing a DELETE statement:

• Performing DELETE query to the Customers table where the customer name is Matt Gregory:
Data Manipulation Language (DML)
• DELETE:

Change to
SQL Query (Select statement)
Select Query
• It is used to retrieve information from the database.
• Select statement doesn’t change the database; it just retrieves the data.
Simple Select Query
Product PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi

SELECT
SELECT **
FROM
FROM Product
Product
WHERE
WHERE category=‘Gadgets’
category=‘Gadgets’

PName Price Category Manufacturer


Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
“selection”
Select Query

• Retrieve specific information from the data.


• Basic form: (plus many many more bells and whistles)

SELECT
SELECT <attributes>
<attributes>
FROM
FROM <oneorormore
<one morerelations>
relations>
WHERE
WHERE <conditions>
<conditions>
Select Query
Assume that we have the following table:

1. Retrieving All the data:

SELECT * FROM students


Select Query

2. Retrieving Single Column Data:


Ex: “first_name” from the students

Syntax: SELECT first_name FROM students

Output:
Select Query

2. Retrieving Multiple Column Data:


Ex: “first_name” , “fees” from the students

Syntax: SELECT first_name, fees FROM students

Output:
Select Query

2. Retrieving Data based on a specific condition:


Ex: retrieve email where the fees are larger than 40000

SELECT email FROM students


Syntax:
WHERE fees > 40000

Output:
Select Query
• You can apply several select queries.

• The column name is case-sensitive

• The reserved words (e.g., select, update, delete…etc.) are not case-sensitive.

• Select stamen doesn’t change the database; it just retrieves the data.
The End
Thanks!

You might also like