DATABASE
DESIGN and
SQL
Database Design Understanding Phase
• Conceptual Database Design
• Logical Database Design
• Physical Database Design
Conceptual Database Design
• The Conceptual Database Design is used to develop high-level
data descriptions.
• This phase is carried out using an Entity-Relationship (E-R) model.
• Its purpose is to create a simplified representation of data that
aligns with the user's thinking or perspective.
ERD Component
Notation Description
Entities are a set of objects that can be uniquely and
distinctly identified from one another.
Each entity will invariably have attributes that serve to
explain or describe the characteristics of that entity.
Relationships in an Entity-Relationship Diagram (ERD)
denote the connections between one or more entities;
these relationships are represented through processes.
In an ERD, lines not only act as connectors but can also
indicate the direction or flow of the ERD.
Logical Database Design
• Transforming an Entity-Relationship (E-R) schema into a relational
database schema.
• Creating queries during the database or relational database
development.
• Converting the existing E-R schema into a relational database
schema.
• The objective is to obtain a conceptual schema in the relational
data model commonly referred to as a logical schema.
Database Normalization
The process of grouping data attributes to form simple,
non-redundant, flexible, and easily adaptable entities,
ensuring the creation of a high-quality database.
Objectives of normalization
● To eliminate and
reduce data
redundancy.
● To ensure data
dependencies so that
data resides in the
appropriate tables.
● To minimize
database storage
capacity usage.
Types of Normalization
● 1NF
● 2NF
● 3NF
● EKNF
● BCNF
● 4NF
● 5NF
● DKNF
● 6NF
Unnormalize
● Non-normal form represents a collection of data stored without adhering to a
specific format requirement.
● In non-normal form, there exists a repeating group, which in this condition, poses
data manipulation challenges (insert, update, and delete), commonly referred to as
anomalies.
customer_id title cluster city
1 sabun cair a a jakarta
sabun cair b
2 hp b depok
1 Normal Form
● The function of the first normal form is as follows:
1. Eliminating duplicate columns from the same table.
2. Creating separate tables for each related data group and identifying each row with a
unique column.
customer_id title cluster city
1 sabun cair a a jakarta
1 sabun cair b a jakarta
2 hp b depok
2 Normal Form
● The function of the second normal form is as follows:
1. Removing some subsets of data from the table and placing them in separate tables.
2. Creating relationships between the new tables and the original table by establishing
foreign keys.
3. Ensuring that no attribute in the table is functionally dependent on a candidate key of
that table.
customer_id title cluster city
1 sabun cair a a jakarta
1 sabun cair b a jakarta
2 hp b depok
2 Normal Form
id title
a transaksi di atas 5.000.000
b transaksi di bawah 5.000.000
customer_id title cluster city
1 sabun cair a a jakarta
1 sabun cair b a jakarta
2 hp b depok
3 Normal Form
● The function of the third normal form is as follows:
1. Eliminating partial 'transitive dependencies' within a table.
2. Transitive dependencies typically occur in relation tables or in
situations where there are three attributes A, B, and C. The condition
is A ⇒ B and B ⇒ C, making C a transitive dependency on A through
B.
3. If there is an attribute that depends on something other than the
primary key, it needs to be separated into a new table.
3 Normal Form
id title price id title
ab_1 sabun cair a 17000 a transaksi di atas 5.000.000
ab_2 sabun cair b 18000 b transaksi di bawah 5.000.000
hp_1 hp 10000000
customer_id product_code cluster city
1 ab_1 a jakarta
1 ab_2 a jakarta
2 hp_1 b depok
Physical Design
● Defining a set of table structures, data types for fields, and constraints on these
tables, such as primary keys, foreign keys, and not null.
● Identifying specialized storage structures and access methods for efficient
data retrieval.
● Designing security features for the database system, including creating
accounts, granting/revoking privileges, access protection, and setting security
levels.
Physical Database Design
• This stage involves creating indexes on several tables, grouping
some tables, or involving substantial redesign of certain parts
of the database schema.
• Typically, this is done directly within a DBMS like MySQL ,
PostgreSQL or a similar system.
Data Models
• Entity Relationship Diagram (ERD).
• An entity is represented in the form of a rectangle.
• Each entity is depicted with a set of attributes.
• Attributes describe specific characteristics of an entity and
are represented in the form of ellipses.
• A relationship is represented by a diamond shape (or
rhombus) that connects each entity.
• The connections between entities and relationships or
entities and their attributes are depicted as lines.
• The degree of a relationship/cardinality indicates the
degree of the relationship between entities and is
expressed in the form of 1:1, 1:N, N:1, N:M.
Data Models
• Entity
• Entities are objects that can be distinguished in the real world.
• A set of entities is a collection of similar objects, which can be
either:
• Physical objects: Houses, Vehicles, Equipment
• Conceptual objects: Jobs, Companies, Plans
• Types of Entities
• Composite Entities: Entities formed as a result of many-to-many
relationships (N:M).
• Weak Entities: Entities whose existence is highly dependent on
other entities and do not have a primary key."
Data Models
• Attribute
• An attribute is a characteristic of an entity or relationship that provides
detailed information about that entity or relationship.
• The value of an attribute is the actual data or information stored within
that entity's or relationship's attribute.
• Type of Attributes
• Key Attribute: An attribute that uniquely identifies an entity.
• Simple Attribute: An attribute that holds a single value.
• Multi-value Attribute: An attribute that holds a set of values for each entity
instance.
• Composite Attribute: An attribute that consists of several smaller
attributes, each with a specific meaning.
• Derived Attribute: An attribute that is generated from other attributes.
Data Models
Data Models
Data Models
• Relationship
• Relationship is a connection that exists between one or
more entities. A set of relationships is a collection of similar
relationships.
• A set of relationships is a collection of similar relationships.
• The degree of a relationship describes the number of
entities participating in a relationship, which includes:
• Unary degree
• Binary degree
• Ternary degree
Data Models
Data Models
• Cardinality Ratio Relationship
• Cardinality of a relationship indicates the maximum
number of entities that can be related to other entities in
another entity set.
• There are three types of relationship cardinalities, namely:
• One to One (1 : 1)
• One to Many (1 : N)
• Many to Many (N : M)
Data Models
Data Models
Data Models
Data Models
Data Models
Data Models
• Advantages and Disadvantages Model Data E-R
• Advantages
• Conceptually simple
• Easily understandable through visual representation
• Effective communication tool in database design
• Integration with the relational database model
• Disadvantages
• Limited representation of rules and constraints
• Limited representation of relationships
• Lack of a language for data manipulation
• Potential loss of informational content
SQL Introduction
Standard language for querying and manipulating data
Structured Query Language
Many standards out there:
• ANSI SQL
• SQL92 (a.k.a. SQL2)
• SQL99 (a.k.a. SQL3)
• Vendors support various subsets of these
• What we discuss is common to all of them
Command SQL
Data Definition Language (DDL)
Data Manipulation Language (DML)
•
Transact-SQL, dll
Data Types in SQL
Characters:
• CHAR(20) -- fixed length
• VARCHAR(40) -- variable length
Numbers:
• BIGINT, INT, SMALLINT, TINYINT
• REAL, FLOAT -- differ in precision
Times and dates:
• DATE
• DATETIME -- SQL Server
Others...
Tables in SQL
Table name Attribute
names
Product
PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Tuples or
Powergizmo $29.99 Gadgets GizmoWorks
rows
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Tables Explained
• The schema of a table is the table name and its attributes:
Product(PName, Price, Category, Manfacturer)
• A key is an attribute whose values are unique;
we underline a key
Product(PName, Price, Category, Manfacturer)
SQL Query
Basic form: (plus many many more bells and
whistles)
SELECT attributes
FROM relations (possibly
multiple)
WHERE conditions (selections)
Simple SQL Query
Product PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
SELECT *
FROM Product
WHERE
category=‘Gadgets’
“selection
PName Price Category Manufacturer
” Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
Simple SQL Query
Product PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
SELECT PName, Price,
Manufacturer
FROM Product
WHERE Price > 100
PName Price Manufacturer
“selection” SingleTouch $149.99 Canon
and
“projection” MultiTouch $203.99 Hitachi
A Notation for SQL Queries
Input
Schema
Product(PName, Price, Category, Manfacturer)
SELECT PName, Price,
Manufacturer
FROM Product
WHERE Price > 100
Answer(PName, Price, Manfacturer)
Output
Schema
Eliminating Duplicates
Category
SELECT DISTINCT
category Gadgets
FROM Product Photography
Household
Compare
to:
Category
Gadgets
SELECT category
FROM Product Gadgets
Photography
Household
Ordering the Results
SELECT pname, price, manufacturer
FROM Product
WHERE category=‘gizmo’ AND price >
50
ORDER BY price, pname
Ordering is ascending, unless you specify the DESC keyword.
Ties are broken by the second attribute on the ORDER BY list,
etc.
Ordering the Results
SELECT
category
FROM Product
ORDER BY
pname
PName Price Category Manufacturer
?
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Ordering the Results
Category
SELECT DISTINCT
category Gadgets
FROM Product Household
ORDER BY category
Photography
Compare
to:
SELECT
category
FROM Product
ORDER BY
pname
?
Joins in SQL
• Connect two or more tables:
Product PName Price Category Manufacturer
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Company Cname StockPrice Country
What is
the connection GizmoWorks 25 USA
between
them ? Canon 65 Japan
Hitachi 15 Japan
Joins
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)
Find all products under $200 manufactured in Japan;
return their names and prices.
Join
between Product
and Company
SELECT pname, price
FROM Product, Company
WHERE manufacturer=cname AND
country=‘Japan’
AND price <= 200
Joins in SQL
Product
Company
PName Price Category Manufacturer Cname StockPrice Country
Gizmo $19.99 Gadgets GizmoWorks
GizmoWorks 25 USA
Powergizmo $29.99 Gadgets GizmoWorks
Canon 65 Japan
SingleTouch $149.99 Photography Canon
Hitachi 15 Japan
MultiTouch $203.99 Household Hitachi
SELECT pname, price
FROM Product, Company
WHERE manufacturer=cname AND country=‘Japan’
AND price <= 200
PName Price
SingleTouch $149.99
Joins
Product (pname, price, category, manufacturer)
Company (cname, stockPrice, country)
Find all countries that manufacture some product in the ‘Gadgets’ category.
SELECT country
FROM Product, Company
WHERE manufacturer=cname AND
category=‘Gadgets’
Joins in SQL
Product
Company
Name Price Category Manufacturer
Cname StockPrice Country
Gizmo $19.99 Gadgets GizmoWorks
GizmoWorks 25 USA
Powergizmo $29.99 Gadgets GizmoWorks
Canon 65 Japan
SingleTouch $149.99 Photography Canon
Hitachi 15 Japan
MultiTouch $203.99 Household Hitachi
SELECT country
FROM Product, Company
WHERE manufacturer=cname AND
category=‘Gadgets’ Country
??
What is ??
the problem ?
What’s the
solution ?
Joins
Product (pname, price, category, manufacturer)
Purchase (buyer, seller, store, product)
Person(persname, phoneNumber, city)
Find names of people living in Seattle that bought some product in the ‘Gadgets’
category, and the names of the stores they bought such product from
SELECT DISTINCT persname, store
FROM Person, Purchase, Product
WHERE persname=buyer AND product = pname
AND
city=‘Seattle’ AND category=‘Gadgets’
Disambiguating Attributes
• Sometimes two relations have the same
attr:
Person(pname, address, worksfor)
Company(cname, address)
SELECT DISTINCT pname, Which
address address
FROM Person, Company ?
WHERE worksfor = cname
SELECT DISTINCT Person.pname,
Company.address
FROM Person, Company
WHERE Person.worksfor = Company.cname
Tuple Variables
Product (pname, price, category,
manufacturer)
Purchase (buyer, seller, store, product)
Person(persname, phoneNumber, city)
Find all stores that sold at least one product that the
store
‘BestBuy’ also sold:
SELECT DISTINCT x.store
FROM Purchase AS x, Purchase AS y
WHERE x.product = y.product AND y.store =
‘BestBuy’
Answer
(store)
SQL – Select
Select <List of Columns and expressions (usually involving columns)>
From <List of Tables & Join Operators>
Where <List of Row conditions joined together by And, Or, Not>
Group By <list of grouping columns>
Having <list of group conditions connected by And, Or, Not >
Order By <list of sorting specifications>
JOIN
Join is a way to link
data retrieved from
tables through a
column that links
them.
How many types of joins are there?
• There are 4 types of Join
Tables, namely:
1. Inner Join
2. Left Join
3. Right Join
4. Full Join
Inner Join
Inner Join
• The inner join is
probably the most
widely used join type.
Inner join returns rows
from two or more tables
that meet the
conditions.
• Example :
SELECT columns FROM TableA
INNER JOIN TableB
ON A.columnName = B.columnName;
Left [Outer] Join
• A left outer join (often
abbreviated as left join)
will return all rows from
the table on the left that
are subject to the ON
condition and only rows
from the table on the
right that meet the join
condition.
• Example :
SELECT columns FROM TableA
LEFT OUTER JOIN TableB
ON A.columnName = B.columnName;
Left [Outer] Join without Intersection
• This join is a variation of
the left outer join. In this
join, we will only retrieve
data from the left table
which is subject to the ON
condition which also
fulfills the join condition
without data from the
right-hand table that
fulfills the join condition.
• Example :
SELECT columns FROM TableA
LEFT OUTER JOIN TableB
ON A.columnName = B.columnName
WHERE B.columnName IS NULL;
Right [Outer] Join
• A right outer join (often
abbreviated as right join)
will return all rows from
the right-hand table that
are subject to the ON
condition with data from
the left-hand table that
satisfy the join condition.
This technique is the
opposite of the left outer
join.
• Example :
SELECT columns FROM TableA
RIGHT OUTER JOIN TableB
ON A.columnName = B.columnName;
Right [Outer] Join without Intersection
• This technique is a
variation of the right outer
join. In this join, we will
only retrieve data from the
right-hand table that is
subject to the ON
condition which also
fulfills the join condition
without data from the
right-hand table that
meets the join condition.
• Example :
SELECT columns FROM TableA
RIGHT OUTER JOIN TableB
ON A.columnName = B.columnName
WHERE A.columnName IS NULL;
Full [Outer] Join
• A full outer join (often
abbreviated as full join)
will return all rows from
both tables that are ON
including data that are
NULL.
• Example :
SELECT columns FROM TableA
FULL JOIN TableB
ON A.columnName = B.columnName;
Full [Outer] Join without Intersection
• Another variation of the
full outer join will return
all data from both tables
that are ON without any
data having NULL
values.
• Example :
SELECT columns FROM TableA
FULL JOIN TableB
ON A.columnName = B.columnName
WHERE A.columnName IS NULL
OR B.columnName IS NULL;
ERD
ETL Pipeline
Data Architecture
Reference
Textbooks
1. Date, C.J. 2000, An Introduction to Database
System, Addison Wesley Publishing Company,
Vol. 7, New York.
2. Fathansyah, 1999, Basis Data, Informatika,
Bandung.
References
3. Elmasri, Ramez; Navathe, Shamkant B., 2001,
Fundamentals of Database Systems, The
Benjamin/ Cummings Publishing Company, Inc.,
California.
Project
(1)
Toko A membutuhkan sistem manajemen inventaris
modern untuk mengotomatiskan proses,
meningkatkan keakuratan data, dan meningkatkan
kontrol inventaris secara keseluruhan. Sistem ini
harus memungkinkan toko untuk mengoptimalkan
tingkat stok, mengurangi biaya inventaris, dan
meningkatkan kepuasan pelanggan.
Project
(2)
Sebuah Fitness Center membutuhkan sistem
manajemen hubungan pelanggan (CRM) modern
untuk mengotomatiskan proses, mendapatkan
wawasan pelanggan, dan meningkatkan kampanye
pemasaran yang dipersonalisasi. Sistem ini harus
memungkinkan pusat kebugaran untuk
meningkatkan retensi pelanggan, meningkatkan
penjualan keanggotaan, dan memperkuat loyalitas
merek.
Project
(3)
PNM (Permodalan Nasional Mandiri) menyalurkan dana pinjaman
usaha kepada pelaku usaha mikro di seluruh Indonesia. Proses
pengadaan barang dan jasa untuk mendukung operasional PNM saat
ini masih manual dan memakan waktu. Hal ini berdampak pada
lamanya waktu pengadaan, kurangnya transparansi, dan berpotensi
terjadi kesalahan. PNM membutuhkan sistem pengadaan yang lebih
efisien dan transparan untuk mengelola proses pengadaan barang
dan jasa. Sistem ini harus dapat menyederhanakan proses
permintaan dan persetujuan pengadaan., meningkatkan kolaborasi
antar departemen terkait pengadaan, memastikan kepatuhan
terhadap peraturan pengadaan pemerintah, mengurangi resiko
kesalahan dan kecurangan dalam proses pengadaan, menyediakan
data dan analitik untuk pengambilan keputusan yang lebih baik.
Project
(4)
PNM (Permodalan Nasional Mandiri) menyalurkan berbagai barang
modal kepada pelaku usaha mikro binaannya. PNM saat ini masih
menggunakan sistem manual untuk mengelola stok barang modal
tersebut. Hal ini berpotensi menimbulkan ketidakakuratan data stok,
kesulitan pelacakan barang, dan keterlambatan pengiriman barang
ke penerima. PNM membutuhkan sistem manajemen stok yang lebih
akurat dan efisien untuk mengelola barang modal yang disalurkan
kepada pelaku usaha mikro. Sistem ini harus dapat melacak
pergerakan barang modal secara real-time, mencegah terjadinya
stok barang modal yang berlebihan atau kekurangan, mempercepat
proses pengiriman barang modal ke penerima, menyederhanakan
proses pengembalian barang modal yang rusak.
Project
(5)
Rumah Sakit Sehat Bersama, sebuah rumah sakit swasta di Jakarta, mengalami
kesulitan dalam mengelola pengadaan barang medis. Proses pengadaannya
masih manual dan memakan waktu, sering terjadi kekurangan stok barang medis,
dan kurangnya transparansi dalam proses pengadaan. Hal ini berakibat pada
terhambatnya pelayanan medis kepada pasien. Rumah Sakit Sehat Bersama
membutuhkan sistem pengadaan barang medis yang lebih efisien, transparan,
dan terintegrasi dengan sistem informasi rumah sakit lainnya. Sistem ini harus
dapat menyederhanakan proses permintaan dan persetujuan pengadaan barang
medis, meningkatkan kolaborasi antar departemen terkait pengadaan,
memastikan kepatuhan terhadap peraturan pengadaan barang dan jasa di sektor
kesehatan, mengurangi risiko kesalahan dan kecurangan dalam proses
pengadaan, menyediakan data dan analitik untuk pengambilan keputusan yang
lebih baik terkait pengadaan barang medis, terintegrasi dengan sistem informasi
rumah sakit lainnya, seperti sistem inventaris dan sistem keuangan.
Thank You
G2Academy Xtion
Lippo St. Moritz Office Tower (Floor 9, Unit 901)
Jl. Puri Indah, South Kembangan, Kembangan,
West Jakarta City 11610