Course Intro & Relational Model: Intro To Database Systems Andy Pavlo
Course Intro & Relational Model: Intro To Database Systems Andy Pavlo
Course Intro & Relational Model: Intro To Database Systems Andy Pavlo
01 Relational Model
Wait List
Overview
Course Logistics
Relational Model
Relational Algebra
WA I T L I S T
COURSE OVERVIEW
COURSE OUTLINE
Relational Databases
Storage
Execution
Concurrency Control
Recovery
Distributed Databases
Potpourri
C O U R S E LO G I S T I C S
Academic Honesty:
→ Refer to CMU policy page.
→ If you’re not sure, ask the professors.
→ Don’t be stupid.
TEXTBOOK
COURSE RUBRIC
Homeworks (15%)
Projects (45%)
Midterm Exam (20%)
Final Exam (20%)
Extra Credit (+10%)
HOMEWORKS
PROJECTS
BUSTUB
Architecture:
→ Disk-Oriented Storage
→ Volcano-style Query Processing
→ Pluggable APIs
→ Currently does not support SQL.
L AT E P O L I C Y
P L A G I A R I S M WA R N I N G
D ATA B A S E
D ATA B A S E E X A M P L E
F L AT F I L E S T R AWM A N
F L AT F I L E S T R AWM A N
F L AT F I L E S T R AWM A N
F L AT F I L E S : D ATA I N T E G R I T Y
F L AT F I L E S : I M P L E M E N TAT I O N
F L AT F I L E S : D U R A B I L I T Y
D ATA B A S E M A N A G E M E N T S Y S T E M
E A R LY D B M S s
E A R LY D B M S s
R E L AT I O N A L M O D E L
Edgar F. Codd
CMU 15-445/645 (Fall 2019)
28
D ATA M O D E L S
D ATA M O D E L
D ATA M O D E L
Relational
Key/Value
Graph
← NoSQL
Document
Column-family
Array / Matrix
Hierarchical
Network
CMU 15-445/645 (Fall 2019)
29
D ATA M O D E L
Relational
Key/Value
Graph
Document
Column-family
Array / Matrix ← Machine Learning
Hierarchical
Network
CMU 15-445/645 (Fall 2019)
29
D ATA M O D E L
Relational
Key/Value
Graph
Document
Column-family
Array / Matrix
Hierarchical
← Obsolete / Rare
Network
CMU 15-445/645 (Fall 2019)
29
D ATA M O D E L
R E L AT I O N A L M O D E L
R E L AT I O N A L M O D E L
R E L AT I O N A L M O D E L : P R I M A RY K E Y S
R E L AT I O N A L M O D E L : P R I M A RY K E Y S
R E L AT I O N A L M O D E L : F O R E I G N K E Y S
R E L AT I O N A L M O D E L : F O R E I G N K E Y S
Artist(id, name, year, country)
id name year country
123 Wu Tang Clan 1992 USA
456 Notorious BIG 1992 USA
789 Ice Cube 1989 USA
R E L AT I O N A L M O D E L : F O R E I G N K E Y S
Artist(id, name, year, country)
id name year country
123 Wu Tang Clan 1992 USA
ArtistAlbum(artist_id, album_id) 456 Notorious BIG 1992 USA
artist_id album_id 789 Ice Cube 1989 USA
123 11
123 22 Album(id, name, artists, year)
789 22 id name artists year
456 22 11 Enter the Wu Tang 123 1993
22 St.Ides Mix Tape ??? 1994
33 AmeriKKKa's Most Wanted 789 1990
R E L AT I O N A L M O D E L : F O R E I G N K E Y S
Artist(id, name, year, country)
id name year country
123 Wu Tang Clan 1992 USA
ArtistAlbum(artist_id, album_id) 456 Notorious BIG 1992 USA
artist_id album_id 789 Ice Cube 1989 USA
123 11
123 22 Album(id, name, year)
789 22 id name year
456 22 11 Enter the Wu Tang 1993
22 St.Ides Mix Tape 1994
33 AmeriKKKa's Most Wanted 1990
R E L AT I O N A L M O D E L : F O R E I G N K E Y S
Artist(id, name, year, country)
id name year country
123 Wu Tang Clan 1992 USA
ArtistAlbum(artist_id, album_id) 456 Notorious BIG 1992 USA
artist_id album_id 789 Ice Cube 1989 USA
123 11
123 22 Album(id, name, year)
789 22 id name year
456 22 11 Enter the Wu Tang 1993
22 St.Ides Mix Tape 1994
33 AmeriKKKa's Most Wanted 1990
D ATA M A N I P U L AT I O N L A N G UA G E S ( D M L )
Procedural: ← Relational
→ The query specifies the (high-level) strategy Algebra
the DBMS should use to find the desired result.
Non-Procedural:
→ The query specifies only what data is wanted
and not how to find it.
D ATA M A N I P U L AT I O N L A N G UA G E S ( D M L )
Procedural: ← Relational
→ The query specifies the (high-level) strategy Algebra
the DBMS should use to find the desired result.
Non-Procedural: ← Relational
→ The query specifies only what data is wanted Calculus
and not how to find it.
R E L AT I O N A L A LG E B R A
R E L AT I O N A L A LG E B R A : S E L E C T
R(a_id,b_id)
Choose a subset of the tuples from a a_id b_id
R E L AT I O N A L A LG E B R A : P R O J E C T I O N
R(a_id,b_id)
Generate a relation with tuples that a_id
a1
b_id
101
contains only the specified attributes. a2 102
→ Can rearrange attributes’ ordering. a2 103
→ Can manipulate the values. a3 104
Πb_id-100,a_id(σa_id='a2'(R))
Syntax: A1,A2,…,An(R) b_id-100 a_id
2 a2
3 a2
R E L AT I O N A L A LG E B R A : U N I O N
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains all a_id b_id a_id b_id
a1 101 a3 103
tuples that appear in either only one a2 102 a4 104
or both input relations. a3 103 a5 105
(R ∪ S)
Syntax: (R ∪ S) a_id b_id
a1 101
a2 102
(SELECT * FROM R) a3 103
UNION ALL a3 103
(SELECT * FROM S); a4 104
a5 105
R E L AT I O N A L A LG E B R A : I N T E R S E C T I O N
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains only a_id b_id a_id b_id
a1 101 a3 103
the tuples that appear in both of the a2 102 a4 104
input relations. a3 103 a5 105
Syntax: (R ∩ S) (R ∩ S)
a_id b_id
a3 103
(SELECT * FROM R)
INTERSECT
(SELECT * FROM S);
R E L AT I O N A L A LG E B R A : D I F F E R E N C E
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains only a_id b_id a_id b_id
a1 101 a3 103
the tuples that appear in the first and a2 102 a4 104
not the second of the input relations. a3 103 a5 105
Syntax: (R – S) (R – S)
a_id b_id
a1 101
a2 102
(SELECT * FROM R)
EXCEPT
(SELECT * FROM S);
R E L AT I O N A L A LG E B R A : P R O D U C T
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains all a_id b_id a_id b_id
a1 101 a3 103
possible combinations of tuples from a2 102 a4 104
the input relations. a3 103 a5 105
(R × S)
Syntax: (R × S) R.a_id
a1
R.b_id
101
S.a_id
a3
S.b_id
103
a1 101 a4 104
a1 101 a5 105
SELECT * FROM R CROSS JOIN S; a2 102 a3 103
a2 102 a4 104
a2 102 a5 105
SELECT * FROM R, S; a3 103 a3 103
a3 103 a4 104
a3 103 a5 105
R E L AT I O N A L A LG E B R A : J O I N
R(a_id,b_id) S(a_id,b_id)
Generate a relation that contains all a_id b_id a_id b_id
tuples that are a combination of two a1 101 a3 103
R E L AT I O N A L A LG E B R A : E X T R A O P E R AT O R S
Rename (ρ)
Assignment (R←S)
Duplicate Elimination (δ)
Aggregation (γ)
Sorting (τ)
Division (R÷S)
O B S E R VAT I O N
R E L AT I O N A L M O D E L : Q U E R I E S
C O N C LU S I O N