0% found this document useful (0 votes)
15 views

Lec04 SQL Aggregates

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Lec04 SQL Aggregates

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Introduction to Database Systems

Lectures 4: Joins & Aggregation


(Ch. 6.1-6.4)

1
Outline
• Inner joins (6.2, review)
• Outer joins (6.3.8)
• Aggregations (6.4.3 – 6.4.6)

2
UNIQUE
• PRIMARY KEY adds implicit “NOT NULL” constraint
while UNIQUE does not
– you would have to add this explicitly for UNIQUE:

CREATE TABLE Company(


name VARCHAR(20) NOT NULL, …
UNIQUE (name));
• You almost always want to do this (in real schemas)
– SQL Server behaves strangely with NULL & UNIQUE
– otherwise, think through NULL for every query
– you can remove the NOT NULL constraint later

3
(Inner) Joins
SELECT a1, a2, …, an
FROM R1, R2, …, Rm
WHERE Cond

for t1 in R1: (Nested loop


for t2 in R2: semantics)
...
for tm in Rm:
if Cond(t1.a1, t1.a2, …):
output(t1.a1, t1.a2, …, tm.an)

4
(Inner) joins
Company(cname, country)
Product(pname, price, category, manufacturer)
– manufacturer is foreign key

SELECT DISTINCT cname


FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

5
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Camera Photo Hitachi Canon Japan

OneClick Photo Hitachi Hitachi Japan

6
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname
Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Camera Photo Hitachi Canon Japan

OneClick Photo Hitachi Hitachi Japan

pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA


7
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Camera Photo Hitachi Canon Japan

OneClick Photo Hitachi Hitachi Japan

Not output because country != ‘USA’


(also cname != manufacturer)
8
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Camera Photo Hitachi Canon Japan

OneClick Photo Hitachi Hitachi Japan

Not output because country != ‘USA’


9
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Camera Photo Hitachi Canon Japan

OneClick Photo Hitachi Hitachi Japan

Not output because category != ‘gadget’ (and …)


10
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Camera Photo Hitachi Canon Japan

OneClick Photo Hitachi Hitachi Japan

Not output because category != ‘gadget’


11
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Camera Photo Hitachi Canon Japan

OneClick Photo Hitachi Hitachi Japan

Not output because category != ‘gadget’


12
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Camera Photo Hitachi Canon Japan

OneClick Photo Hitachi Hitachi Japan

Not output because category != ‘gadget’ (with any Company)


13
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Camera Photo Hitachi Canon Japan

OneClick Photo Hitachi Hitachi Japan

restrict to category = ‘gadget’


14
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product (where category = ‘gadget’) Company


pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Canon Japan

Hitachi Japan

restrict to country = ‘USA’


15
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Product (where category = ‘gadget’) Company (where country = ‘USA’)


pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

Now only one combination to consider

(Query optimizers do this too.)


16
(Inner) joins
SELECT DISTINCT cname
FROM Product, Company
WHERE country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Alternative syntax:

SELECT DISTINCT cname


FROM Product JOIN Company ON
country = ‘USA’ AND category = ‘gadget’AND
manufacturer = cname

Emphasizes that the predicate is part of the join.


17
Self-Joins and Tuple Variables
• Ex: find companies that manufacture both products in
the ‘gadgets’ category and in the ‘photo’ category
• Just joining Company with Product is insufficient:
need to join Company with Product with Product
FROM Company, Product, Product

• When a relation occurs twice in the FROM clause we


call it a self-join; in that case every column name in
Product is ambiguous (why?)
– are you referring to the tuple in the 2nd or 3rd loop?

19
Name Conflicts we used cname / pname
to avoid this problem

• When a name is ambiguous, qualify it:


WHERE Company.name = Product.name AND …

• For self-join, we need to distinguish tables:


FROM Product x, Product y, Company

• These new names are called “tuple variables”


– can think of as name for the variable of each loop
– can also write “Company AS C” etc.
– can make SQL query shorter: C.name vs. Company.name

19
Self-joins
SELECT DISTINCT z.cname
FROM Product x, Product y, Company z
WHERE z.country = ‘USA’
AND x.category = ‘gadget’
AND y.category = ‘photo’
AND x.manufacturer = cname
AND y.manufacturer = cname;
Product Company
pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

SingleTouch photo Hitachi Hitachi Japan

MultiTouch photo GizmoWorks

20
Self-joins
SELECT DISTINCT z.cname
FROM Product x, Product y, Company z
WHERE z.country = ‘USA’
AND x.category = ‘gadget’
AND y.category = ‘photo’
AND x.manufacturer = cname
AND y.manufacturer = cname;
Product Company
x pname category manufacturer cname country

Gizmo gadget GizmoWorks GizmoWorks USA

SingleTouch photo Hitachi Hitachi Japan

MultiTouch photo GizmoWorks

21
Self-joins
SELECT DISTINCT z.cname
FROM Product x, Product y, Company z
WHERE z.country = ‘USA’
AND x.category = ‘gadget’
AND y.category = ‘photo’
AND x.manufacturer = cname
AND y.manufacturer = cname;
Product Company
x pname category manufacturer cname country
y Gizmo gadget GizmoWorks GizmoWorks USA

SingleTouch photo Hitachi Hitachi Japan

MultiTouch photo GizmoWorks

22
Self-joins
SELECT DISTINCT z.cname
FROM Product x, Product y, Company z
WHERE z.country = ‘USA’
AND x.category = ‘gadget’
AND y.category = ‘photo’
AND x.manufacturer = cname
AND y.manufacturer = cname;
Product Company
x pname category manufacturer cname country z
y Gizmo gadget GizmoWorks GizmoWorks USA

SingleTouch photo Hitachi Hitachi Japan

MultiTouch photo GizmoWorks

restrict to country = ‘USA’


Not output because y.category != ‘photo’ 24
Self-joins
SELECT DISTINCT z.cname
FROM Product x, Product y, Company z
WHERE z.country = ‘USA’
AND x.category = ‘gadget’
AND y.category = ‘photo’
AND x.manufacturer = cname
AND y.manufacturer = cname;
Product Company
x pname category manufacturer cname country z
Gizmo gadget GizmoWorks GizmoWorks USA

y SingleTouch photo Hitachi Hitachi Japan

MultiTouch photo GizmoWorks

Not output because y.manufacturer != cname


25
Self-joins
SELECT DISTINCT z.cname
FROM Product x, Product y, Company z
WHERE z.country = ‘USA’
AND x.category = ‘gadget’
AND y.category = ‘photo’
AND x.manufacturer = cname
AND y.manufacturer = cname;
Product Company
x pname category manufacturer cname country z
Gizmo gadget GizmoWorks GizmoWorks USA

SingleTouch photo Hitachi Hitachi Japan

y MultiTouch photo GizmoWorks

x.pname x.category x.manufacturer y.pname y.category y.manufacturer z.cname z.country

Gizmo gadget GizmoWorks MultiTouch Photo GizmoWorks GizmoWorks 26USA


Outer joins
Product(name, category)
Purchase(prodName, store) -- prodName is foreign key
SELECT Product.name, …, Purchase.store
FROM Product, Purchase
WHERE Product.name = Purchase.prodName

Or equivalently:
SELECT Product.name, …, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName

But some Products may not be not listed. Why? 27


Outer joins
Product(name, category)
Purchase(prodName, store) -- prodName is foreign key

If we want to include products that never sold,


then we need an “outer join”:

SELECT Product.name, …, Purchase.store


FROM Product LEFT OUTER JOIN Purchase ON
Product.name = Purchase.prodName

28
SELECT Product.name, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

28
SELECT Product.name, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

29
SELECT Product.name, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

Name Store

Gizmo Wiz

30
SELECT Product.name, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

Name Store

Gizmo Wiz

31
SELECT Product.name, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

Name Store

Gizmo Wiz

32
SELECT Product.name, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

Name Store

Gizmo Wiz

33
SELECT Product.name, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

Name Store

Gizmo Wiz

Camera Ritz

34
SELECT Product.name, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

Name Store

Gizmo Wiz

Camera Ritz

Camera Wiz

35
SELECT Product.name, Purchase.store
FROM Product JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

Name Store

Gizmo Wiz

Camera Ritz

Camera Wiz

36
SELECT Product.name, Purchase.store
FROM Product LEFT OUTER JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

Name Store

Gizmo Wiz

Camera Ritz

Camera Wiz

37
SELECT Product.name, Purchase.store
FROM Product LEFT OUTER JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category ProdName Store

Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

OneClick Photo Camera Wiz

Name Store

Gizmo Wiz

Camera Ritz

Camera Wiz

OneClick NULL

38
SELECT Product.name, Purchase.store
FROM Product RIGHT OUTER JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category
ProdName Store
Gizmo gadget
Gizmo Wiz
Camera Photo
Camera Ritz
OneClick Photo
Camera Wiz

Name Store Phone Foo

Gizmo Wiz

Camera Ritz

Camera Wiz

NULL Foo

39
SELECT Product.name, Purchase.store
FROM Product FULL OUTER JOIN Purchase ON
Product.name = Purchase.prodName
Product Purchase
Name Category
ProdName Store
Gizmo gadget
Gizmo Wiz
Camera Photo
Camera Ritz
OneClick Photo
Camera Wiz

Name Store Phone Foo

Gizmo Wiz

Camera Ritz

Camera Wiz

OneClick NULL
l
NULL Foo
40
Outer Joins
• Left outer join:
– Include the left tuple even if there’s no match
• Right outer join:
– Include the right tuple even if there’s no match
• Full outer join:
– Include both left and right tuples even if there’s no
match

• (Also something called a UNION JOIN, though it’s rarely used.)


• (Actually, all of these are used much more rarely than inner joins.)

41
Outer Joins Example

See lec04-sql-outer-joins.sql…

42
Aggregation in SQL
>sqlite3 lecture04

sqlite> create table Purchase(


pid int primary key,
product text,
price float, Other DBMSs have
quantity int, other ways of
month varchar(15)); importing data

sqlite> -- download data.txt


sqlite> .import lec04-data.txt Purchase

43
Comment about SQLite
• One cannot load NULL values such that they
are actually loaded as null values

• So we need to use two steps:


– Load null values using some type of special value
– Update the special values to actual null values
update Purchase
set price = null
where price = ‘null’
44
Simple Aggregations
Five basic aggregate operations in SQL

select count(*) from Purchase


select sum(quantity) from Purchase
select avg(price) from Purchase
select max(quantity) from Purchase
select min(quantity) from Purchase

Except count, all aggregations apply to a single value


45
Aggregates and NULL Values

Null values are not used in aggregates


insert into Purchase
values(12, 'gadget', NULL, NULL, 'april')
Let’s try the following
select count(*) from Purchase

select count(quantity) from Purchase

select sum(quantity) from Purchase

select sum(quantity)
from Purchase
where quantity is not null; 46
Aggregates and NULL Values
Null values are not used in aggregates
insert into Purchase
values(12, 'gadget', NULL, NULL, 'april')
Let’s try the following
select count(*) from Purchase
-- NULL is counted in count(*)
select count(quantity) from Purchase
-- NULL is ignored in count(quantity)

select sum(quantity) from Purchase

select sum(quantity)
from Purchase
where quantity is not null;
-- “is not null” is redundant
47
Counting Duplicates
COUNT applies to duplicates, unless otherwise stated:

SELECT Count(product) same as Count(*) if no nulls


FROM Purchase
WHERE price > 4.99

We probably want:
SELECT Count(DISTINCT product)
FROM Purchase
WHERE price> 4.99
48
More Examples

SELECT Sum(price * quantity)


FROM Purchase
What do
SELECT Sum(price * quantity) they mean ?
FROM Purchase
WHERE product = ‘bagel’

49
Simple Aggregations
Purchase Product Price Quantity
Bagel 3 20
Bagel 1.50 20
Banana 0.5 50
Banana 2 10
Banana 4 10
SELECT Sum(price * quantity)
FROM Purchase 90 (= 60+30)
WHERE product = ‘Bagel’
50
Simple Aggregations
Purchase Product Price Quantity
Bagel 3 20
Bagel 1.50 20
Banana 0.5 50
Banana 2 10
Banana 4 10
SELECT Sum(price * quantity)
FROM Purchase 90 (= 60+30)
WHERE product = ‘Bagel’
51
More Examples
How can we find the average revenue per sale?

SELECT sum(price * quantity) / count(*)


FROM Purchase
WHERE product = ‘bagel’

How can we find the average price of a bagel sold?

SELECT sum(price * quantity) / sum(quantity)


FROM Purchase
WHERE product = ‘bagel’

52
More Examples
SELECT sum(price * quantity) / count(*)
FROM Purchase
WHERE product = ‘bagel’

SELECT sum(price * quantity) / sum(quantity)


FROM Purchase
WHERE product = ‘bagel’

What happens if there are NULLs in price or quantity?

Lesson: disallow NULLs unless you need to handle them

53

You might also like