0% found this document useful (0 votes)
16 views

03 Advanced SQL Annotateddi

Gagahbs

Uploaded by

divakarmass444
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

03 Advanced SQL Annotateddi

Gagahbs

Uploaded by

divakarmass444
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Advanced SQL

Lecture 3: Advanced SQL

1 / 76
Advanced SQL

Today’s Agenda

Advanced SQL
1.1 Recap
1.2 Relational Language
1.3 Aggregates
1.4 Grouping
1.5 String and Date/Time Functions
1.6 Output Control
1.7 Nested Queries
1.8 Window Functions
1.9 Common Table Expressions
1.10 Joins

2 / 76
Advanced SQL Recap

Recap

3 / 76
Advanced SQL Recap

Relational Model

Proposed in 1970 by Ted Codd (IBM Almaden).


Data model to avoid this maintenance.

• Store database in simple data structures


• Access data through high-level language
• Physical storage left up to implementation

4 / 76
Advanced SQL Recap

Core Operators

• These operators take in relations (i.e., tables) as input and return a relation as output.
• We can “chain” operators together to create more complex operations.

• Selection (σ)
• Projection (Π)
• Union (∪)
• Intersection (∩)
• Difference (−)
• Product (×)
• Join (1)

5 / 76
Advanced SQL Relational Language

Relational Language

6 / 76
Advanced SQL Relational Language

Relational Language

• User only needs to specify the answer that they want, not how to compute it.
• The DBMS is responsible for efficient evaluation of the query.
▶ Query optimizer: re-orders operations and generates query plan

7 / 76
Advanced SQL Relational Language

SQL History

• Originally “SEQUEL" from IBM’s System R prototype.


▶ Structured English Query Language
▶ Adopted by Oracle in the 1970s.
▶ IBM releases DB2 in 1983.
▶ ANSI Standard in 1986. ISO in 1987
▶ Structured Query Language

8 / 76
Advanced SQL Relational Language

SQL History

• Current standard is SQL:2016


▶ SQL:2016 −→ JSON, Polymorphic tables
▶ SQL:2011 −→ Temporal DBs, Pipelined DML
▶ SQL:2008 −→ TRUNCATE, Fancy sorting
▶ SQL:2003 −→ XML, windows, sequences, auto-gen IDs.
▶ SQL:1999 −→ Regex, triggers, OO
• Most DBMSs at least support SQL-92
• Comparison of different SQL implementations

9 / 76
Advanced SQL Relational Language

Relational Language

• Data Manipulation Language (DML)


• Data Definition Language (DDL)
• Data Control Language (DCL)
• Also includes:
▶ View definition
▶ Integrity & Referential Constraints
▶ Transactions
• Important: SQL is based on bag semantics (duplicates) not set semantics (no
duplicates).

10 / 76
Advanced SQL Relational Language

List of SQL Features

• Aggregations + Group By
• String / Date / Time Operations
• Output Control + Redirection
• Nested Queries
• Join
• Common Table Expressions
• Window Functions

11 / 76
Advanced SQL Relational Language

Example Database

sid name login age gpa sid cid grade


1 Maria maria@cs 19 3.8 1 1 B
students 2 Rahul rahul@cs 22 3.5 enrolled 1 2 A
3 Shiyi shiyi@cs 26 3.7 2 3 B
4 Peter peter@ece 35 3.8 4 2 C

cid name
1 Computer Architecture
courses 2 Machine Learning
3 Database Systems
4 Programming Languages

12 / 76
Advanced SQL Aggregates

Aggregates

13 / 76
Advanced SQL Aggregates

Aggregates

• Functions that return a single value from a bag of tuples:


▶ AVG(col)−→ Return the average col value.
▶ MIN(col)−→ Return minimum col value.
▶ MAX(col)−→ Return maximum col value.
▶ SUM(col)−→ Return sum of values in col.
▶ COUNT(col)−→ Return number of values for col.

14 / 76
Advanced SQL Aggregates

Aggregates

• Aggregate functions can only be used in the SELECT output list.


• Task: Get number of students with a "@cs" login:
SELECT COUNT(login) AS cnt
FROM students WHERE login LIKE '%@cs'
SELECT COUNT(*) AS cnt
FROM students WHERE login LIKE '%@cs'
SELECT COUNT(1) AS cnt
FROM students WHERE login LIKE '%@cs'

CNT
3

15 / 76
Advanced SQL Aggregates

Multiple Aggregates

• Task: Get the number of students and their average GPA that have a "@cs" login.
SELECT AVG(gpa), COUNT(sid)
FROM students WHERE login LIKE '%@cs'

AVG CNT
3.6666 3

16 / 76
Advanced SQL Aggregates

Distinct Aggregates

• COUNT, SUM, AVG support DISTINCT


• Task: Get the number of unique students that have an "@cs" login.
SELECT COUNT(DISTINCT login)
FROM students WHERE login LIKE '%@cs'

COUNT
3

17 / 76
Advanced SQL Aggregates

Aggregates

• Output of columns outside of an aggregate.


• Task: Get the average GPA of students enrolled in each course.
SELECT AVG(s.gpa), e.cid
FROM enrolled AS e, students AS s
WHERE e.sid = s.sid
AVG e.cid
3.5 ???

18 / 76
Advanced SQL Aggregates

Aggregates

• Output of columns outside of an aggregate.


• Task: Get the average GPA of students enrolled in each course.
SELECT AVG(s.gpa), e.cid
FROM enrolled AS e, students AS s
WHERE e.sid = s.sid
AVG e.cid
3.5 ???
• column "e.cid" must appear in the GROUP BY clause or be used in an aggregate
function

18 / 76
Advanced SQL Grouping

Grouping

19 / 76
Advanced SQL Grouping

Group By

• Project tuples into subsets and calculate aggregates of each subset.


• Task: Get the average GPA of students enrolled in each course.
SELECT e.cid, AVG(s.gpa)
FROM enrolled AS e, students AS s
WHERE e.sid = s.sid
GROUP BY e.cid
e.cid AVG
1 3.8
3 3.5
2 3.8

20 / 76
Advanced SQL Grouping

Group By

• Non-aggregated values in SELECT output clause must appear in GROUP BY clause.


SELECT e.cid, AVG(s.gpa), s.name
FROM enrolled AS e, students AS s
WHERE e.sid = s.sid
GROUP BY e.cid
SELECT e.cid, AVG(s.gpa), s.name
FROM enrolled AS e, students AS s
WHERE e.sid = s.sid
GROUP BY e.cid, s.name

21 / 76
Advanced SQL Grouping

Having

• Filters results based on aggregate value.


• Predicate defined over a group (WHERE clause for a GROUP BY)
SELECT AVG(s.gpa) AS avg_gpa, e.cid
FROM enrolled AS e, students AS s
WHERE e.sid = s.sid AND avg_gpa > 3.9
GROUP BY e.cid
SELECT AVG(s.gpa) AS avg_gpa, e.cid
FROM enrolled AS e, students AS s
WHERE e.sid = s.sid
GROUP BY e.cid
HAVING avg_gpa > 3.9

22 / 76
Advanced SQL Grouping

Having

• Filters results based on aggregate value.


• Predicate defined over a group (WHERE clause for a GROUP BY)
SELECT AVG(s.gpa) AS avg_gpa, e.cid
FROM enrolled AS e, students AS s
WHERE e.sid = s.sid
GROUP BY e.cid
HAVING AVG(s.gpa) > 3.9

e.cid AVG
1 3.8
2 3.8

23 / 76
Advanced SQL String and Date/Time Functions

String and Date/Time Functions

24 / 76
Advanced SQL String and Date/Time Functions

String Operations

String Case String Quotes


SQL-92 Sensitive Single Only
Postgres Sensitive Single Only
MySQL Insensitive Single/Double
SQLite Sensitive Single/Double
DB2 Sensitive Single Only
Oracle Sensitive Single Only
WHERE UPPER(name) = UPPER('MaRiA') // SQL-92
WHERE name = 'MaRiA' // MySQL

25 / 76
Advanced SQL String and Date/Time Functions

String Operations

• LIKE is used for string matching.


• String-matching operators
▶ % : Matches any substring (including empty strings).
▶ _ : Match any one character
SELECT * FROM student AS s
WHERE s.login LIKE '%@%'
SELECT * FROM student AS s
WHERE s.login LIKE '%@c_'

26 / 76
Advanced SQL String and Date/Time Functions

String Operations

• SQL-92 defines string functions.


▶ Many DBMSs also have their own unique functions
• These functions can be used in any expression (projection, predicates, e.t.c.)
SELECT SUBSTRING(name,0,5) AS abbrv_name
FROM students WHERE sid = 1
SELECT * FROM students AS s
WHERE UPPER(e.name) LIKE 'M%'

27 / 76
Advanced SQL String and Date/Time Functions

String Operations

• SQL standard says to use || operator to concatenate two or more strings together.
SQL-92
SELECT name FROM students WHERE login = LOWER(name) || '@cs'
MSSQL
SELECT name FROM students WHERE login = LOWER(name) + '@cs'
MySQL
SELECT name FROM students WHERE login = CONCAT(LOWER(name), '@cs')

28 / 76
Advanced SQL String and Date/Time Functions

Date/Time Operations

• Operations to manipulate and modify DATE/TIME attributes.


• Can be used in any expression.
• Support/syntax varies wildly!
• Task: Get the number of days since 2000.
• Demo Time!
PostgreSQL
SELECT (now()::date - '2000-01-01'::date) AS days;
MySQL
SELECT DATEDIFF(CURDATE(), '2000-01-01') AS days;
SQL Server
SELECT DATEDIFF(day, '2000/01/01', GETDATE()) AS days;

29 / 76
Advanced SQL Output Control

Output Control

30 / 76
Advanced SQL Output Control

Output Redirection

• Store query results in another table:


▶ Table must not already be defined.
▶ Table will have the same number of columns with the same types as the input.
SQL-92
SELECT DISTINCT cid INTO CourseIds
FROM enrolled;
MySQL
CREATE TABLE CourseIds (
SELECT DISTINCT cid FROM enrolled
);

31 / 76
Advanced SQL Output Control

Output Redirection

• Insert tuples from query into another table:


▶ Inner SELECT must generate the same columns as the target table.
▶ DBMSs have different options/syntax on what to do with duplicates.
SQL-92
INSERT INTO CourseIds
(SELECT DISTINCT cid FROM enrolled);

32 / 76
Advanced SQL Output Control

Output Control

• ORDER BY <column*> [ASC|DESC]


▶ Order the output tuples by the values in one or more of their columns.
SELECT sid, grade FROM enrolled
WHERE cid = 2
ORDER BY grade
SELECT sid, grade FROM enrolled
WHERE cid = 2
ORDER BY grade DESC, sid ASC

sid grade
1 A
4 A

33 / 76
Advanced SQL Output Control

Output Control

• LIMIT <count> [offset]


▶ Limit the number of tuples returned in output.
▶ Can set an offset to return a "range"
SELECT sid, name FROM students
WHERE login LIKE '%@cs'
LIMIT 10
SELECT sid, name FROM students
WHERE login LIKE '%@cs'
LIMIT 20 OFFSET 10

34 / 76
Advanced SQL Nested Queries

Nested Queries

35 / 76
Advanced SQL Nested Queries

Nested Queries

• Queries containing other queries.


• They are often difficult to optimize.
• Inner queries can appear (almost) anywhere in query.
SELECT name FROM students --- Outer Query
WHERE sid IN
(SELECT sid FROM enrolled) --- Inner Query

36 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Get the names of students in course 2


SELECT name FROM students
WHERE ...

37 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Get the names of students in course 2


SELECT name FROM students
WHERE ...
SELECT sid FROM enrolled
WHERE cid = 2

38 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Get the names of students in course 2


SELECT name FROM students
WHERE sid IN (
SELECT sid FROM enrolled
WHERE cid = 2
)

name
Maria
Peter

39 / 76
Advanced SQL Nested Queries

Nested Queries

• ALL −→ Must satisfy expression for all rows in sub-query


• ANY −→ Must satisfy expression for at least one row in sub-query.
• IN −→ Equivalent to ’=ANY()’.
• EXISTS −→ Returns true if the subquery returns one or more records.

40 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Get the names of students in course 2


SELECT name FROM students
WHERE sid = ANY (
SELECT sid FROM enrolled
WHERE cid = 2
)

41 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Get the names of students in course 2


SELECT name FROM students AS s
WHERE EXISTS ( --- EXISTS operator
SELECT sid FROM enrolled AS e
WHERE cid = 2 and s.sid = e.sid
)

42 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Get the names of students in course 2


SELECT (SELECT s.name --- Inner query in projection expression
FROM students AS s
WHERE s.sid = e.sid) AS sname
FROM enrolled AS e
WHERE cid = 2

43 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Get the names of students not in course 2


SELECT name FROM students
WHERE sid ...

44 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Get the names of students not in course 2


SELECT name FROM students
WHERE sid != ALL (
SELECT sid FROM enrolled
WHERE cid = 2
)

name
Rahul
Shiyi

45 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Find students record with the highest id that is enrolled in at least one course.
--- Won't work in SQL-92
SELECT MAX(e.sid), s.name
FROM enrolled AS e, students AS s
WHERE e.sid = s.sid;

46 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Find students record with the highest id that is enrolled in at least one course.
--- "Is greater than every other sid"
SELECT sid, name
FROM students
WHERE ...
--- "Is greater than every other sid"
SELECT sid, name
FROM students
WHERE sid >= ALL(
SELECT sid FROM enrolled
)

sid name
4 Peter

47 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Find students record with the highest id that is enrolled in at least one course.
SELECT sid, name FROM students
FROM students
WHERE sid IN (
SELECT MAX(sid) FROM enrolled
)
SELECT sid, name FROM students
WHERE sid IN (
SELECT sid FROM enrolled
ORDER BY sid DESC LIMIT 1
)

48 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Find all courses that has no students enrolled in it.


SELECT * FROM courses
WHERE ...
--- "with no tuples in the 'enrolled' table"

49 / 76
Advanced SQL Nested Queries

Nested Queries

• Task: Find all courses that has no students enrolled in it.


SELECT * FROM courses
WHERE NOT EXISTS(
SELECT * FROM enrolled
WHERE course.cid = enrolled.cid
)

cid name
4 Peter

50 / 76
Advanced SQL Window Functions

Window Functions

51 / 76
Advanced SQL Window Functions

Window Functions

• Performs a “sliding” calculation across a set of related tuples.


• Unlike GROUP BY, tuples do not collapse into a group
• So needed if must refer back to individual tuples
SELECT ... FUNC-NAME(...) --- Special Window Functions, Aggregation Functions
OVER(...) --- How to slice up data? Can also sort.
FROM tableName

52 / 76
Advanced SQL Window Functions

Window Functions

• Special window functions:


▶ ROW_NUMBER() −→ Number of the current row
▶ RANK() −→ Order position of the current row.
• Aggregation functions:
▶ All the functions that we discussed earlier (e.g., MIN, MAX, AVG)
SELECT *, ROW_NUMBER()
OVER () AS row_num
FROM enrolled
sid cid grade row_num
1 1 B 1
1 2 A 2
2 3 B 3
4 2 A 4
53 / 76
Advanced SQL Window Functions

Window Functions

• The OVER keyword specifies how to group together tuples when computing the
window function.
• Use PARTITION BY to specify group.
SELECT cid, sid, ROW_NUMBER()
OVER (PARTITION BY cid) --- Note the row numbering
FROM enrolled
ORDER BY cid
cid sid row_number
1 1 1
2 1 1
2 4 2
3 2 1

54 / 76
Advanced SQL Window Functions

Window Functions

• You can also include an ORDER BY in the window grouping to sort entries in each
group.
SELECT cid, sid, ROW_NUMBER()
OVER (ORDER BY cid) --- Note the row numbering
FROM enrolled
ORDER BY cid
cid sid row_number
1 1 1
2 1 2
2 4 3
3 2 4

55 / 76
Advanced SQL Window Functions

Window Functions

• Task: Find the students with the highest grade for each course.
SELECT cid, sid, grade, rank FROM (
SELECT *, RANK() -- Group tuples by cid and then sort by grade
OVER (PARTITION BY cid ORDER BY grade ASC) AS rank
FROM enrolled
) AS ranking
WHERE ranking.rank = 1

cid sid grade rank


1 1 B 1
2 1 A 1
3 2 B 1

56 / 76
Advanced SQL Window Functions

Window Functions

• Task: Get the name of the students with the second highest grade for each course.
SELECT cid, sid, grade, rank FROM (
SELECT *, RANK()
OVER (PARTITION BY cid ORDER BY grade ASC) AS rank
FROM enrolled
) AS ranking
WHERE ranking.rank = 2 --- Update rank

cid sid grade rank


2 4 C 2

57 / 76
Advanced SQL Window Functions

Window Functions

• Task: Get the name of the students with the second highest grade for each course.
SELECT * FROM (
SELECT C.name, S.name, E.grade, RANK()
OVER (PARTITION BY E.cid ORDER BY E.grade ASC) AS grade_rank
FROM students S, courses C, enrolled E
WHERE S.sid = E.sid AND C.cid = E.cid --- Connect with students
) AS ranking
WHERE ranking.grade_rank = 2

name name grade rank


Machine Learning Peter C 2

58 / 76
Advanced SQL Common Table Expressions

Common Table Expressions

59 / 76
Advanced SQL Common Table Expressions

Common Table Expressions

• Provides a way to write auxiliary statements for use in a larger query.


▶ Think of it like a temp table just for one query.
• Alternative to nested queries and materialized views.
WITH cteName AS (
SELECT 1
)
SELECT * FROM cteName
column
1

60 / 76
Advanced SQL Common Table Expressions

Common Table Expressions

• You can bind output columns to names before the AS keyword.


WITH cteName (col1, col2) AS (
SELECT 1, 2
)
SELECT col1 + col2 FROM cteName
column
3

61 / 76
Advanced SQL Common Table Expressions

Common Table Expressions

• Task: Find students record with the highest id that is enrolled in at least one course.
WITH cteSource (maxId) AS (
SELECT MAX(sid) FROM enrolled
)
SELECT name FROM students, cteSource
WHERE students.sid = cteSource.maxId

62 / 76
Advanced SQL Common Table Expressions

Common Table Expressions – Recursion

• Task: Print the sequence of numbers from 1 to 10.


WITH RECURSIVE cteSource (counter) AS (
(SELECT 1)
UNION ALL
(SELECT counter + 1 FROM cteSource WHERE counter < 10)
)
SELECT * FROM cteSource

63 / 76
Advanced SQL Joins

Joins

64 / 76
Advanced SQL Joins

Types of Join

• Types of Join
▶ (INNER) JOIN (1) −→ Returns records that have matching values in both tables
▶ LEFT OUTER JOIN (=1) −→ Returns all records from the left table, and the matched
records from the right table
▶ RIGHT OUTER JOIN (1<) −→ Returns all records from the right table, and the matched
records from the left table
▶ FULL OUTER JOIN (=1<) −→ Returns all records when there is a match in either left or
right table

65 / 76
Advanced SQL Joins

Example Database

SQL Fiddle: Link

sid name sid hobby


1 Maria 1 Stars
students 2 Rahul hobbies 1 Climbing
3 Shiyi 2 Coding
4 Peter 5 Rugby

66 / 76
Advanced SQL Joins

Types of Join: Inner Join

• Task: List the hobbies of students.


SELECT name, hobby
FROM students JOIN hobbies
ON students.id = hobbies.user_id;

name grade
Maria Stars
Maria Climbing
Rahul Coding

67 / 76
Advanced SQL Joins

Types of Join: Left Outer Join

• Task: List the hobbies of all students.


SELECT name, hobby
FROM students LEFT OUTER JOIN hobbies
ON students.id = hobbies.user_id;

name grade
Maria Stars
Maria Climbing
Rahul Coding
Peter NULL
Shiyi NULL

68 / 76
Advanced SQL Joins

Types of Join: Right Outer Join

• Task: List all the hobbies of students.


SELECT name, hobby
FROM students RIGHT OUTER JOIN hobbies
ON students.id = hobbies.user_id;

name grade
Maria Stars
Maria Climbing
Rahul Coding
NULL Rugby

69 / 76
Advanced SQL Joins

Types of Join: Full Outer Join

• Task: List all the hobbies of all students.


SELECT name, hobby
FROM students FULL OUTER JOIN hobbies
ON students.id = hobbies.user_id;

name grade
Maria Stars
Maria Climbing
Rahul Coding
NULL Rugby
Peter NULL
Shiyi NULL

70 / 76
Advanced SQL Joins

More Types of Join

• SEMI JOIN (⋉)


▶ Returns record from the left table if there is a matching record in the right table
▶ Unlike regular JOIN, only returns columns from the left table and no duplicates.
▶ We do not care about the values of other columns in the right table’s record
▶ Used to execute queries with EXISTS or IN operators
• ANTI JOIN (▷)
▶ Opposite of a SEMI JOIN
▶ Returns record from the left table if there is no matching record in the right table
▶ Used to execute queries with NOT EXISTS or NOT IN operators
• LATERAL JOIN (▶◁) (a.k.a., Dependent Join, CROSS APPLY)
▶ Subqueries appearing in FROM clause can be preceded by the key word LATERAL
▶ Table functions appearing in FROM clause can also be preceded by the key word
LATERAL

71 / 76
Advanced SQL Joins

Types of Join: Semi Join

• Task: List the names of students with hobbies.


SELECT name
FROM students
WHERE sid IN
(SELECT sid
FROM hobbies);

name
Maria
Rahul

72 / 76
Advanced SQL Joins

Types of Join: Anti Join

• Task: List the names of students without hobbies.


SELECT name
FROM students
WHERE sid NOT IN
(SELECT sid
FROM hobbies);

name
Shiyi
Peter

73 / 76
Advanced SQL Joins

Types of Join: Lateral Join

• Task: List the names of students with hobbies.


SELECT name
FROM students, LATERAL (SELECT sid FROM hobbies
WHERE students.sid = hobbies.sid) ss;

name
Maria
Maria
Rahul

74 / 76
Advanced SQL Joins

Conclusion

• SQL is not a dead language.


• You should (almost) always strive to compute your answer as a single SQL statement.

75 / 76
Advanced SQL Joins

Next Class

• Storage Management

76 / 76

You might also like