Introduction to SQL
Part 2
© 2005 by Prentice Hall 1
Outline
Single table select
Aggregations
Group by and having
2
Single Table Select
select column1, column2,…
from table-name
where condition
The where clause checks each row in the table whether
it satisfies the condition
If so, the columns in the select clause are returned
Where clause is processed first, and select clause is
processed last
Select and from clause are required
Where clause is optional (if there is no where clause, all
rows are returned)
3
Select Clause
select column1, column2,…
All these columns must appear in the table
in the from clause
E.g. select ename, salary
from emp
To return all columns in the table, use
select *
E.g., select * from emp
4
Where Clause
Syntax:
Where condition
Condition can be:
Comparison operator: =, <, <=, >, >=, !=
Typically compare a column with a value, or
compare two columns.
E.g.
Select * from emp
Where salary >= 50000;
5
Where Clause
Use correct format for values
Varchar or char
E.g.
Select * from emp
Where ename = 'jeff';
Integer, number
Select * from emp
Where salary >= 50000;
6
Where Clause
Use correct format for values
Date
Select * from emp
where hiredate <= date '2004-1-1';
Before a date is <
After is >
7
Where Clause
Multiple conditions
When all conditions must be satisfied
Cond1 AND cond2
Select * from emp
Where salary >= 50000
and hiredate <= date '2004-1-1';
AND is used in most cases
Don’t write multiple where clauses for multiple
conditions (use AND instead)
8
Where Clause
Multiple conditions
When only one of the conditions must be
satisfied (the other condition may not be
satisfied)
Cond1 OR cond2
Select * from emp
Where salary >= 50000
OR hiredate <= date '2004-1-1';
9
Where Clause
Multiple conditions
We can also have more than two conditions. If
there is a mix of AND OR, use parenthesis to
avoid ambiguity
E.g. Select * from emp
Where (salary >= 50000 and hiredate > date
'2004-1-1') OR hiredate <= date '2004-1-1';
10
Example
Return all rows in dept table
select * from dept;
Return all employee hired after ‘2005-1-1’
(including ‘2005-1-1’)
select * from emp where hiredate >= date
'2005-1-1';
11
Steps To Write Select
First decide which table needs to be used.
The table name will appear in the from
clause
Next decide which columns to return and
put them in select clause. If all columns
are returned (or no specific columns are
mentioned), use select *
12
Steps To Write Select
Finally decide which conditions to satisfy and put
them in the where clause. Identify column,
comparison operator (=, <, >, <=, >=), and
the value to compare against
Pay attention to the format of value
Varchar and char, in single quote
Date, with date prefix
If there are more than one conditions, link them
with AND, OR
13
Exercise
Return all rows in emp table
Return all employee (in emp table) whose salary
is greater or equal to 60000
Return all employee whose name is jeff
Return all employee whose salary is greater or
equal to 60000 and is hired after January 1st,
2005
14
Common Errors
Missing from clause
Spell column names wrong
Use multiple where clause
Use wrong format for values in conditions
Miss AND for multiple conditions (or use
comma)
15
Membership Condition
• Syntax
column in (expression-list)
column not in (expression-list)
• E.g.
SELECT did
FROM dept
WHERE dname IN ('HR','IT');
16
Null Condition
• Syntax
column is null
column is not null
• E.g.
SELECT ename
FROM emp
WHERE salary is not null;
17
Null Condition
• Can not use =
• E.g.
SELECT ename
FROM emp
WHERE salary = null;
This condition will never be satisfied (even if
salary is null)
18
The select Clause (Cont.)
SQL allows duplicates in relations as well as in query results.
To force the elimination of duplicates, insert the keyword
distinct after select.
Example, return distinct employee names from emp table
Select distinct ename
From emp;
19
Exercise
Return distinct list of dept id from emp
table
20
Distinct
If multiple columns are returned, can use
only one distinct (right after select), and
distinct combinations will be returned
Select distinct did, ename
From emp;
21
Outline
Single table select
Aggregations
Group by and having
22
Aggregate Functions
Return the number of departments
select count(*)
from dept;
Type of aggregations:
avg(column): average value
min(column): minimum value
max(column): maximum value
sum(column): sum of values
count(column) or count(*): number of values,
count(*) returns number of rows
Tips for Aggregations
When to use aggregations:
When you see: total, number of, average,
highest, lowest, largest, smallest, …
Where to place aggregations:
In select clause
In having clause (later)
Example
Return the number of employees
select count(*) from emp;
Return average salary of employees
select avg(salary) from emp;
Return the highest salary
select max(salary) from emp;
Exercise
Return the number of departments
Return the lowest salary of all
employees
Aggregations + Where
Return highest salary for employees hired
after 2005-1-1
select max(salary) from emp
where hiredate > date '2005-1-1';
The aggregation is computed over rows
satisfying the where condition
Exercise
Return the lowest salary of all employees
hired after 2005-1-1
Common Errors
Forget to put column being aggregated
inside ()
Wrong: select count * from dept;
Correct: select count(*) from dept;
Common Errors
Do not mix aggregations with other columns, if
there is no group-by (later)
E.g., select max(salary), ename from emp
This will not work
There is only one highest salary for emp table, but
there are many ename
Comparison Over Aggregation
Aggregation can not appear directly in
where clause
So how can you check each row's value
against an aggregation?
E.g., = max(salary)
31
Example (cont.)
Return the name of employee with highest
salary
We can not use
Select ename
From emp
Where salary = max(salary)
Because the max function can be computed only after
seeing all rows in the table, but when executing where
clause we can see one row in the table
32
Example (cont.)
EMP table
Ename salary
Jeff 70000
Susan 50000
Bob 90000
When the DBMS executes where clause for Jeff’s row, it
does not know what is the highest salary yet
So we need to write a subquery to compute highest salary
first
33
Solution
Use subquery in where clause
Return the name of the employee with the highest
salary
Select ename
From emp
Where salary =
(select max(salary) from emp);
The subquery will be processed first
34
Outline
Single table select
Aggregations
Group by and having
35
Group By
Return average salary per department
Select did, avg(salary)
From emp
Group by did
Group-By Statement
Syntax
Select group-by-columns, aggregations
from …
where …
Group by columns
Divide results into groups
Each group has same values on group-by columns
Aggregations are computed per group
SQL COMMAND FOR DIVIDING INTO GROUP FOR AVG SALARY
select * from emp;
select avg (salary) from emp;
select did, avg (salary) from emp
group by did;
Group-By Process
Emp
Eid did salary
1 1 70000
2 2 50000
3 1 90000
4 1 60000
Rows not satisfying where clause are filtered out first
Group-By Process
Group by did
Emp
Eid did salary group
1 1 70000 1
2 2 50000 2
3 1 90000 1
4 1 60000 1
Rows are assigned to groups based on group by column
Group-By Process
Emp
Avg(salary)
Eid did salary group
1 1 70000 1 Group1 73333
2 2 50000 2 Group 2 50000
3 1 90000 1
4 1 60000 1
Aggregations are computed per group
Restrictions
Select clause can only contain the following
two types:
Group-by columns
Aggregation (computed over each group)
E.g. is the following correct?
select count(*), hiredate
from emp
group by did
Group-By Statement
E.g. is the following correct?
select count(*), hiredate
from emp
group by did
SQL semantics requires only one value returned per
group (dept).
Employees of a department may have different hire
dates.
Exercises
Return the highest salary for each department
select did, max (salary) from emp
group by did;
Having Clause
HAVING condition
Indicate the conditions under which a group will be included
select did, max (salary) from emp where hiredate > date '2004-1-1'
group by did;
Example
Return dept id with no less than 2
employees
Select did
From emp
Group by did
Having count(*) >= 2
Group-By Process
Emp
Group Count(*)
Eid did salary group
1 1 70000 1 Group1 3
2 2 50000 2 Group2 1
3 1 90000 1 Having count(*) >=2
4 1 60000 1 Only group 1 remains
Having is computed after group-by and aggregation
It will filter out some groups
Having Clause
Difference with where clause:
Where clause filter rows
Having clause filter groups
Where clause is computed before group-by
Having is computed after group-by
Having Clause
select did, count(*)
from emp
where hiredate > date '2004-1-1'
group by did
having count(*) >= 2
Those rows with what condition will be counted?
Those groups with what condition will be included?
Having Clause
select did, count(*)
from emp
where hiredate > date '2004-1-1'
group by did
having count(*) >= 2
Employees hired after 2004-1-1 will be counted
They will be divided into groups based on their department
Those departments with no less than two such employees
will be counted
Exercises
Return the dept id with average salary
below 60000
select did, avg (salary)
from emp
group by did
having avg (salary) < 60000
Having and Where are two different
Tips for Group By
When to use group by:
When you see: per …, each…, every…
When group-by is used, always compute
some aggregations for each group
Tips for Group By
Template:
Select group-by columns, aggregations-per-group
from table
where …
group by columns
Where condition filter out rows before grouping
Where clause is optional
Select clause can only contain group-by columns
and aggregations (which is computed for each
group)
Tips for Group By
When to use having: to filter out groups
Select group-by columns, aggregations-per-group
From table
Where conditions over rows
Group by group-by columns
Having aggregation-per-group > (or <, or…) constant
Semantics: filter out groups with the aggregation-per-group
> (or…) constant
May have group-by without having, but having always
come with a group-by
Common Errors
Miss group by
Select did
From emp
Having count(*) >= 2
Common Errors
Conditions to filter groups must show up in having, not
where (No aggregation in where clause)
Wrong:
Select did from emp
Where count(*) >=2
Group by did
Correct:
Select did from emp
Group by did
Having count(*) >=2
Order-By Clause
Syntax:
Order by column-list
Sorts the result according to specified criteria
Default is ascending order
Add desc if descending order
Example
Return employees by ascending order of their hiredate
Select * from emp order by hiredate;
Return employees by decending order of their hiredate
Select * from emp order by hiredate desc;
Exercise
Return employees in descending order of
their salary
Figure SQL
statement
processing order
String Functions
String manipulations are the most used
functions in programming
String manipulation function examples:
Concatenation
Printing in uppercase
Finding length of an attribute
Database Systems, 9th
59
Edition
Substring Matching Condition
In where clause: Column-name like pattern
Pattern is a quoted string, it contains normal
characters and two special characters:
percent (%). The % character matches any
substring.
underscore (_). The _ character matches any
character.
To match a real % or _ (not any substring or
character), add \ before the character
60
Example
Return name of employees whose name
start with j
Select ename
From emp
Where ename like 'j%';
61
Exercise
Return name of employees whose name
contains character 'e' (so jeff, alice, ethan
all match)
62