RELATIONAL DATABASE
MANAGEMENT SYSTEM
Introduction to MySQL
Part 2
THINGS TO LEARN:
■ Select statement
■ Where clause
■ Sorting Data
– Order By clause
■ SQL Functions
■ Single Row Functions
Basic SELECT statement
■ Data Retrieval using the SQL SELECT statement
– The SELECT statement is used to retrieve records/rows from one or more
tables.
– consists of at least the SELECT clause and the FROM clause.
– Syntax:
■ SELECT <select_expression> FROM <table_name>;
– select_expression is column name / comma-separated list of columns that
appears in the query result
– table_name is the name of the source table
Basic SELECT statement
Note: that world is the source schema used in all SQL statements presented.
■ Information requirement: List of all cities
– SELECT name FROM city;
■ result:
Basic SELECT statement
■ Information requirement: List of all cities along with their respective populations.
– SELECT name, population FROM city;
■ result:
Basic SELECT statement
■ Information requirement: Complete information on all cities.
– SELECT * FROM city;
■ result:
Basic SELECT statement
■ Information requirement: List of all districts.
– SELECT DISTINCT(district) FROM city;
■ result: the query returns all districts without duplicates
■ The keyword DISTINCT is used to remove duplicate rows.
– In the example, the district column may have duplicates, and must be
eliminated in the query result.
– Syntax:
■ SELECT DISTINCT<select_expression> FROM <table_name>;
Basic SELECT statement
■ Information requirement: List of all countries, their populations and their projected
populations if these increase by 25%.
– SELECT name country, population, population*1. 25 AS “projected population”
FROM country;
■ result:
Basic SELECT statement
■ SELECT name country, population, population*1. 25 AS “projected population”
FROM country;
Column alias
■ Basic arithmetic operators are typically used in column expressions. Parentheses
are used in complex expressions to indicate precedence.
■ A column alias may be used to rename a column headings, for readability purposes.
– The keyword AS is optional. Column aliases that consist of more than one word
are enclosed in double quotes.
WHERE clause
■ The WHERE clause is used to filter the records/ rows in a query.
– Syntax:
■ SELECT <select_expression> FROM <table_name> WHERE <conditional expression>;
– The WHERE keyword is followed by a conditional expression that must be satisfied
by a table row in order to be included in the query result. Comparison and logical
operators may be used for the condition.
– Comparison operators: , <=, >=, =, <> or !=
– Logical operators: AND, OR, NOT
WHERE clause
■ Special operators:
– LIKE evaluates to TRUE if a value matches a specified value. Utilizes the % and
_ characters. (e.g. „%range%‟, „_nd%‟, „at%‟)
– IN ( ) returns TRUE if a value matches any value in the list / set enclosed in
parentheses
– BETWEEN returns a range of values
– IS NULL is used to check if a value is NULL or not
– EXISTS returns TRUE if the value exists
WHERE clause
■ Information requirement: List of all cities with populations below 5000.
– SELECT name city FROM city WHERE population < 5000;
■ result:
WHERE clause
■ Information requirement: List of all cities with populations below 5000 and whose
names end with “Island”.
– SELECT name city, district FROM city WHERE population < 5000 AND district
LIKE '%Island’;
■ result:
The _ indicates a single Only 2 rows satisfy both conditions in
the WHERE clause.
WHERE clause
■ Information requirement: List of all cities that either have populations below 5000 or
whose names end with “Island”.
– SELECT name city, district FROM city WHERE population < 5000 OR district
LIKE '%Island’;
■ result:
There are more rows returned because a row that
satisfies either of the conditions in the WHERE
clause are included in the result.
WHERE clause
■ Information requirement: List of all cities whose populations range from 5,000 to
10,000. Include the population in the list.
– SELECT name city, population FROM city WHERE population BETWEEN 5000
AND 10000;
■ result:
Sorting data: ORDER BY clause
■ ORDER BY clause is used to sort query results
– ORDER BY clause is always the LAST clause in a SELECT statement (prior to the
LIMIT keyword)
– the keywords ASC and DESC are used to indicate the sort order (the default is
ascending order)
– sorting may be based on one or more columns, each with a possibly different
sort order
Sorting data: ORDER BY clause
■ Syntax:
– SELECT <select_expression> FROM <table_name> ORDER BY
<col_name/col_alias/col_number> [ASC/DESC]
[,col_name/col_alias/col_number> [ASC/DESC] [, …] ;
where:
– col_name : the name of the column
– col_alias : the column alias
– col_number : the column‟s order number in the select list
Sorting data: ORDER BY clause
■ Information requirement: List of all cities along with their respective populations.
Sort the data according to population.
– SELECT name, population FROM city ORDER BY population;
– SELECT name, population FROM city ORDER BY 2;
■ result:
Sorting data: ORDER BY clause
■ Information requirement: List of all cities whose populations exceed 10000000.
Sort the data in reverse alphabetical order.
– SELECT name city, population FROM city WHERE population > 10000000
ORDER BY city DESC;
■ result:
Sorting data: ORDER BY clause
■ Information requirement: List of all cities sorted in descending order by countrycode,
then alphabetically by city.
■ Heading: countrycode, siyudad, population
– SELECT countrycode, name AS siyudad, population FROM city ORDER BY
countrycode DESC, siyudad;
– SELECT countrycode, name AS siyudad, population FROM city ORDER BY 1
DESC, 2;
– SELECT countrycode, name AS siyudad, population FROM city ORDER BY 1
DESC, 2;
■ result: query returns rows sorted based on two columns
SQL Functions
■ RDBMSs like MySQL provide a variety of predefined functions
– A function takes 0 or more column expressions as arguments and always
return a value.
– Functions, possibly combined with operators and/or other functions can be
embedded in a SELECT statement in order to allow other types of data
processing such as data conversion, string manipulation, complex
mathematical operations, and the like.
■ Two types
– single-row functions
– group (a.k.a. multiple-row or aggregate) functions
Single-row functions
■ Single-row functions
– operate on single rows and return one result per row
– the function argument may be a column or an expression
■ the expression may contain another function, thus, making it possible to nest
functions
– manipulate different types of data and provide for the conversion from one
data type to another
Single-row functions
■ Some types of single-row functions
– String functions (e.g. CONCAT, LOWER, UPPER, LENGTH)
– Mathematical / Numeric (e.g., ABS, ROUND, TRUNCATE)
– Date and Time (e.g., ADDDATE, DATE_FORMAT, MONTH, YEAR, NOW)
– Control Flow (CASE.. WHEN.. THEN, IF, IFNULL, NULLIF)
Single-row functions
■ Information requirement: List of all Asian countries and their respective population
density, rounded up to integer values.
– SELECT Name, ROUND(Population/SurfaceArea,0) AS "Population Density“
FROM country WHERE continent = "Asia"
■ Result
Group functions
■ Group / Aggregate / Multiple-row Functions
– operate on sets of rows (the set may be the entire table or groups of rows in
the table) and return one result per set.
– unless specified otherwise, group functions ignore NULL values
– by default, group functions consider ALL non-null rows including duplicates,
unless the DISTINCT keyword is used
– group functions are primarily used with the GROUP BY clause of the SELECT
statement
– some examples: AVG, COUNT, MAX, MIN, SUM
Group functions
■ Consider the following examples and results:
– SELECT count(*) FROM country WHERE continent = 'North America’;
– SELECT count(name) FROM country WHERE continent = 'North America’;
– SELECT count(indepyear) FROM country WHERE continent = 'North America’;
COUNT(*) returns ALL rows. The 2nd query returns the same
value - the column name has no nulls. The 3rd query returns
a lower value; null indepyear values are not counted.
Group functions
■ Consider the following examples and results:
– SELECT count(region) FROM country WHERE continent = 'Asia’;
– SELECT count(DISTINCT region) FROM country WHERE continent = 'Asia’;
All rows with non-null values for region are counted,
regardless of duplicate values. With the use of DISTINCT,
duplicates are eliminated. Only the 4 distinct regions in Asia
are counted in the 2nd query.
Group functions
■ Information requirement: What is the average life expectancy?
– First approach:
■ SELECT AVG(LifeExpectancy) FROM country;
– Result: the computed average is based on non-null values; it only returns the
average based on countries with recorded life expectancy values
NOTE: In SQL, mathematical operations involving null values
return null (e.g., 10 + null = null). Hence a group function
excludes null values in the computation: average = sum of
LifeExpectancy values / number of non-null rows
Group functions
■ Information requirement: What is the average life expectancy?
– Second approach: get the overall average
■ SELECT AVG(IFNULL(LifeExpectancy, 0)) FROM country;
– Result: null values are replaced with 0 and are therefore included in the
computation of the average
No rows are null; all rows, even with 0 values, are included in
the sum: average = sum of LifeExpectancy values / number
of all rows
GROUP BY clause
■ Group Functions and the GROUP BY clause
– The group functions in the previous examples are applied to all rows.
– The GROUP BY clause is used to divide table rows into sets upon which the
group function is applied. It specifies the column/s that serve as the basis for
grouping the rows (e.g. by country, by product, by customer).
■ provides information such as:
– total sales per item / category / branch, average sales per year
– number of students per class, total units enrolled by each student
– maximum sales per agent, and so on
GROUP BY clause
■ Group Functions and the GROUP BY clause
– The aggregated data appears as a single row in the query result; hence,
columns in the SELECT clause typically include the items specified in the
GROUP BY list.
– By default, rows are sorted in ascending order of the columns included in the
GROUP BY list. The ORDER BY clause overrides this default order.
GROUP BY clause
■ Consider the following examples:
– SELECT continent, count(indepyear), avg(lifeexpectancy),
avg(ifnull(LifeExpectancy, 0)) FROM country GROUP BY continent;
– Result:
The group functions in the statement are applied to each set
of rows that belong to the same continent.
GROUP BY clause
■ SELECT countrycode, COUNT(population), AVG(population),
MAX(population),MIN(population), SUM(population) FROM city GROUP BY
countrycode;
– Result:
The group functions in the statement are computed per
country.
GROUP BY clause
■ SELECT continent, region, count(name), max(lifeexpectancy), avg(population) FROM
country GROUP BY continent, region;
– Result:
Group functions are applied to groups within groups.
GROUP BY clause
■ Filtering with WHERE and Group Functions
– SELECT region, AVG(population) FROM country GROUP BY region;
– SELECT region, AVG(population) FROM country WHERE population > 500000
GROUP BY region;
■ Rows with populations above 50000 are eliminated before the average per region
is computed.
– The 1st statement returns 25 rows, while the 2nd returns only 21 rows. All
countries in some regions have populations <= 500000.
■ The WHERE clause can be used to exclude rows before dividing them into groups.
However, it cannot be used to filter groups.
GROUP BY clause
■ Filtering with HAVING and Group Functions
– SELECT region, avg(population) FROM country GROUP BY region HAVING
avg(population) > 20000000;
■ The results of the averaged information are filtered, thus, only averages that satisfy
the condition specified in the HAVING clause appear in the result.
GROUP BY clause
■ Group functions may be nested in single-row functions. Single-row functions may be
nested in group functions.
– SELECT countrycode, ROUND(SUM(percentage),0) FROM countrylanguage
GROUP BY countrycode;
■ Group functions cannot be nested in MySQL.
LIMIT clause
■ The LIMIT clause is used to specify the number of rows to be returned by the query
based on the sort order specified (or implied) in the query.
– SELECT * FROM city LIMIT 10;
■ The query returns the first 10 rows from the city table based on the default order.
– SELECT * FROM city LIMIT 1,15;
■ The query returns 15 rows from the top.
LIMIT clause
■ Information requirement: Alphabetical list of all cities in the Philippines. Display only
the first 15 rows.
– SELECT name FROM city WHERE countrycode = ‘PHL’ ORDER BY 1 LIMIT 15;
■ Result:
LIMIT clause
■ SELECT * FROM city WHERE countrycode = ‘PHL’ ORDER BY name LIMIT 16,20;
– The query returns the next 20 rows, starting from the 16th row in the list.
TOP-N Queries
■ TOP-N queries provide information on such queries as:
– “Who are the top 10 students?”
– “What are the top 100 most saleable products?”
– “List the 50 employees with lowest salaries.”
■ Used with the LIMIT clause to get the desired N rows
– Data must be sorted accordingly in order to obtain the correct results.
TOP-N Queries
■ Information requirement: List of the top 5 Asian countries with the highest
populations.
– SELECT code, name, continent, population FROM country ORDER BY
population DESC LIMIT 5;
■ Result:
TOP-N Queries
■ Information requirement: List of the 10 smallest countries based on surface area.
– SELECT code, name, continent, surfacearea FROM country ORDER BY 4 LIMIT
10;
■ Result:
TOP-N Queries
■ What information does the following query provide?
– SELECT region, SUM(population) AS "region total" FROM country GROUP BY 1
ORDER BY 2 DESC LIMIT 5;
■ Result: