Course Slides - SQL Fundamentals
Course Slides - SQL Fundamentals
Tasks
Skills
Company Overview
• Large, multinational manufacturing company
• Focused on metal and composite bicycles to
North American, European, and Australian
commercial markets
• 290 employees
• Several regional sales teams
Business Goals:
• Broaden market share by targeting their
sales
• Extend product availability through an
external Website
Please refer to the SQL Credentials text file in your SQL Student Files folder
Create a new query and type the following code. Run the code using the PLAY button.
SQL Code
SELECT
CustomerKey AS CustomerID,
SUM(SalesAmount) AS SalesAmount
FROM FactInternetSales
GROUP BY CustomerKey
An advanced notepad is a great way to create a code repository, for SQL and other coding languages.
SELECT and FROM are the two most basic key words that form part of any SQL query.
SELECT * allows us to see all the columns in a table, however we should only use this for testing.
SQL Code
SELECT *
Query
FROM FactInternetSales
We can include specific columns in our SELECT statement to make our query more specific.
SQL Code
SELECT
SalesOrderNumber,
OrderDate,
Query SalesAmount,
TaxAmt,
OrderQuantity
FROM FactInternetSales
SQL Code
SELECT
SalesOrderNumber AS InvoiceNumber,
OrderDate,
Code SalesAmount,
TaxAmt,
OrderQuantity
FROM FactInternetSales
WHERE allows us to filter the rows that are selected in the FROM tables.
SQL Code
SELECT
SalesOrderNumber AS InvoiceNumber,
OrderDate,
SalesAmount,
TaxAmt,
Query OrderQuantity,
SalesTerritoryKey
FROM FactInternetSales
WHERE SalesTerritoryKey = 6
After using a WHERE filter, use the ROW count at the bottom of the query to see how many rows remain.
We can use WHERE to filter to a single invoice number to help us test the group by in the next video.
SQL Code
SELECT
SalesOrderNumber AS InvoiceNumber,
OrderDate,
SalesAmount,
TaxAmt,
Query
OrderQuantity
FROM FactInternetSales
--WHERE SalesTerritoryKey = 6
WHERE SalesOrderNumber = ‘S051182’
InvoiceNumber OrderDate SalesAmount TaxAmt OrderQuantity InvoiceNumber OrderDate SalesAmount TaxAmt OrderQuantity
SalesOrderNumber AS InvoiceNumber,
Every column in the selected dataset must either be grouped, or
OrderDate,
aggregated.
SUM(SalesAmount) AS InvoiceSubTotal,
SUM(TaxAmt) AS TaxAmount,
• Groups are determined by the attributes we specify in the
SUM(OrderQuantity) AS TotalQuantity
GROUP BY statement
FROM FactInternetSales
• Aggregate functions are listed in the SELECT statement.
--WHERE SalesTerritoryKey = 6
WHERE SalesOrderNumber = ‘S051182’
SQL Code
SELECT
SalesOrderNumber AS InvoiceNumber,
OrderDate,
SUM(SalesAmount) AS InvoiceSubTotal,
SUM(TaxAmt) AS TaxAmount,
Query SUM(OrderQuantity) AS TotalQuantity
FROM FactInternetSales
--WHERE SalesTerritoryKey = 6
WHERE SalesOrderNumber = ‘S051182’
Updated request: View only invoices where total amount is more than $1,000.
The HAVING statement allows us to filter grouped rows in our output table.
SQL Code
SELECT
SalesOrderNumber AS InvoiceNumber,
OrderDate,
SUM(SalesAmount) AS InvoiceSubTotal,
SUM(TaxAmt) AS TaxAmount,
SUM(OrderQuantity) AS TotalQuantity
Query
FROM FactInternetSales
WHERE SalesTerritoryKey = 6
--WHERE SalesOrderNumber = ‘S051182’
This means the results of the SELECT statement such as column aliases cannot be used in earlier steps.
The ORDER BY statement allows us to sort the final rows in our table.
SQL Code
SELECT
SalesOrderNumber AS InvoiceNumber,
OrderDate,
SUM(SalesAmount) AS InvoiceSubTotal,
SUM(TaxAmt) AS TaxAmount,
SUM(OrderQuantity) AS TotalQuantity
SQL Code
SELECT
SalesOrderNumber AS InvoiceNumber,
OrderDate,
SUM(SalesAmount) AS InvoiceSubTotal,
SUM(TaxAmt) AS TaxAmount,
SUM(OrderQuantity) AS TotalQuantity,
SUM(SalesAmount) + SUM (TaxAmt) AS InvoiceTotal
Query
FROM FactInternetSales
WHERE SalesTerritoryKey = 6
--WHERE SalesOrderNumber = ‘S051182’
The TOP (N) statement allows us to return the first N rows from our output table.
You will need to use ORDER BY to ensure you get the true TOP rows.
SQL Code
SELECT TOP(10)
FROM
SalesOrderNumber AS InvoiceNumber,
OrderDate,
SUM(SalesAmount) AS InvoiceSubTotal, WHERE
SUM(TaxAmt) AS TaxAmount,
SUM(OrderQuantity) AS TotalQuantity,
SUM(SalesAmount) + SUM (TaxAmt) AS InvoiceTotal GROUP BY
Query
FROM FactInternetSales
WHERE SalesTerritoryKey = 6 HAVING
The TOP (N) PERCENT statement allows us to return the first N % of rows from our output table.
SQL Code
Query
FROM FactInternetSales
WHERE SalesTerritoryKey = 6 HAVING
--WHERE SalesOrderNumber = ‘S051182’
The OFFSET FETCH statement is similar to TOP but allows us to skip the first N rows from our output table.
SQL Code
SELECT
SalesOrderNumber AS InvoiceNumber, FROM
OrderDate,
SUM(SalesAmount) AS InvoiceSubTotal,
WHERE
SUM(TaxAmt) AS TaxAmount,
SUM(OrderQuantity) AS TotalQuantity,
SUM(SalesAmount) + SUM (TaxAmt) AS InvoiceTotal GROUP BY
FROM FactInternetSales
Query WHERE SalesTerritoryKey = 6 HAVING
--WHERE SalesOrderNumber = ‘S051182’
The DISTINCT statement allows us to return a unique list of rows in the output table.
SQL Code
FROM
WHERE
SELECT DISTINCT
CustomerKey GROUP BY
Query
FROM FactInternetSales HAVING
ORDER BY CustomerKey
SELECT & DISTINCT
ORDER BY
We can order our final output table using a column that is not present in our output table.
SQL Code
SELECT
CustomerKey
Query
FROM FactInternetSales
ORDER BY SalesAmount
1. Return only invoices that HAVE a total product cost 2 SO43702 2171.2942
4 SO43706 2171.2942
Hint: You first need to group by the invoice to get the
5 SO43707 2171.2942
total and then filter.
6 SO43709 2171.2942
7 SO43710 2171.2942
..
We need a detailed list of invoices and invoice line numbers, but we’re only interested in currency key 100.
2 SO43700 1 699.0982
1. Return only lines WHERE the currency key is 100.
3 SO43702 1 3578.27
Hint: Since line number is the lowest level of detail in the 4 SO43706 1 3578.27
FactInternetSales table, you won’t need to use GROUP 5 SO43707 1 3578.27
BY.
6 SO43711 1 3578.27
7 SO43713 1 3578.27
...
We have a new data analyst in the team who wants to see a unique list of sales territory keys. This will help
her to better understand the database.
SalesTerritory
1. Write a query to return the sales territory column
1 1
from the FactInternetSales table.
2 2
1. Return a unique list of territories only. 3 3
4 4
1. Order the results alphabetically for ease.
5 5
6 6
7 7
8 8
9 9
10 10
Tasks
Transform your data with the help Filter our data by creating
01. of numerical, date and text 02. conditions that test if something
functions. is true or false.
Skills
Aggregate functions help define how we want values to be treated when we use a
GROUP BY in our query.
Function Description
SUM Returns the sum of all the values, or only the DISTINCT values, in the
expression. SUM works with numeric columns only. Null values are ignored.
AVG Returns the average of the values in a group. It ignores null values.
COUNT Returns the number of items found in a group.
MAX Returns the maximum value in a group.
MIN Returns the minimum value in a group
COUNT(*) is a special aggregated function that allows us to count the rows in a table.
SQL Code
SELECT
COUNT(*) AS TotalCustomers,
Query AVG(YearlyIncome) AS AverageIncome,
--YearlyIncome AS TotalIncome –Non aggregated columns cannot be used alongside aggregated columns
FROM DimCustomer
Aggregate functions ignore NULL values. For example COUNT will not include NULL values.
SQL Code
--Notice that when you count the same table by middle name, you get a lower count. That’s because it ignore the NULLs.
Query 3
SELECT COUNT(MiddleName) AS MiddleNameCount FROM DimCustomer
SUM(SalesAmount) AS InvoiceSubTotal,
ROUND(SUM(SalesAmount),1) AS InvoiceSubTotalRounded,
SUM(TaxAmt) AS TaxAmount,
FLOOR(SUM(TaxAmt)) AS TaxAmountFloor,
Query
SUM(OrderQuantity) AS TotalQuantity,
SUM(SalesAmount) + SUM(TaxAmt) AS InvoiceTotal
FROM FactInternetSales
WHERE SalesTerritoryKey = 6
The BOOLEAN data type is represented by the BIT number type in SQL.
SQL Code
SELECT *
FROM DimProduct
Query
WHERE FinishedGoodsFlag = 1
DATETIME* ‘YYYYMMDD hh:mm:ss.nnn’ 8 bytes January 1st ,1753 though December 31st ,9999
2020-01-01 11:45:32.547
SMALLDATETIME* ‘YYYYMMDD hh:mm’ 4 bytes January 1st,1900 through June 6th,2079
2020-01-01 11:45
DATE ‘YYYY-MM-DD’ 3 bytes January 1st,0001 through Dec 31,9999
2020-01-01
TIME ‘hh:mm:ss:nnnnnnn’ 3-5 bytes Stores times only to an accuracy of 100
11:45:42.4356456 nanoseconds
DATETIME2 ‘YYYYMMDD hh:mm:ss:nnnnnnn’ 6 to 8 January 1st,0001 through Dec 31,9999
2020-01-01 11:45:42.4356456 bytes
DATETIMEOFFSET ‘YYYYMMDD hh:mm:ss:nnnnnnn’ +/- 8 to 10 January 1st,0001 through Dec 31,9999
hh:mm’ bytes
2020-01-01 11:45:42.4356456 + 08:00
*These data types are legacy types, meaning they are still supported, but are not as up to date or accurate.
Parameter Description
year, yyyy, yy Returns the year
quarter, qq, q Returns the quarter
month, mm, m Returns the month
day, dd, d Returns the day of the month
dayofyear, dy, y Returns the day of the year
week, ww, wk Returns the week
weekday, dw, w Returns the weekday
hour, hh Returns the hour
Date and time functions allow us to manipulate date and time values in a column.
SQL Code
SELECT
GETDATE() AS DateTimeStamp
DueDate,
Query ShipDate,
DATEDIFF (day, ShipDate, DueDate) AS DaysBetweenShippedAndDueDate,
DATEDIFF (hour, ShipDate, DueDate) AS HoursBetweenShippedAndDueDate
FROM FactInternetSales
SQL Code
SELECT
GETDATE() AS DateTimeStamp
DueDate,
ShipDate,
Query DATEDIFF (day, ShipDate, DueDate) AS DaysBetweenShippedAndDueDate,
DATEDIFF (hour, ShipDate, DueDate) AS HoursBetweenShippedAndDueDate,
DATEADD (day, 10, DueDate) AS DueDatePlusTenDays,
DATEADD (day, -10, DueDate) AS DueDateLessTenDays
FROM FactInternetSales
SQL Code
SELECT
MONTH(‘20201011’) AS MonthNumerical,
Query 1 MONTH(‘2020-10-11’) AS MonthNumerical
Query 2 SELECT
DATENAME(month, ‘02/12/2020’) AS MonthName
Collation tells us how our database deals with case sensitivity and what languages we can use.
SQL Code
Query 2 Right click on the dimProduct table and SELECT TOP 1000
REPLACE Replaces all occurrences of a specified string SELECT REPLACE('Buy Stock', 'Buy', 'Sell') Sell Stock
value with another string
UPPER Changes the format to UPPERCASE SELECT UPPER(‘futures’’) FUTURES
LEN Returns the string length, excluding trailing SELECT LEN (’stock ’) 5
spaces.
The CONCAT function joins several strings together into one long string.
SQL Code
SELECT
EnglishProductName AS ProductName,
EnglishDescription AS ProductDescription,
Query 1
CONCAT(EnglishProductName, '-', EnglishDescription) AS ProductNameAndDescription
FROM DimProduct
WHERE ProductKey=555
SELECT *
Query 2
FROM DimProduct
These are the most common string functions, allowing us to perform manipulations on text.
SQL Code
SELECT
EnglishProductName AS ProductName,
EnglishDescription AS ProductDescription,
CONCAT(EnglishProductName, '-', EnglishDescription) AS ProductNameAndDescription,
LEN(EnglishDescription) AS DescriptionLength,
Query 1
UPPER(EnglishProductName) AS UpperProductName,
LOWER(EnglishProductName) AS LowerProductName,
REPLACE(EnglishProductName, 'Front', 'Ultra Durable Front' ) AS EnglishProductNameReplaced
FROM DimProduct
WHERE ProductKey=555
SELECT *
Query 2
FROM DimProduct
We can use LEFT and RIGHT to return a dynamic number of characters from a string.
SQL Code
SELECT
ProductKey,
ProductAlternateKey,
EnglishProductName AS ProductName,
EnglishDescription AS ProductDescription,
CONCAT(EnglishProductName, '-', EnglishDescription) AS ProductNameAndDescription,
LEN(EnglishDescription) AS DescriptionLength,
Query UPPER(EnglishProductName) AS UpperProductName,
LOWER(EnglishProductName) AS LowerProductName,
REPLACE(EnglishProductName, 'Front', 'Ultra Durable Front' ) AS EnglishProductNameReplaced
LEFT(ProductAlternateKey,2) AS ProductShort,
RIGHT(ProductAlternateKey,LEN(ProductAlternateKey)-3) AS ProductSize
FROM DimProduct
WHERE ProductKey=555
Comparison operators are usually used to determine if a certain condition is TRUE OR FALSE.
However, comparisons that involve a NULL return UNKNOWN.
=
Equal To
>
Greater than
<
Less than
≠
Not equal to
≥
Greater or
≤
Less or equal
equal to to
NULLs are not captured with regular comparison operators. We have to identify them with IS NULL.
SQL Code
SELECT * SELECT * SELECT *
FROM DimProduct FROM DimProduct FROM DimProduct
Query 1 WHERE Class <> ‘H’ WHERE Class <> ‘H’
SELECT *
Query 3 FROM DimProduct
WHERE Class <> ‘H’ OR Class IS NULL
OR AND NOT
Example Class <> ‘H’ OR Class IS NULL Size = ‘XL’ AND Color = ‘Black’ NOT(Color = ’Black’)
SQL Code
SELECT
EnglishProductName,
EnglishDescription,
Color,
[Status],
Class
Query
FROM DimProduct
IN BETWEEN
IN and BETWEEN can help us simplify our code when we are dealing with multiple conditions.
SQL Code
FROM DimProduct
Query 1
WHERE (SafetyStockLevel BETWEEN 500 AND 1000) AND [Status] IS NOT NULL --BETWEEN IS INCLUSIVE OF BOTH END
-- WHERE (SafetyStockLevel >= 500 AND SafetyStockLevel <= 1000) AND [Status] IS NOT NULL
FROM DimProduct
Query 2
WHERE Color IN (‘Black’, ‘Silver’, ’White’ , ’Yellow’)
WHERE Color = ‘Black’ OR Color = ‘Silver’ OR Color =’White’ AND ‘Color’=‘Yellow’
WILDCARDS
LIKE
% represents any number of characters
_ represents one character
TRUE if the value matches a
specified pattern
EnglishProductName
Example
LIKE ‘%BRAKE%’
SQL Code
SELECT
FirstName,
EmailAddress
Query
FROM DimCustomer
SQL Code
SELECT
FirstName,
LastName,
Query YearlyIncome,
EmailAddress,
IIF(YearlyIncome > 50000, ‘Above Average’ , ’Below Average’ ) AS IncomeCategory
FROM DimCustomer
SQL Code
SELECT
FirstName,
LastName,
YearlyIncome,
EmailAddress,
IIF(YearlyIncome > 50000, ‘Above Average’ , ’Below Average’ ) AS IncomeCategory,
CASE
Query WHEN NumberChildrenAtHome = 0 THEN '0’
WHEN NumberChildrenAtHome = 1 THEN '1’
WHEN NumberChildrenAtHome BETWEEN 2 AND 4 THEN '2-4’
WHEN NumberChildrenAtHome >=5 THEN '5+’
ELSE 'UNKN’
END AS NumberOfChildrenCategory,
NumberOfChildrenCategory AS ActualChildren
FROM DimCustomer
SQL Code
SELECT
FirstName,
LastName,
YearlyIncome,
EmailAddress,
IIF(YearlyIncome > 50000, ‘Above Average’ , ’Below Average’ ) AS IncomeCategory,
CASE
Query WHEN NumberChildrenAtHome = 0 THEN '0’
WHEN NumberChildrenAtHome = 1 THEN '1’
WHEN NumberChildrenAtHome BETWEEN 2 AND 4 THEN '2-4’
WHEN NumberChildrenAtHome >=5 THEN '5+’
ELSE 'UNKN’
END AS NumberOfChildrenCategory,
NumberOfChildrenCategory AS ActualChildren
FROM DimCustomer
SQL Code
SELECT
FirstName,
LastName,
YearlyIncome,
EmailAddress,
IIF(YearlyIncome > 50000, ‘Above Average’ , ’Below Average’ ) AS IncomeCategory,
CASE
WHEN NumberChildrenAtHome = 0 THEN '0’
Query WHEN NumberChildrenAtHome = 1 THEN '1’
WHEN NumberChildrenAtHome BETWEEN 2 AND 4 THEN '2-4’
WHEN NumberChildrenAtHome >=5 THEN '5+’
ELSE 'UNKN’
END AS NumberOfChildrenCategory,
NumberOfChildrenCategory AS ActualChildren
FROM DimCustomer
WHERE IIF(YearlyIncome > 50000, ‘Above Average’ , ’Below Average’ ) = ‘Above Average’
FROM DimCustomer
SQL Code
SELECT
SalesAmount,
CAST(SalesAmount AS INT) AS SalesAmountCast,
Query
OrderDate,
CAST(OrderDate AS DATE) AS OrderDateCast
FROM FactInternetSales
In this example, the CAST operator is used to change two INTEGERS into DECIMALS.
SQL Code
/*SELECT
SalesAmount,
CAST(SalesAmount AS INT) AS SalesAmountCast,
OrderDate,
CAST(OrderDate AS DATE) AS OrderDateCast
FROM FactInternetSales*/
Query
SELECT
EnglishProductName,
ReOrderPoint,
SafetyStockLevel
CAST(ReOrderPoint AS DECIMAL(8,4)) / CAST(SafetyStockLevel AS DECIMAL(8,4)) AS PctOfTotalSafetyStock
FROM DimProduct
WHERE [Status] = ‘Current’
The National ‘N’ should be used before all strings interacting with a UNICODE column type.
SQL Code
SELECT
EnglishProductName,
EnglishDescription,
Color,
[Status],
Class,
Query
SafetyStockLevel
FROM DimProduct
Earlier, we mentioned that the AND operator is executed before the OR operator.
1. ( ) (Parentheses)
4. =, >, <, >=, <=, <>, !=, !>, !< (Comparison operators)
5. NOT
6. AND
8. = (Assignment)
Sales territory 1 need a summary of their sales for the lead up period to Christmas.
Marketing need a list of homeowner customers, along with the number of cars owned.
Tasks
Skills
1979
Relational Software Inc. introduced
the first commercially available
implementation of SQL Today
Advantages Advantages
o Security o No upfront cost
o Flexibility o Easy to scale
Disadvantages Disadvantages
o Upfront cost o Targeted by hackers
o Requires expertise o Cost escalation
Personal Data
PII
(Personally
Identifiable Information)
OLTP DW
ETL
Extract, Transform, and Load
o Designed for efficient data entry o Efficient for analysis and reporting
o Reduce data redundancy o Data Warehouse 🡪 Organization
o Fast at reading / writing data o Data Mart 🡪 Specific team
Department
First Name Last Name Salary Hourly Salary Department Location
Manager
Department
First Name Last Name Salary Hourly Salary Department Location
Manager
Normalization
Category
CategoryID Category
Product Name DimProduct
1 Cell Phone Product Cateogy BrandID Product Product Category Brand Product
ID ID Name ID Name
2 Tablet
1 1 1 Iphone X 1 Cell Phone Apple Iphone X
2 1 1 Iphone
XI 2 Cell Phone Apple Iphone XI
Brand
3 2 1 Ipad 3 Tablet Apple Ipad 10.2
BrandID Brand Name 10.2
4 1 2 Galaxy 4 Cell Phone Samsung Galaxy
1 Apple Note20 Note20
2 Samsung
Dimension tables provide descriptive information about the attributes in the fact table.
A simple star schema has one central fact table, and a number of single dimension tables.
Sub-Category Table
(Dimension Table)
Order CustomerID ProductID SubCategory CategoryID(FK
Date (FK) Revenue SubCategory
ID (PK) (FK) (FK) ID (PK) )
152156 01/12/2020 261.24 1 3 1 Water Bottles 1
A snowflack schema has one central fact table and includes dimension tables which are further
normalized with additional dimension tables.
Tasks
Skills
Scenario
Marketing would like a list of top customers
by sales, along with email addresses.
Many to One
Each key can only appear once in one table, but
may appear many times in the other.
E.g. Product tbl to Sales tbl
One to One
A single occurrence of the key in each table.
E.g. Tax Payers tbl to Social Security tbl
Many to Many
Potentially many occurrences of each key in each
table.
E.g. Customer tbl to Addresses tbl
FROM JOIN
OrderID Revenue CustomerID CustomerID Customer Name
Inner Join
Returns only the rows
where the linking value(s) OrderID Revenue CustomerID Customer Name
match in both tables. 152156 261.24 CG-12520 Claire Gute
The INNER JOIN here matches the Customer Key (FK) from FactInternetSales to the Customer Key (PK)
from DimCustomer.
SQL Code
SELECT *
Here we GROUP BY Customer Names and Email Addresses, to ensure we have unique customers.
We then used ORDER BY – DESC to rank the unique customers, and TOP to select only 100.
SQL Code
SELECT TOP(100)
We can use INNER JOIN again, to JOIN a second table to our fact table.
SQL Code
SELECT TOP(100)
FROM FactInternetSales AS fs
INNER JOIN DimCustomer AS dc
Query
ON fs.CustomerKey = dc.CustomerKey
INNER JOIN DimCurrency AS dcy
ON fs.CurrencyKey = dcy.CurrencyKey
In this example, it is more efficient to filter the original data using WHERE.
SQL Code
SELECT TOP(100)
FROM FactInternetSales AS fs
INNER JOIN DimCustomer AS dc
ON fs.CustomerKey = dc.CustomerKey
Query
INNER JOIN DimCurrency AS dcy
ON fs.CurrencyKey = dcy.CurrencyKey
The RIGHT JOIN keeps the whole table right of the join (DimProduct)
SQL Code
SELECT
dp.EnglishProductName AS ProductName,
dp.Color AS ProductColor,
ISNULL(dp.Size,'UNKN') AS ProductSize,
ISNULL(SUM(fs.SalesAmount),0) AS SalesAmount
FROM FactInternetSales AS fs
Query
RIGHT JOIN DimProduct AS dp
ON fs.ProductKey = dp. ProductKey
Here we return all rows from our main table, and find matches where available from our JOIN table.
SQL Code
SELECT
dp.EnglishProductName AS ProductName,
dp.Color AS ProductColor,
ISNULL(dp.Size,'UNKN') AS ProductSize,
ISNULL(SUM(fs.SalesAmount),0) AS SalesAmount
FROM DimProduct AS dp
Query LEFT JOIN FactInternetSales AS fs
ON dp.ProductKey = fs. ProductKey
FactInternetSales
SalesOrder SalesOrder Revenue dimSalesReason
Number Line Reason Sales
SO51178 1 2319 ID Reason
FactInternetSalesReason
SalesOrder SalesOrder Reason
Number Line ID
FactInternetSales
SO51178 1 1
SalesOrder SalesOrder Revenue dimSalesReason
Number Line SO51178 1 2
Reason Sales
SO51178 1 2319 SO51178 2 1 ID Reason
SO51187 2 1
SO51187 2 2
The second INNER JOIN creates a link between the bridge table and the dimension.
SQL Code
SELECT
--fs.SalesOrderNumber AS InvoiceNumber,
--fs.SalesOrderLineNumber AS InvoiceLineNumber,
dsr.SalesReasonReasonType AS SalesReason,
SUM(fs.SalesAmount) AS SalesAmount
FROM FactInternetSales AS fs
Query
INNER JOIN FactInternetSalesReason AS fsr
ON fs.SalesOrderNumber = fsr.SalesOrderNumber AND fs.SalesOrderLineNumber = fsr.SalesOrderLineNumber
INNER JOIN DimSalesReason dsr
ON fsr.SalesReasonKey = dsr.SalesReasonKey
GROUP BY dsr.SalesReasonReasonType
The MANY to MANY nature of this connection means we should use caution when presenting results.
A UNION combines 2 or more tables by adding the rows from one table to another.
1 261.24 1
OrderID Revenue CustomerID
2 14.62 2
1 261.24 1
3 957.57 3 2 14.62 2
3 957.57 3
OrderID Revenue CustomerID
4 1706.18 4
3 957.57 3
4 1706.18 4
A union combines 2 or more tables by adding the rows from one table to another.
1 261.24 1
OrderID Revenue CustomerID
2 14.62 2
1 261.24 1
3 957.57 3 2 14.62 2
3 957.57 3
OrderID Revenue CustomerID
3 957.57 3
3 957.57 3
4 1706.18 4
4 1706.18 4
The UNION should combine two queries that have the same columns.
The ORDER BY can only be used after the UNION of the two queries.
SQL Code
-- For illustrative purposes only. For full code refer to Completed Queries folder.
SELECT
COL1
COL2
COL3
FROM FactInternetSales
Query UNION
SELECT
COL1
COL2
COL3
FROM FactResellerSales
ORDER BY COL3
SELF JOINS are particularly common when dealing with hierarchical data, such as manager employee relationships.
The Cross Join combines every row in the first table with every row from the second table
CardType
MasterCard
Visa CardType Color
Cross Join
MasterCard Blue
Visa Blue
Color MasterCard Yellow
Blue Visa Yellow
Yellow MasterCard Red
Red Visa Red
SQL Code
-- For illustrative purposes only. For full code refer to Completed Queries folder.
SELECT
Query
COL1
COL2
COL3
FROM FactResellerSales
ORDER BY COL3
We can query a pre-defined view, which means we don’t need to re-do all our hard work, including JOINS.
SQL Code
SELECT *
Query
FROM vwOrdersALL
A subquery (inner query) returns data that we can re-use in our main query (outer query).
SQL Code
SELECT *
FROM vwOrdersALL
SELECT
InvoiceNumber,
InvoiceLineNumber,
Query OrderDate,
SalesAmount,
ProductName,
ProductSubcategory
FROM vwOrdersALL
Summarize the Internet Sales by Subcategory and return the top 5 subcategories.
2. We’re only interested in sales from our website 1 Road Bikes 4289925.9
(internet sales). 2 Mountain Bikes 3417457.74
3 Touring Bikes 1292475.9
1. Finally, the data should only include sales where
the country is United States and the currency is 4 Tires and Tubes 88762.86
It’s performance review time. HR Europe need to see sales by sales representative, and by currency.
Tasks
Connect to our database and views Create three summary reports that
01. from popular BI tools 02. require more advanced
functions
Skills
When using GROUP BY across multiple columns, we lose the ability to see subtotals.
CUBE allows us to see the subtotals and totals of each row combination.
SQL Code
SELECT
Region,
ProductSubCategory,
SUM(SalesAmount) AS TotalSales
Query
FROM vwOrdersALL
ROLLUP gives a similar outcome to CUBE, but performs better on hierarchical data.
SQL Code
SELECT
Region,
ProductSubCategory,
SUM(SalesAmount) AS TotalSales
Query
FROM vwOrdersALL
We can use a subquery to calculate the grand total, and use it in our SELECT statement to calc % of total.
SQL Code
SELECT
Source AS Reseller,
SUM(SalesAmount) AS Sales,
SUM(SalesAmount) / (SELECT SUM(SalesAmount) FROM vwOrdersALL WHERE Country = N'United States' AND Source
<> N'Web') AS PctOfTotal
Query
FROM vwOrdersALL
GROUP BY Source
Create a summary of expenditure account totals, and then calculate a PCT of total.
Other Travel
29 Related 3053.93 0.00040814