0% found this document useful (0 votes)
26 views30 pages

Overview of Databricks SQL

Databricks SQL is a data warehousing solution optimized for SQL users, featuring an open file format (Delta) and separation of compute and storage. It supports integration with BI tools and offers various SQL warehouse configurations, including Classic, Pro, and Serverless types. Users can create interactive dashboards and visualizations, utilize query filters and parameters for data analysis, and leverage ANSI SQL for compatibility.

Uploaded by

animillasekhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views30 pages

Overview of Databricks SQL

Databricks SQL is a data warehousing solution optimized for SQL users, featuring an open file format (Delta) and separation of compute and storage. It supports integration with BI tools and offers various SQL warehouse configurations, including Classic, Pro, and Serverless types. Users can create interactive dashboards and visualizations, utilize query filters and parameters for data analysis, and leverage ANSI SQL for compatibility.

Uploaded by

animillasekhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Overview of

Databricks SQL
D ATA B R I C K S C O N C E P T S

Kevin Barlow
Data Practitioner
DATABRICKS CONCEPTS
Databricks for SQL Users
Databricks SQL
Data Warehousing for the Lakehouse

Familiar environment for SQL users


SQL-optimized performance (Photon)

Connect to your favorite BI tools

Comes built into the platform!

DATABRICKS CONCEPTS
Databricks SQL vs. other databases
Databricks SQL Other Data Warehouses
Open file format (Delta) Proprietary data format

DATABRICKS CONCEPTS
Databricks SQL vs. other databases
Databricks SQL Other Data Warehouses
Open file format (Delta) Proprietary data format

Separation of compute and storage Storage often tied to compute

DATABRICKS CONCEPTS
Databricks SQL vs. other databases
Databricks SQL Other Data Warehouses
Open file format (Delta) Proprietary data format

Separation of compute and storage Storage often tied to compute

ANSI SQL Tech-specific SQL

DATABRICKS CONCEPTS
Databricks SQL vs. other databases
Databricks SQL Other Data Warehouses
Open file format (Delta) Proprietary data format

Separation of compute and storage Storage often tied to compute

ANSI SQL Tech-specific SQL

Integrated into other data workloads Usually lacking advanced analytics

DATABRICKS CONCEPTS
SQL in the Lakehouse Architecture

DATABRICKS CONCEPTS
Let's review!
D ATA B R I C K S C O N C E P T S
Getting started with
Databricks SQL
D ATA B R I C K S C O N C E P T S

Kevin Barlow
Data Practitioner
SQL Compute vs. General Compute
Designing compute clusters for data science is inherently different than designing compute
or data engineering workloads... for SQL workloads

import pyspark.sql.functions as F SELECT *


FROM user_table u
spark_df = (spark LEFT JOIN product_use p
.read ON u.userId = p.userId
.table('user_table')) WHERE country = 'USA'
AND utilization >= 0.6
spark_df = (spark_df
.withColumn('score',
F.flatten(...))
)

DATABRICKS CONCEPTS
SQL Warehouse

DATABRICKS CONCEPTS
SQL Warehouse
SQL Warehouse Configuration Options

1. Cluster Name

2. Cluster Size (S, M, L, etc.)


3. Scaling behavior

DATABRICKS CONCEPTS
SQL Warehouse
SQL Warehouse Configuration Options

1. Cluster Name

2. Cluster Size (S, M, L, etc.)


3. Scaling behavior

4. Cluster Type

DATABRICKS CONCEPTS
SQL Warehouse Types
Different types provide different benefits Classic

Most basic SQL compute

In customer cloud

Pro Serverless

More advanced features than Classic Cutting edge features

In customer cloud In Databricks cloud

Most cost performant

DATABRICKS CONCEPTS
SQL Editor

DATABRICKS CONCEPTS
Common SQL Commands
COPY INTO CREATE <entity> AS

Grab raw data and put into Delta Create a Table or View

The Extract of ETL The Transform in ETL

COPY INTO my_table CREATE TABLE events


FROM '/path/to/files' USING DELTA
FILEFORMAT = <format> AS (
FORMAT_OPTIONS ('mergeSchema' = 'true') SELECT *
COPY_OPTIONS ('mergeSchema' = 'true'); FROM raw_events
WHERE ...
)

DATABRICKS CONCEPTS
Let's practice!
D ATA B R I C K S C O N C E P T S
Databricks SQL
queries and
dashboards
D ATA B R I C K S C O N C E P T S

Kevin Barlow
Data Practitioner
Databricks SQL Assets

DATABRICKS CONCEPTS
Databricks SQL Assets

DATABRICKS CONCEPTS
Visualizations
Lightweight, in-platform visualizations
Support for standard visual types

Ability to quickly comprehend data in a graphical way

DATABRICKS CONCEPTS
Databricks SQL Assets

DATABRICKS CONCEPTS
Databricks SQL Assets

DATABRICKS CONCEPTS
Dashboards
Lightweight, easily created dashboards
Ability to share and govern across your organization

Scalable and performant

DATABRICKS CONCEPTS
Query Filters
Filters

Interactive query / dashboard components


that allow the user to reduce the size of the
result dataset SELECT *

Works on the client-side, so is very fast FROM nyctaxi.trips


WHERE pickup_zip = 10103
Supports single select, multi-select, text
AND dropoff_zip = 10023
fields, and date / time pickers

DATABRICKS CONCEPTS
Query Parameters
Parameters

More flexible than filters, and supports


more kinds of selectors SELECT *
Allow the user to provide a value that is FROM nyctaxi.trips
input into the underlying SQL query text WHERE pickup_zip = 10103
AND dropoff_zip = 10023
Created in the query by using the {{ }}
AND {{ nullCheck }} IS NOT NULL
syntax

DATABRICKS CONCEPTS
Let's practice!
D ATA B R I C K S C O N C E P T S
Creating a
Databricks SQL
Dashboard
D ATA B R I C K S C O N C E P T S

Kevin Barlow
Data Practitioner
Let's practice!
D ATA B R I C K S C O N C E P T S

You might also like