Query Execution
Query Execution
● Plan caching: Storing query execution plans to reuse them for similar queries, saving
compilation time.
○ Example: A frequently executed query might use a cached plan instead of
creating a new one.
● Plan validation: Ensuring the cached plan is still valid, especially if data statistics or
schema have changed.
3. Scans
Interesting Orderings
● An "interesting ordering" occurs when an operation’s output order aligns with
requirements of a subsequent operation (e.g., sorting data for a merge join).
● Access plans: Focus on how to retrieve data from storage (e.g., choosing between an
index scan or table scan).
● Dataflow plans: Represent the flow of data between operations in a query (e.g.,
pipelines in parallel execution).
● Serial execution: Executes one operation at a time. Simpler but slower for large
queries.
● Parallel execution: Splits the query into tasks to run concurrently on multiple CPUs or
nodes. Faster but more complex.
Co-partitioning
● Ensuring data is partitioned in the same way across different nodes to avoid shuffling
during operations like joins.
○ Example: Partitioning two tables by the same column used in a join.
Interesting Partitioning
● A partitioning scheme that minimizes data movement and optimizes performance for
specific operations.
Query Execution Archicture -
● Scans: The methods used to access data, as previously mentioned (e.g., full table
scan, index scan).
● Iterators: Implement a pull-based model for executing queries, where data is
retrieved one row (or block) at a time in a pipeline manner.
○ Example: A nested loop join uses iterators to fetch rows from each input
one by one.
2. Unary Matching
3. Binary Operations
● Joins: Combines rows from two datasets based on a condition. Types include:
○ Inner join: Matches rows with equal keys in both datasets.
○ Outer join: Includes unmatched rows from one or both datasets.
○ Semi join: Checks for the existence of a match in the second dataset but
returns only the first dataset’s rows.
● Set operations: Perform operations like INTERSECT, UNION, or EXCEPT.
● Bloom filters: Probabilistic data structures used to filter out rows early in the
pipeline.
○ Example: In a join, a Bloom filter can eliminate rows that are definitely not
in the join key set.
4. Nested Iteration
● Nested iteration: Used for implementing nested-loop joins, where one dataset is
scanned for each row of the other dataset.
○ Efficient when one dataset is small or indexed.
6. Memory Management
● Bushy plans: Query execution plans where multiple subtrees are processed
independently before being combined.
○ Example: (A JOIN B) JOIN (C JOIN D) is a bushy plan because the two
joins (A JOIN B) and (C JOIN D) can be executed in parallel.
○ Scheduling: Determines the order and concurrency of executing these
subtrees.
Query execution – scans
● Row stores:
Data is stored row by row (useful for OLTP
workloads).
● Column stores: Data is stored column by column (optimized for
analytical workloads like OLAP).
○ Scans in column stores can skip irrelevant columns,
improving efficiency.
Indexes
5. Shared Scans
7. Sampling Scans
● Instead of reading the entire dataset, a sampling scan retrieves a
subset of the data for approximate query processing or
statistical analysis.
○ Example: Selecting 1% of rows to estimate an aggregate.
8. “Zone” Filters
Matching Techniques:
4. Exchange/Shuffle (Parallelism)
● In a join, data from two tables might be shuffled so rows with the
same key end up on the same node.
Example:
A|B
--+--
1|x
1|y
2|z
2|w
Approach:
Divide the dataset into segments where the value of A is the same.
Output:
A|B
--+--
1|x
1|y
2|w
2|z
Use Case: This is useful when you need to refine existing order
without re-sorting the entire dataset.
Example:
A|B
--+--
1|x
1|y
2|w
2|z
Approach:
Identify runs of data already sorted by A, B (e.g., [1,x, y] and [2, w, z]).
Output:
B|A
--+--
w|2
x|1
y|1
z|2
Use Case: This is common in queries where the primary sort key
becomes less relevant or when merging datasets with overlapping
sort orders.
Order-Based Algorithms
1. Joins
2. Set Operations
● Approach:
For each row in the outer table, use an index on the
inner table to find matching rows.
● When to Use: Efficient if the inner table has an index on the join
key.
● Drawback: High overhead if many lookups are required or if the
index is not selective.
Sort-Based Algorithms
● Merge Join:
○ Both datasets are sorted on the join key, and a single pass
merges them.
○ Efficient for large datasets with pre-sorted inputs or when
sorting is inexpensive.
● Zigzag Merge Join:
○ A variation of merge join that efficiently handles multiple
sorted inputs or overlapping ranges.
Hash-Based Algorithms
● Hash Join:
○ The smaller dataset is hashed on the join key, and the
larger dataset is probed against the hash table.
○ When to Use: Suitable for large datasets where sorting is
not feasible.
○ Variants:
■ Adaptive Join: Dynamically switches between join
strategies based on data characteristics (e.g.,
memory availability).
■ Symmetric Hash Join: Allows early output by building
hash tables for both datasets and probing them
incrementally.
4. Filters
5. Co-Group-By
Summary of Strategies
This section touches upon critical optimization techniques and considerations for efficient query
execution. Here’s a breakdown:
1. Asynchronous Fetch
● Definition: Overlaps the fetching of data from storage with computation to avoid delays
caused by slow I/O operations.
● Advantage: Improves performance by keeping the processor busy while waiting for I/O.
● Example: Fetching the next block of rows while processing the current block.
● Definition: A simplified or approximate merge join used when full sorting isn’t feasible or
necessary.
● How it Works:
○ Uses partially ordered data to perform a merge-like operation without the
overhead of complete sorting.
○ Often relies on "interesting orderings" already present in the data.
3. Smooth Scans
● Definition: Adjusts the level of parallelism dynamically during query execution based on
system load or query progress.
● Advantage: Prevents resource contention and optimizes performance in real-time.
7. Bottom-Up Activation
● Definition: Starts query execution with the most basic operations (e.g., scans) and
propagates results upwards in the query plan.
● Example: In a tree-structured query plan, the leaves (data sources) are processed first,
and intermediate nodes (joins, aggregates) are activated as their inputs become
available.
● Definition: Optimized search over multi-dimensional indexes, often used for complex
predicates.
● Example: Efficiently querying a dataset with conditions like (A > 10 AND B < 20).
9. Adaptive Join
● Definition: A join technique that adjusts its execution strategy based on runtime
information.
● Variants:
○ Switch between nested-loops, hash join, or merge join based on available
memory and data characteristics.
○ Output early results (symmetric join).
10. Group + Join
● Definition: Combines grouping and joining operations into a single pass to improve
performance.
● Example: Instead of grouping after a join, the group-by logic is integrated during the join
phase.
● Performance Cliffs:
○ Abrupt drops in performance caused by exceeding system limits, like memory
capacity or disk bandwidth.
○ Example: A hash join that spills to disk due to insufficient memory.
● Continuous Cost Functions:
○ Gradual performance degradation as workload increases.
○ Goal: Avoid performance cliffs by ensuring system behavior follows a continuous
cost function.
Summary of Concepts
Concept Primary Use Benefit
Poor Man’s Merge Simplified join using partial Reduces sorting overhead.
Join order.
Adaptive Join Flexible join strategies. Handles varying data and resource
constraints.
This section focuses on optimization strategies for query execution, particularly for handling
large datasets and improving performance by using advanced techniques for sorting, merging,
and resource allocation.
Definition: External merge sort is a sorting algorithm used when the dataset is too large to fit in
memory and must be sorted by accessing data from external storage (like disk). The goal is to
minimize I/O operations while sorting.
● Offset-Value Codes:
○ What It Is: A technique for representing data in a way that reduces the amount of
data transferred from disk by encoding the differences (offsets) between values
instead of the full values.
○ Benefit: Reduces data size during sorting, leading to fewer disk accesses.
● Graceful Degradation:
○ What It Is: The ability of the sorting algorithm to scale with available memory. If
memory is limited, the algorithm adjusts to perform efficiently despite having less
space to work with.
○ Benefit: Ensures that even with constrained resources, the sort operation
remains efficient.
● Operations like 'Distinct', 'Group By', 'Pivot', 'Top':
○ These operations are often performed during or after sorting to organize data.
For example:
■ Distinct removes duplicates,
■ Group By aggregates data,
■ Pivot reorganizes data based on key values,
■ Top is used to return a fixed number of top rows based on some criteria.
● Instant Aggregation, Wide Merging:
○ Instant Aggregation: A technique where aggregation (e.g., sum, count) is
applied as soon as data is read or streamed, instead of waiting until after all data
is processed.
○ Wide Merging: A merging strategy that handles wide (many-column) datasets
more efficiently, merging sorted runs or partitions while minimizing memory
usage.
● Co-group-by:
○ What It Is: A technique that allows for performing multiple group-by operations
on different columns simultaneously.
○ Benefit: Optimizes performance by reducing the number of passes over the data.
2. MDAM Merge: Segmented Sort + Merging “Natural” Runs
● MDAM (Multi-Dimensional Access Method) merge refers to the strategy of sorting and
merging data in multi-dimensional indexes.
● Segmented Sort: Breaks down the data into manageable segments that can be sorted
and processed independently, reducing I/O and memory overhead.
● Merging “Natural” Runs: Combines pre-sorted runs of data (that may already follow
some natural order) into a larger sorted output without needing full-scale sorting.
● Load Balancing:
○ Distributes data processing tasks evenly across all available resources (e.g.,
processors, memory, nodes) to avoid bottlenecks.
○ Ensures that no single resource is overwhelmed with too much work, improving
overall throughput.
● Tail Balancing:
○ What It Is: Focuses on managing the final stages of a query, where the
remaining data may be small but requires significant processing.
○ Benefit: Optimizes the execution of query plans by focusing resources on the
end of the process, ensuring that even the "tail" end of the query performs
efficiently.
● Prior Order:
○ Establishes the initial sequence of operations in a query execution plan, taking
into account factors like data distribution, available indexes, and system
resources.
● Progress Estimation:
○ What It Is: A method of estimating how much progress will be made at each step
of the query execution based on the data and operations involved.
○ Benefit: Helps in predicting how much time or resources each step of the
execution will require, allowing for better optimization decisions.
● Resource Allocation:
○ What It Is: The dynamic allocation of system resources (like CPU, memory, disk
I/O) based on the progress of the query.
○ Benefit: Allows for adaptive resource management, ensuring the query runs as
efficiently as possible without overloading the system.
Summary of Key Concepts
Technique Description Benefit
Tail Balancing Optimizing the final stages of query Ensures efficient final-phase
processing. query execution.