01 - Introduction To Optimization S1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 119

Db2 for i

Db2 for i
SQL Performance Workshop

Best Practices for Monitoring, Tuning, and


Developing Optimal SQL Applications

Lab Services Db2 for IBM i Team


Rochester Minnesota USA

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Statements, Trademarks, Disclaimers, etc...
• Workshop content and examples are based on multiple software
versions and are all functional on IBM 7.1, IBM i 7.2, and IBM i 7.3.
System and application behavior may vary depending on the release and
group PTF levels of the operating system.
• Example SQL statements / syntax might or might not be correct. :-)
• NOTICE: This publication may refer to products that are not currently
available in your country.
• IBM makes no commitment to make available any products referred to
herein.
• IBM, IBM i, System i, i5, iSeries, AS/400, IBM i, i5/OS, OS/400, DB2, and
Db2 are trademarks of the IBM Corporation in the United States or other
countries or both.
• Other company, product, and service names may be trademarks or
service marks of others.
• A list of trademarks may be found on the worldwide web:
http://www.ibm.com/trademarks.html

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
About the charts…

• Used in this session to assist with explanations

• Used in this session to assist with illustrations

• Used after this session for reference

• Some charts break the rules of good presentation technique

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Audience – Where Are You?

SQL
Performance
Analyst

SQL System
Author or Administrator
Programmer

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Workshop objectives

• At the end of the workshop, you will have learned:

– How the SQL query optimizer works

– How you can use indexes and statistics to help the query
optimizer make better decisions

– How SQL manages implementation of queries

– Best practices for optimal SQL implementation

– What tools are available to help you with both proactive and
reactive performance analysis

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Agenda

• Introduction / Background / Foundation

• The Science and Art of SQL Query Optimization

• Indexing and Statistics Strategies

• SQL - Access Plans, ODPs and Performance Considerations

• SQL Interfaces - Best Practices

• Database Performance Tools

• Review / Best Practices / Final Questions & Answers

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

Introduction and Foundation

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM's Db2 Family

• Three code bases...

– Based on the system history, architecture and operating system

 Db2 for i (IBM i, System i, iSeries, AS/400)

 Db2 for z (System z, zSeries, S/390)

 Db2 for luw (Linux, UNIX, Windows , System p, System x)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Db2 for i

Power System + IBM i


• System viewed as a database server, not just an application
system
• Db2 for i (integrated part of IBM i)
• Data Centric focus – support for business logic in the
database
• SQL (DDL and DML) as primary interface to the database
• GUI to operating system and database via Access Client
Solutions

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Db2 for i ibm.com/systems/power/software/i/db2

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Db2 for i ibm.com/systems/power/software/i/db2
• Getting Started -> Whitepapers

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Db2 for i ibm.biz/Db2iWiki

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
New Wiki for Db2 Enhancements via PTF
• Regularly check (or subscribe to) the Db2 for i Updates Wiki!
– Contains details on new PTFs that deliver new Db2
capabilities
– Wiki : https://www.ibm.com/developerworks/ibmi/techupdates/db2

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
New Wiki for Db2 Enhancements via PTF
• The enhancements wiki is part of an IBM i zone in IBM
developerWorks: https://www.ibm.com/developerworks/ibmi/

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i - Logical Partitioning (LPAR)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Terminology / Foundation

• Architecture
– N-way (up to 192 Power 9 cores)
– SMP, Tightly Coupled Processing (TCP), shared everything

• IBM Power system running IBM i and Db2 for i


– Single Level Storage
 128 bit virtual address space
 "Objects are always persistent", each with a unique set of virtual addresses
– Very Large Memory System
 "Disk is the store, memory is the cache"
 16 TB of main storage, all addressable, all usable
– Storage Management
 Automatic striping and disk usage
– Work Management
– Independent I/O subsystems
– Single, integrated instance of Db2 within i
 Relational database is a native file system

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Power + IBM i Architecture

M
E Multiple CPUs
SQL M
O
R
N-way
Y
SMP
Single
Single Level 64 bit
System
Storage Power
Processing

Storage Management

IOA IOA IOA IOA IOA IOA IOA

Table

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
N-way Processing and Multi-User Support

Processors
Job A Thread J
1 2

3 4
Job B Thread I

5 6

7 8 Thread H
Job C

Job D Job G
Job E
Job F

• n Processors can work on several jobs or threads at one time without any special programming
• Memory is shared across all processors
• Database is shared across all jobs and all processors
• No one job is running on more than one processor

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects

SQL IBM i
schema/collection library
table physical file
view nonkeyed logical file
index keyed logical file
row record
column field
log journal

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects

One Database Management System with multiple interfaces

Structured Query Language (SQL)


Command Language (CL)
Embedded
ODBC
JDBC
CLI
CRTPF CREATE TABLE

Db2
database object SELECT...
High Level Language FROM...
Record Level Access
(RLA)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Db2 for i and SQL Interoperability

Considerations:
SQL
SQL-created Multi-member &
Programs multi-format files
objects

*Restrictions:
DDS defined HLL RLA* EVIs, LOBs,
objects Programs UDTs, Datalinks,
etc.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Physical File or Table

Object

Data Space

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Tables versus PFs

SQL Tables compared with Physical Files

• Advantages

– More data types


– Constraints like primary key specified on create statement
– Longer, more descriptive column names
– Data Modeling Tool support
– Remote access supported with three part naming

• Considerations

– Multi-member files, but SQL can use partitioned tables


SQL ALIAS provides solution:
CREATE ALIAS JanSales FOR SALES (JANUARY)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Physical File Members

Object

Data Space

Data Space

Data Space

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Physical File Members

• A physical file may contain more than one data space (member)
– SQL handles the members as if they are separate tables
– ALIAS or OVRDBF needed to access individual members

• Advantages
– Allows sharing of table definition amongst members
– Data can be physically subset based on Country, Year, etc.
– Can be helpful for “fast delete” of data

• Disadvantages
– No SQL indexes on individual members and other restrictions
– Can impact performance

• Considerations
– Consolidate members into a single member using a new key
column for logical separation
– Consider local table partitioning (other considerations apply!)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects

SELECT...
FROM Physical File
Library (Schema)
CREATE ALIAS...
Physical File (Table)

Member 1 Alias_1 SELECT...


FROM Alias_1

Member 2 Alias_2 SELECT...


FROM Alias_2

Member 3 Alias_3 SELECT...


FROM Alias_3

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Indexes or Keyed LFs

Object

Index
Data
Structure

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Indexes versus Keyed LFs

SQL Indexes compared with Keyed Logical Files


• Advantages
– Default Logical Page Size of 64K
– More efficient when multiple index entries are accessed
– Also can reduce index maintenance costs and contention
– Larger memory footprint BUT memory starved environment are
increasingly rare

• Disadvantages
– No multiformat or join logical file support
– Some DDS options not available – FIFO, LIFO, etc.

 Note that the modern tools represents Logical Files as Views

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Views or Nonkeyed LFs

Object

View Definition
(Query)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Views versus LFs

SQL Views compared with Logical Files


• Advantages
– More flexibility in terms of selecting and processing data
• CASE expressions and Date/Time functions
• Grouping & more advanced Join processing
– Can be used as logical files to enhance native functionality
– Can be very helpful for isolating and managing the data model

• Disadvantages
– Views cannot be keyed/ordered
• Does that mean Views have slower performance?
The answer is NO assuming you have the right set of
indexes/statistics in place.
• The optimizer always will decide whether or not to use the
index regardless of how and where it is created.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Keyed Physical File

Object

Index
Data Space Data
Structure

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects

Keyed PFs = Data Space + Index

"DATA SPACE" + "INDEX"


CRTPF PF1

R PF1FMT
FIELD1 7 0 "DATA SPACE"
FIELD2 7 0 ƒ Will be used

FIELD3 10
K FIELD3

"INDEX"
• May or may not be used

• Based on optimization
SELECT *
FROM PF1
WHERE FIELD3 = 'ABC'

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Keyed Logical File

Object

View Definition
(Query)
Index
Data
Structure

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects

Keyed LFs = View + Index


CRTLF LF1 "VIEW" + "INDEX"

R LF1FMT PFILE(PF1)
FIELD1 "VIEW"
FIELD3 Will be used
FIELD5
FIELD7
K FIELD7
O FIELD5 COMP(EQ 123)
"INDEX"
• May be used

SELECT * • Based on optimization

• References only rows where


FROM LF1
FIELD5 <> 123
WHERE FIELD3 = 'ABC'
AND FIELD5 <> 123 Added from LF1 (view)

ORDER BY FIELD7
IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
IBM i Objects – Physical File Members

File Index View

DATA DATA* Select …

Metadata Metadata No Data

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

The Science and Art of Query Optimization

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Art and Science of Query Optimization

Applying knowledge – scientific or otherwise – is an art.

An artist is somebody who knows what to put where,


and when to put it.

- Wendell Berry

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

The Science of Query Optimization

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Db2 for i Architecture

ODBC / JDBC / ADO .NET / DRDA / XDA

Network

Host Server CLI / JDBC / PHP

Static Dynamic Extended


Dynamic
Compiled Prepare Prepare
embedded every time once and
statements then
reference
The SQL Query Engine
Native SQL leverages capabilities
(Record in both XPF and SLIC
I/O)
Optimizer

SQE
Db2
(Data Storage & Management)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
What's in a name...?

• Db2 for i represents the entire integrated database for IBM i


running on the IBM Power System
• The Optimizer now determines what engine will handle the
query request through the Query Dispatcher
– Classic Query Engine (CQE)
– SQL Query Engine (SQE)

• There is still only one interface into the optimizer

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Post V5R2 Database Architecture

Optimizer

Query Dispatcher

CQE Optimizer SQE Optimizer

Technology Independent
Machine Interface
Db2 (Data Storage & Management)
SLIC
SLIC SQE Optimizer

CQE Database Engine SQE Primitives

SQE Statistics
Manager

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Query Dispatcher

 Determines which engine will optimize and process


each query request
– Only SQL requests are considered for the SQL Query Engine
 Initial step for all query optimization that occurs in IBM i
 Ability to “back up” and use the Classic Query Engine
when non-standard indexes are encountered during
optimization
Optimizer

Query Dispatcher

 Goal is to use SQE CQE Optimizer SQE Optimizer

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Db2 for i Optimizer Interfaces

SQL Based Interfaces


ODBC / JDBC / CLI
Non-SQL Interfaces Embedded & Interactive SQL
OPNQRYF Run SQL Scripts
Query/400 Query Manager
QQQQry API Net.Data
RUNSQLSTM

Optimizer
Query Dispatcher

CQE SQE

Parts of the SQE


Optimizer reside in SLIC
System Licensed Internal Code (SLIC)

CQE DATA SQE

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Optimizer Evolution

IBM i 7.2
All Queries
IBM i 7.1
All SQL Queries
IBM i 6.1
LF limitations
IBM V5R2
Initial Version
Limited Function

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Query Dispatcher – V5R2
Dispatched to CQE if:
– >1 Table (i.e. no joins)
– OR & IN predicates SQE support added into V5R2 - May 2003
– SMP requested (Latest DB Group + SI07650)

– Non-Read (INSERT with subselect can use new path)


– LIKE predicates Not part of
any PTF
– UNIONS package
– View or Logical File references
– Subquery
– Derived Tables & Common Table expressions, UDTFs
– LOB columns
– LOWER, TRANSLATE, or UPPER scalar function
– CHARACTER_LENGTH, POSITION, or SUBSTRING scalar function using UTF-8/16
– Sort Sequences & CCSID translation between columns
– Distributed queries via Db2 Multisystem
– Non-SQL queries (QQQQry API, Query/400, OPNQRYF)
– ALWCPYDTA(*NO) specified
– Sensitive Cursor

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Query Dispatcher - V5R3
Dispatched to CQE if:
– LIKE predicates
– Logical File references
– UDTFs
– LOB columns
– LOWER, TRANSLATE, or UPPER scalar function
– CHARACTER_LENGTH, POSITION, or SUBSTRING scalar function using UTF-8/16
– Sort Sequences & CCSID translation between columns
– Distributed queries via Db2 Multisystem
– Non-SQL queries (QQQQry API, Query/400, OPNQRYF)
– ALWCPYDTA(*NO) specified
– Sensitive Cursor

SQE now optimizes Only SQE optimizes


– VIEWS, UNIONS, SubQueries – INTERSECT
– INSERT, UPDATE, DELETE – EXCEPT
– Star Schema Join queries – Materialized query tables

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Query Dispatcher - V5R4

Dispatched to CQE if:


– Logical File references
– UDTFs
– LOWER, TRANSLATE, or UPPER scalar function
– CHARACTER_LENGTH, POSITION, or SUBSTRING scalar function using UTF-8/16
– Sort Sequences & CCSID translation between columns
– Lateral Correlation
– Distributed queries via Db2 Multisystem
– Non-SQL queries (QQQQry API, Query/400, OPNQRYF)

SQE now optimizes Only SQE optimizes


– LIKE predicates – Recursive common tables expressions
– LOB columns – OLAP functions
– ALWCPYDTA(*NO) specified RANK, ROW NUMBER
– Sensitive Cursor

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Query Dispatcher – 6.1

Dispatched to CQE if:


– Logical File referenced
– Distributed queries via Db2 Multisystem All SQL
– Non-SQL queries syntax supported
(QQQQry API, Query/400, OPNQRYF) by SQE!
SQE now optimizes
– National Language Sort Sequences
– UPPER, LOWER, & TRANSLATE functions
– UDTFs
– LENGTH, POSITION, and SUBSTR with UTF-8 & UTF-16 arguments

Only SQE optimizes


– VALUES on FROM
– ROLLUP & CUBE
– Full Outer Join
– Grouping Sets
– SQL Derived Indexes

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Query Dispatcher – 7.1

Dispatched to CQE if:


–Distributed queries via Db2 Multisystem
–Non-SQL queries
• QQQQry API, Query/400, OPNQRYF
All DB
objects
supported
SQE now optimizes by SQE!
–Logical File references
–Sparse indexes
• CREATE INDEX… …WHERE C1 = 123

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
SQE Non-SQL Query Support in IBM i 7.2
SQE now handles non-SQL queries (OPNQRYF, RUNQRY, etc.)
and record level access use of SQL views as logical files
 Changes in behavior are possible
 Testing and benchmarking are highly recommended

This can be turned off by setting SQE_NATIVE_ACCESS to *NO


in a QAQQINI file to avoid possible changes in behavior:
Different results returned:
 Results in a different order
 Different values for null columns or columns with errors
 Suppression of some mapping error messages
 Loss of RRN positioning capabilities
 Differences in duplicate key behaviors
 Missing key feedback and larger open feedback record counts
Creation of new indexes
More plans in the SQL Plan Cache

See the IBM i 7.2 Memo to Users for complete details

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The process of learning starts at the bottom

ODBC / JDBC / ADO .NET / DRDA / XDA

Network

Host Server CLI / JDBC / PHP

Static Dynamic Extended


Dynamic
Compiled Prepare Prepare
embedded every time once and
statements then
reference Build a foundation based on query
Native SQL optimization, then cover other
(Record aspects such as interfaces and
I/O) performance monitoring and tuning
Optimizer

SQE
Db2
(Data Storage & Management)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Science and Art

• If you look at the time it takes to do work on a computer,


it can be loosely divided into four parts
– the time spent in the processor
– the time spent waiting for disk I/O
– the time spent waiting for communications I/O
– the time spent waiting for some resource used by
another job

• In most environments, the first two are the largest factors


• Query optimization attempts to minimize these factors

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Components of Work

END Output Results


User Display I/O ƒ Journaling
ƒ Index Maintenance
Local Data ƒ Constraint Enforcement
RunTime Locking
Processing ƒ

ƒ Trigger Processing
ƒ Disk Operations
Communications ƒ CPU Operations
Open ODP Creation
Database
ƒ

Processing ƒ Database
Request Authentication

Authentication
Optimization ƒ Access Plan Creation
ƒ Index Estimates
User Display I/O
Process Request
BEGIN

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Components of Work - Database Request

Output Results
ƒ Journaling
ƒ Index Maintenance
ƒ Constraint Enforcement
RunTime ƒ Locking
Trigger Processing
ƒ

ƒ Disk Operations
Identify,
ƒ CPU Operations minimize,
Open ƒ ODP Creation and eliminate
Processing Database
bottlenecks
ƒ

Authentication
in these areas
Optimization ƒ Access Plan Creation
ƒ Index Estimates

Process Request

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

Query Optimization

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
SQL Query Processing – New Request

SQL request Optimize Open Run

Db2 for i

Time

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Optimization

The Optimizer
Writes the best? program to fulfill your request

The Optimizer
Provides the directions
Provides the methods
Does no driving

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Optimization... the intersection of various factors

System or LPAR configuration System or LPAR attributes


Version/Release/Modification
Level
System or LPAR performance

SMP

Job, Query attributes The Plan Database design

SQL Request Table sizes, number of rows

Static
Dynamic
Extended Dynamic Indexes (Radix, EVI)
Work
Interfaces management

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
(Query) Access Plans

The output of query optimization

Contents
• A control structure that contains information on the actions
necessary to satisfy each SQL request
• These contents include:
– Access Method
– Instance related information on relevant tables and indexes
– Any applicable program and/or environment information

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Query Optimization

Cost Based Query Optimization

• The Db2 for i Optimizer performs "cost based" optimization

• "Cost" is defined as the estimated time it takes to run the request

• "Costing" refers to the comparison of a given set of algorithms and


methods in an attempt to identify the "fastest" plan

• Optimization is based on time, not on resource utilization

• Usually the fastest plan is also the most resource efficient plan, but this
is not necessarily true

• The goal of the optimizer is to eliminate I/O as early as possible


by identifying the best path to and through the data

• The optimizer has the ability and freedom to "rewrite" the query

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Application Input to Optimizer and Engine

Job’s Query
System Values
Attributes
CHGSYSVAL()
CHGQRYA()
QAQQINI file Examples
• Query Time Limit
• Parallel Degree
• Optimization Goal
High Level • Optimizer Feedback
Language • Isolation Level
Attributes

Connection
Attributes
Query
Optimization
and SQL SET
Execution SQL SELECT statement
statement
clause

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
QAQQINI Query Options File

• QAQQINI is implemented as a table and provides


input to the query optimizer.

• It is not a list of “hints.”

• Unlike many other databases, it is used very sparingly


if at all.

• Just because it exists doesn’t mean you should use it.

• It will be referenced in many of the charts but how and


when to use it will be covered later.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Query Phases - Feedback

We can use information gathered at run


time to influence optimization the next time

SQL request Optimize Open Run

Db2 for i

Time

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Most Efficient Data Access Method

• Many different data access methods can be used to satisfy a


query -- each with their own strengths and weaknesses
• What data access method should we use to find the rows
that contain the color blue within a 1 million row table
... WHERE COLOR = ‘BLUE’ ...
• When...
– 1 row contains the color blue
– 1,000 rows contain the color blue
– 100,000 rows contain the color blue
– 1,000,000 rows contain the color blue
• How does the optimizer know which choice to make?

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Most Efficient Data Access Method

The optimizer will use the selectivity for the local selection and join
predicates to determine how many rows must be processed for each
data access method considered

• The selectivity will always be calculated for each predicate using:


– Default Sources (default filtering based upon the operator
used)
– Meta-Data Sources (existing indexes or column statistics)

• It's best to provide statistics for your most selective and least
selective columns

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Data Access Methods

Cost based optimization dictates that the fastest access method


for a given table will vary based upon selectivity of the query

High

Response
Time Method 3

Method 2

Low
Method 1

Few Many
Number of rows searched / accessed

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Strategy for Query Optimization

• Gather meta-data and statistics for costing


Selectivity statistics
Indexes available to be costed (sorted by usefulness based on
all parts of the query)
Environmental attributes that may affect the costs

• Generate default cost


Build an access plan associated with the default plan

• For each index:


Gather information needed specific to this index
Build an access plan based on this index
Cost the use of the index for this access plan
Compare that cost against the current lowest cost

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Why are Statistics important?

 If the correct statistics are collected and available, the


cost based optimizer will more accurately estimate the
number of rows to process

 Better estimates will allow for better query optimization


and the selection of the best query plan

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Statistics

• All query optimizer’s rely upon statistics to make plan decisions

– Db2 for i has always relied upon indexes as its preferred source for
statistics
– Other databases rely upon manual statistics collection based on
table data or from other sources

• SQE offers a hybrid approach where column stats will be


automatically collected for cases where indexes do not already
exist

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Statistics and SQE Query Optimization

Table
with Radix EVI
Stats

Statistics Manager

Q&A

SQE Optimizer Access


SQL Query
Plan
Costing

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Types of Stats Questions

Type of Question Description

How many rows will be selected


Selectivity by a given selection predicate or
combination of predicates?
How many distinct occurrences
of a value exist for a single
Cardinality
column or multiple columns in a
query?
How many rows exist within a
Meta-data table?

What indexes exist over a given


Meta-data table and what keys are
interesting?
How many I/O’s will be required
I/O Estimation to process this table or index?

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Sources for Answers

• Meta-data sources
– Existing indexes (Radix or Encoded Vector) Best!
 More accurately describes multi-column key values
 Stats available immediately as the index maintenance occurs
 Selectivity estimates from radix by reading n keys
 Selectivity from EVI by reading symbol table values
– Column Statistics
 SQE only
 Column Cardinality, Histograms & Frequent Values List
 Constructed over a single column in a table
 Stored internally as a part of the table object after created
 Collected automatically by default for the system
 Stats not immediately maintained as the table changes
 Stats are refreshed as they become “stale” over time|
– Default sources
• No representation of actual values in columns – information
is derived Worst

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
SQE's Automatic Statistics Collection

• The Statistics Manager controls the automatic collection of


column statistics based on the queries that are run and when it
determines statistics are “stale.”
– The job QDBFSTCCOL runs in the background at low priority to gather
column statistics.
– It is controlled by a system value of the same name - QDBFSTCCOL:
 *ALL Allow both user and system requested stats collections
 *NONE No background stats collection is allowed
 *USER Only user requested stats collection allowed
 *SYSTEM Only system requested stats collection allowed

• Column statistics can also be managed manually with Access


Client Solutions graphical tools and API’s.
• Column statistics are stored in the table object (adding about
12K to the object) which means they are saved and restored
with the table.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
SQE's Automatic Statistics Collection

• Statistics are useful in many situations but rarely warrant


any human intervention.
• They are not as useful as indexes in some important ways:
– Even a completely accurate column statistic doesn’t help the database
engine find rows that match some criteria any faster.
– Because they reflect the value of a single column, they don’t provide
optimal information when the column values are correlated with each
other, for example columns containing month and quarter values.

• The best practices for column statistics are:


– Identify and implement a good indexing strategy first. Indexes provide
more accurate and timely information plus they provide a path to the data.
– Leave the system value QDBFSTCCOL set to *ALL (the default) so the
Statistics Manager can decide when they’re useful and gather them as
needed.

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Selectivity Statistics

• SQL to find the rows that contain the color blue, within a 1 million row
table, when...

– 300,000 rows contain the color blue

SELECT ORDER, COLOR, QUANTITY


FROM ITEM_TABLE
WHERE COLOR = ‘BLUE’

– Without index over COLOR, assume 100,000 rows (10% default from =)
– With radix index over COLOR, estimate 199,357 rows (read n keys)
– With EVI over COLOR, actual 300,000 rows (read symbol table)
– With column stat over COLOR, might be actual, might not, it depends...

• The Optimizer uses number of rows, not a percentage

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Selectivity Statistics and Data Skew

• Data Skew relates to the how VALUES are distributed in the DATA
• Ex: Country - 0.5% = “Iceland", 50% = “China"
Estimated number of rows
Table Size Column Value Actual Number of Rows
based on equal distribution
200,000 “Iceland” 1,000 4,000
“China” 100,000 4,000
200,000,000 “Iceland” 1,000,000 4,000,000
“China” 100,000,000 4,000,000

Choosing the "best" access plan is based on understanding the data


• Maintaining statistics in Tables and Indexes helps the
optimizer select the "best" access method
SELECT CUSTNAME, CUSTID SELECT CUSTNAME, CUSTID
FROM CUSTOMER_MASTER FROM CUSTOMER_MASTER
WHERE COUNTRY = “Iceland" WHERE COUNTRY = “China"

Probe the index and table Scan the table

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
What is the Goal?

If the goal is: first vehicle to the 10 meter mark, who will win?

10 meters

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
What is the Goal?

If the goal is: first vehicle to the 100 kilometer mark, who will win?

100 kilometers

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Optimization Goal
• Tells the optimizer how many rows you expect to fetch per transaction
• Optimizer builds a plan that is optimal for returning n or all rows expected
• Affects the query "start up" time and overall run time

SELECT *
FROM Big_Table
ORDER BY Col1
All rows
Two plans:
Ordering via Time
(A) INDEX or
(B) SORT Next n rows
First n rows
Read by key Read, select all
via an index Plan A Plan B and sort

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Which is the Winning Plan?

Plan A Plan B

All rows
WINNER!

Time

Next n rows
WINNER!

First n rows

First I/O All I/O

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Optimization Goal

Application Result Set


Behavior (1,000 rows)

Query plan for 10 rows

FETCH 10 rows
Wait

Different
Plans

FETCH ALL rows


Query plan for 1,000 rows
Wait

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Optimization Goal

Application Result Set


Behavior (10 rows)

Query plan for 10 rows

FETCH 10 rows
Wait
Same
Plan

Query plan for 10 rows


FETCH ALL rows
Wait

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Optimization Goal – Explicit Settings

This value can be set explicitly:

• Set via optional SQL statement clause


– OPTIMIZE FOR n ROWS
– OPTIMIZE FOR ALL ROWS
– FETCH FIRST n ROWS

• Set via QAQQINI options file -or- ODBC/JDBC


connection attributes
– *FIRSTIO
– *ALLIO

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
The Optimization Goal – Implicit Defaults

If you don't set it explicitly, here are the implicit defaults:

• Default for dynamic interfaces is First I/O


– ODBC, JDBC, STRSQL, dynamic SQL in programs
– SQE (equivalent to OPTIMIZE FOR 30 ROWS)

• Otherwise default is ALL I/O


– Extended dynamic, RUNSQLSTM, INSERT + subSELECT,
CLI, static SQL in programs
– All of expected result set

• Optimization goal will affect the optimizer's decisions


– Use of indexes, SMP, temporary results like hash tables, etc.
– Tell the optimizer as much information as possible
– If the application fetches the entire result set, use *ALLIO

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Fair Share of Memory

• During optimization, the optimizer calculates an expected fair share of memory


• This keeps the optimizer from over commiting memory for a given query
• This allows the optimizer to consider more memory intensive methods
• The fair share value will affect what query plans are choosen

Plan 1
Memory IX (index probe into index)
Pool
Query's
Fair
Share
Hash Plan 2
Table (hash probe into hash table)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Fair Share of Memory

Application
Fetch…
“Fair Share”

Memory Pool

Data Foot print will


constrict or expand
the flow of data to
the application

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Fair Share of Memory

• SQE fair share = memory pool size / min(max-active, max(avg-active, 5))


– Average Active is:
– 15 minute rolling average number of users when paging option set to
*CALC
– The number of unique users when paging option set to *FIXED
– Base IBM i 7.1 enforced a minimum of 10% of max-active value
– Subsequent PTF’s eliminate the max-active value from the
calculations

Net: Max active doesn’t impact fair share of memory


on current IBM i 7.1 PTF’s or on later releases

• Max active value can be viewed and changed via: WRKSYSSTS


command or using graphical tools

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
CPU Processing Power
• Optimizer understands the relative processing power of CPUs available
in the system or logical partition (LPAR)

– Speed of the processor


– Number of processors
– SMT (Symmetric Multithreading) capabilities

• Only Processing Units are considered and they are rounded up to


nearest whole number

• Virtual Processors and Uncapped Processors are not considered

• Remember the difference between query optimization and query


execution

* There is no benefit to specifying more virtual processors


than physical processors *

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Allow Copy Data - ALWCPYDTA

• The Allow Copy Data attribute is an old fashioned way to control whether
queries are obliged to use only “live” data or if they can use potentially
temporary copies of data that may be "stale".

– Controlled via the ALWCPYDTA parameter


 *NO - no temporary copies are allowed, query may fail
 *YES – temporary copies are allowed only if required
 *OPTIMIZE - copies are allowed and the optimizer decides what to use

– ALWCPYDTA parameter is hidden and set to *OPTIMIZE on all interfaces


except:
 High Level Languages (COBOL, RPG, C, SQL Procedures, etc.)
 Interactive SQL (STRSQL)
 OPNQRYF Watch these! Default is
 RUNSQLSTM Default is *YES *OPTIMIZE
since V5R2

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Review

• What is the optimizer's job?

• What is the optimizer's output?

• What are some of the key elements used for


cost based optimization?

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

Query Optimizer Feedback


via Visual Explain

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Lab Exercises with Visual Explain

• Facility within IBM i Access Client Solutions

• Visualization of...
– Query access plan
– Execution flow
– Database objects involved
– Query Environment

• Via dynamic queries within Run SQL Scripts

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Access Plan Information

SQE
Plan Cache Visual
Explain

SQE
Plan Cache
Snapshots
SQL request

SQL
Query
Performance
Optimization
Monitor

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Where to Start …

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Run Options

Explain
Optimize and Draw Plan

Run and Explain


Optimize, Draw Plan,
Run Query, Show
Results

Explain While Running


Optimize, Run Query,
Draw Plan with Actuals
(dynamic)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain - Anatomy

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Explain

Optimize
Open
Run

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Explain

Optimize
Open
Run
(close)

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Attributes Pane

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Menus and Toolbar

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Menus

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Menu Options

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Index and Stats Adviser

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Query Environment

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Toolbar

Size to fit Zoom in Zoom out Overview

Save collected data Print Statistics and


as SQL Graph Index Advisor
Performance
Monitor

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Toolbar …

Orient Orient Orient Orient Refresh the Request that


left right top bottom Visual Explain the database
picture with engine
runtime consider a
information new plan for
this statement

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Orientation

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Flyovers

Note that the same


information is in
the attributes pane
IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Flyovers …

Note: estimates are


based on statistics

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Object Information

Right click on
the object for
additional
information

Note: for this to


work, the objects
must be on the
system

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Expensive Icons

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Expensive Icons

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
Visual Explain – Index Advised

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
ACS Visual Explain – Legend

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
ACS Visual Explain – Legend…

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
ACS Visual Explain – Legend…

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
ACS Visual Explain – Font

Hint: Use control + and - to change the


font size in the right pane

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i
ACS Visual Explain – Find

Hint: Use the enter key to tab through the matches

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation
Db2 for i

LAB – Getting Started with Access Client


Solutions (ACS) and Visual Explain

IBM Systems Lab Services – January 2019 Copyright 2019 IBM Corporation

You might also like