SAP Database Administration With Oracle: Michael Höding, André Faustmann, Gunnar Klein, Ronny Zimmermann
SAP Database Administration With Oracle: Michael Höding, André Faustmann, Gunnar Klein, Ronny Zimmermann
SAP Database Administration With Oracle: Michael Höding, André Faustmann, Gunnar Klein, Ronny Zimmermann
Bonn Boston
Contents at a Glance
1 Introduction ........................................................................ 15
1 Introduction ......................................................................... 15
1.1 Reasons for Owning This Book ............................................... 16
1.2 Tasks of the SAP Basis ............................................................ 18
1.3 Structure of This Book ............................................................ 19
1.4 Conventions and Other Information ....................................... 23
1.5 Acknowledgements ................................................................ 24
What is SAP? What are its capabilities? How does it work? In this chapter, well address these questions. 27
7
Contents
It seems easy to store data, but harder to store data safely. Usually, it takes a lot of effort to search stored data. Modern database management systems provide easy storage, quick research, security, and other functions, even in multiuser mode. 65
8
Contents
9
Contents
8 Performance .........................................................................405
8.1 Administrative and Program-Based Problems .......................... 406
8.2 Analyzing Administrative Performance Problems .................... 408
8.2.1 Analyzing the Hardware and Operating System ........... 409
8.2.2 Analyzing the Database .............................................. 416
8.2.3 Analyzing the SAP System .......................................... 458
8.3 Analyzing Program-Based Performance Problems:
SQL Optimization ................................................................... 473
8.3.1 Two Goals: Functionality and Performance ................. 474
10
Contents
11
Contents
12
Contents
13
Contents
Appendix ....................................................................................787
14
Performance is not everything, but without performance everything
is nothing, because most users find waiting almost impossible to
bear.
8 Performance
Usually, the transfer of a specific amount of data is the challenge, so that the
unit can be described as quantity per time unit. Standardization of units is
essential for comparing different systems and their performance. Such com-
parisons of defined and reproducible performance are referred to as bench-
marking.
In the real world, it is the user who determines whether the performance of
a system is good or poor. It's a matter of personal or subjective perception. A
user does not necessarily recognize the scope of a task that has to be pro-
cessed by a system. Consequently, on the one hand, good performance is an
absolute characteristic when comparing systems, and, on the other hand, it
has to be regarded in relation to the requirements. In addition, sociological
factors play a role in this field: Five seconds of queue time for a data ware-
house request is no problem for a user in an enterprise, whereas a customer
of a web shop might be less patient.
Therefore, the overall goal of a system administrator must be to meet the dif-
ferent performance requirements to enable users to efficiently use a system.
405
8 Performance
Planning
Implementation
Implementation
Analyse Verification
Revision Analysis
Implementation
Verification
Operation
System Lifecycle
The "big" cycle covers the lifecycle phases of implementation, operation, and
revision. Optimizations that require more time to be completed, as they
involve tests and affect the system operation, must be performed within this
wider context. Modifications on application code or extensive reorganiza-
tions of databases are also part of this cycle.
406
Administrative and Program-Based Problems 8.1
The third source for possible problems is the behavior of users, in other
words, user-specific causes. In this context, the problematic question is: Who
caused the problem? The user who, for example, runs extensive queries and
therefore causes the system performance to go down, or the programmer or
administrator who does not prevent different kinds of "excessive use," by
setting maximum values for input boxes or running plausibility checks?
User-specific performance problems are not further discussed in this book.
The solution to this type of problem is not the administration of SAP and
Oracle databases but the development of applications or the administration
of user permissions.
Besides the causes of the problems, the locations of problems represents the
second part of a problem analysis, in this context, location means: Where
does the performance problem occur? For a further specification of this
issue, a system must be divided into its individual components. From a per-
formance analysis point of view, an SAP system consists of the following
components:
1. Hardware
2. Operating system
3. Database
4. SAP Basis (that is, SAP Kernel + SAP Basis = SAP NetWeaver Application
Server)
5. SAP application
407
8 Performance
Table 8.1 shows an overview of the possible combinations of cause and loca-
tion for the assignment of performance problems. Note that this chapter
focuses only on the problematic points related to Oracle and SAP.
SAP Basis Errors in the ABAP code for basic Inefficient buffer sizes
functions such as in communication Wrong parameters for the
components SAP kernel
We will now continue this chapter in two parts. The first part, Section 8.2
Analyzing Administrative Performance Problems deals with administrative
performance problems in all fields including hardware, operating system,
the Oracle database, and the SAP system, with a particular focus on Oracle
and the SAP system, according to the intention of this book. Then, Section
8.3, Analyzing Program-Based Performance Problems, deals with the program-
related issues, such as expensive SQL statements, indexing, and to a smaller
extent, ABAP programming.
408
Analyzing Administrative Performance Problems 8.2
The SAP workload analysis uses the times required for processing the indi-
vidual dialog steps (roll-in and roll-out, database time, CPU time, and so on).
These time values are collected in the system. This analysis used to monitor
not only the components of the system, but also their interaction. Time is
obviously the relevant key figure is in this context.
Experience has shown that starting with the workload analysis is useful
when individual users complain about performance problems or when the
problem occurs only at certain times. If the performance is generally poor or
if a system analysis is carried out on a regular basis, starting with a general
component analysis is preferable. For a complete analysis of the system per-
formance, you should carry out both analyses.
The SAP workload analysis is not further covered here, as this would go
beyond the scope of this chapter. Some excellent literature is already availa-
ble on this topic, such as the following: SAP Performance Optimization by
Thomas Schneider (SAP PRESS 2006).
409
8 Performance
All data that is available for a hardware and operating system analysis in the
SAP system is collected by the SAP OS Collector (SAPOSCOL), which is a com-
ponent of the SAP Kernel that depends on the hardware and operating sys-
tem. A background job (SAP_COLLECTOR_FOR_PERFORMANCE) reads the data
and writes it to the performance database of the SAP system (Table MONI).
The analysis is started via the operating system monitor, which uses the data
from the performance database or queries the SAPOSCOL directly. Transac-
tion ST06 starts the OS monitor for the local instance of the system. For sys-
tems that have several instances on different servers, Transaction OS07 is
used to navigate to the corresponding operating system monitor of an
instance that is installed on a different server.
410
Analyzing Administrative Performance Problems 8.2
Table 8.2 provides information about the meaning of the most important key
figures in the operating system monitor and states critical performance limits
where possible or wherever it makes sense.
Utilization user CPU load caused by user pro- S > 80% ( per h)
cesses including SAP system
and database
Load average Number of processes waiting >3.0 (specific OS, such as Solaris,
for a CPU also count the active processes, then
>3 + number of CPUs)
Phy. mem avail Free memory of the server <3% Phy. mem avail (except AIX,
which uses the free RAM as file
cache)
Pages in/out Number of memory pages Windows: Kb-in 3600 > 20% RAM
paged in and out between UNIX: Kb-out 3600 > 5% RAM
RAM and swap
Disk Hard disk with currently high- Utilization > 50% ( per h)
est response time (menu path
Detail analyses menu Disk
Errors in/out Error when sending and Should no longer occur with the cur-
receiving network packets rent state of technology; therefore
(total of all network interfaces) should be checked in the case of >0
411
8 Performance
The critical values are not absolute values; rather, they indicate problems. If
one of these values is exceeded or fallen below, you should check further.1
Different values are available regarding the size of a swap memory. SAP rec-
ommends using three times more swap memory than physical memory, but
at least 3.5 GB. This recommendation, however, is unrealistic for systems
with a memory of more than 64 GB. In that case, it is difficult to reserve the
appropriate amount of swap memory on the local disks. However, the oper-
ating systems often provide the corresponding solution, such as the use of a
pseudo-swap for HP-UX.
Paging, that is, outsourcing memory pages from the memory and transfer-
ring them to the swap partition or the swap file on the hard disk, should gen-
erally be regarded as critical. The swap memory is merely a kind of emer-
gency help for the operating system in order to be able to start more
processes than the existing memory allows and to prevent processes from
failing in situations of extreme memory load. With paging, you should
always bear in mind that, theoretically, the factor that's responsible for the
difference in access speed between the hard disk and RAM is approximately
500,000 (8 milliseconds for the positioning of the hard disk head 15
nanoseconds of latency time for memory access). Although these values are
only theoretical values that can be changed considerably by employing dif-
ferent hardware techniques, such as hard disk arrays or parallel memory
access, a considerable difference still remains.
Generally, we advise that you not page out more than 5% of the memory
within one hour. The best thing, however, is to entirely avoid paging and to
size the memory according to your specific requirements. As a rule, when
1 Five percent can also be a poor result when the RAM is larger than 8 GB or the I/O for swap
memory is too slow, for example due to a software RAID.
412
Analyzing Administrative Performance Problems 8.2
The third component is the I/O load. When you double-click on the current
Disk with highest response time in the operating system monitor, a list with
all hard disks of the system displays including their current statuses. If a hard
disk is indicated with Utilization 100%, this does not necessarily mean that
there's a bottleneck. In fact, you should merely ensure that the average
Utilization per hour does not exceed 50%. The history of the I/O load of
each hard disk is displayed under Detail Analysis Menu Disk.
The network can be checked from within the SAP system using a simple ping
test. This LAN check of the presentation servers (SAP GUI) only works if the
servers don't access the system via an SAP router. A second and much better
way of checking the network is to use the niping program, which can be
called from Transaction SM49 as an external operating system command.
SAP Note 500235 contains detailed instructions on how you can use niping.
The processes at the operating system level are responsible for the CPU uti-
lization. In Transaction ST06, the current processes of the server are listed in
the order of their CPU utilization under Detail Analysis Menu Top CPU. The
displayed CPU utilization percentage always refers to a CPU of the system;
that is, in a system with multiple processors (n CPUs), the maximum utiliza-
tion is n 100%. If it is possible, you can also use the tools provided by the
operating system, such as "top" that's available in the different UNIX deriv-
atives.
The further procedure depends on the processes that are identified as CPU
users. SAP work processes can be recognized by the <sid>adm user and the
process names, dw.sap<instance> (UNIX) and Disp+Work (Windows). If
413
8 Performance
these are the processes that produce the CPU load, these are further analyzed
against the SAP process overview (Transaction SM50; see Section 8.2.3, Ana-
lyzing the SAP System). Individual processes are identified by a process ID
(PID), which is displayed in the operating system monitor and in the SAP
process overview.
Important Note
This book mainly refers to the application server of SAP Releases 6.40 and 7.00.
Transaction ST04N, which will be mentioned frequently in the following sections,
will no longer be available in the coming Release 7.10. In the new release, it will be
called ST04 again. Moreover, from Release 7.00 onward, Transaction DBACOCK-
PIT for Oracle is available.
If an external process causes the high CPU load, this process has to be ana-
lyzed, and the bottleneck has to be eliminated in collaboration with the
operating system administrator.
First, you must use the options provided by the operating system to deter-
mine whether the memory utilization is caused by the SAP system or the
Oracle database or by other processes. External and high memory utilization
which is not caused by SAP or Oracle is often caused by the file system
cache, which reserves a particular percentage of the memory as a buffer for
414
Analyzing Administrative Performance Problems 8.2
If the memory utilization occurs within the SAP system, that is, with the SAP
work processes, a further analysis of the SAP memory areas is executed. Note
that the SAP system is only capable of allocating the memory in accordance
with the relevant instance parameters (see Section 8.2.3). The analysis of the
memory usage of the Oracle database should be performed in the same way
(see Section 8.2.2).
There are three possible causes for a high I/O load: massive paging in the
swap area, a high load on the database, or an external program. If you recog-
nize a high paging rate in the operating system monitor, you can use the disk
analysis (ST06 Detail Analysis Menu Disk) to verify if the hard disk that
contains the swap area has a high load. If that is the case, solving the paging
problem also solves the problem with the I/O load. Because the swap area is
never located on the same hard disk as the database, an I/O problem caused
by paging usually never causes any I/O performance problems in the data-
base.
If the high load occurs in the area where the database is installed, further
analysis is required (see Section 8.2.2).
The connection to the database server and possibly to the connected third-
party system, for example, as a data source in SAP NetWeaver BI, has a par-
ticularly high bandwidth utilization. In this context, SAP requires the SAP
415
8 Performance
Buffers
The buffer areas of the Oracle database store frequently used information
in the main memory of the server to provide a considerably faster access
than is made possible by the hard disk storage.
Wait event
The analysis of wait events indicates when and which event the database
has to wait for during the processing of a request. This is a relatively sim-
ple way to identify bottlenecks in the database.
General parameterization
In addition to the buffer parameters, there are many other performance-
relevant Oracle parameters. These must also be included in the complete
analysis.
Statistics
The Oracle Cost-Based Optimizer (CBO) calculates the costs of the poten-
tial access paths (for example, full scan, index range scan) to determine the
fastest possible access path.
I/O
The task of a database is to read and write data blocks. Therefore, having
the best possible I/O for the performance of an SAP system is essential.
SQL analysis
The quality of SQL queries affects the speed of the database significantly.
Consequently, the identification and enhancement of bad SQL queries
represent an important task in the context of performance optimization.
When is an analysis of the database useful? The amount of the total response
time for the database in an SAP system is the best indicator. Usually, you can
416
Analyzing Administrative Performance Problems 8.2
As a prerequisite for the analysis of Oracle performance data, you must set
the TIMED_STATISTICS parameter to TRUE. However, this is already the case
after a standard SAP installation. Otherwise, you can set this parameter
dynamically as a SYS database user (logon via sysdba):
The buffer quality can be calculated based on these factors by using the fol-
lowing formula:
Basically, you must consider that all buffers are initialized after a system star-
tup and therefore have no informational value. For a buffer analysis, the sys-
tem has to be in an established state. In general, you can assume that this state
is reached after one or two days of operation. The number of logical reads on
the buffer cache of the database is another reference value for the established
state. This value should be greater than 50,000,000.
The database buffers of the Oracle database are located in the system global
area (SGA). You can find a detailed description of the individual buffer areas
and their functions in Chapter 3, Oracle Fundamentals.
417
8 Performance
Data Buffer
SGA
Physical
Read
Data Files
The overview in Table 8.3 includes the most important buffers in the SGA of
the Oracle database.
Data buffer Contains the buffered data blocks from the data files on the hard disk.
Parameter: DB_CACHE_SIZE
Shared pool The two main subcaches: data dictionary cache and library cache.
Parameter: SHARED_POOL_SIZE
Java pool Used by the Oracle JVM, but not by the SAP system. Parameter:
JAVA_POOL_SIZE
Large pool Buffer for special data (for example, message buffer for processes running
parallel queries). This buffer is very small and hardly used in SAP systems.
Parameter: LARGE_POOL_SIZE
Streams pool New buffer area in Oracle 10g for Oracle Stream, which manages data and
events in distributed environments not used in the SAP system. Parame-
ter: STREAMS_POOL_SIZE
As of Oracle Version 9i, the administrator can change the most important
parameters (DB_CACHE_SIZE and SHARED_POOL_SIZE) of the SGA at runtime
418
Analyzing Administrative Performance Problems 8.2
of the Oracle instance. This feature is referred to as dynamic SGA and should
not be confused with the Automatic Shared Memory Management (ASMM).
The old parameters from Oracle 8.1.x for the data buffer (DB_BLOCK_BUFF-
ERS) can no longer be used. SAP has generally approved the use of the
dynamic SGA, which is enabled by default during SAP installations based on
SAP Basis 6.40.
As of Oracle 10g, you can fully automate the SGA management function. In
that case, Oracle adjusts the individual areas, DB_CACHE_SIZE, SHARED_POOL_
SIZE, JAVA_POOL_SIZE, LARGE_POOL_SIZE, and STREAMS_POOL_SIZE, to your
current requirements. The SGA_TARGET parameter provides the total size of
the SGA and enables the ASMM. Moreover, if the DB_CACHE_SIZE and
SHARED_POOL_SIZE parameters are set, they provide the lower limits for each
buffer area. Due to the lack of experience with ASMM, its use in an SAP sys-
tem is not recommended. Nevertheless, it makes sense to use this parameter
in a nonproduction environment to minimize the administrative effort, but
only if you do not intend to use the system as an image of the production
system for testing purposes.
Access to analysis data in the SAP system occurs via the Oracle database
monitor (Transaction ST04N). This monitor provides you with all informa-
tion about the Oracle database, which can be accessed from within the SAP
system. The information about the database monitor originates from the
Oracle database, specifically from the V$ views. Figure 8.4 shows the initial
screen of the database monitor.
Table 8.4 explains the meanings of the most important key figures in the
Oracle database monitor and provides recommendations for its optimal
states after establishing the database.
419
8 Performance
Data buffer Main database buffer for the data blocks (warning: This >94%
recommendation is very general, because there are
extreme cases in both directions, which means there are
systems running with 80% without problems and systems
having serious problems at 98%.)
420
Analyzing Administrative Performance Problems 8.2
SQL area get SQL cache stores the parse tree and execution plan of pre- >95%
ratio viously run SQL statements
Get ratio = S (hit)/S (request) 100
SQL area Indicates the quantity of all requests (in percent) that still >99%
pin ratio have required objects to be executed in the memory:
Pin ratio = S (executions = pin hits)/S (requests for execu-
tion = pins) 100
User/Recursive Ratio between user requests to the database and requests >2
calls that the database executes in addition to the user requests
(for example, due to a missing dictionary cache entry)
Busy wait time Total amount of all wait times of the database in terms of Ratio
seconds, without idle events (see Section 8.2.2.2, Analyz- approx.
ing Wait Events) 60:40
CPU time Total amount of CPU time consumed by all Oracle sessions
In the following text, we will look more closely at the relevant Oracle buffers
in the SAP environment with regard to performance.
Without a doubt, the data buffer for the actual data blocks of the database has
the greatest impact on performance because it reduces the total number of
physical disk accesses.
The logical reads include all reading requests to the database. During a buffer
get, the system tries to read the corresponding data block from the data
buffer for all requests that are not declared as direct path, which means they
don't have an explicit direct access to the database. A successful read access
is referred to as a buffer hit, whereas a failure leads to a physical read in
which the block is read from the data files on the hard disk (see Figure 8.4).
The hit ratio for the data buffer can be calculated as follows:
421
8 Performance
You can view the number of direct path operations in Transaction ST04N
under Additional Function Display V$ V$SYSTEM_EVENT (see Figure 8.5).
This number should be very small (<0.5%), primarily in comparison with
the number of regular accesses to data blocks via the data buffer
(db file sequential read). SAP NetWeaver BI systems are an exception in
this case, because substantially higher values are acceptable here (see Chap-
ter 12, SAP NetWeaver BI and Oracle). You can use direct path operations, for
example, to access the PSAPTEMP tablespace. As an example, increasing the
PGA memory of individual database work processes can help you reduce the
number of these accesses to the temporary tablespace for JOIN or SORT oper-
ations. Figure 8.5 shows an excerpt of the V$SYSTEM_EVENT view.
Another note regarding this view: This view contains only the wait events
that occurred after the last database startup. You shouldn't be surprised if
you don't see all of the wait events described above in this excerpt.
If you can exclude expensive SQL statements and direct path operations as
a reason for a poor hit ratio, you should, if possible, try to improve the per-
422
Analyzing Administrative Performance Problems 8.2
formance by increasing the data buffer. If you can only implement this
increase by extending the hardware, you should first exclude all possible
causes for a performance degradation before making a corresponding invest-
ment.
Since the introduction of the dynamic SGA with Oracle 9i, the Oracle admin-
istrator can test changes to the data buffer in a simple and convenient way.
The V$DB_CACHE_ADVICE view enables you to check how a change to the
buffer size affects the number of physical reads. The factor representing the
changes between the physical database accesses and the current status repre-
sents the possible reduction of the buffer in MB without a significant per-
formance degradation or the efficient expansion of the buffer in MB to fur-
ther minimize the physical reads. For this purpose, you must enable the
dynamic SGA and set the Oracle parameter, DB_CACHE_ADVICE, to ON.
In addition to the actual data buffer, the keep pool and the recycling pool also
buffer data blocks. If you use the dynamic SGA (Oracle 9i), you can see that
these two pools are no longer part of the data buffer but are included sepa-
rately in the SGA. The keep pool can be used for tables and blocks that
should not be displaced from the data buffer. The recycling pool, on the
other hand, can be used for tables that should not displace other blocks from
the data buffer but whose own blocks can be displaced immediately. The
standard settings in SAP do not use these pools; however, their usage is rec-
ommended under specific circumstances (see SAP Note 762808).
Apart from the data buffer, other important buffers of the Oracle database
are located in the shared pool, namely, SQL cache and dictionary cache.
The SQL cache (formerly known as shared cursor cache) is located in the
library cache and stores all Oracle-internal information for later reuse, if
required. This information is related to an SQL statement call, such as the
parse tree and the execution plan.
Another key figure for the SQL cache in the shared pool is the pin ratio. You
can calculate the pin ratio as follows:
423
8 Performance
A buffer hit in the SQL cache simply means that the parsing of the queried
SQL statement was already performed. However, the pin ratio indicates the
number of successful reuses for a found cache entry. If the reuse fails, the
system reloads the corresponding component.
In the Oracle library cache, you can find further subcaches for the PL/SQL
(Procedural Language/Structured Query Language) packages as well as for
the control structures, such as locks and library cache handles. These play
only a minor role regarding the system performance.
The dictionary cache buffers rows from the dictionary of the Oracle data-
base, that is, information about structures of tables, authorizations, and so
on. This metadata of the database is needed regularly to process user
requests.
According to SAP, the minimum size of the shared pool should be approxi-
mately 400 MB. If the hit ratio values are permanently under the values
listed in Table 8.3, it may be useful to increase the value of the SHARED_POOL_
SIZE parameter. However, you should take into account that, for instance,
the structure of the database statistics may temporarily decrease the hit rate
in the shared pool significantly.
Possibly, you can also minimize the shared pool again if the performance
values (see Table 8.4) are acceptable and a larger subarea of the shared pools
remains free (>50 MB). You can find the free area in the shared pool in Table
V$SGASTAT free memory or by using the following SQL command:
424
Analyzing Administrative Performance Problems 8.2
As of Oracle 10g, you have the option to display the history of the load of the
shared pool. Furthermore, the DBA_HIST_SGASTAT view displays the
progress of the free space development.
The Program Global Area (PGA) component of the Oracle memory is locally
assigned to a server process (shadow process or background process). The
entire PGA memory of an Oracle instance can be calculated based on the
amount of PGAs of all database processes. You can find the total amount of
allocated PGA memory in Transaction ST04N or under Additional Function
Display V$/GV$ Views and Values V$PGASTAT total PGA allocated or by
using the following SQL call:
The PGA of a process contains only the data and information that is needed
or to be processed. The size of the PGA plays a particularly important role for
memory-intensive sort and hash operations. Consequently, the administra-
tor should place special emphasis on the optimum configuration of this
memory, in particular in the SAP NetWeaver BI environment (see Chapter
12).
To better understand the PGA tuning settings, we will now introduce some
terms. To execute an operation, the Oracle process needs local memory, the
work area. If the available PGA memory for the process is sufficient for the
entire work area, we refer to this as an optimal work area size, and the corre-
sponding operation is called optimal execution. If the PGA is not sufficient,
the operation uses the temporary permanent storage (PSAPTEMP). The
resulting I/O activities (direct path operations without buffering) have a sig-
425
8 Performance
nificantly negative impact on the system's performance. If the first pass (first
recursion level) of the PSAPTEMP is successful, we refer to it as a one-pass
operation. If the PSAPTEMP is used for several passes, it is called multi-pass
operation.
426
Analyzing Administrative Performance Problems 8.2
CPU time: the time during which the Oracle session uses the CPU
Wait event: the times during which the Oracle session waits for an event,
such as reading a data block from a hard disk
A wait event is a situation in which an Oracle session waits for an event. This
event can come from different database areas. For example, the wait event,
log buffer space, indicates that the session had to wait for free space in the
redo log buffer. After starting the database, all wait events are collected in X$
tables and can be queried using different V$ views. The most important of
these views are as follows:
V$SYSTEM_EVENT
Contains all wait events since the database was started including their fre-
quency and average length.
V$SESSION_EVENT
Contains all waits since the database was started including their frequency
and the average and maximum lengths for every Oracle session.
V$SESSION_WAIT
Contains the current waits for every Oracle session or the information that
the CPU is currently being utilized.
427
8 Performance
With Oracle database Release 9i or lower, all monitoring data for the wait
events are deleted after restarting the database. Oracle 10g, however,
includes some history tables or views that store historical data. You can find
the history of wait events in the DBA_HIST_SYSTEM_EVENT view.
Wait events are always composed of an event name and up to three optional
parameters to include more specific information on the event, as described
in the following example:
Event: direct path read: Waiting for a read operation on a data block
from the hard disk while circumventing the data buffer
Parameter 1: file number: File number of the file to be read
Parameter 2: first dba: First block to be read in the file
Parameter 3: block count: Quantity of blocks to be read
You can use the following SQL command to determine the file name for a file
number and the corresponding tablespace:
Oracle 10g contains more than 850 wait events (Oracle 9i has about 400),
which are grouped in the classes shown in Table 8.5 to provide a better over-
view (as of Oracle 10g).
Administrative 46
Idle 62
Application 12
Network 26
Cluster 47
Scheduler 1
428
Analyzing Administrative Performance Problems 8.2
System I/O 24
Commit 1
User I/O 17
Concurrency 24
Other 591
Configuration 23
The following SQL statement can be used to determine to which class a wait
event belongs:
It is important to know that some wait events don't influence the database
response time at all and can therefore be neglected in performance analyses.
On the on hand, these are all events that belong to class Idle. These events
are reported if an Oracle process is in idle state (that is, not performing any
action). The most commonly known and used event of this class is SQL*Net
message from client, which occurs if an Oracle shadow process is waiting for
a new query. On the other hand, there are wait events that are irrelevant to
the database, especially in the context of SAP. One reason for such a situation
can be that the time of an event is already included in another event; for exam-
ple, log file parallel write is already covered by log file sync. Further-
more, many (but not all) events that occur in Oracle shadow processes (DBWR,
PMON, SMON, etc.) are only of secondary importance, because the corresponding
operations are performed asynchronously to the Oracle work processes.
The following list shows the most frequent wait events that are usually irrel-
evant from the SAP perspective.
429
8 Performance
As already described in Table 8.4, the ratio between Busy wait time and CPU
time (ideally 60:40) is generally a first indicator.
When starting a general wait event analysis, it is useful to create a list con-
taining the top wait events, that is, a list with the totaled wait times in
descending order. You can create this list in Transaction ST04N (see Figure
8.6) using the V$SYSTEM_EVENT view or by executing the following SQL
command:
430
Analyzing Administrative Performance Problems 8.2
Column Description
TOTAL_WAITS Number of occurrences of the wait event since the last start of
the Oracle database
TOTAL_TIMEOUTS Number of waits for which the corresponding event has not
occurred
TIME_WAITED Total wait time for the wait event in hundredths of a second
Once the list has been created, it is searched from top to bottom to find crit-
ical wait events; during this step, idle wait events are ignored.
431
8 Performance
Column Description
WAIT TIME Time waited for the wait event (in hundredths of a second) once the
wait event is no longer active. The value of an active wait event is 0.
Moreover, there are two special values: Value = 1 if the duration of
the event was below the measurement accuracy and value = 2 if
TIMED_STATISTICS is not active.
Warning
A CPU bottleneck can also cause a large number of different wait events. If the
CPU load is very high, it is possible that Oracle processes that currently hold a lock
are displaced. If other processes are waiting for this lock, several wait times can
increase drastically. You should therefore first ensure that sufficient CPU resources
are available.
432
Analyzing Administrative Performance Problems 8.2
Remark
As mentioned earlier, compared to the previous Release Oracle 9i, the number of
wait events was increased considerably in Oracle 10g. This is reflected, for
instance, in the splitting up of wait events for a more detailed root cause specifica-
tion. We will mainly use the wait events from Oracle 10g and only point out the
differences in comparison to Oracle 9i in a few situations.
The following sections describe the most important wait events and provide
some background information on these. You'll find several tables with the
most important details followed by a text section containing a description of
the wait event.
Meaning These events represent the process of waiting for one or more parallel
read operations to be performed on blocks on the hard disk. In this case,
parallel does not refer to reading several blocks successively, but to simul-
taneous reads of different, nonsuccessive blocks.
Rating Average wait time should be less than 2, i.e., 2/100 s = 20 ms.
If there are problem values for the average wait time, this primarily indicates
an I/O performance bottleneck. For information on the analysis of I/O prob-
lems, refer to Section 8.2.2.3, Analyzing the Database I/O. Another important
factor apart from wait time is the occurrence frequency of db file sequen-
tial read. If this value is very high, the wait event usually occurs in conjunc-
tion with a bad hit ratio of the data buffer. In this case, there are two solution
scenarios: You either tune potentially existing bad SQL statements (see Sec-
tion 8.3) or you increase the data buffer size.
433
8 Performance
Meaning If this event occurs, an Oracle session is waiting for a successive read
operation on several blocks from the hard disk.
SAP Note 619188 describes an SQL command that can be used as of Oracle
9i. Using this command, you can determine the 20 SQL statements that gen-
erate the largest number of disk reads because of full scans. You should then
determine if these commands can be tuned. Attention: If you make extensive
use of the Oracle transactions in the SAP system (for example, Transaction
ST04N) during performance analysis, this may be reflected in the results. In
this case, some of the top 20 SQL statements contain queries on Oracle spec-
ifications or Oracle monitoring data that are not associated with the normal
business-related SQL queries. These SQL statements should be ignored in
your analysis.
434
Analyzing Administrative Performance Problems 8.2
Meaning This wait event is registered if the data buffer is circumvented when data
blocks are accessed. As of Oracle 10g, waits are categorized by either
access to "normal" blocks or access to temporary blocks from the
PSAPTEMP tablespace (temp).
Rating None of these events should be among the first 10 in the wait event list (in
descending order according to the totaled wait time; see Figure 8.7). Fur-
thermore, similar to db file sequential read, a maximum value of 2
applies to the average wait time, i.e., 2/100 s = 20 ms.
If the problem is caused by a too long average wait time, this is probably also
caused by an I/O bottleneck. In this case, you should perform the steps
described in Section 8.2.1.2, Identifying the Causes of Bottlenecks in Hardware
Components.
If the direct path operations are performed too often and are therefore dis-
played among the first entries in the list, you must distinguish between the
reasons for these operations in subsequent actions.
For the Oracle database, there are three reasons why direct path opera-
tions are performed:
1. PSAPTEMP accesses
2. Parallel queries
3. Access to LOB data (large object)
435
8 Performance
Parallel queries, that is, performing special actions such as a full table scan in
parallel, are generally not used by SAP. They are only used for SAP
NetWeaver BI systems. The reason for this is that these queries have several
disadvantages regarding the CBO and the resulting resource allocation (see
SAP Note 651060).
Rating For all three wait events, a maximum average wait time of 4 applies, i.e.,
4/100 s = 40 ms. However, for current hardware, significantly lower val-
ues should be obtained that allow for about 15 ms.
All three wait events described above usually depend directly on I/O per-
formance during write operations for redo log files. You should therefore
first analyze and examine if there are I/O problems and whether these areas
can be optimized (see Section 8.2.1.2, Identifying the Causes of Bottlenecks in
Hardware Components, and Section 8.2.2.3, Analyzing the Database I/O). The
redo log files are the most I/O-intensive area of an Oracle database and
therefore have special requirements regarding their storage location and
parameterization. Section 5.2.3, Storage and SAN Infrastructure, provides
further information on this topic.
436
Analyzing Administrative Performance Problems 8.2
Another aspect is the size of the redo buffer. If this buffer is configured with
less than one megabyte contrary to the SAP recommendation this may
also result in log buffer space wait events. If this situation occurs, you need
to change the LOG_BUFFER parameter to the size of one megabyte (offline).
There are a few other cause of the log file sync wait event, such as
enqueue wait situations (see SAP Note 745639, Section 12).
Parameters
Meaning These wait events are reported if the system needs to wait for a log file
switch for different reasons (see below).
The term log file switch is a generic term for several wait events that occur
when switching to the next redo log file:
437
8 Performance
archiver stuck before a new write attempt is carried out. An archiver stuck
must not occur in an SAP production system, as this would cause a system
standstill. Only in a nonproduction system is a short archiver stuck accepta-
ble under certain circumstances, if the system reaches a very high load, for
instance, during client copies or data loading at night-time. To avoid this
standstill of the Oracle database, your backup strategy must ensure that the
disk volume on which the offline redo logs are saved (usually the directory
oraarch) is always backed up and purged so that there is sufficient space for
new offline redo logs after a redo switch. If you define other archiver desti-
nations using the LOG_ARCHIVE_DEST parameter, you need to ensure that
these are backed up as well.
In the case of a log file switch (checkpoint incomplete) wait event, the
"checkpoint not complete" error has occurred and was recorded in the Ora-
cle alert log. Checkpoints are performed during every log switch. Several
checkpoints can be active at the same time. The "checkpoint not complete"
error is recorded if a log switch is to be performed to a redo log with a check-
point that has not yet been completed.
The following four situations can cause a repeated occurrence of the wait
event or the "checkpoint not complete" situation:
If the Oracle database writes many redo logs, you should first examine
whether you are dealing with an operational load, that is, whether the
number of redo logs is caused by the normal system usage. If there is no
indication that your applications are responsible for the high redo log fre-
quency, there are several other possible reasons, such as misconfigurations
and Oracle bugs. Read SAP Note 584548 for a description of the possible
causes.
Usually, the reason for a high amount of redo logs can, of course, be found in
the system operation. As a first step, you should ensure that no more than
one redo log switch is performed per minute. If this is not the case, you
should increase the size of your redo log files. To do that, proceed as follows:
438
Analyzing Administrative Performance Problems 8.2
If the error ORA-01624 is reported, the current checkpoint has not yet
been completed. Wait for a few seconds and repeat the DROP command.
3. Delete the corresponding operating system files in the redo log directory.
4. Set up the log file group 11 with a new larger size (<new_size> in MB):
ALTER DATABASE ADD LOGFILE GROUP 11
('/oracle/<sid>/origlogA/log_g11_m1.dbf',
'/oracle/<sid>/mirrlogA/log_g11_m2.dbf')
SIZE <new_size>M;
In the standard SAP installation, the redo log files of the four groups have a
size of 50 MB each. Increase the files incrementally and verify whether this
solves the problem. The scope of the increase depends on the number of
redo logs that are written per minute. If five log switches are performed per
minute with a size of 50 MB, there is no point in increasing the log size to
100 MB, but, change the size to, for instance, 300 MB right away.
If the log file switch (private strand flush incomplete) wait event
occurs or there are other indications of a bottleneck in the DBWR process,
for example, from the free buffer waits wait events (see below), you can
increase the number of DBWR processes to enhance write performance. To
do this, set the DB_WRITER_PROCESSES parameter using the following com-
mand (prerequisite: parameter management with SPFILE):
Attention: The number of DBWR processes should not exceed the number of
available CPUs.
439
8 Performance
Another way to enhance write performance is, of course, to tune the Oracle
environment, that is, all I/O relevant components. Refer to Section 8.2.2.3
for further information on this topic.
The log file switch completion wait event occurs if an Oracle shadow pro-
cess must wait for the completion of a log switch. As described above, if too
many redo log switches exist during operation (more than once per minute),
this results in a critical condition regarding the database performance. How-
ever, if this happens, proceed as described earlier.
Parameter 3 ID
Meaning These wait events describe the process of waiting for a block in the data
buffer, because this block is currently being read (read by other session)
or modified (buffer busy wait).
Rating The average wait time value should be below 2, i.e., 2/100 s = 20 ms.
In Oracle 9i, both events were named buffer busy wait, and parameter
value 3 indicated the reason: The IDs started with 1 or 2 and contained fur-
ther places depending on the exact reason. If the ID starts with 1 (ID = 1xx),
the event deals with the reading of a block. If it starts with 2 (ID = 2xx), the
wait event is caused by a write or change operation for a block. As of Oracle
10g, the name of the wait event already distinguishes whether the event was
caused by a read or write operation. Detailed information on the event cause
can be obtained from the parameters.
As all data that are read from or saved in the database "pass through" the data
buffer (an exception is the already mentioned direct path access), high I/O
loads always result in buffer busy waits. You always have the option to
reduce I/O load to decrease the amount of waits on the data buffer. This can
either be done by redistributing data loads or by tuning SQL statements so
that fewer data blocks must be read (see Section 8.3).
The second criteria besides I/O load is the management of the data blocks
themselves. In this area, in particular, Oracle 9i provided significant
enhancements with the introduction of Automatic Segment Space Manage-
ment (ASSM). Previously, the individual blocks of a tablespace or a segment
440
Analyzing Administrative Performance Problems 8.2
Without ASSM, the database administrator had to or could decide for each
table how the individual blocks of the segments were used. This made it pos-
sible to choose between performance and efficient space usage depending on
the change frequency. This task is now performed by ASSM. SAP made the
use of ASSM possible as of Version 9.2.0.5, and on installations with SAP
Basis 6.40 and higher all data tablespaces are set to ASSM by default. If prob-
lems occur that are related to buffer busy waits, you can now switch to
ASSM to resolve issues relating to segment management. In Oracle 9i, this
switching procedure involves downtime, whereas Oracle 10g allows you to
make the transition online. SAP Note 620803 provides step-by-step instruc-
tions for this transition.
Parameter 3 ID
Meaning These wait events occur if an Oracle process must wait for the DBWR process
to write a block into the relevant data file.
Rating Both wait events must not be among the first 10 entries in the wait event list.
If these wait events occur too often, the data buffer may be too small or the
performance of the DBWR process is poor. If possible, resolve this problem
by increasing the value for the DB_CACHE_SIZE parameter or optimizing the
I/O performance (Section 8.2.2.3). Furthermore, you can raise the number of
DBWR processes as described in the previous section.
Parameter 3
441
8 Performance
Meaning This wait event occurs if an Oracle shadow process must wait for a back-
ground process.
Rating This event must not be among the top 10 entries in the wait event list.
In general, the occurrence of the rdbms ipc reply wait event is not a prob-
lem, as there are various reasons why a process must wait for a background
process. The crucial factor is the duration of the wait time for the background
process. The main reason for this event is wait situations in the
BEGIN BACKUP, TRUNCATE, and DROP operations, because the CKPT process
must perform a checkpoint in these operations. In Oracle 9i and lower, there
is the additional drawback of a design weakness that results in the entire data
buffer being searched for affected blocks in a DROP or TRUNCATE operation;
this process can take quite some time with larger buffer sizes. This problem
does longer exists in Oracle 10g.
In general, the duration of rdbms ipc reply wait events is very short. For
this reason, the average wait time should not exceed 10 ms, as this would
indicate several wait periods that are much too long and increase the average
value. In this case, it is advisable to examine the enqueue wait events,
because some of them are closely related to the rdbms ipc reply wait event.
Refer to SAP Note 745639 for further information on this topic.
If you suspect a problem with this wait event, you can use the Oracle Session
Monitor (Transaction ST04N; Resource Consumption Oracle Session) and
V$SESSION_WAIT to determine which Oracle work process is waiting for
which background process. In V$SESSION_WAIT you'll find the SID of the
process that is waiting for the rdbms ipc reply wait event, while parameter
1 (column P1) displays the PID of the background process for which the
work process is waiting. Subsequently, you can use the Session Monitor to
find out which actions the background process is currently performing. For
a more detailed analysis of the actions performed by Oracle processes, you
should use the functions of the ORADEBUG trace. The procedure is
described in SAP Note 613872.
442
Analyzing Administrative Performance Problems 8.2
Meaning This wait event occurs if a process must wait for the release of a latch.
Rating The rating heavily depends on the specific latch wait event. In general,
latch wait events should not appear among the top 10 entries in the wait
event list.
A latch is a very low-level lock mechanism for the SGA memory structures.
In contrast to a lock, a latch is applied only for a very short time. For this rea-
son, latch requests are not placed in a queue, but the requesting processes
permanently try to apply the latch. This so-called spinning process is per-
formed as many times as set in the _SPIN_COUNT parameter. If a process
applies a latch and another process tries to access the respective memory
area, but does not succeed in doing so during the spin phase, a latch <latch_
name> wait event is activated. Oracle 10g contains 27 latch wait events, and
their names specify the location or the memory structure in which the latch
is applied. All "irrelevant" latches are referred to as "latch free."
In older Oracle releases (before 10g), all waits are summarized under the
term latch free. As an analysis of the different latch wait events would go far
beyond the scope of this chapter, we refer you to SAP Note 767414 for more
detailed information on this topic.
Parameter 1 Type
Meaning An event of this type occurs if a process is waiting for the release of an
Oracle lock.
Rating The average wait time value (of all enqueue events) should be below 10,
i.e., 10/100 s = 100 ms. As all possible lock situations are collected under
one event in 9i, it is generally not a problem if these events are listed
among the lower top 10 entries in the wait event list. With Oracle 10g,
however, the different enqueue waits should not appear among the top
10 (an exception is TX "row lock contention").
443
8 Performance
Term Clarification
Before getting started with the analysis, we should clarify the term hard disk. (Also
refer to Figure 8.11.) Regarding server systems for business-critical applications,
namely, the world of SAP and Oracle, the term hard disk has two meanings:
Physical: magnetic memory; the hardware part
Virtual: operating system resource (device); the software part
In the following text, the term hard disk is always used with the second meaning,
because the SAP system or the database considers only the operating system with
the resources available as the underlying level. Regardless of whether the hard disk
is visible as a hard disk with file system or as a raw device, the virtual hard disk in
a modern IT infrastructure is far more than a physical hard disk. In fact, there is a
complex storage architecture with many different components behind the operating
system as the abstraction layer. All of these parts of a storage system, for instance,
interface cards to a SAN, storage switches, or array controllers, can be relevant for
I/O problems. Unfortunately, you as an SAP or Oracle administrator have no chance
to identify and solve problems but need assistance from specialists of the respective
hardware partner. You should keep this definition in mind and remember the
"veiled" complexity of the term in the relevant parts of the following sections.
Critical points in the structure of an Oracle database are as follows: The most
I/O intensive areas are without any doubt the redo log files followed by the
data files. Out of those, the undo (or rollback) and PSAPTEMP tablespaces
can be pointed out, which always (undo) or, especially in the OLAP environ-
ment (PSAPTEMP), show increased access rates. Offline redo logs and Oracle
444
Analyzing Administrative Performance Problems 8.2
executables are less important. Read Section 5.2.3, Storage and SAN Infra-
structure, to find out more about the optimal distribution of Oracle files.
Figure 8.9 Overview of the Hard Disks in the Operating System Monitor
By double-clicking on one of the displayed disks, you can view the utilization
of the selected disk during the last 24 hours. The most important key figure,
Utilization, shows a mean value over a period of one hour (see Figure 8.10).
445
8 Performance
If you discover a problem with a hard disk via the I/O utilization, you must
identify which parts of the Oracle and SAP installation are on the disk.
Unfortunately, the SAP system does not enable you to retrace the direct
assignment of files and raw devices to hard disks. The reason is the already
mentioned resource of the virtual hard disk that is displayed in the operating
system monitor. Between the virtual hard disk and the actual files, server
systems have another virtualization layer, such as the typical Logical Volume
Manager (LVM) for UNIX operating systems. Figure 8.11 illustrates the rela-
tionship between the individual components.
For example, if you want to create a connection between the hard disk and
the Oracle data files or raw devices, you must use tools of the operating
system.
You have two options to increase the I/O performance: You can increase the
performance of the hard disk(s) or try to reduce the I/O load.
Hardware
Disk Array
Storage Network
Interface Internal
Card Hard
Disks
Devices, e.g.,
/ dev/dsk/c12t 0d3
Logical Volume
Logical Volume, e.g., Manager
/ dev/ vg00/lvol1 Volume
Group Volume
e.g. vg00 Group
Ra
ss
w
ce
De
Ac
v ic
e
eA
vic
De
cc
s
Ra
Applications
Application Files of the Swap and Data
Application
e.g., Oracle Operating System, e.g., Space
e.g., SAP
Root Directory (/)
Let us first take a look at the options to increase the hard disk performance.
First, you should check if you have used all options of the operating system
for optimal performance:
446
Analyzing Administrative Performance Problems 8.2
1. Does the layout of the structure of your Oracle files correspond to the rec-
ommendations, especially regarding the separation of load-intensive files
(see Chapter 5, Planning the System Landscape)?
2. Have you used all possible options of I/O processing?
3. Are all current drivers for I/O subsystems installed?
You should pay particular attention to the options of I/O processing. In gen-
eral, all operating systems provide two functions for I/O operations: the file
system caching and file lock mechanisms. File system caching works in a similar
way to the Oracle data buffer (but is easier) as a temporary storage for the
access of applications to I/O systems. File locks serve as write locks so that
data or files cannot be changed simultaneously by two different processes.
For the options of I/O processing, the handling of the mentioned operating
system functions is decisive. Table 8.18 lists all options in descending order
of their performance.
Name Description
Raw I/O When raw I/O (or raw devices) is used, the operating system functions
are bypassed completely, and logical volumes and hard disks are directly
accessed. Furthermore, there is no file system and correspondingly no
data in the traditional sense.
Concurrent I/O As is the case for raw I/O, file system caching and file locks are bypassed
completely, but there is a file system and thus normal files. Of course,
the used file system must support concurrent I/O (e.g., Veritas VxFS).
This I/O type can be used for Oracle databases, as the Oracle-internal
locking functions already provide a collision-free access and hence
ensure the integrity of data. Caution: Only volumes that contain only
Oracle data or redo log files may be operated in this mode (Oracle exe-
cutables are also excluded).
Direct I/O Direct I/O disables only the file system caching of the operating system
but uses the locking mechanism for the data access.
Cached I/O Uses all operating system functions for I/O and is generally the default
setting for I/O processing.
In addition to the I/O modes already mentioned, there are two more inde-
pendent options: synchronous I/O and asynchronous I/O. When a process per-
forms a synchronous input or output, the process must wait until the opera-
tion is completed and only then can continue to work or perform the next
input or output. This is not the case for asynchronous I/O. Thus, the process
can continue working simultaneously with the I/O operation. Generally, you
should use asynchronous I/O.
447
8 Performance
SAP Note 834343 provides a table with the currently supported combina-
tions of operating system, file system, and I/O options.
Another remark on raw I/O: Basically, with raw I/O, you can assume that
you can reach an I/O performance that is about 20% higher than the per-
formance with other I/O options. Why is raw I/O not always used? The main
disadvantage of raw I/O is that it always involves significantly increased
administration efforts for setting up and managing the Oracle database.
However, a psychological aspect also assumes an important role regarding
the usage of raw I/O: The administrator "misses" his files. The Oracle recom-
mendation for the usage of raw devices is as follows:
Oracle recommends that raw devices should only be considered when the
Oracle database is I/O bound.2
Raw I/O is fully supported and integrated by SAP, Oracle, and all relevant
monitoring and backup solutions so that it can be especially used for per-
formance-critical installations, such as SAP Business Information Ware-
house. In this case it is also possible to operate only parts of the Oracle data-
base, for instance, redo log files or the temporary tablespace, on raw devices.
For the usage of the described I/O options, the administrator also has to keep
two relevant Oracle parameters in mind:
2 Oracle recommends that raw devices should only be considered when the Oracle database
is I/O bound. See www.oracle-training.cc/oracle_tips_raw_devices.htm.
448
Analyzing Administrative Performance Problems 8.2
For example, this can be used for installations with the combination of raw
devices and file system to use asynchronous I/O on the raw device (DISK_
ASYNC_IO=TRUE) on the one hand, but also to use cached I/O for file system
data (FILESYSTEMIO_OPTIONS=NONE).
In addition to the options for I/O processing, SAP supports further parame-
ters for the different systems that can affect the I/O performance. SAP Note
793113 is a good starting point that refers to the individual operating sys-
tem-specific notes. To increase the hard disk performance, you can also
exchange hardware. Especially regarding the already described complex
storage systems that are now used in important areas, many components can
be significantly enhanced when they are replaced by a new generation.
Together with your hardware partner, you should decide whether such an
exchange makes sense or not. Let us now take a look at the second option for
increasing the I/O performance: the attempt to reduce the I/O load. Here,
you must bear in mind which I/O type occurs where. Table 8.19 provides
notes on the I/O reduction.
Data files Tuning of expensive SQL Time between log switches more than
statements (Section 8.3) one minute
Caching of LOB accesses Distribution of data files across differ-
(SAP Note 563359) ent volumes or hard disks
Extension of the Oracle Extension of the PGA
buffer pool Extension of the buffer pool (at free
Extension of the PGA buffer waits)
449
8 Performance
With the access statistics, the administrator can, for example, distribute the
top 10 data files to different media or data subsystems. You find an example
for moving data files below:
2. Move the file at the operating system level (to simplify matters from the
Oracle shell with a preceding "!"):
! mv /path with old volume/<sid>.dataX /path with new volume/
<sid>.dataX
450
Analyzing Administrative Performance Problems 8.2
Because the data files are generally rather big and the corresponding
tablespace needs to be offline, it is not possible to move the files while the
SAP system is running. When files of the SYSTEM or of the UNDO tablespace
need to be moved, the procedure mentioned above does not work. Instead,
the database needs to be offline to move the files. The new path to the file is
published in the MOUNT status with the following command:
451
8 Performance
System enqueues are the enqueue types that occur in many Oracle-inter-
nal management mechanisms. There are more than 40 different types.
However, most of them are only partially or not at all relevant. The most
important type in the SAP environment should be mentioned, though: ST
(space transaction enqueue). This enqueue is generated during extent
management in DMTS (dictionary-managed tablespace).
A data lock mostly affects a row of a table if it is changed via update, delete,
insert, or select for update. Such locks, as described above, are called TX
enqueues. Other enqueues, for example, also lock entire segments (e.g., TM
enqueues) or critical paths (ST enqueues). The set lock is always kept up to
the data base command COMMIT and is then released again. Because a commit
or rollback is carried out after every SAP transaction step (caution: not
transaction), the database locks should generally be kept no longer than a
few seconds.
Of course, database locks are generally important for data consistency in the
database. Therefore, their occurrence is normal and presents no problem for
the performance. However, this only applies when the locks are only held as
long as necessary and no serialization effects occur. This would mean that
increasingly more processes wait for the release of a database lock. You
should therefore observe the Oracle lock monitor in Transaction ST04N or
by following menu path Exceptional Conditions Lock Monitor (or Transac-
tion DB01). Caution: You will not see the current locks that are kept in the
Oracle database but only the lockwaits and thus the requested locks that
have not been immediately assigned (that are being waited on). Information
about the current database locks can be found in the views V$LOCK and
V$LOCKED_OBJECT. Figure 8.13 shows these two views also called via
Transaction ST04N or by going to Additional Function Display V$/GV$
Views and Values.
The figure demonstrates how the Oracle Process 31 (Session ID) locks the
object with ID 18.992. V$LOCKED_OBJECT illustrates that it is a TM enqueue
in Mode 3. Furthermore, V$LOCK demonstrates that Process 31 also keeps
another TX enqueue in Mode 6.
The mode shows how restrictively the object is locked. Table 8.20 illustrates
the possible modes of locks.
452
Analyzing Administrative Performance Problems 8.2
Mode Name
1 Null mode
453
8 Performance
A shared lock on a resource also allows other shared locks on that resource
but prevents exclusive locks. That means write access to the resource is
not possible with a shared lock. Therefore, the resource can be read con-
sistently.
In the individual modes, these principles are linked at table and row level.
Let us take another look at the example of Figure 8.13. What is happening? As
described, Process 31 holds a lock with Mode 3 on Table 18992. That means
it has a shared lock on the entire table (TM enqueue) and wants to make changes
to the row. This also corresponds to the meaning of mode row exclusive table
locks. Of course, changing the rows also determines an exclusive lock on the
corresponding row, which can be seen in V$LOCK. The last entry demonstrates
that Process 31 holds an exclusive lock (Mode 6) on a row (TX enqueue). That
is, Process 31 only makes entries or changes in Table 18992.
Finally, we will take another look at the V$LOCK view and, in particular, at
the lock mode column in Figure 8.13. (Caution: Dont confuse this with the
column with the same name in V$LOCKED_OBJECT.) This column contains
the mode of a requested lock. That is, in case of a value >0, the process waits
for a lock from the lock mode that it requested. At this point, you can recog-
nize a lockwait. As mentioned before, such a requested but not assigned lock
would also be displayed by the Oracle lock monitor (Transaction DB01) in
the SAP system. Furthermore, you can also determine the Oracle process that
holds the lock on the "demanded" object (lock mode = 0) via the columns
Lock ID1 and Lock ID2 (objects).
If you discover Oracle locks that are held longer or that are being waited on,
you should identify the SAP work process that causes the lock via the client
host and the client ID. Afterwards, you can determine which program sets
the lock for which user and does not release it. A further analysis can then be
made with the help of the user or the developer.
If you cannot identify an SAP process with the respective lock, there are two
possibilities: First, the SAP work process was cancelled and the "attached"
454
Analyzing Administrative Performance Problems 8.2
database shadow process was not closed properly. In this case, you can
delete the lock manually by cancelling the corresponding database shadow
process using tools of the operating system. (Caution: Ensure that you dont
cancel the wrong database shadow process or even an Oracle system pro-
cess.) To prevent such a situation, you can set the parameter SQLNET.EXPIRE_
TIME in the file sqlnet.ora, which enables an automatic cleanup of cancelled
sessions. See SAP Note 20071 for further information. Second, the lock
could also be kept by an external process or its attached database shadow
process.
See the corresponding Oracle documentation and SAP Note 745639 for fur-
ther information on Oracle enqueues.
The table statistics of the Oracle database provide another essential perform-
ance aspect. The Oracle Database Optimizer is supposed to determine the
optimum access path for accessing the data. Generally, there are two types of
optimizers: the Rule-Based Optimizer (RBO) and the Cost-Based Optimizer
(CBO). The RBO calculates the access paths according to rules that derive
from the "where" clause of the SQL statement to be optimized. However,
because all databases running on SAP systems use the CBO, the exact process
of the RBO is not relevant for us. The R/3 systems version 3.x or older on
Oracle older than 7.3.3 were the only exceptions, because the SAP applica-
tions for the RBO were developed on these systems due to technical prob-
lems with the newly introduced CBO.
The cost-based optimizer calculates the access path to the data on the basis of
the costs required for the access. In Oracle systems, you recognize the usage
of the CBO with the parameter OPTIMIZER_MODE. In SAP systems on Oracle
9i, the parameter has the CHOOSE value, which determines that the CBO is
always used when statistics are available for a table and that otherwise the
RBO is used. From Oracle 10g onward, the RBO is no longer supported, so
different levels of the CBO are provided for selection. As of Version 10g, the
SAP default value for the parameter is therefore ALL_ROWS.
The exact definition of the access costs depends on the database. For Oracle
releases older than 9i, the costs are exclusively determined via the blocks to
be read, whereas versions higher than Oracle 10g allow several key values
(single-block reads, multiblock reads, and CPU) for cost determination. The
exact working method of the CBO is kept secret by Oracle. A description and
summary of the known facts regarding the CBO can be found in SAP Note
750631, Rules of Thumb for Cost Calculation of the CBO. The most important
thing to keep is mind is that the costs are mainly calculated on the basis of
455
8 Performance
In the first step, all tables of the database are analyzed to determine for
which table statistics have to be renewed. Because the actual creation of sta-
tistics is very resource-intensive, this two-phase process prevents unneces-
sary new statistics from being created when the content of a table is insignif-
icantly changed. SAP urgently recommends that you use only the SAP tools
to create optimizer statistics, as these tools are especially customized to meet
the requirements of the SAP software for the Oracle database. The BR*Tools
introduced in Chapter 4 can be called for the creation of statistics by means
of the command line (as <sid>adm user)
or can be scheduled via the DBA planning calendar (Transaction DB13) (see
Figure 8.14).
456
Analyzing Administrative Performance Problems 8.2
For every execution of the BR*Tools, and thus for every table analysis run
and statistics creation, you find the corresponding log file in the log display
for DBA operations (Transaction DB14). There, by clicking on the BRCON-
NECT button, you can view all brconnect logs. You can identify the logs for
the update of the optimizer statistics by means of the description or abbre-
viation "sta" in the FID column. By double-clicking, you can view the log file
and find an entry for every table, whose statistics are recalculated (method =
C) or estimated (method = E), such as:
Which table of the Oracle database is analyzed and how it is analyzed is con-
trolled by two aspects:
Most important are the rules that are programmed in the BRCONNECT pro-
gram. Those are:
First, it is determined whether new statistics are required or not (on
the basis of the number of changed rows), followed by the actual cre-
ation of the statistics (two-phase concept).
No statistics on pool and cluster tables for Oracle 9i or earlier.
Accuracy of the statistics based on the number of entries in the table
and so on.
The second aspect is the content of the DBSTATC table. Using this table,
the system administrator can influence the statistics creation for individ-
ual tables. Therefore, the table is also referred to as exception table. In
every row of DBSTATC, the parameters for running the statistics creation
are set for a specific table. The maintenance of table DBSTATC is per-
formed via Transaction DB21 (see Figure 8.15).
Table 8.21 lists the most important columns of the DBSTATC table.
You can also implement new tables in the DBSTATC table that were, for
example, created through developments. See SAP Note 106047 for further
information on the maintenance of the DBSTATC table.
457
8 Performance
Column Description
Active Controls if the statistics for the table are renewed. Possible values are,
for example:
A: Active (is checked and updated, if required)
I: Ignore
U: Unconditional (statistics are always updated)
N: No statistics
R: Only temporary statistics
Method How the statistic is generated either by the exact analysis of the entire
table (C) or by the estimation according to procedure <sample> (E).
The criteria that are relevant for the performance of an SAP instance can be
divided into two main categories:
458
Analyzing Administrative Performance Problems 8.2
The following sections briefly describe these categories and give an overview
of the most important options and settings.
Note
The following text refers only to the SAP memory management for typical UNIX
operating systems (HP-UX, Solaris, and AIX). The concepts for configuring the
memory of other platforms supported by SAP, such as Linux, Windows, or IBM
iSeries, sometimes deviate considerably. For example, Windows and Linux provide
an option for Zero Administration Memory Management, where only one param-
eter defines the total memory that is available for the instance and where the indi-
vidual memory areas are automatically configured.
Roll memory
Every work process contains a roll memory area that is located in the local
process memory. It stores the initial user context that is swapped to the
459
8 Performance
roll buffer when the process is changed (SAP process multiplexing). The
roll buffer itself (dont confuse it with the roll memory) is also referred to
as the roll area and, like the extended memory, is a shared memory area.
From SAP R/3 3.0 onward, the roll memory plays only a minor role,
because the main part of the user context is directly stored in the extended
memory and the access change is performed through the pointer. This
method is significantly faster than copying the data in the memory.
Extended memory
This shared memory is used by all work processes and is the most impor-
tant memory area of an SAP instance. It contains all user contexts of the
users that are logged in to the instance, with the exception of the small
initial context that is being copied between roll memory and roll buffer.
Heap memory
This memory area is a local memory that belongs to one work process.
The work process type determines when the heap memory is used. It is
used for dialog processes when the extended memory or at least the part
of the extended memory that may be used by a single work process is
entirely utilized. For nondialog processes, the local heap memory is used
immediately after the roll memory, because here no process multiplexing
is performed, and the extended memory is therefore reserved for the dia-
log processes.
Paging memory
Previously, this memory area served to reduce the load of the roll memory
for operations with large amounts of data via a paging procedure similar
to the paging process of an operating system. Today, the memory is only
used when the ABAP commands, EXTRACT and EXPORT ... TO MEMORY...,
are used.
Roll memory and ztta/roll_first Defines the size of the initial local roll mem-
roll buffer (area) ory of a work process. The default value is
only 1 byte.
Roll memory and ztta/roll_area Defines the size of the entire local roll mem-
roll buffer (area) ory of a work process.
460
Analyzing Administrative Performance Problems 8.2
Roll memory and rdisp/ROLL_SHM Defines the size of the roll buffer in the
roll buffer (area) shared memory.
461
8 Performance
The Current use column provides information on the current use of the
instance. Max. use lists the maximum use since the last start of the instance.
The In memory column contains the maximum values of the individual areas
in kilobytes. The areas are defined according to the parameters from Table
8.22. Note that the parameters of the extended and heap memories are
defined in megabytes or bytes, whereas the roll buffer (roll area) and paging
memory (paging area) sizes are specified by the number of blocks with a size
of 8k. The last column, On disk, shows the maximum sizes of the overflow
files of the roll buffer and of the paging memory.
The Detail analysis menu button in Transaction ST02 enables you to obtain
a selection with further analysis options. Using the SAP memory function,
you can display a detailed analysis of the SAP memory areas. Here, the
administrator can view which user is using how much of which SAP memory
at a given time. You can also obtain a historical overview.
Before the extended memory was implemented, the roll-in and roll-out
processes were critical for performance during process multiplexing, that is,
copying the user context. Thanks to the usage of pointers, this problem has
become obsolete. What needs to be done now is to configure the SAP mem-
ory areas with enough memory of each area and without wasting memory
space that could be of more use somewhere else (for instance, for SAP buff-
ers or the Oracle database). However, if not enough memory is available for
one of the areas, the performance of the SAP instance may automatically
decrease considerably, or program aborts may occur. The following list sum-
marizes the most important rules for the configuration of SAP memory areas:
Choose the roll buffer and the paging memory so that the overflow files
are never used, that is, the value in Max. use is always smaller than the
value in In memory, as shown in Figure 8.16. The SAP default values for
these areas are usually sufficient. An exception is SAP NetWeaver BI,
which sometimes places considerably more load on the paging memory.
The extended memory is the most important SAP memory area. There-
fore, you should calculate it more generously. Earlier recommendations of
6 to 10 MB per user are too small for todays requirements. Consequently,
you should consider 20 to 30 MB per user.
As a contingency reserve, 20% of the extended memory should remain
free, that is, the Current use should never exceed 80% of the In memory
value. If the Max. use value indicates that this rule has not been adhered
to, you should check if the history if this contingency reserve was often
used and increase the extended memory, if required.
462
Analyzing Administrative Performance Problems 8.2
If Max. use or the history clearly shows that the extended memory is
never used up to 80%, you can reduce it accordingly and use the memory
somewhere else.
Regarding the maximum amount of the extended memory of a single
work process, SAP recommends 10% to 20% of the entire extended mem-
ory.
You can use heap memory (Current use) without a problem as long as you
use it only for nondialog processes. However, if dialog processes use local
heap memory, they change to the so-called PRIV mode, that is, they can
no longer perform a user change and are bound to the running transac-
tion. You can identify this status in the SAP process overview (Transaction
SM50). You should avoid this status by all means. If required and possible,
you should specify more extended memory.
Of course, you should keep an eye on the hardware resources when config-
uring the memory for an SAP instance. It is important that there is enough
space in your operating system for all SAP memory areas plus the SAP buff-
ers plus a contingency reserve. If there isnt enough space, the SAP instance
won't start. The best option for the performance is that everything (including
the maximum heap memory) can be mapped onto the physical memory of
the server.
463
8 Performance
ABAP
Processor
O pen S Q L
Data
Native SQL or BYPASSING BUFFER
Table Single
buffer request Program Buffer
buffer hit
Screen Buffer
Data
SAP Oracle
buffer request
buffer hit
Data Files
464
Analyzing Administrative Performance Problems 8.2
bly to a physical read of the database. That is, the objects and data that are
buffered in the SAP application server are generally swapped from the data-
base buffer, which causes a physical data access when they are queried again
from the database.
Therefore, SAP buffers primarily contain other data than the database buffer.
Table 8.23 shows which individual buffers are available.
Table definition Buffers the entries of the DDNTT table. Explanation: The name table
(TTAB) (NTAB) contains all information on the table and field definitions in
the ABAP repository. It consists of the DDNTT (table definitions) and
DDNTF (field descriptions) tables.
Program Here, the compiled executable versions of the ABAP codes are buff-
ered. The swaps are performed on the basis of the LRU concept (least
recently used).
CUA Buffers objects from the SAP GUI, such as menus or button definitions
based on the LRU concept.
Screen Stores the dynpro screens that have been generated previously.
Calendar Here, all defined factory and holiday calendars from the TFACS and
THOCS tables are buffered (also on the basis of the LRU concept).
OTR This is the Online Text Repository buffer that stores texts that are, for
example, used in BSPs.
Table generic/ These are the entries for table buffering in an SAP instance (as
Table single described below).
Export/Import This buffer is used for all work processes and stores data clusters using
Exp/Imp SHM specific ABAP commands (see SAP Note 702728).
Transaction ST02 enables you to access the memory monitor of the instance
to which you are currently logged in. Figure 8.18 shows the part of the mon-
itor that lists and evaluates the individual SAP buffers. The individual fields
and their meanings are described in Table 8.24.
465
8 Performance
Figure 8.18 SAP Memory Monitor (Excerpt Containing the SAP Buffers)
Field Description
Hit ratio The hit ratios of a buffer are indicated as a percentage. They are calcu-
lated in the same way as the hit ratio of the database buffers:
Buffer quality (hit ratio) = (Buffer requests Database requests)/
Buffer requests 100%
Swaps Number of swaps from the buffer since the last start of the instance.
Hit ratio and swaps are the essential criteria for SAP buffers. SAP recom-
mends that the hit ratio of the buffers should be >98%. The two export and
import buffers are an exception with an optimum hit ratio that is supposed
to be >80%. If possible, swaps from buffers should be avoided. However,
depending on the utilization and usage of the SAP instance, it is not always
possible to avoid them. For example, if an instance runs in another operating
mode to execute more batch jobs at night, other ABAP programs and tables
are required that are loaded into the program or nametable buffer. This usu-
ally leads to swaps. You can therefore tolerate a small number of swaps (a
few hundred swaps per day). SAP recommends that up to 10,000 swaps per
day can be accepted for the program buffer.
If the number of swaps increases considerably, you must check if the free
space or the number of free directory entries of the buffer is insufficient. The
respective instance parameter must then be increased step by step to elimi-
466
Analyzing Administrative Performance Problems 8.2
nate the swaps. You can find the parameters for the SAP buffers using the
Current parameter button in the memory monitor.
In addition to the hit ratio and swaps, you should also keep an eye on the
free space for SAP buffers. You can possibly find unused memory resources
if large parts of a buffer are not used in the established state. For example, in
Figure 8.18, this is the case for the field description buffer because 30 MB are
unused. It makes more sense to use this memory somewhere else.
1. Single buffering
In this process, each record (row of a table) is stored individually in the
TABLP buffer if it has been read once on the database. To use single buff-
ering, it is important that all key fields are qualified in the where condition
for a query
2. Full buffering
If a record is read from the database, the entire table is stored in the TABL
buffer.
3. Generic buffering
This buffering type is specified by the number of key fields used for selec-
tion. If a record is read from a table buffered as generic 1 from the data-
base, all other data records are buffered that are identical to the record ini-
tially read in key field 1. Corresponding buffering with n key fields is also
possible. The generic buffered data records are in the TABL buffer.
Note
When full buffering is activated, a client-dependent table, that is, a table that
always has the MAN key first, is automatically buffered as generic 1.
The entire buffer management is mapped onto the database interface of the
individual SAP work processes, that is, here it is decided when which buffer
is accessed. For the access of the buffer, it is decided whether all required
keys of a table are specified in the where condition. Let us assume that the
TAB table with the key fields KEY1, KEY2, and KEY3 is buffered as generic 2,
that is, via KEY1 and KEY2. The following call would benefit from a buffering
process:
467
8 Performance
In contrast, the following calls could not be buffered and would thus cause
an access to the database:
Additionally, there are numerous exceptions where the table buffer is not
used either:
When the SQL commands SELECT FOR UPDATE or SELECT DISTINCT are
used
When the aggregate functions SUM, MIN, MAX, and AVG are used
When the Native SQL statements or the Open SQL condition
BYPASSING BUFFER is used
Because all SAP buffers are kept separately for each instance, the table buffer
is forced to synchronize the instances. If the content of a table buffer is
changed, the running work process writes a corresponding entry to the
DDLOG database table. The instances read this table regularly to invalidate
data records in their own table buffers that are affected by the changes. That
is, if a table is individually buffered, only the affected data record is declared
as invalid; however, if tables are generic or fully buffered, the complete
generic part of the table or the entire table in the buffer is invalidated.
The synchronization of the table buffers between the instances may have a
negative effect on the system performance. Therefore, some criteria must be
met by tables and views (they can also be buffered) for useful buffering:
468
Analyzing Administrative Performance Problems 8.2
1. Transaction data
All data that is generated and changed in large quantities during operation,
such as invoices, delivery notes, sales orders, material movements, and so
on. The tables grow rapidly during operation and can thus reach a size of
several gigabytes. Therefore, they are generally not suited for buffering in
the SAP system.
2. Master data
Master data is changed rarely or never during live operation and contains
information on material, customers, vendors, and so on. The respective
tables change less than the transaction data but still reach a size of several
hundred megabytes. Therefore, master data tables are also not included in
the SAP table buffer.
3. Customizing data
Data that is generated when mapping the enterprise processes onto the
SAP system (customizing). The most common examples are company
codes, factories, sales organizations, conditioning, and so on. The respec-
tive table records are changed or supplemented rarely during operation
and are therefore usually buffered in the SAP system.
On the basis of SAPs own experience, table buffering functions are already
configured in the supplied versions of the different SAP software solutions.
To decide when and how a table is buffered reasonably or not, you have to
monitor the SAP table buffering processes. This can be done in the SAP
memory monitor (ST02) via Detail analysis menu Call statistic or using
Transaction ST10. A selection screen is displayed where you must select the
Table type, Period, and SAP instance factors. Because every instance has its
own buffers, you must theoretically analyze each instance. However, this is
only virtually relevant if different tasks are performed on these instances,
such as batch against dialog instances, or if organizational enterprise parts
(for example, international branch offices) are distributed across different
instances and thus other data must be buffered.
Figure 8.19 shows an excerpt from the table access statistics, and Table 8.25
lists the most important columns and their meanings. (Note: You can expand
the individual detail columns via buttons.)
469
8 Performance
Column Description
Buffer key opt Buffering type: ful = full, gen = generic, sng = single.
ABAP/IV proces- Number of ABAP requests for the table, which can be broken down as
sor requests follows: direct reads, sequential reads, and changes (update, inserts,
deletes)
DB rows Number of data records that are transferred from the database to the
affected SAP system. Exception: initial load of the buffer.
470
Analyzing Administrative Performance Problems 8.2
The following list describes how to check the buffered tables and how to
decide if it makes sense to buffer them:
Vice versa, the administrator searches for tables that are not buffered but
should or could be buffered. For this purpose, here is a brief overview of the
most important criteria:
471
8 Performance
Remark
How do you recognize customizing tables? There are numerous customizing tables,
such as the condition tables Axxx (xxx = 000 999). If you search for a table in the
standard SAP system, you can find further information as follows:
1. Look at the short text of the table in the Data Dictionary (Transaction SE11) and
check the specifications under Goto Technical Settings in the Logical storage
parameters field (see Figure 8.20).
2. Search for the table for your application (for example, ERP or SCM) in the SAP
documentation under http://help.sap.com. Because Customizing is documented
very well, this documentation mentions or describes nearly all Customizing
tables.
3. Look for SAP Notes on the table in the SAP Support Portal. There are some
explicit notes on buffering for some tables.
The buffering settings for a table can be made in the Data Dictionary (Trans-
action SE11). There, you must enter the respective table and view and
change it via Goto Technical Settings (see Figure 8.20).
472
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
Having viewed the individual components and areas of an SAP system with
an Oracle database regarding the performance under administrative aspects,
the last section deals with aspects related to the program.
473
8 Performance
On the one hand, this will provide you with material for advising the devel-
oper team regarding a high-performance use of SQL in a qualified manner.
On the other hand, you will learn how to use these tools to identify prob-
lems in that area, and you will get to know basic solution approaches.
The example shown in Figure 8.21 shows two functionally identical variants
of a report that based on the flight data table, SBOOK lists all days on
which Cindy Lindworm booked flights. For this purpose, in the variant on the
left, all data records of the SBOOK table are read from the database. After that,
the data records determined for further processing (here: output) are selected.
In the other variant, the selection is performed by the where clause, which is
triggered by the Oracle DBMS. The second variant has two advantages:
1. The Oracle server process transfers only the data that is really needed for
the work process.
2. Oracle can select the correct data records very quickly by using existing
help data structures.
An old IT saying goes that you make mistakes whenever they are possible.
Functional errors can either be avoided by using proper specifications or
identified by means of testing. A performance-critical programming style
may remain undetected for a long time. That's because hardware performs at
474
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
different speeds or the SAP systems in use have different loads. Conse-
quently, the statement that the execution of the SAPBC_DATA_GENERATOR
report takes about 138 seconds for creating the flight data model is of no real
relevance.3
Note
Not every "bad" SQL statement causes performance problems. SAP and Oracle
proactively provide mechanisms to process such statements quickly (see Chapters
2, 3, and 4). On the other hand, not every optimized SQL statement is fast. Some
processes simply need a lot of time.
8.3.2 Effects
A poorly formulated SQL statement has both indirect and direct effects on
performance. Figure 8.22 illustrates the direct effects. In the example shown
in this figure, the transferred data packages between Oracle and the SAP
application server consists of exactly one SQL query that is submitted and
requests the entire SBOOK table. Therefore, the Oracle database needs to
deliver the entire table, which can involve many physical reads (). In addi-
tion, the transfer of the entire table with delivery of an unnecessary number
of data packages is involved (). The Oracle system itself cannot perform
any optimization action because it does not know that numerous data
records in the application are discarded. In the application server itself, all
data records are (sequentially) verified with regard to whether they corre-
spond to the IF condition (). This is quite time-consuming for the gener-
ated flight database that contains 90,000 data record in the SBOOK table.
SAP GUI
3 We have chosen the monster data record variant and created the database in a background
job (Transaction SM36).
475
8 Performance
Figure 8.23 shows the processing of a good SQL statement. The statement is
improved because not all attributes of the desired data records that are
marked with * are requested only the FLDATE attribute is. Based on an
exact description of the desired result quantity, the Oracle system can deliver
the minimum result set. Consequently, only a small number of loop passes is
necessary for the work process (), and only some data is transferred, which
fits into a single package in our example (). Moreover, Oracle can create an
optimized execution plan. For example, by using indexes, the physical read
operations can be limited to some index data blocks as well as to those data
blocks containing the requested data records.
However, not every SQL statement that requests only the necessary data, like
the one in Figure 8.23, is optimized. You also need to consider the process-
ing by the Oracle system, particularly the use of indexes. From the point of
view of performance, it is sometimes useful to extend the SQL queries with
seemingly redundant where conditions to use an existing index. However,
sometimes you may have to create a new index. We will describe these two
aspects in greater detail in Section 8.3.6, Indexes for Faster Access.
In addition to the direct effects, there are also some indirect effects we
should take a look at. For example, data that is requested unnecessarily occu-
pies space in the buffers of the application server and database system. In the
case of write requests, buffered data must be processed by transaction man-
agement.
For experiments and optimizing operational processes, SAP tables allow you
to define whether and to what extent they are buffered in the table buffer of
the application server. To do that, you must select a table in Transaction
476
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
477
8 Performance
of the entire system and others that enable a detailed analysis of an SQL
statement or program.
In this context, the analysis of the transaction profiles (Transaction ST03) and
Oracle performance (Transaction ST04 or ST04N) again plays a major role
(see Figure 8.25). You can use these tools to identify and isolate problems
from the point of view of the entire system first.
SQL area getratio describes the ratio between matches and requests for an
object in the library cache and should be close to 100% for a production
system. This shows that the shared pool size is well selected for the actual
query load, which can be caused by good queries or a generous measure-
ment.
SQL area pinratio describes the ratio between matches and requests for
reading and executing objects and should also be close to 100%. Here
again, expensive queries can have a negative impact due to displacements.
Vice versa, SQLA.Reloads/Pin describes the ratio between the necessary
reloads (SQL query parsing) and the accesses. Consequently, a value close
to 0 should be reached here.
The quality of the dictionary cache should also be close to 100 because it is
required for the plan creation and query processing. In addition, the size of
478
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
the data dictionary is known because of the fixed structure4. Therefore, the
system can size the memory area in an optimal manner. This is not the case
with the system used in Figure 8.25 because it has been in use only for a
short period.
You should know the typical values for your own system. Deviations may
occur when maintenance work is carried out on the system. Usually, you as
an administrator will carry out this work or at least will be involved. In this
respect, you will develop a feeling for the behavior of your system.
The top SQL statements should be primarily analyzed in terms of disk reads,
buffer gets, and elapsed time. The read access per execution (bgets/exec) and
the read access per record (bgets/row) provide important information. They
are important when performing a thorough and individual analysis of an SQL
statement. bgets/exec can be critical if a request carries out many buffer
accesses that may be redundant, as shown earlier in the example in Figure
4 Apart from custom developments, patches, and upgrades, the database structures generally
remain stable.
479
8 Performance
8.21. If the value for gbets/row is high, many blocks must be read from the
database to deliver a small number of data sets to the application server. In
this context, the database access is possibly performed without using appro-
priate indexes. However, complex joins and several inlists can entail high val-
ues for bgets/exec without having the technical potential for optimization.
The ABAP Dictionary (Transaction SE11; see Figure 8.28) provides more
information about the available indexes. No index is defined for the SBOOK
table via the PASSNAME attribute queried by the where clause.
480
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
Starting from here, a runtime analysis of the program can be carried out (use
Transaction SE30 or follow menu path Tools ABAP Workbench Test
Runtime Analysis). In the first step, the program to be examined must be
executed. In the second step, you can view the evaluation (see Figure 8.29).
481
8 Performance
Figure 8.30 shows the results of the bad SQL statement used at the begin-
ning including the use of the IF condition. With a total execution time of
three seconds, the program has already slowed down noticeably. Most of the
time is assigned to the database system that has to transfer about 900,000
records of the SBOOK table to the application server. From the point of view
of the architecture, this is absolutely necessary because the buffering of the
SBOOK table is not permitted.
In contrast, the program, which has been optimized by shifting the name
verification into the where condition, shows a runtime behavior that has
been improved by a factor of >10 (see Figure 8.31). Whereas the processing
with IF requires approximately 3 seconds, the variant using WHERE only
takes 0.2 seconds. On the one hand, the load in the database system is lower,
because only a few records needed to be delivered. On the other hand, the
application server is also less stressed, because it has to perform the SELECT
loop only a few times.
482
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
Note that the runtime measurements do not always provide the same results.
For buffered tables, the initial execution can involve a high database load,
whereas the second execution can leverage the data from the buffer of the
application server, and the query does not access the Oracle database. The
Oracle buffers behave in a similar manner. Consequently, the Oracle data
buffer also buffers tables for which no buffering (in the application server) is
allowed from the point of view of the SAP system. In this respect, it is not
always easy to construct clear examples. The actual system behavior also has
an impact on the measurements:
Other transactions generate loads and require CPU time and data transfer
volume.
483
8 Performance
Other SQL queries displace data from the buffers to the application server
and Oracle levels.
Other SQL queries already stored the data viewed during the measure-
ment in the buffers at both levels (which is actually positive in terms of the
overall performance).
You can use the SQL Trace (Transaction ST05 or menu path Tools ABAP
Workbench Test SQL Trace) to perform a very detailed examination of a
single program. Here, the database interface on the side of the SAP system
logs the processing steps for SQL queries in detail. This includes the opera-
tions provided by the record-oriented interface of the five-layer architecture
(see Chapter 3). Figure 8.32 shows the large number of fetch operations that
occur with an unspecific SQL query against the SBOOK table.
484
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
An optimized statement that performs the selection using the where clause
needs only one fetch for the few Cindy Lindworm records. However, this
fetch runs relatively long (see Figure 8.33).
Note that when you use an SQL trace, all SQL queries are logged if no filters
are used for specific tables, users, or work process numbers. The result can
thus be falsified by the side effects of other transactions. Therefore, you
should use the filtering options. Tracing processes are time-consuming and
may falsify the result.
You can also view the execution plan for individual SQL statements from
within the SQL trace (Explain F9). In addition, you can change and test SQL
statements on a trial basis (Ctrl-F6).
We have now briefly discussed some of the main approaches for detecting
and analyzing problematic SQL queries. Lets take a closer look at the solu-
tion to these problems.
485
8 Performance
The examples also illustrate the five golden rules of high-performing SQL pro-
gramming (see Schneider, Thomas: SAP Performance Optimization, SAP
PRESS, 2005), which we will describe briefly here:
1. The number of data records to be transferred between the DBMS and SAP
application server must be kept as small as possible. The impact of non-
specific SQL requests that transport large amounts of data has been dis-
cussed several times in this section. In addition, multiple reads of identical
data by a program can cause large data volumes. You can detect such
behavior in the SQL Trace using the Display Identical Selects function
(Ctrl-Shift-F8).
2. The transported data volume must be kept as small as possible. In addition
to Rule 1, you must ensure that no complete data records are transferred,
that is, you should avoid using select *. Furthermore, for typical calcu-
lations, such as calculating averages or totals, you should use the aggregate
functions of the Oracle system (see the example in Figure 8.34).
3. The number of transfers between the Oracle database and the application
server should also be kept small. Consequently, it is better to use a few
SQL requests with large5 return quantities than many SQL requests with
486
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
very small return quantities. On one hand, this is because processing SQL
requests involves a certain overhead. On the other hand, data packages
that contain small amounts of data and have a constant size on the network
unnecessarily consume resources. With many small selects, the trans-
ferred gross quantity of data can be a multiple of the net quantity of data.
If you succeed in determining all necessary data with a single SQL request,
waste occurs only in the last package, which is probably not completely
filled with data.
For this reason, you should avoid using SQL requests with nested loops. In
current SAP versions, join operators are available for this purpose. In
older SAP versions (older than 4.0), joins could be emulated by defining
views. The alternative and intuitive way of formulating joins as nested
loops can still be found in older developments and offers some potential
for optimization.
4. You can keep the overhead for processing the request small by using a
where clause that corresponds to the existing indexes. For this purpose,
the where condition should be simple, that is, it should consist of AND
links as far as possible. AND constrains the search area, whereas OR extends
it. When formulating a request, it is often possible to transfer OR condi-
tions into AND conditions.6 We'll describe the use of the correct index in
this context in greater detail later in this chapter.
5. It can be useful to shift the load more toward the application server. The
architecture of the SAP system can best be scaled at the level of the appli-
cation server. It is possible to use a larger number of instances to increase
the computing performance at that level. However, most SAP installations
use only a single integrating database server (see Chapter 2, SAP Funda-
mentals). For operations such as sorting or grouping, which can be per-
formed equally at the database server and application server levels, a shift
toward the application server is advisable. Here, you can also use particu-
larly efficient algorithms internally, such as a sorting algorithm. Of course,
you need to take Rules 1 to 3 into account in this context, that is, this pro-
cedure only makes sense if the entire data quantity considered is used in
the application.
6 Procedures for the conversion into disjunctive normal form and conjunctive normal form
(by negation) are described in the algorithm literature, for example the application of De
Morgan or the Quine-McCluskey procedure.
487
8 Performance
specific installation. If the application level and database level are located on
the same server computer, this would cause a load on the same CPUs, irre-
spective of the execution level. If the database server computer is compara-
bly overdimensioned, an execution in the database layer would certainly
have some advantages. In this context, it is essential to understand the
underlying mechanisms to obtain a working solution.
For tables, you can create several indexes, which connect single attributes or
combinations of attributes. First, there is always the primary index that con-
tains the key attributes. It can be mapped by means of a sorted storage of the
data records. Other indexes are referred to as secondary indexes. These sec-
ondary indexes store the combinations of attribute values with references to
the associated data records in additional memory pages. The organization
can occur, for example, as a B-tree, via hash procedures, or as a bitmap
index.
488
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
Note that by taking into account existing indexes, you can often accelerate
the program execution considerably. In this context, you should particularly
consider attributes that do not change from the perspective of a specific
7 The costs (Estim. Costs, Estim. Rows) are relative values, which support the comparison of
alternative plans. See also SAP Note 766349.
489
8 Performance
The following example shows how an index is created and becomes effec-
tive. As changes to the structure of Table SBOOK are not allowed without an
access key, we'll first create a copy of that table. To do that, you must start
Transaction SE11 and click on the Copy icon. This creates an empty table,
ZBOOK, which has the same structure as SBOOK including the secondary
indexes.
The following small ABAP program allows you to copy all data records from
SBOOK to ZBOOK:
REPORT zh_sbook_copy.
DATA booking LIKE sbook.
SELECT * FROM sbook INTO booking.
insert into zsbook values booking.
ENDSELECT.
For Table ZBOOK, the runtime analysis shows the values measured in Figure
8.36 for a selection by "Cindy Lindworm." No index is used here, because no
suitable index exists for that passenger name. Compared to a selection using
Table SBOOK, the values shown here are slightly different. One possible rea-
son might be the different physical characteristics of the two tables.
490
Analyzing Program-Based Performance Problems: SQL Optimization 8.3
As shown in Figure 8.37, you can use Transaction SE11 to create an index,
PNI, for the PASSNAME attribute.
When measuring the unchanged program, the new results are significantly
better, as shown in Figure 8.38. The processing in the application server
requires a similar amount of time. The number of packages transported
between the database and application server is the same. However, in the
example, processing in Oracle is 250 times faster due to the use of the index.
491
8 Performance
The display of the selected plan (see Figure 8.39) shows the use of index
PZBOOK-PNI via the PASSNAME attribute, as expected.
8.4 Summary
Performance is a far-ranging and complex topic. Hardware operating
system Oracle SAP system: Nobody knows everything. The intention of
this chapter was to provide an overall picture to enable you to handle per-
492
Summary 8.4
493
Index
801
Index
802
Index
user 590 C
verification 591
volume 619 Cached I/O 447
volume management 589 Calendar 465
BRCONNECT 544 Cardinality, high 752
BRRECOVER 580, 606 Catalog 67
backup device 608 Central instance 49
detailed log file 611 Change and Transport System (CTS) 288
disaster recovery 614 Change Management Service (CMS) 706
parallelization 608 Change request 293
post-processing 612 local 308
recovery scenario 609 release 320
structural change 610 Characteristic 750
summary log file 611 Check table 68
unattended 609 Checkpoint 93, 113, 550
user 610 Checkpoint not complete 438
BRRESTORE 580, 602 Checkpointer 90, 93, 551
backup device 602 Classical star configuration 746
compression 604 Classification 733
data file 603 Client 35, 280, 357
detailed log file 605 administration 291
parallelization 604 background job 363
redo log file 603 export 360
restore mode 602 import 360
summary log file 606 initial storage space 360
unattended 604 maintenance 285
user 604 new creation 362
verification 605 number 280
volume 604 role 285
BRTOOLS 555, 580 transport 358
BSP (Business Server Page) 60 Client copy 35, 355, 358
B-tree index 768 copy log 362
Buffer 536 copy profile 358
administration 472 local 359
management 81 remote 359
quality 417 test run 362
synchronization 468 very large production clients 363
Buffer busy wait 440 Client library 83
Bug fix 378 Client process 121
Business Add-In 279 Client/server system 29
Business Application Programming Inter- Climate control 245, 246, 247
face (BAPI) 60 Clustering 733
Business Content 744 Codd's rules 67
Business Explorer 743, 760 Collector job 538
Business Intelligence 30, 723 COMMIT 95
Business process 28 fast 93
Business Server Pages (BSP) 60, 261 Communication hardware 415
Business transaction 32 Communication SAP Oracle 163
Busy wait time 417, 421, 430 Communication structure 754
803
Index
Company code 35 D
Compliance 29
Component Data
Build Service 706 backup 133
development 708 backup method 561
partitioning 737 buffer 117, 418, 421, 423
software 708 cleaning up 727
Support Package 365 cleansing 727, 741
Component model 707 cube 728
Compression 618 definition language 74
BRARCHIVE 598 Dictionary 80, 117, 127
BRBACKUP 588 element 59
compute_statistics 130 export 561
Concurrent I/O 447 file 134, 321, 547
Config Tool 673 flow 758
Configuration file 111 integration 733
Configuration with one machine 36 manipulation language 74
CONNECT 143 mart 730, 734, 738
Connecting SAP and Oracle 158 mining 724, 732
Connection pooling 124 model 71, 745
Consistency modeling 71
archive log 514 package 475
spooler 541 protection 140
TemSe 541 pump 725, 735
Consolidation route 303 quality 741
Consolidation system 303 security 54, 66, 70, 133, 135
Constraint 82 storage 36
Control file 109, 111, 136, 321, 551, target 742
630, 631 type 59, 74
Copy profile 358 warehouse 724
Correction 316 Warehousing Workbench 743
Cost-Based Optimizer (CBO) 455, 516 Data basis, initial 34
Cost-based selection 128 Data buffer 420
count 77 Data class (database object) 209, 469
CPU 497, 536 Data Definition Language (DDL) 74
time 417, 421, 430 Data Manipulation Language (DML) 74
utilization 412, 413 Database
create any table 142 buffer 549
create index 129 buffer cache 94
create session 142 check 227
create table 73, 75 consistency 511
Critical patch update 382 Creation Assistant (DBCA) 89
CUA 465 export 513
Cube 728, 740, 743 growth 618
Cursor 59 interface 43, 467
Customer namespace 57, 314 layer 36
Customizing 34, 279 lock 452
data 281, 469 management software 17
request 293 object 140, 547, 584, 592
tables 472
804
Index
805
Index
806
Index
Flashback recovery log 140 High availability 49, 245, 249, 266
Foreign key 73 license 353
Foreign system 753 High-availability system 345
Forwarding Hint 127, 133
direct 755 Hit ratio 417
flexible 756 HP-UX 385
rule 754 HTML 60
Fragmentation 107 Humidity 248
external 107
internal 108
free buffer waits 441 I
Free list 104
Frequent itemset 741 I/O
Frontend 401 analysis 444
Full buffering 467 configuration 777
Full restore and complete recovery 578 load 413, 415, 446, 449
Full table scan 128 modes 447
Full upload 758 performance 97, 117, 446
Idle wait event 429
IDoc (Intermediate Document) 63
G Implementation Guide 278, 309
Import 735
Galaxy configuration 731 Monitor 322
Gateway 528 overview 321
monitor 529 Puffer 326
process 39, 49 queue 321, 326
Generic Request and Message Genera- Index 80, 127, 488
tor 696 range scan 131
Geofeature 764 rebuild 234
Granularity 752 Individual software 27, 278
Grid computing 82 InfoCube 751
group by 78 InfoObject 750
grouping 78, 487, 740, 751 InfoPackage 755
gwmon 529 InfoProvider 742
InfoSet 761
InfoSource 754
H init.ora 96, 111
init.sap 582
Hardware error 546 Initial record 465
Hash insert_into 76
grouping 129 Installation
join 128, 737 phases 341, 344
partitioning 737 planning 110
HASH_AREA_SIZE 425 tools 341
having 78 Instance 36, 49
Heap memory 460, 463 name 50
Heterogeneity 726 number 345
Heuristic 128 profile 51, 153
Hierarchy 747, 750 Instance recovery 551
807
Index
808
Index
809
Index
810
Index
811
Index
P Prevention 485
Primary index 480
Package 58, 304 Primary key 73, 75
$TMP 308 Privilege (Oracle) 197, 201
Paging 412, 415 Process 90, 150
Paging memory 460 Process After Input (PAI) 42
PAI area (Process After Input) 42 Process Before Output (PBO) 43
Parallel processes 361 Process overview 523
Parallel query 436 Product Availability Matrix (PAM) 386
Parameter file 96 Production key 748
Parameter maintenance (Oracle) 235 Profile file 50
Parameterization file 547 Program Global Area (PGA) 94, 425, 771
Partial backup 655 Administration 777
Partial Restore and Complete Recovery administration 776
575 management 425
Partitioning 737 Project IMG 310
Password 140 CTS function 310
Password change (Oracle) 191 view 311
Patchset 378, 386 Projection 128
installation 379 Projection list 77
prerequisite 379 Protocol Support 120
PBO area (Process Before Output) 43 protocol.ora 122
pctfree 104, 441 PSAPTEMP 422, 425, 444, 777
pctincrease 104 PSPO 91
pctused 104 Punchcard stacks 29
Performance 55, 70, 126, 475
Performance problems 407
administrative 407 Q
analysis 408
program-based 407 QA approval procedure 309
user-specific 407 Quality assurance system 284
Persistent Staging Area (PSA) 755 Query 763
Personalization 279 definition 761
PGA_AGGREGATE_TARGET 425 Designer 761
Physical read 417 language 76
Pivoting 732 processing 92
Planning phase 244 rewrite 739
PMON 90, 121 Quick Sizer 250
Point-in-time recovery 577, 610
Post installation 352
Power supply 248 R
uninterruptable 247
Preliminary correction 317 R/2 system 49
PREPARE 393, 395, 396 R3trans 153, 290, 323, 333
log file 398 RAC 125
modules 397 RAID 81, 97, 110, 133, 135, 254
Prepare 116 Range partitioning 737
Prerequisites Check 346 Raw device 97, 207, 448
Presentation layer 35 Raw I/O 447
812
Index
813
Index
814
Index
815
Index
816
Index
817
Index
818