An Introduction To System Sizing For Data Warehousing Workloads
An Introduction To System Sizing For Data Warehousing Workloads
An Introduction To System Sizing For Data Warehousing Workloads
ibm.com/redbooks
Networks
System
Figure 1 Data movement in systems
Storage
Due to the high volume of data moving through data warehousing systems, special consideration should be given to the I/O subsystem for this type of workload.
With some analytical work a reasonable configuration that meets the business requirements can be estimated. As the quality of data increases, sizing estimates become more accurate.
Sizing methodology
This section introduces the sizing methodology using a sample sizing effort. The workload used is a benchmark from the Transaction Processing Performance Council (TPC) for data warehousing. The TPC Benchmark H (TPC-H) is a well recognized data warehousing benchmark and its detailed description is publicly available. TPC-H Benchmark overview on page 30contains more information about the TPC and the TPC-H benchmark. The following diagram illustrates the methodology used.
Data Collection
Workload Characteristics
Product Data
Business Requirements
Sizing Process
Sizing Estimate
This example characterizes the behavior of a data warehouse workload and sets specific performance goals for achieving the business objective. Performance data sheets on the IBM Eserver pSeriesTM 655 (p655) and IBM TotalStorageTM FAStT900 Storage Server were used to establish a system sizing estimate to meet the goals. Although not part of this example, the data was used to run and publish a TPC-H benchmark that validated our work. More details on this benchmark result can be found on the TPC Web site:
http://www.tpc.org/tpch/results/tpch_result_detail.asp?id=103120801/1
The following sections describe the steps required to collect data and size the system.
Choosing a system
Selecting a vendor, a product line, and a system for a new project is a complicated process beyond the scope of this paper. It should be noted that
1
http://www.tpc.org/tpch/results/tpch_result_detail.asp?id=103120801/
most selection processes are influenced by organizational preferences, historical purchasing patterns, and other non-technical issues. Regardless of the reasoning, the selection team is responsible for ensuring that a selected system is capable of meeting the technical requirements of the workload and providing the return on investment sought by the business. For this project the following products were selected: Clustered configuration of IBM eServer pSeries 655 systems IBM FAStT900 Storage Server IBM DB2 UDB The choice of products was influenced by the project requirements, as well as the desire to highlight these products.
GX MCM
GX to RIO-2 Bridge RIO-2 RIO-2 to PCI-X Bridge PCI Buses Internal PCI Devices and PCI Slots
RIO-2
The following sections describe the major components of the p655. For general description and configuration information about the p655, refer to the following IBM Web site:
http://www-1.ibm.com/servers/eserver/pseries/hardware/midrange/p655_desc.html2
Processing power
The p655 system is powered by a single multi-chip processor module. A Multi-Chip Module (MCM) has either four or eight 1.7 GHz, POWER4+ processor cores. Each processor core contains 32 KB of data cache and 64 KB of instruction cache. Each processor chip has a 1.5 MB L2 cache on board that operates at chip frequency. On the 8-way MCM, the two cores on each processor chip share that chips L2 cache, while on the 4-way MCM each core has a dedicated L2 cache. A 32 MB L3 cache is located between each processor chip and main memory and operates at one-third of the chip frequency. For more detailed information on the p655 configuration, refer to the following white paper:
http://www-1.ibm.com/servers/eserver/pseries/hardware/whitepapers/p655_hpc.pdf3
Memory Slot
2
http://www-1.ibm.com/servers/eserver/pseries/hardware/midrange/p655_desc.html
3 IBM eServer pSeries 655Ultra-dense Cluser Server for High Performance Computing, Business Intelligence and Data Warehousing Applications by Harry M. Mathis, John D. McCalpin, Jacob Thomas
http://www-1.ibm.com/servers/eserver/pseries/hardware/midrange/p655_desc.html
Memory Slot
Memory Slot
Memory configuration
The p655 System has four memory slots that allow from 4 GB to 64 GB of memory to be installed. Memory cards are available in 4 GB, 8 GB and 16 GB sizes. The following table shows possible memory configurations.
Table 1 System memory configuration options Total memory 4 GB 8 GB 16 GB 16 GB 32 GB 32 GB 64 GB Slot 1 4 GB 4 GB 4 GB 8 GB 8 GB 16 GB 16 GB 16 GB 8 GB 4 GB 4 GB 4 GB 8 GB 8 GB 16 GB 16 GB 16 GB 8 GB 4 GB Slot 2 Slot 3 Slot 4
I/O subsystem
The p655 has two RIO-2 (Remote I/O) buses. The first RIO-2 bus supports the service processor, two Ethernet ports, an integrated SCSI adapter, and three hot-plug/blind-swap PCI-X slots on the system board (see Figure 3 on page 7). The second RIO-2 bus can be connected to the 7040-61D I/O drawer for additional I/O adapter slots and performance. The p655 supports a maximum of one I/O drawer with two RIO-2 ports. The I/O drawer contains two PCI I/O planars. Each planar has three PCI Host Buses (PHBs). The first PHB has four 64-bit (133MHz) PCI slots, the second and third ones have three 64-bit (133MHz) PCI slots. Figure 4 on page 9 shows the detailed configuration of the I/O drawer connected to the RIO-2 bus.
RIO-2 RIO-2 HUB RIO-2 Active Passive/Failover RIO-2 Active Sustained 1050MB/s Duplex
7040-61D IO Drawer
FAStT900 can saturate the four 2Gb FC host interfaces and deliver more than 720 MB per second of I/O to the system. For more information about the FAStT900 features and performance refer to the following IBM Web site:
http://www.storage.ibm.com/disk/fastt/fast900/index.html4
http://www.storage.ibm.com/disk/fastt/fast900/index.html
10
Once the raw data size is established, it is necessary to estimate the database space requirement for storing the data in tables using the internal format of the database. For this, the schema, page size, and row density per page are required. A database administrator and a data architect must be involved in this process. Most database vendors provide accurate estimates once the schema and data size are known. For this example, the information provided in the DB2 Administration Guide: Planning, SC09-4822,5 document was used. This manual has a specific section that can help estimate table, index, log, and temporary space requirements for the schema.
11
Considering the above information, the space requirement for storing all the indices was estimated to be 258 GB. Once again, the database product documentation should be consulted for information on estimating index space requirements.
12
Estimating the temporary storage space requirements for a DW workload is difficult because several external factors such as system memory size and concurrency levels impact the need for space. It usually takes an experienced data warehousing architect with help from database vendors to estimate temporary space needs. Underestimating the temporary space requirements of a workload can prevent the proper execution of large queries or limit concurrency levels of query execution. The following information was considered when estimating our temporary storage needs: Percentage of data used in large sort and join queries Number of concurrent queries that can be running at any one time Number of concurrent query segments per query Previous experience with the workload memory usage Expert guesses and rules of thumb available from product vendors Comparative estimates provided by the database vendor Future growth of data and increase of concurrency levels Based on the above information, 140 GB of temporary space was estimated for the worst-case query, and since seven query streams could run concurrently, about 1,000 GB of temporary space was needed.
Staging space
Most data warehouse projects require some space for staging raw data for loading or maintaining the warehouse. Depending on the operational procedures, the space requirement can vary drastically. The following information was considered when estimating the staging space requirement: Data warehouse loading procedures Location and storage needs for the raw data Data warehouse maintenance procedures Location and storage needs for the maintenance data Future growth of data Based on the operational needs to store update data and some load data it was estimated that 1,500 GB of space was sufficient for the project.
13
Number of disks required to meet the performance requirements Number of disks needed to balance the I/O performance. For example, suppose that the storage requirement can be satisfied with 11 disk drives, but if the system has two disk controllers it might be better to use six disks per controller to evenly distribute the I/O on both controllers. Adjustments to the number of disks for performance reasons may result in having more space than the minimum required, but as always one must balance the performance needs and cost based on the project priorities. The following table shows the overall storage requirements for the configuration.
Table 2 Minimum storage space requirements 1,000 GB TPC-H warehouse space requirement Data Index Database log Database temporary storage Staging space Total storage space 1,350 GB 258 GB 36 GB 1,000 GB 1,500 GB 4,144 GB
14
A well balanced data warehouse system maximizes one or more of the resources in the system. Unlike OLTP workloads, a single unit of a DW work (query) can be parallelized to maximize the utilization of the system and return results in a minimum amount of time. When additional concurrent queries are added, the databases start to distribute system resources among the active queries. The time to complete a query in a DW workload will vary from one execution to another depending on the mix of queries running at the same time. Only an expert in data warehousing workloads, with help from the database vendor and a data architect, can analyze a schema and anticipate the resource needs of potential queries. Without the ability to experiment and run test queries, even experts can have a hard time gauging the CPU needs of a query. The process of categorizing the various queries starts by careful examination of the database schema, data characteristics, and queries. The goal is to determine the following: 1. 2. 3. 4. Estimate the size of the data accessed by each group of queries. Categorize the queries based on the dominant resource needs. Select some larger queries to represent these categories. Use these queries as a guide for system sizing.
For this workload, the various queries were organized in three groups: Small: Less than 15 percent of data is needed to produce results. Medium: Between 15 and 50 percent of data. Large: More than 50 percent of the data. In estimating the data set size for queries, it was assumed that the database can be optimized using indices and other schemes to minimize data access to what is necessary to calculate and produce the query results. Different databases and schema definitions may behave differently for the same queries. For example, a missing index may force the database to execute a full table scan and result in significantly more access to data.
15
Since it was intended to use the largest queries for this characterization work, there was less concern about the ability of the database to optimize data access. The following chart shows the approximate minimum data size required to complete the six largest queries in the workload.
3,000 2,500
Data Size (GB)
The details of the size categorization for all the queries are gathered in Table 3 on page 17.
Categorizing queries
TPC-H queries are categorized with respect to the most dominant resource they use. To do this, the intensity of the processing being applied to the data being read for the query is estimated. For example, the following segment of SQL code shows a query with a significant number of calculations for every row of data to be processed:
select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice*(1-l_discount)) as sum_disc_price, sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc from
16
In contrast, the following query has modest processing requirements for each row being scanned. This query will run as fast as the data can be read by a database and therefore it is I/O bound.
select sum(l_extendedprice*l_discount) as revenue from lineitem
A few of the diverse queries were compared and contrasted with each other and based on the experience of the team with DB2 query processing. They were categorized in the following table.
Table 3 Query categorization Query 21 9 17 19 1 18 7 5 13 11 6 2 22 16 14 Data set Size Large Large Large Large Large Large Medium Medium Medium Small Small Small Small Small Small Memory Requirement High High Low Low Low High High High Medium Medium Low Low Low High Low CPU Requirement High High Low Low Medium High High High High Low Low Medium High High High I/O requirement Sequential Low Low High High Medium Low Medium Low Low High High Medium Low Low Low Random Low Medium None None None Medium Medium None Low None None None Low Low Low Network Requirement Low High Low Low Low Low High High Low Low Low Low Medium Medium High
17
15
Small
Medium
High
Low
Low
High
Once the characteristics of the various queries are established, it can be seen that most of the queries have high CPU or sequential IO requirements. To balance the system, sufficient IO performance is needed to keep CPU resources maximally utilized. As can be seen in the above table, the top six largest queries can easily represent the entire query set with respect to resource needs. All the major system resource categories can be maximized by one or more of the six largest queries.
18
18
1 21 19 17
0.2
0.4
0.6
0.8
1.0
Figure 6 can be used to make system sizing decisions based on the relative information. For example, if a system is configured with just enough I/O bandwidth to maximize the CPU utilization during query 1, then all queries to the right of query 1 will be I/O bound. It can also be concluded that configuring the system with more I/O than is necessary for query 17 will not provide any benefit. Since it is not possible to configure the system optimally for every possible query, this type of categorization can be used to optimize for the largest class of queries within a given budget.
19
To complete a given query in a fixed period of time, the system needs enough I/O bandwidth to read the required data in that time and enough CPU power to keep up with the processing of the data. The point representing query 17 on the far right edge of the chart in Figure 6 on page 19 indicates that this query has an insatiable requirement to read data when compared to the other queries in the workload.
20
of queries takes one CPU hour to complete, then seven streams of similar queries could potentially take about seven hours of CPU time. This rule of thumb provides reasonable estimates. However, sometimes simultaneously running query streams can benefit from sharing data caches, while at other times the reduced availability of memory results in resource conflicts. To be safe a margin of error should be anticipated. If a system is mostly I/O bound during the execution of single stream queries, then more than one stream of queries is required to fully utilize the system resources. To get the best return on investment, the system should be configured to fully utilize CPU resources as much as possible.
Business requirements
Obviously, the primary purpose of the data warehouse infrastructure is to solve business problems; so it is critical to collect the business requirements applicable to the sizing effort and translate them to system requirements. For example, a business unit may require that a warehouse be reloaded from transactional data once per week and the task must be completed in a six-hour window of time. This requirement must be translated to the various read, write, and process rates for the data to ensure the system has the appropriate capacity. The following table shows the list of business requirements and expectations addressed in this sizing example. This list is by no means conclusive, but it does capture the most critical elements for this purpose.
Table 4 Business requirements Requirement Raw data size Annual growth rate of data Tolerance for performance decline due to growth rate Service life expectancy Expectation 1000 GB of raw text data Less than 2% Less than 2% Comment The raw data size is only the base number for calculating storage space requirements. Normal growth rate must be accommodated without major architectural changes in the system. Slower response time for any operation can be tolerated if the percentage of degradation is less than or equal to that of the growth in data. The system is expected to operate for at least three years without major changes.
3 years
21
145 MB/Sec
145 MB per second load rate is derived from the need to load the 1000 GB data in less than two hours. At this rate, a DBA will have enough time to rebuild the warehouse from scratch and prepare it for query process in less than four hours. This requirement is critical because a significant number of ad hoc queries in the workload perform scan and filter operations on all or part of the data. In addition, extract operations from the warehouse are bound by the scan rate of the system. The response time for several large and small queries with simple calculation will be impacted by the scan rate. Query 17 is our guide for this criterion. Our workload has several queries that frequently run to generate reports. These queries require several complicated computations and sort operation on all or parts of the data. The intent of this requirement is to ensure the worst case report will complete in what is considered reasonable time based on the business requirement. We also intend to run multiple simultaneous queries and we like to ensure the work can be completed in a reasonable time. We will use query 18 as our guide for sizing the system to meet this requirement. The workload requires that at least 7 query streams can operate concurrently at any one time. In addition, query response times must not degrade by more than a factor of 7 to 8, when compared to a single stream execution. In this exercise, reaching the specific performance targets had the highest priority. Although the overall hardware cost had a budget ceiling, the price/performance was secondary to overall performance. Our goal was to achieve the performance target with optimal configuration and price within the budget guidelines. Our workload requires RAID protection to prevent a single disk failure from disabling the system. RAID level 5 is most appropriate for us because it allows protection with a reasonable overhead. A 5+P RAID level 5 configuration requires 5 disks for data and one disk for parity; this adds a 17% disk overhead for RAID-5.
7 query streams
Performance
Data protections
RAID level 5
22
The above table maps the business requirements to a set of checklist items for the system. When sizing the system, this table should be used as a boundary guide to ensure the business needs are met.
23
Since 122 disks do not evenly distribute amongst 16 EXP700, the number of disks should be rounded up from 7.6 per EXP700 to 8. Additionally to configure for 5+P RAID level 5 configuration, the number of data disks in each EXP700 should be divisible by 5 so once again the number of disks in each EXP700 is rounded up to 10. For every 5 data disks, a parity disk should be added so the total number of disks per EXP700 is 12. The following diagram shows the logical configuration of a FAStT900 with the enclosures and disks attached.
A
1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 E E
CTRL A
B
EXP700 Enclosure 0
CTRL B
A
8 9 1 0 1 1 1 2 E E
FC HBA
FAStT900
Figure 7 FAStT900 disk configuration
B
EXP700 Enclosure 1
Twelve disks per EXP700, 2 EXP700 per FAStT900, and 8 FAStT900 brings the total number of disks to 192. Obviously 192 disks is significantly more than the initial requirement of 122 disks, but this configuration provides a balanced, evenly distributed load with RAID level 5 protection that meets the performance requirement. Random read and write requirements for this workload were relatively low when compared to the rest of the I/O needs. Considering the number of disks and FAStT900 servers, the random IO requirements of the workload is easily met. Although a single FAStT900 can easily accommodate the space needs for the workload, it cannot possibly meet the performance requirements. Space should never be the only determining factor for storage configuration.
24
Based on the storage sizing the system has to be able to accommodate 8 storage servers, each with 4 Fibre Channel interfaces. A total of 32 Fibre Channel host bus adapters are required to match the performance of the storage subsystem. Each p655 can easily provide for 1400 MB/sec of I/O (see I/O subsystem on page 8). From this it can be determined that at least four p655 nodes are required to satisfy the 5000 MB/sec I/O needs of the workload. Depending on the CPU and memory needs for the workloads, the storage can be connected to 8 or 16 nodes to satisfy the I/O bandwidth requirements. Considering the flexibility of the FAStT900, any size system can be connected that can evenly distribute the access to 32 HBAs amongst all processors to provide for 175 MB/sec of bandwidth for each HBA. This system can be a single node with any number of CPUs or a multi-node cluster of systems with an aggregate I/O bandwidth of 5000MB/sec. The following table sums up the storage requirements.
Table 5 Storage configuration Storage configuration FAStT900 Storage Server EXP700 Enclosures 36GB disk drives 2 Gb Fibre Channel Host bus adapters 8 16 192 32
25
used to measure the time it takes to process a fixed amount of data by the worst case query. If the test system completes a 1 GB query in 50 seconds then it is known that a processor that is ten times faster can complete the work in 5 seconds. To process 1000 GB in 500 seconds, 10 of the new processors are needed. This type of estimate can only be applied to simple compute intensive queries that are known to scale and that have stable query plans regardless of their size. Complex multi-table join queries are not good candidates for simple CPU testing because most databases try to apply query optimizations that may behave differently based on data size. The memory usage of the test system needs to be limited to a relative size. It would be unreasonable to allow the 1 GB test query to consume 1 GB of memory unless is was intended to configure the target system with 1000 GB of memory. To simplify the estimates it is preferable to establish a fixed relation between memory size and data size during testing. For this testing, it was established that at least 100 MB of database memory was needed for each 1 GB of data. Throughout the testing, a similar ratio was maintained. This ratio can be drastically different from one workload to another. Workloads that concurrently run many large queries with multi-table join and sort operations require more memory than workloads that run mostly scan/filter queries and aggregate data. Alternatively, some sizing experts use the ratio of memory per CPU as a guide. A common rule of thumb for data warehouse workloads used to be 1 GB of memory per CPU, but as memory prices have gone down and processor speeds have gone up this ratio has increased to 4GB. Too little memory will result in increased use of database temporary storage, which will require more I/O operations and possibly more idle processing power. Too much memory will fail to provide performance improvements once processing power is saturated. Adding memory to a production system is much simpler than adding I/O or processing capacity, so this area of the configuration can further be improved so long as some flexibility is built into the plan. For this project, query 18 was selected as a guide for sizing the system processing needs. Query 1 was known to be a much simpler query to run tests with and that it scales for all data sizes. Query 1 was used for small scale testing and measurement and the relative resource usage graph (Figure 6 on page 19) was used to estimate the processing needs for query 18. Based on the business requirements, it was known that the worst case query running on 1000 GB of data needed to complete in 900 seconds. Assuming query 18 was the worst case and knowing that it used about 80 percent of the data, it was estimated that it would take 720 seconds to complete a query similar
26
to 18. Using the relative resource requirement chart (Figure 6 on page 19), it was estimated that query 1 used about 2.5 times less processing resources to process the same size data as query 18 in a fixed period of time, and from this it was determined that enough CPU processing power was needed to complete query 1 in approximately 300 seconds. Using experience with older systems and a subset of the data, it was calculated that the 1.7 GHz processor available for the p655 system could process query 1 at the approximate rate of 180 MB per second. For query 1 to process 800 GB of data in 300 seconds a processing rate of 2,730 MB per second was needed. At a rate of 180 MB per processor, at least 16 processors were needed to meet the performance target. It was estimated that 16 1.7 GHz p655 processors could complete seven concurrently executing queries similar to query 18 in less than 90 minutes. These timing estimates were well within the business requirements. Based on these estimates it was assumed a configuration with 16 processors and approximately 128 GB of memory would satisfy the requirements.
27
processor and maximize utilization helps achieve the return on the initial investment needed to be successful. The following is a general diagram of the system configuration.
EXP700 EXP700 FAStT900 EXP700 EXP700 FAStT900 EXP700 EXP700 FAStT900 EXP700 EXP700 FAStT900
Figure 8 Overall system configuration
Gigabit Switch
EXP700
IBM ^ p655
IBM ^ p655
EXP700 FAStT900 EXP700 EXP700 FAStT900 EXP700 EXP700 FAStT900 EXP700 EXP700 FAStT900
IBM ^ p655
IBM ^ p655
The four node cluster was connected with a gigabit Ethernet switch for all inter-node communications. The overall configuration provided about 5,500 MB per second of read bandwidth from disk to system memory. Each node was directly connected to one-fourth of the total storage configuration. This configuration was well balanced and flexible and could be easily extended for larger DB2 installations. The project was completed by building the configuration based on the sizing estimate and executing the TPC-H benchmark. The sizing effort was a success since the minimum requirements for the workload were met. Some of the I/O bound queries performed better than expected due to conservative performance estimates for various components, but the results were within 510 percent of expectations. The CPU bound queries performed within 10 percent of expectation with about equal number on the plus and minus side. The benchmark was submitted to the TPC and published in December, 2003. For further details, see the following TPC Web site:
http://www.tpc.org/tpch/results/tpch_result_detail.asp?id=1031208017
28
Conclusion
System sizing for a data warehouse is a complicated task that requires some expertise to accurately estimate the configuration that can meet the needs of a business. The quality of the sizing estimate depends on the accuracy of the data put into the process. With some analysis and workload characterization, it is possible to drastically improve a sizing estimate. In the absence of the workload characterization data, the expectation of an accurate system sizing estimate should be set appropriately. To safeguard a project, build flexibility into plans by configuring systems that are well balanced in resources and are extensible. For clustered configurations, ensure that the basic building block system is well balanced and meets both I/O and CPU requirements for its subset of the workload. The reliability, availability, and serviceability of a cluster are only as good as that of the building blocks.
http://www.tpc.org/tpch/results/tpch_result_detail.asp?id-103120801
29
PC Council
The Transaction Processing Performance Council (TPC) was founded in August 1988 by eight leading hardware and software companies as a non-profit organization with the objective to produce common benchmarks to measure database system performance. More than 20 companies are currently members of the council. There are four active benchmarks (TPC-C, TPC-H, TPC-R and TPC-W) for which results can be published. IBM was the very first company to publish a TPC-H result at the 10,000 GB scale on December 5, 2000. Prior to publication, results must be reviewed and approved by designated auditors and a full disclosure report documenting compliance is submitted to the TPC. Published benchmarks as well as the benchmark specifications are accessible on the TPC Web site:
http://www.tpc.org
TPC-H overview
The TPC-H benchmark models a decision support system by executing ad-hoc queries and concurrent updates against a standard database under controlled conditions. The purpose of the benchmark is to provide relevant, objective performance data to industry users according to the specifications and all implementations of the benchmark, in addition to adhering to the specifications, must be relevant to real-world (that is, customer) implementations. TPC-H represents information analysis of an industry that must manage, sell, or distribute a product worldwide. The 22 queries answer questions in areas such as pricing and promotions, supply and demand management, profit and revenue management, customer satisfaction, market share, shipping management. The refresh functions are not meant to represent concurrent online transaction processing (OLTP); they are meant to reflect the need to periodically update the database. The TPC-H database size is determined by the scale factor (SF). A scale factor of 1 represents a database with 10,000 suppliers and corresponds approximately to 1 GB of raw data. Only a subset of scale factors are permitted for publication: 1, 10, 30, 100, 300, 1000, 3000 and 10000. The database is populated with a TPC-supplied data generation program, dbgen, which creates the synthetic data set. The set of rows to be inserted or deleted by each execution of the update
30
functions is also generated by using dbgen. The database consists of eight tables.
Table 6 Eight tables Table name REGION NATION SUPPLIER CUSTOMER PART PARTSUPP ORDER LINEITEM Cardinality 5 25 SF*10 K SF*150 K SF*200 K SF*800 K SF*1500 K SF*6000 K (approximate)
The chart below gives a more detailed view of the relationships between the tables and the number of rows per table when the scale factor is 10,000, that is for a 10,000 GB (10 TB) database.
31
PART (P_) 2,000M PARTKEY NAME MFGR BRAND TYPE SIZE CONTAINER RETAILPRICE COMMENT
PARTSUPP (PS_) 8,000M PARTKEY SUPPKEY AVAILQTY SUPPLYCOST COMMENT CUSTOMER (C_) 1,500M CUSTKEY NAME ADDRESS NATIONKEY PHONE ACCTBAL MKTSEGMENT COMMENT NATION (N_) 25 NATIONKEY NAME REGIONKEY COMMENT
LINEITEM (L_) 60,000M ORDERKEY PARTKEY SUPPKEY LINENUMBER QUANTITY EXTENDEDPRICE DISCOUNT TAX RETURNFLAG LINESTATUS SHIPDATE COMMITDATE RECEIPTDATE SHIPINSTRUCT SHIPMODE COMMENT REGION (R_) 5 REGIONKEY NAME COMMENT
SUPPLIER (S_) 100M SUPPKEY NAME ADDRESS NATIONKEY PHONE ACCTBAL COMMENT
TPC-H enforces the ad-hoc model by severely restricting the implementation of auxiliary data structures such as indices and materialized query tables (sometimes known as automatic summary tables or materialized views). It also restricts how horizontal partitioning (by row) may be implemented. The partitioning column is constrained to primary keys, foreign keys, and date columns. If range partitioning is used, the ranges must be divided equally between the minimum and maximum value. By imposing these restrictions, the TPC-H benchmark maintains the server platform as part of the performance equation and represents an ad-hoc environment. The TPC-R benchmark that does not restrict auxiliary structures models a reporting environment. The table below summarizes the differences between the TPC-H and TPC-R benchmarks.
32
Table 7 Differences between TPC-H and TPC-R benchmarks TPC-H (ad-hoc) Auxiliary data structures Simulated environment Side effects Restrictions on indices No aggregates Ad-hoc query Heavy stress on system Average response time several minutes Update function times similar to query times TPC-R (reporting) Extensive indices and aggregates OK Pre planned, frequently asked queries Lots of tuning by DBA Sub-second response times for several queries Load time much longer Update function times much longer than query times
The TPC-H benchmark exercises the following areas: Twenty-two queries with the following characteristics: Left outer join Very complex queries with nested sub queries Aggregate with "HAVING" clause Queries with multiple "OR" predicates Query combining "EXISTS" and "NOT EXISTS" Query with multiple "SUBSTRING" operators Large scans with multi-table joins Aggregate operations with large number of distinct values Large number aggregations and sorts. Queries relying on index access as well as table access Long running queries as well as short running queries exercising all aspects of query processing Database refresh functions to perform inserts and deletes on the database The benchmark specifications require that the implementation chosen for the benchmark satisfy Atomicity, Consistency, Isolation and Durability (ACID) properties. Specific tests are designed to show: That the system either performs individual operations on the data or assure that no partially completed operations leave any effects on the data (A). That execution of transactions take the database from one consistent state to another (C).
33
That concurrent database transactions are handled correctly (I). That committed transactions and database consistency are preserved after recovery from hardware failures such as loss of power, communications, memory, data and log disks, etc. (D). The concurrent updates insert and delete from the two large tables, LINEITEM and ORDER. Each of the refresh functions represents 0.1 percent of the initial population of these two tables. Each pair of refresh functions alters 0.2 percent of these two tables. A single update pair must be run for the power test and a set of update pairs for each query stream is run in the multi-user throughput test. The exact implementation of the refresh functions is left to the vendor. There are certain rules that need to be followed for the implementation of these refresh functions. The TPC-H specification states that each refresh function (RF1 or RF2) can be decomposed into any number of database transactions as long as the following conditions are met: All ACID properties are satisfied. Each atomic transaction includes a sufficient number of updates to maintain the logical database consistency. For example, when adding or deleting a new order, the LINEITEM and ORDER tables are both updates within the same transaction. An output message is sent when the last transaction of the update function has completed successfully.
TPC-H Metrics
The benchmark specification provides details on how to report results, which consist of two performance metrics and one price/performance metric:
Composite Metric (QphH@Size) = . This metric is the primary performance metric, which is composed of the two pieces: Power (QppH@Size) =
3600 * SF
24
Q1 * Q 2 * ... * Q 22 * RF1 * RF 2
Where Q1, Q2,... RF1, RF2 are timing intervals in seconds of queries and updates. The geometric mean of the queries and updates is used here to give equal weighting to all the queries even though some may be much longer running than others. The power metric is derived from a power run
34
(single stream) in which all queries and update functions are run in a specified sequence.
Benchmark evolution
The following table shows the evolution of the TPC Decision Support benchmarks.
Table 8 Evolution of TPC Decision Support benchmarks Benchmark TPC-H TPC-H TPC-D Version 2.* 1.* 1.* & 2.* Date released November 2002 Feb. 1999 May 1995 Date obsolete Current November 2002 April 1999
35
Although the TPC-H benchmark evolved from TPC-D, they are vastly different and cannot be compared in any way. The TPC-H Version 2 became effective in November of 2002, and although the basic performance aspects of V1 and V2 are identical, the pricing methodology was changed. The price/performance metric of TPC-H V2 is based on 3-year cost, while V1 was based on 5-years.
Performance evolution
Since the inception of the TPC Decision Support benchmark, there have been some general trends in the industry standard benchmark results, which reflect the growth pattern of the business intelligence market. Although the incompatibilities between the four different versions of the TPC Decision Support benchmarks make it impossible to chart a continuous trend line from 1995 to 2003 the following points are undisputed: Price/performance has improved steadily Scale factor size had increased, that is, database sizes have increased Processing power has increased Memory and disk requirements have increased The TPC-H benchmark measures the servers I/O, CPU, and memory capabilities via various database operations such as full table scans, sorting, joins and aggregation. It also measures how well a DBMS performs these basic database operations, and rewards those with efficient code paths and advanced query optimizers and parallel technology.
36
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurement may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
37
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces. This document created or updated on July 23, 2004. Send us your comments in one of the following ways: Use the online Contact us review redbook form found at: ibm.com/redbooks Send your comments in an email to: redbook@us.ibm.com Mail your comments to: IBM Corporation, International Technical Support Organization Dept. JN9B, Building 905, Internal Mail Drop 9053D005, 11501 Burnet Road Austin, Texas 78758-3493 U.S.A
Trademarks
The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: DB2 Eserver IBM ibm.com POWER4 pSeries Redbooks Redbooks (logo)
Other company, product, and service names may be trademarks or service marks of others. TPC Benchmark, TPC-D, TPC-H, TPC-R, QppD, QppH, QppR, QthD, QthH, QthR and QphD, QphH, QphR are trademarks of the Transaction Processing Performance Council.
The following terms are trademarks or registered trademarks of IBM in the United States and/or other countries: IBM, AIX, DB2, DB2 Universal Database, Power PC Architecture, POWER, IBM Eserver, pSeries.
Performance results described in this paper were obtained under controlled conditions and may not be achievable under different conditions. All information is provided AS IS and no warranties or guarantees are expressed or implied by IBM. Actual system performance may vary and is dependent upon many factors including system hardware configuration and software design and configuration.
38