Benchmarking SAS® I/O: Verifying I/O Performance Using Fio: Spencer Hayes, DLL Consulting
Benchmarking SAS® I/O: Verifying I/O Performance Using Fio: Spencer Hayes, DLL Consulting
Benchmarking SAS® I/O: Verifying I/O Performance Using Fio: Spencer Hayes, DLL Consulting
Paper 479-2013
INTRODUCTION
The authors goal of this paper is to explain how to per form accurate, repeatable I/O benchmarking using the fio tool. The paper will discuss I/O performance and how it relates to SAS. It will review high level SAS architecture and discuss why I/O is a common bottleneck. Further it will explore the various components of the I/O subsystem and how each may impact performance. This document will cover the fio tool and how it operates. The paper will review a typical SAS workload and how to create a fio job file to simulate an example SAS program. It will describe the details of the fio job file and how to customize a job file for a target I/O workload. It will review the output from fio and explain the various fields. It will describe how to evaluate a benchmarked system for adequate I/O performance. Finally, the paper will compare the fio results to an actual SAS program to validate that the simulated workload is accurate.
Each component in the chain from disk to computer system may impact performance.
Figure 1: Layout of a typical Storage Area Network (SAN) The performance of traditional spinning disks is largely dependent on the rotational speed of the platters and the time it takes to move the head to the correct position. 7200 RPM disks are common in desktop PCs while 10k or 15k RPM disks are typically configured in large enterprise storage arrays. Spinning disks perform very well for single, sequential read and write operations. However, when multiple users access the same disk, performance rapidly degrades due to the seek time associated with moving the head from place to place on the disk. Combining or pooling many disks into an array is a common way to increase storage I/O performance by aggregating the bandwidth of several disks at once. A bus connection attaches disks to computer systems. Examples of a bus connection are the internal wiring for locally-attached disks within a server, or fiber optic cables running from a storage array to a SAN switch or from the SAN switch to a Host Bus Adapter (HBA) card in a server. The bandwidth of the bus depends on the technology used. Fibre Channel, Serial Attached SCSI and Serial ATA are three common bus technologies. OS level components may also affect I/O performance. Multipathing software allows a host to send data to a disk via multiple bus connections. Volume management software such as Veritas Volume Manager or Linux LVM may be used to create logical volumes that combine multiple disks into a higher-level device on which a filesystem is built. A common volume type is a stripe which reads and writes chunks of a file to each disk in order, effectively aggregating the performance of multiple disks. Finally, the filesystem itself contributes to system performance. Various filesystems have different characteristics, some of which may or may not make them well suited for a SAS system. On RedHat Enterprise Linux, the ext3, ext4 and xfs filesystems are available, but the SAS best practices dictate that only ext4 and xfs should be used due to performance impacts from the ext3 journaling feature.
The following fio job file will simulate the SAS program above: [interleave] directory=/sasdata1 direct=0 invalidate=1 blocksize=128k rw=readwrite size=3335m The [interleave] job section defines the directory to use as well as the direct=0 and invalidate=1 options. The direct=0 option tells fio to use the OS file system cache for reads and writes. In a real -world scenario, theres no guarantee that the data sets will reside in the OS cache. To account for this, the invalidate=1 option is provided to invalidate the cache for the fio files prior to starting I/O. SAS will take advantage of the OS file system write cache and that usage accurately reflects the actual storage subsystem performance in the majority of scenarios. To simulate a true worst-case scenario where the OS file system cache is full, use direct=1 to bypass it. The blocksize=128k section specifies 128KB blocks for I/O. The rw=readwrite option sets the I/O pattern to mixed sequential reads and writes. The default mix for rw=readwrite is 50% reads, 50% writes. The total data read will be 1.7GB and the total data written will be 1.7GB. Using the rw=readwrite I/O pattern, fio concurrently reads 1.7GB and writes 1.7GB of data. That characteristic is important given the SAS design of one record at a time. SAS will read one record from each BY group in each data set, then output the appropriate record to the new data set, read another row, write another row and repeat until the program completes.
Once the fio program completes, a significant amount of detailed, verbose output is displayed. The output has been trimmed for readability: interleave: (groupid=0, jobs=1): err= 0: pid=13774: Fri Sep 21 16:16:13 2012 read : io=1676.0MB, bw=36839KB/s, iops=287 , runt= 46587msec ... write: io=1659.0MB, bw=36465KB/s, iops=284 , runt= 46587msec ... Run status group 0 (all jobs): READ: io=1676.0MB, aggrb=36839KB/s, minb=36839KB/s, maxb=36839KB/s, mint=46587msec, maxt=46587msec WRITE: io=1659.0MB, aggrb=36465KB/s, minb=36465KB/s, maxb=36465KB/s, mint=46587msec, maxt=46587msec Disk stats (read/write): dm-6: ios=25653/373464, merge=0/0, ticks=68580/181800744, in_queue=181869348, util=89.15%, aggrios=13118/6696, aggrmerge=12993/366972, aggrticks=64771/3232191, aggrin_queue=3296957, aggrutil=90.39% sda: ios=13118/6696, merge=12993/366972, ticks=64771/3232191, in_queue=3296957, util=90.39% The key data points to analyze are in bold. The first bolded line denotes the beginning of the section for the specific job and the statistics pertaining to it. If multiple jobs are used, there will be multiple sections containing data about each individual job. The designation of the start of the job-specific section and the timestamp are the two important pieces of data for this line. The next two bolded lines provide data describing the read and write performance for the job. The io=* field indicates the amount data transferred. The bw=*KB/s field describes the bandwidth of the read operation in terms of KB per second. Understand that these statistics are only for this one job and may vary greatly if multiple jobs are running concurrently. The iops=* field describes the number of I/O operations per second (IOPS). This statistic is generally more useful when observing high transaction-rate systems such as Online Transaction Processing (OLTP) databases. SAS performance tuning does not typically target IOPS as a main factor. Finally the run time of the job is listed in the runt= *msec field. The lines at the bottom of the output are the most important for the majority of test cases. These summarize and aggregate the data for all the jobs executed. In this example, the data matches the individual read and write lines contained in the interleave job section above. However, when executing multiple jobs, these lines provide the overview of the total throughput and utilization. In the summary section the io=* fields designate the total amount of data read and written for all jobs. The aggrb=* fields are the average bandwidth of all jobs and are the key to understanding the storage subsystem performance. In this example, these numbers add up to 73304KB/s or approximately 72MB/s. As noted above, SAS requires I/O bandwidth in the range of 25-135MB/s. The Disk Stats section indicates that the disk was 90.39% utilized during our job. Any additional concurrent jobs would very likely saturate the disk and degrade overall performance on this small system. This desktop Linux workstation appears to be sufficient as a single-user SAS system with data set sizes of a few GB.
The execution of the following SAS data step creates the merged data set work.censusmerge: data work.censusmerge; set work.alla work.allb; by serialno; run; The program produced the following output in the SAS log: 2917 2918 2919 2920 NOTE: NOTE: NOTE: NOTE: data work.censusmerge; set work.alla work.allb; by serialno; run; There were 3877316 observations read from the data set WORK.ALLA. There were 3877316 observations read from the data set WORK.ALLB. The data set WORK.CENSUSMERGE has 7754632 observations and 28 variables. DATA statement used (Total process time): real time 46.93 seconds user cpu time 6.85 seconds system cpu time 9.89 seconds memory 487.13k OS Memory 10720.00k Timestamp 09/21/2012 05:44:25 PM Page Faults 0 Page Reclaims 0 Page Swaps 0 Voluntary Context Switches 5730 Involuntary Context Switches 8520 Block Input Operations 0 Block Output Operations 0
As well as the merged data set: sas /sasdata1/work/SAS_work1E5700005C87_gordon > ls -alhtr total 4.0G ... -rw-rw-r-- 1 sas sas 1.7G Sep 21 17:44 censusmerge.sas7bdat The SAS program interleaving two data sets required 46.93 seconds to complete. The fio job above simulating this SAS program took 46587 milliseconds, or 46.59 seconds. Subsequent runs of both the SAS program and the fio job will produce slight variations on the program durations, but overall the fio job appears to be a very close fit for the example SAS scenario it was designed to simulate. As an additional test of scalability, the scenario was re-run with larger data sets. The second test used source data sets of 4.1GB to produce a merged data set of 8.2GB. fio was configured with a job file identical to the first with the exception of the size=16674m option. The same SAS code along with a larger sample of the census data were used to produce the SAS data sets. The SAS program finished in 3:59.35 or roughly 239 seconds. The fio job completed in 235485msec or roughly 235 seconds. As before, the fio job appears to simulate accurately the real-world SAS program.
CONCLUSION
SAS performance tuning is a tricky and difficult problem to solve. Without the best tools and methodologies, it can be a black hole of lost time and productivity. However, administrators can quickly identify bottlenecks in storage I/O by using fio to model and evaluate this major constraint on SAS performance. In addition fio provides a repeatable process that allows businesses not only to benchmark baseline configurations, but also compare performance against new IT purchases and quantify the return on investment of capital expenditures.
REFERENCES
Augustine, Bob. 2012. Storage 101: Understanding Storage for SAS Applications . Cary, NC: SAS Institute Inc. Available at http://support.sas.com/resources/papers/proceedings12/416-2012.pdf Axboe, Jens. 2012. fio HOWTO. Available at http://git.kernel.dk/?p=fio.git;a=blob;f=HOWTO Figure 1. Moll, Michael. 2006. Schema of a Storage Area Network (SAN). Available at http://commons.wikimedia.org/wiki/File:Schema_SAN_german.png
ACKNOWLEDGMENTS
I would like to thank Bob Augustine for writing the cited paper above that gave me the inspiration for this paper. Also thanks to Jens Axboe and the fio developers for creating an outstanding and useful tool. Credit also goes to Don Hayes, Rebecca Hayes, Jennifer Hayes, Roger Hayes and Deborah Hayes, also known as my dad, sister, wife, uncle and mom, several of whom either are or plan to be SAS professionals as well. Their encouragement and support was invaluable. Finally, thanks to Jeff Holoman and Jeremy Reynolds for their valuable input and advice.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at: Name: Spencer Hayes Enterprise: DLL Consulting City, State ZIP: Johns Creek, GA 30005 Work Phone: 404-668-0830 E-mail: spencer.hayes@dllbi.com
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies.