SQL Server 2012:
Installation and Configuration
Module 3: Pre-installation Tasks
for SQL Server 2012
Glenn Berry
Glenn@SQLskills.com
Introduction
Using a standardized naming scheme for disks and directories
Considering your workload type for storage
Choosing the storage type
Introduction to RAID
RAID levels and SQL Server workloads
Provisioning your logical drives
Testing logical drive performance
Using a Standard Naming Scheme for Disks
This makes it easier to work with multiple database servers
It is also very useful for database mirroring and AlwaysOn AGs
This also makes it easier for restoring during disaster recovery
Example naming scheme:
Drive letter and directory Purpose
L:\SQLLogs SQL Server log files
K:\SQLLogs SQL Server log files
M:\SQLBackups SQL Server backup files
P:\SQLData SQL Server data files
Q:\SQLData SQL Server data files
R:\SQLData SQL Server data files
S:\SQLData SQL Server data files
T:\TempDB SQL Server tempdb files
Considering Your Workload for Storage
SQL Server can have several different workload types
Three most common types:
Online Transaction Processing (OLTP)
Relational Data Warehouse (DW)
Online Analytical Processing (OLAP)
These workload types have different I/O access patterns
OLTP workload has frequent writes to data files and log file
Also has random reads from data files if database does not fit in memory
Random I/O performance is very important
DW workload has large sequential reads from data files
Sequential I/O performance is very important
OLAP workload has lots of random reads from cube files
Random I/O performance is very important
Additional Workload Considerations
You may have a mixed I/O workload for several reasons
If you have multiple databases on the same instance
This complicates and randomizes the I/O workload
If you have multiple databases with log files on the same LUN
This will make the I/O workload on that LUN random instead of sequential
If you will be using HA/DR features that read from the transaction log
This will cause reads from the LUN where the log files are located
Index creation and maintenance will cause sequential I/O pressure
Reads and writes to data files, and writes to log file
Database backups will cause sequential I/O pressure
Reads from data files and log files, and writes to the backup file(s)
Database restores will cause sequential I/O pressure
Reads from backup file(s) and writes to data files and log file
Choosing the Storage Type for SQL Server
Depends on server usage, performance requirements, budget
Existing infrastructure, employee skillset, and politics also matter
Four main storage types
Internal drives - traditional magnetic drives or solid state drives (SSDs)
PCI-E storage cards
Direct-attached storage (DAS) - traditional magnetic drives or SSDs
Storage area networks (SAN) - traditional magnetic drives or SSDs
Internal, DAS and SAN can use hybrid or tiered-storage
Mixture of magnetic storage and SSD storage
Good compromise between space, performance and cost
Storage details can make a huge difference for I/O performance
10K drives versus 15K drives, 3Gbps SAS versus 6Gbps SAS
Bandwidth of RAID controller, HBA or iSCSI NIC is very important
Tips for Requesting Storage
Don’t just ask for storage based on space requirements
You will be more likely to get poor performance storage
Specify your necessary performance requirements
Sequential performance in MB/second or GB/second
Random performance in input/output operations per second (IOPS)
Specify your necessary redundancy requirements
What will the storage be used for?
Data files, log files, backup files, etc.
Consider using “short-stroking” to improve I/O performance
Intentionally using a smaller percentage of your available space
You ask for much more space than you think you will need
RAID Basics
Redundant array of inexpensive disks (RAID)
Standardized method of managing multiple drives with a controller
Provides redundancy and higher performance than a single drive
Allows higher capacity logical drives than is possible with one drive
Hardware RAID controllers manage multiple drives
Server RAID controllers have dedicated cache memory
Cache can be used for reads or writes or both
Several different RAID levels are commonly used
RAID 1
RAID 5
RAID 50
RAID 10
RAID 1
RAID 1 is called mirroring
Requires two physical drives
Data is copied to both drives
Requires 50% storage space overhead
Drive array can survive the loss of one drive
You need to replace the failed drive and allow the RAID controller to
automatically rebuild the mirror as soon as possible
No performance impact after the loss of one drive
Very common to install the OS to a RAID 1 volume on a server
Usually done with two internal drives in the server
This allows the server to operate normally after losing one drive
RAID 5
RAID 5 is called striping with parity
Requires at least three physical drives
Data is striped between all drives
After data is written to all drives, parity information is calculated and then
striped to all of the drives
This causes a write performance penalty
This allows the array to survive the loss of one drive in the array
Performance is severely affected after the loss of one drive
Failed drive must be replaced as soon as possible
Requires 1/(the number of drives) as storage overhead
RAID 5 is very popular with I.T. departments
It is quite economical because of low storage overhead
Risk of failure goes up as you add drives to the array
RAID 50
RAID 50 is called striping across multiple RAID 5 data sets
Requires at least six physical drives
Minimum of two, three-drive RAID 5 arrays
Requires 1/(number of drives) in each RAID 5 array for storage overhead
Can survive the loss of one drive in each RAID 5 array
Performs better than RAID 5 after the loss of one drive
Can be a good compromise between RAID 5 and RAID 10
Less expensive than RAID 10
More expensive than RAID 5, but provides better redundancy
Not all RAID controllers support RAID 50
RAID 10
RAID 10 is called a striped set of mirrors
Data is mirrored and then striped
Possible to survive the loss of more than one drive
Requires a minimum of four physical drives
Must be an even number of physical drives
No write performance penalty
Very well-suited to write intensive workloads
Ideal for SQL Server log files
Requires a 50% storage space overhead
More expensive than RAID 5
RAID 10 is very popular with database administrators
Provides better write performance and better redundancy than RAID 5
It is more expensive than RAID 5
Raid Level and SQL Server Workloads
The number of spindles in an array is extremely important
A larger number of smaller drives will perform much better than a small
number of larger drives
RAID 5 has a write performance handicap
RAID 5 cannot survive the loss of more than one disk in an array
RAID 5 arrays with larger numbers of disks are more likely to lose a disk
Try to put infrequently accessed data on RAID 5 to save money
RAID 10 and RAID 1 have very good write performance
RAID 10 also has more redundancy than RAID 5
Always try to use RAID 1 or RAID 10 for log files
As a DBA, don’t negotiate with yourself on storage
Ask for RAID 10, and then negotiate down if necessary
Use RAID 5 for data files and backup files if necessary
Provisioning Your Logical Drives
Drive arrays must be created and presented to the host
This is done differently depending on the storage type
Internal drives
Internal PCI-E storage
Direct attached storage (DAS)
Storage area network (SAN)
Windows Disk Manager is used to initialize and format drives
You also assign drive letters and name the LUN
Use your standardized naming scheme
It is a good idea to also create the directories for SQL Server
This will make the SQL Server installation go more quickly
Testing Your Logical Drive Performance
CrystalDiskMark is a quick way to test drive performance
Enables you to test each logical drive in a few minutes
Tests sequential and random I/O performance
Use for first round of testing, before you use SQLIO
SQLIO allows you to do much more detailed logical drive testing
Does not require SQL Server to be installed
Does not generate a database specific workload
Can be much more time consuming to run comprehensive tests
CrystalDiskMark 3.0.1 x64
Very easy to use, no complicated configuration required
You can choose the file size for the test runs
50MB, 100MB, 500MB, 1000MB, 2000MB, 4000MB
You can choose the file type
Random data or non-random data
Some SSD controllers use compression for performance
Random data is not very compressible
You can choose the number of test runs (1-9)
Quickly measures sequential and random I/O performance
Sequential reads and writes in MB/second
Large and small random reads and writes at different queue depths
Measured in MB/sec and IOPS
Free download
http://bit.ly/TDoGOi
CrystalDiskMark Example Output
SQLIO Disk Benchmark
Despite the name, it has nothing to do with SQL Server
Free tool developed by Microsoft to evaluate I/O performance
You can use it on any server running a recent version of Windows
Command-line utility
Requires some expertise to properly configure and run
Can take a long time to run a comprehensive set of tests
Allows you to test the limits of your I/O subsystem
Measures IOPS
Sequential throughput in MB/second
Latency in milliseconds
Download location
http://bit.ly/QxwUV8
Summary
Use a standardized naming scheme for drives and directories
Makes it easier to work with multiple database servers
Makes it easier to use HA/DR features like AlwaysOn AGs
Consider your workload type when you think about storage
Read versus write frequency
Sequential versus random I/O performance
Choose the appropriate RAID level for your logical drives
This affects performance, redundancy and cost
Make sure your drives are provisioned and tested before you install
SQL Server 2012
Use CrystalDiskMark for the first round of I/O testing
Use SQLIO for more comprehensive I/O testing
What is Next?
Module 4 will cover actually installing SQL Server 2012
Selecting which features to install
Specifying Service accounts for SQL Server Services
Changing SQL Server Agent to automatic start
Database Engine configuration tasks
Specifying data directories for SQL Server