Blog Post

SQL Server Storage best practices

,

Server storage is one of the most important things for SQL Server performance. Chances to utilize a lot of RAM and powerful CPUs when storage system is slow are extremely small.

IO and Throughput

One of the main properties of storage is IO or Throughput.  So what is IO ? IO can also be called IOPs or IOs/sec. It determines how many operations per second can be done with current storage. If measured in MS/sec it determines how much data can be written and/or read.

You can following performance monitor counters to figure out the numbers in your system.

 

Performance Monitor: Logical Disk

Disk Read Bytes / Sec

Disk Write Bytes / Sec

Disk Read / Sec

Disk Writes / Sec

Latency

Latency is the time taken between request for the data and return of the data. It is measured in milliseconds (ms).

Performance Monitor: Logical Disk

Avg. Disk Sec / read

Avg. Disk Sec / write

More on healthy latency values later

Capacity

Storage capacity determines how much data you can store.

Measured in GB/TaB

If you`re using some kind of RAID you should keep it in mind when calculating storage capacity cause you`ll get much less storage in RAID 10 comparing to RAID 5, on the flip side RAID 10 will be way much faster and more reliable.

Terminology

Below is some storage specific terminology.

JBOD – Just a Bunch of Disks

SAME – Stripe and Mirror Everything

RAID – Redundant Array of Inexpensive Disks

DAS Direct Attached Storage

NAS Network Attached Storage

SAN Storage Area Network

CAS Content Addressable Storage

 

Traditional Spindle throughput

Below reefers to HDDs itself.

10K RPM – 100 -130 IOPs at ‘full stroke’

15K RPM – 150-180 IOPs at ‘full stroke’

Can achieve 2x or more when ‘short stroking’ the disks (using less than 20% capacity of the physical spindle)

These are for random 8K I/O

Aggregate throughput when sequential access:

Between 90MB/sec and 125MB/sec for a single drive

If true sequential, any block size over 8K will give you these numbers

Depends on drive form factor, 3.5” drives slightly faster than 2.5”

Approximate latency: 3-5ms

Storage selection 

  • Number of drives matter
  • More drives typically yield better speed, do not confuse physical drives and LUNs (logical drives)
  • True for both SAN and DAS
  • … Less so for SSD, but still relevant (especially for NAND)
  • If designing for performance, make sure the topology can handle it
  • Understand the path to the drives

Best Practice: Validate and compare configurations prior to deployment

Random or sequential ? 

Knowing if your workload is random or sequential in nature can be a hard question to answer. Depends a lot on application design. SQL Server Access Methods can give some insights. High values of Readahead pages/sec indicates a lot of sequential activity. High values of index seeks / sec indicates a lot of random activity.

Transaction log is for example always sequential

Best Practice: Isolate transaction log on dedicated drives(LUNs)

Configuring disks in Windows 

  • Use Disk Alignment at 1024KB
  • Use GPT if MBR not large enough
  • Format partitions at 64KB allocation unit size
  • One partition per LUN
  • Only use Dynamic Disks when there is a need to stripe LUNs using Windows striping  (i.e. Analysis Services workload)

Tools:

  • Diskpar.exe, DiskPart.exe and DmDiag.exe
  • Format.exe, fsutil.exe
  • Disk Manager

Sample patterns 

Read/Write percentTypeBlockThreads/queueSimulates
80/20Random8K#cores/FilesTypical OLTP data files
0/100Sequential60K1/32Transaction log
100/0Sequential512K1/16Table scans
0/100Sequential256K1/16Bulk load
100/0Random32K#cores/1SSAS workload
100/0Sequential1MB1/32Backup
0/100Random64K-256K#cores/FilesCheckpoints

Typical IO workloads

OLTP (Online Transaction Processing)

Typically, heavy on 8KB random read / writes

Some amount of read-ahead

Size varies – multiples of 8K (see read-ahead slide)

Many “mixed” workloads observed in customer deployments

Rule of Thumb: Optimize for Random I/O (spindle count)

RDW (Relational Data Warehousing)

Typical 64-512KB reads (table and range scan)

128-256KB writes (bulk load)

Rule of Thumb: Optimize for high aggregate throughput I/O

Analysis Services

Up to 64KB random reads, Avg. Blocks often around 32KB

Highly random and often fragmented data

Rule of Thumb: Optimize for Random, 32KB blocks

OLTP worloads 

I/O patterns generally random in nature

Selective reads

Writes to data files through periodic checkpoint operations

Random in nature with heavy bursts of writes

Can issue a large amount of outstanding I/O

Steady writes to transaction log

Many ”OLTP” deployments consist of ”mixed” workload with some amount of online reporting

Will result in larger block I/O that is sequential in nature to happen concurrent with small block (~8K) I/O

Can make sizing more challenging

If you have any storage specific questions or want to discuss the topic in more details please don`t hesitate to leave a comment.

 

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating