Every now and then you may have had to move nonclustered indexes between filegroups. There are two ways it can be done: drop the existing indexes first then create new ones or execute a create statement with the DROP_EXISTING = ON option. At face value it may appear that SQL Server will do the same work regardless of which way you choose but in reality one of them will result in significantly higher overhead than the other. I'm going to show the differences between the two, and for fun I'll also look at what SQL Server is doing under the hood when you rebuild a nonclustered index.
(Don't care about how and what I did, trust me that I'm right, and just want to get to the endgame? Then skip to the conclusions. Otherwise, read on…)
The Setup
To get a real world example I used a production database that's been restored on a test server running SQL 2008 Standard SP1 CU 9 on Windows Server 2003 Standard x86. Data + Indexes take up ~20 GB, there are two filegroups - PRIMARY and INDEXES, and all clustered indexes have been rebuilt with a FILLFACTOR of 100. The SQL service was restarted in between each test to ensure that nothing hanging around in cache from a previous test influenced the next. The tests I ran were:
- Drop indexes in the PRIMARY filegroup and create new indexes in the INDEXES filegroup. Do the same thing back the other direction.
- CREATE INDEX with DROP_EXISTING = ON, indexes originating in the PRIMARY filegroup and created in the INDEXES filegroup. Again, do the same thing back the other direction
- Index rebuild, once with the indexes in the PRIMARY filegroup and once with the indexes in the INDEXES filegroup
In each test I specified PADINDEX = ON, FILLFACTOR = 80 to leave 80% free space in the intermediate and leaf pages of the nonclustered indexes. I used profiler to capture reads, writes, CPU, and duration for the batch and the sys.dm_io_virtual_file_stats DMV to see reads and writes specific to the files in the PRIMARY and INDEXES filegroups.
Results
Here's the breakdown of each test run:
Operation | From Filegroup | To Filegroup | Profiler | sys.dm_io_virtual_file_stats | |||||
Reads | Writes | CPU | Duration (ms) | Filegroup | Reads | Writes | |||
Create w\ Drop Existing | PRIMARY | INDEXES | 1,631,186 | 445,530 | 699,512 | 196,874 | PRIMARY | 15,202 | 149 |
INDEXES | 9 | 17,297 | |||||||
Create w\ Drop Existing | INDEXES | PRIMARY | 1,634,950 | 445,563 | 712,171 | 197,892 | PRIMARY | 189 | 21,844 |
INDEXES | 9,566 | 50 | |||||||
Drop & Create | PRIMARY | INDEXES | 3,639,243 | 445,796 | 1,787,090 | 518,424 | PRIMARY | 106,298 | 248 |
INDEXES | 34,517 | 53,089 | |||||||
Drop & Create | INDEXES | PRIMARY | 3,662,077 | 445,976 | 1,780,435 | 518,943 | PRIMARY | 140,967 | 56,415 |
INDEXES | 8 | 90 | |||||||
Rebuild | PRIMARY | PRIMARY | 1,634,969 | 445,546 | 693,139 | 192,294 | PRIMARY | 14,998 | 21,882 |
INDEXES | 0 | 0 | |||||||
Rebuild | INDEXES | INDEXES | 1,635,322 | 445,513 | 707,967 | 195,565 | PRIMARY | 171 | 79 |
INDEXES | 9,237 | 18,286 |
Observant eyes may notice the huge difference between reads & writes coming from profiler vs. sys.dm_io_virtual_file_stats. On the read side it's because profiler is reporting logical reads and sys.dm_io_virtual_file_stats is showing physical reads. Writes are a different story - profiler is reporting physical writes and BOL indicates sys.dm_io_virtual_file_stats shows the "Number of writes made on this file". Louis Davidson seems to think that means physical writes and so does Dave Turpin, but clearly the profiler and DMV numbers don't match up. Also interesting is the consistency between writes reported by profiler whereas writes from sys.dm_io_virtual_file_stats are all over the map. In any case, for this exercise the difference doesn't matter as I'm not comparing the capture methods against one another; instead, I'm using them to support each other relative to each index operation.
Conclusions
Based on the test results we can draw a few conclusions:
- The reads & writes from sys.dm_io_virtual_file_stats show that when rebuilding\creating the index with DROP_EXISTING = ON the SQL engine is reading from the existing index pages and not from the clustered index. However, when creating a new nonclustered index the engine will read from the clustered index (or heap if no clustered index exists).Rebuilding a nonclustered index and creating the index with DROP_EXISTING = ON have the same I/O and CPU cost and will take roughly the same amount of time to complete.
- Dropping an index first and then creating it again is an average of 2-3 times more costly in I/O, CPU, and duration vs. rebuilding\creating the index with DROP_EXISTING = ON.
So at the end of the day the lesson here is that if you need to move a nonclustered index to another filegroup stick with the CREATE INDEX…WITH (DROP_EXISTING=ON) syntax. Now you know!