One good database maintenance practice is to keep the indexes in good working order. This is typically done via index defragmentation routines. Some people perform routine index maintenance, some people do not, and still others have done some sort of “set it and forget it” script provided by a vendor and may or may not be doing index maintenance while thinking they are.
When done properly, this is a good thing. Sadly, there are many cases where good practice is not adhered to (let alone best practice). Some of these less than ideal situations could be when an application vendor has something built into their code to perform some sort of index maintenance, unbeknownst to you. In my opinion this is a near-worst case scenario. Being diligent, a good DBA would try to establish routine maintenance for the database based on best practices (or at least good practices). Should that DBA do such a thing and there is index maintenance baked into compiled code somewhere that is not accessible to the DBA, you just doubled the work.
Corruption
Why is this a near-worst case scenario beside the point just made? Imagine being stuck, due to vendor requirements, on one of the versions of SQL Server affected by the online index rebuild bug (read more about it here by Aaron Bertrand), and for some reason you end up with a corrupted index every night. In fact, it is the same index that becomes corrupt on a nightly basis. Every day you drop and recreate the index, and lucky for you, it is not a unique index or a clustered index. But despite those efforts, the corruption continues to return. And since you know you are performing your index maintenance with maxdop of 1 or you have decided to do the rebuilds offline, you know for certain that your process is not the cause.
Take it a step further and you disable your maintenance routine entirely just to be 100% certain your process is not causing it. Yet when you check the recent update date for the index you find that it was still somehow updated/rebuilt while your maintenance was disabled. This is indeed a near-worst case scenario. This means that you have some digging to do to find when and how the index maintenance is being run and who/what is doing it.
How in the world are you going to find what/when this is happening without causing too much increased load on the server? Lightbulb!
Being on SQL Server 2012 or later, it suddenly occurs to you that Extended Events is an awesome tool and could very well be of some use in solving this problem. Let’s pull out that handy dandy script that we have used time and again to try and find potential Extended Events that could help us with this problem.
SELECT c.OBJECT_NAME AS EventName,p.name AS PackageName,o.description AS EventDescription FROM sys.dm_xe_objects o INNER JOIN sys.dm_xe_object_columns c ON o.name = c.OBJECT_NAME and o.package_guid = c.object_package_guid INNER JOIN sys.dm_xe_packages p ON o.package_guid = p.guid WHERE object_type='event' AND c.name = 'channel' AND (c.OBJECT_NAME like '%index%' or o.description like '%index%') ORDER BY o.package_guid;
Running that code produces two very interesting results that seem plausible in this exercise. Those events are index_build_extents_allocation and progress_report_online_index_operation. Even the descriptions seem exceedingly promising.
Especially promising is the event named progress_report_online_index_operation. Look at it, the name tells us that it is a report on online index operations. Querying the available fields seems viable, the description is good, so give it a go.
Well, I will save you the trouble. I tried to create an index with the online = on syntax, tried a rebuild with online option, tried a reorganize option, and tried every way I could think to rebuild an index online. I just wanted to get some sort of data and I could not get it to fire while using the asynchronous_file_target (I really wanted to use the file target for ease of investigation later). Though the event_file target doesn’t appear to work, I was able to get results by sending data to the ring_buffer or to the Livestream target.
What does that mean? If you decide to watch the live data from SSMS, then you invoke the Livestream target, and you can see the data as it streams. Of course that also means you have to sit and watch to try and find when this phantom online rebuild occurs. If you choose to use the ring_buffer, then you run the risk of events cycling out of the buffer. This may or may not be a bad thing since you basically need just an event showing the source of the rebuild and the statement used.
XEvents for the Save
Without further ado, let’s see what we can do to find these online index build operations so we can hopefully target the additional maintenance that is causing us grief.
-- Create the Event Session IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='OnlineIXOps') DROP EVENT SESSION OnlineIXOps ON SERVER; GO CREATE EVENT SESSION OnlineIXOps ON SERVER --+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ --+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ /* The online index operations */--ADD EVENT sqlserver.progress_report_online_index_operation --+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ --+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ADD EVENT sqlserver.progress_report_online_index_operation( ACTION (sqlserver.database_name,sqlserver.client_hostname,sqlserver.client_app_name, sqlserver.sql_text, sqlserver.session_id ) --Change this to match the database in question, WHERE sqlserver.database_id=5 ) ADD TARGET package0.ring_buffer WITH (MAX_DISPATCH_LATENCY = 5SECONDS) GO -- Start the Event Session ALTER EVENT SESSION OnlineIXOps ON SERVER STATE = START; GO
Notice that in this session I have decided to target a very specific database. For me, this database happens to be my ReportServer database. And for testing purposes, this works wonderfully. All I need to do is something like the following rebuild operation.
alter index IX_PolicyUserRole on PolicyUserRole REBUILD PARTITION = ALL WITH (online = on)
This is a rather innocuous index for me. In addition to it being on my test instance, it is in a database that is used lightly and a table that is used even more lightly. If you are looking to test, I recommend you find a similar scenario and definitely test somewhere other than on a production box.
Having rebuilt that index online, I can then query the Event session with TSQL using a query similar to the following:
SELECT event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name, event_data.value('(event/@timestamp)[1]', 'varchar(50)') AS timestamp, event_data.value('(event/action[@name="database_name"]/value)[1]', 'varchar(max)') as DBName ,event_data.value('(event/data[@name="object_name"]/value)[1]', 'varchar(max)') as ObjName ,event_data.value('(event/data[@name="index_name"]/value)[1]', 'varchar(max)') as index_name ,event_data.value('(event/data[@name="partition_number"]/value)[1]', 'varchar(max)') as PartitionNumber ,event_data.value('(event/action[@name="session_id"]/value)[1]', 'varchar(max)') as SessionID ,event_data.value('(event/data[@name="build_stage"]/value)[1]', 'varchar(max)') as Build_Stage ,event_data.value('(event/data[@name="build_stage"]/text)[1]', 'varchar(max)') as BuildStage_Description ,event_data.value('(event/action[@name="client_hostname"]/value)[1]', 'varchar(max)') as Client_hostName, event_data.value('(event/action[@name="client_app_name"]/value)[1]', 'varchar(max)') as Client_AppName, event_data.value('(event/action[@name="sql_text"]/value)[1]', 'varchar(max)') AS sql_text ,event_data.value('(event/data[@name="duration"]/value)[1]', 'Decimal(18,2)')/1000 as Duration_ms ,event_data.value('(event/data[@name="rows_inserted"]/value)[1]', 'varchar(max)') as rows_inserted FROM( SELECT evnt.query('.') AS event_data FROM ( SELECT CAST(target_data AS xml) AS TargetData FROM sys.dm_xe_sessions AS s INNER JOIN sys.dm_xe_session_targets AS t ON s.address = t.event_session_address WHERE s.name = 'OnlineIXOps' AND t.target_name = 'ring_buffer' ) AS tab CROSS APPLY TargetData.nodes ('RingBufferTarget/event') AS split(evnt) ) AS evts(event_data) Order by timestamp, Build_Stage
Now I have an abundance of data at my fingertips to determine how this index continues to get rebuilt. And since this is likely happening in the middle of the night, and that I have to rely on the ring_buffer, I might decide to set an agent job that will run this query every hour or two, to see if I can trap the necessary data about how my indexes are getting rebuilt without my knowledge.
This is just one more tool that can be thrown into your utility belt or tool shed to help find those little things that cause us pain from time to time. I hope it helps you, whether in the same scenario I just described or otherwise. This could even be of help when trying to determine timings of defrag operations when trying to plan for maintenance windows.