Introduction
Performance, performance, performance. While performance in SQL Server can be achieved with code improvement and proper infrastructure maintenance, both tasks require an effort. Code reviews and deep index analysis would take time, but sometimes the obvious is right in front of you. Let's talk about HEAP tables, aka tables with no clustered indexes that somebody (hopefully not you) left you to maintain and use.
Problem
I'm in no way an expert on indexes, but when I think about tables and indexes, I always draw a parallel to books.
Virtually all books have a table of contents (TOC) at the beginning of the book (see Fig #1) and index at the end of the book (see Fig #2) to make searching for content as quick and efficient as possible. Additionally, the TOC and index also allow the book author to better organize his/her thoughts when writing a book, which is very similar to making SQL Server storage engine more efficient.
Obviously, while books and tables have similarities, they have some distinctive differences - while book index can co-exists with the book, clustered index is a part of the table. Of course, there are plenty of other differences.
A book with an index is akin to a table with clustered index while a book with a table of contents is akin to table with a non-clustered index. Having just a TOC without an index in a book is not efficient enough. I mean, you can find your stuff, but it will take much longer to find it.
That is also true for tables with no clustered indexes. For the most part, heap tables are bad for performance and rarely a good idea. This is perhaps why when I think about adding a clustered index to a HEAP table, I think about enabling a Turbo button. Tables with clustered indexes will help SQL Server to locate the data quickly.
Since we are in agreement that HEAP tables are usually bad and tables with clustered are usually good, let's try to find and fix those heap tables, shall we?
Solution
The following script will detect all (tables with less than 100 records are excluded) the HEAP tables in a database and create a script to inject identity column into every table.
select objs.name as table_name , pars.row_count , CASE WHEN ISNULL(stas.last_user_scan, '01-01-1900')>ISNULL(stas.last_user_seek, '01-01-1900') AND ISNULL(stas.last_user_scan, '01-01-1900')>ISNULL(stas.last_user_lookup, '01-01-1900') AND ISNULL(stas.last_user_scan, '01-01-1900')>ISNULL(stas.last_user_update, '01-01-1900') THEN stas.last_user_scan WHEN ISNULL(stas.last_user_seek, '01-01-1900')>ISNULL(stas.last_user_scan, '01-01-1900') AND ISNULL(stas.last_user_seek, '01-01-1900')>ISNULL(stas.last_user_lookup, '01-01-1900') AND ISNULL(stas.last_user_seek, '01-01-1900')>ISNULL(stas.last_user_update, '01-01-1900') THEN stas.last_user_seek WHEN ISNULL(stas.last_user_lookup, '01-01-1900')>ISNULL(stas.last_user_scan, '01-01-1900') AND ISNULL(stas.last_user_lookup, '01-01-1900')>ISNULL(stas.last_user_seek , '01-01-1900') AND ISNULL(stas.last_user_lookup, '01-01-1900')>ISNULL(stas.last_user_update, '01-01-1900') THEN stas.last_user_lookup WHEN ISNULL(stas.last_user_update, '01-01-1900')>=ISNULL(stas.last_user_scan, '01-01-1900') AND ISNULL(stas.last_user_update, '01-01-1900')>=ISNULL(stas.last_user_lookup, '01-01-1900') AND ISNULL(stas.last_user_update, '01-01-1900')>ISNULL(stas.last_user_lookup, '01-01-1900') THEN stas.last_user_update ELSE '01-02-1900' END as latest_access , stas.user_lookups + stas.user_scans + stas.user_seeks + stas.user_updates as total_access , 'ALTER TABLE ' + objs.name + ' ADD ID INT NOT NULL IDENTITY (1,1) ' as addid_script from sys.indexes as inds inner join sys.objects as objs on objs.object_id = inds.object_id inner join sys.dm_db_partition_stats as pars on pars.object_id = objs.object_id inner join sys.dm_db_index_usage_stats as stas on stas.object_id = inds.object_id AND stas.index_id = inds.index_id where inds.type_desc = 'heap' and pars.index_id=0 and pars.row_count>100 order by row_count desc, objs.name
All you need to do is to run the script above to discover HEAP tables (see Fig #3), take a ready to use script from addid_script column and run it to create an extra column that by default will add a clustered index. I also included last_access and total_access columns to add some context on which frequency and recency of indexes usage.
Disclaimer
This blog post is partially based on the following resources: