September 13, 2019 at 8:02 pm
Hello experts,
I am running a CREATE INDEX statement for a large table (more than a billion rows). I am checking the progress with this query.
SELECT session_id as SPID, command,
a.text AS Query, start_time,
cast(percent_complete as decimal(38,38)),
dateadd(second,estimated_completion_time/1000, getdate()) as estimated_completion_time
FROM sys.dm_exec_requests r CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) a
WHERE r.command in ('CREATE INDEX')
I noticed that the a.text/Query column contains: insert [dbo].[Table] select * from [dbo].[Table]
Why is that? Does the index creation literally re-insert every row in the table into itself?
Thanks for any help!
webrunner
-------------------
A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html
September 13, 2019 at 8:51 pm
Why is that? Does the index creation literally re-insert every row in the table into itself?
Not exactly, but it re-inserts every row into the new index (unless it's a filtered index).
September 14, 2019 at 10:02 pm
Is the index in question a "Clustered Index"?
--Jeff Moden
Change is inevitable... Change for the better is not.
September 16, 2019 at 3:12 pm
Hi Jeff,
Yes, it is a clustered index.
Thanks,
webrunner
-------------------
A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html
September 16, 2019 at 7:17 pm
I noticed that the a.text/Query column contains: insert [dbo].[Table] select * from [dbo].[Table]
Why is that? Does the index creation literally re-insert every row in the table into itself?
The answer to your question is, yes... for anything over 128 extents (which is only 1,024 pages or 8MB), the original object is preserved as is and a brand new copy of the object is created with the new index in place. That means that it copies every bloody row and, in the FULL Recovery Model, it's a fully logged evolution. It's minimally logged in the BULK LOGGED and SIMPLE recovery models.
The problem with creating a new index (Clustered or NOT) or REBUILDing one is the fact that the original object does remain until the new object is committed. For really large Clustered Indexes and depending on how much freespace the MDF/NDF file has, that can cause a huge expansion of the MDF/NDF file so that it can accommodate the size of both the original an new objects. Of course, the space occupied by the original object is returned as freespace to the file but that can be a whopping amount that's just not needed especially in the short term.
Yes, it is a clustered index.
The way I handle such large CREATEs or REBUILDs is to create a new file group and create the index there. From there you can either do another CREATE (WITH DROP_EXISTING=ON) to move it back to the PRIMARY file group (and then drop the temporary file group you made) of leave it in the new file group and do the bit of work necessary to shrink the PRIMARY file group and rebuild the indexes that were fragmented by the shrink. The latter has the advantage for future index maintenance (REBUILDs) on the large table by creating another file group and doing the CREATE (WITH DROP_EXISTING=ON) and then drop the original (not the PRIMARY). You can go back and forth like that on a regular basis to "rebuild" the large CLUSTERED index without ending up with a shedload of wasted freespace.
Of course, it's much better to build a Clustered Index that will only fragment 1 or 2% every half decade or so.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply