July 16, 2017 at 5:27 am
Hi all.
I've wrote a blog post to ask a question on how statistics are created for new indexes. Essentially I found a clustered index which is composed of two varchar columns, when I looked at the statistics the columns were the opposite way round.
I did a bit of testing on this which I've shown in the post but to summarise it seems that when I modify a column order when creating a clustered index the statistic column order remains the same, as it were basically. For a non clustered index however the statistics reflect the index after the column order modification.
So, why the difference!? 🙂 Is it the way stats are created at index creation for the different types?
http://sqlclarity.blogspot.co.uk/2017/07/a-question-on-index-and-statistic.html
As always, many thanks!
David
'Only he who wanders finds new paths'
July 16, 2017 at 8:45 am
I think you're just seeing another symptom of the oddity/quirk/bug with sys.stats_columns reported in this Connect item: https://connect.microsoft.com/SQLServer/feedback/details/1163126/unexpected-stats-column-id-behavior
If you run DBCC SHOW_STATISTICS for the statistics in question, you should see the change in order correctly reflected in both the density vector and the histogram (i.e., the density vector should have a row for LastName, and a row for Lastname, FirstName after the change, while the histogram should show LastName values).
The statistics themselves have changed as expected. It's the stats_column_id in sys.stats_columns that is incorrect, and that's what the GUI uses to display order for the columns.
It's a bit annoying, but at least the statistics themselves are correct; it would be nice if MS responded to the Connect item, though.
As mentioned in one of the threads linked from the Connect item, it seems that for a clustered index's stats, the stats_column_id just reports ordinal position of the columns with respect to their column_id; on that note, I see this behavior even if the first clustered index creation is on LastName,FirstName; it doesn't seem tied to recreating the clustered index.
It does seem limited to clustered index statistics (whether inline with table creation or not). If I leave the table a heap, and create a multi-column statistics object on the two columns, the stats_column_id behaves as documented.
As the workarounds on that Connect item note, for querying and reporting purposes, you could just substitute key_ordinal from sys.index_columns for stats on clustered indexes. Again, annoying, but fortunately not a massive problem 🙂
We can hope that if we start upvoting that Connect item MS will maybe pay some attention, but with zero responses in the 27 months it's been open, I'm not holding my breath 🙂
Cheers!
July 19, 2017 at 11:50 pm
Yes, that's exactly it - thank you!
'Only he who wanders finds new paths'
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply