I was reading a blog by Pinal Dave that was about using catalog views to obtain information about stats for the entire database. While reading the blog, I was reminded about an interesting tidbit of information I had seen once upon a time concerning statistics naming in SQL Server.
This got me looking for that information and burrowing down the rabbit hole. I found the reference for the naming convention concerning auto generated statistics. That reference is from Paul Randal and can be found here. In that article, Paul outlines the five parts of the name of an auto-created statistic – with each part being separated by an underscore. This got the wheels churning a bit more.
I had to go do a bit more research in order to put all the pieces together. Based on the naming convention, I knew I needed to convert Hex to Integer, so I could see the object id (column or table). You might think this would be straight forward, but the value in the name of the statistic is not a true Hex Value. That value is a hex string and needs a bit of manipulation in order to convert from string to Hex.
After some searching, I found the solution. Peter Debetta created this handy little function for this very purpose. I found that script here. After some testing, and comparing results I was pleased with how well it worked. Now that I had the pieces necessary, I could venture further down that little rabbit hole.
First though, there is one more piece that I want to discuss. Remember that naming scheme for that auto created statistics? I noted that the name is separate by an underscore – each element of the name anyway. I decided that I would use that underscore as my string delimiter and implement a string split function to break out the name parts. Any number of splitter functions would work. I chose to use the same function that was written about by Jeff Moden here.
Now all of the pieces have been referenced and are in play. Let’s take a look at the query and some of those catalog views.
[codesyntax lang=”tsql”]
with autostatnames as ( SELECT split.itemnumber as Statnum,S.name, split.item as Statnamepart,s.object_id From sys.stats S CROSS APPLY dbo.DelimitedSplit8K(s.name,'_') split Where s.auto_created = 1 ), statpart1 as ( Select asn.name,CONVERT(INT,master.dbo.HexStrToVarBin('0x' + asn.Statnamepart)) as StatCol From autostatnames asn Where asn.Statnum = 4 ), statpart2 as ( Select asn.name,CONVERT(INT,master.dbo.HexStrToVarBin('0x' + asn.Statnamepart)) as StatObj From autostatnames asn Where asn.Statnum = 5 ) Select DISTINCT s.name as StatName, OBJECT_NAME(s2.StatObj) as TableName,COL_NAME(s2.StatObj,s1.StatCol) as ColName ,s.auto_created ,s.user_created ,s.no_recompute ,STATS_DATE(s.[object_id], s.stats_id) AS LastUpdated From statpart1 s1 Inner Join statpart2 s2 On s1.name = s2.name Inner Join sys.stats s On s.name = s1.name Union All SELECT DISTINCT OBJECT_NAME(s.[object_id]) AS TableName, COL_NAME(s.object_id,sc.column_id) as ColName, s.name AS StatName, s.auto_created, s.user_created, s.no_recompute, STATS_DATE(s.[object_id], s.stats_id) AS LastUpdated FROM sys.stats s Inner Join sys.stats_columns sc ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id Where s.auto_created = 0 Order By s.name desc
[/codesyntax]
I started this little script out with three quick CTEs. The first is simply to gather the auto created stats. It is in this first CTE that the string splitter is utilized. The next two help me to separate out the data so I can work with it. The second CTE is to manage the data related to the column part of the statistics name. In this CTE, you will note that I employ the use of that Hex function. The third CTE is just like the second, but it treats the data related to the table part of the statistics name.
When splitting the data in that first CTE, and due to the naming format, we see that the column is always the fourth part of the name and the table is the fifth part. The first part is really an empty result due to the name leading off with an underscore.
Once the data is grouped into those CTEs, I can then return it in a friendlier result set. Here I use more functions that are built in with SQL server. This reduces the number of catalog views to which I need to JOIN. And since the CTEs are only working with statistics that are auto generated, I need to UNION in another query to pull in those statistics that are not auto created. Note that this query only looks at two catalog views.
What is the point of such an elaborate query in the first part to get this information when I can simply use the second part to do all of it? Not a whole lot of point other than to prove a point. There is consistency between the queries. There is consistency between the naming of those statistics and the object data that can be found in those catalog views.
Any of these methods will return the desired results concerning statistics in the database. You have essentially two methods displayed by me, and then the method employed by Pinal. Have fun with it!!