Why is trivial plan creating statistics

  • Hi all

    As far as I understand a trivial plan is a plan that there is no need or a way to optimize it. The most known example for a trivial plan is running a simple select statement that has no group by or order by and the table has no indexes at all. What I don't understand is why is the server creating statistics if a query is a trivial query and the server can not optimize it in any case. Bellow is a script that shows the the server does create statistics for a trivial plan.

    --creating a new table and inserting data into it

    WITH MyCTE as (

    SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY message_id) as RowNum

    FROM sys.messages

    )

    SELECT RowNum

    INTO MyTable

    FROM MyCTE

    go

    --check that there are no statistics for this table

    SELECT * FROM sys.stats WHERE object_id = OBJECT_ID('dbo.MyTable')

    go

    --running a query on the new table. Be sure to include the actual query plan

    --and check in the plan's properties. You'll see that this plan is a trivial plan

    SELECT RowNum

    FROM MyTable

    WHERE RowNum = 1

    go

    --Now the table does have statistics

    SELECT * FROM sys.stats WHERE object_id = OBJECT_ID('dbo.MyTable')

    go

    DROP TABLE dbo.MyTable

    Thank you for your answers

    Adi

    --------------------------------------------------------------
    To know how to ask questions and increase the chances of getting asnwers:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I'm not an expert by any means but, IIRC, it's not actually a "Trivial" query. It has a WHERE clause and the optimizer will typically build stats if they don't already exist whenever a WHERE clause is involved even if no indexes are in place.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I think that it is a trivial plan. If I understand correctly (and I admit that that I might not understand it correctly:-D), a trivial plan is for queries that have only one possible plan. If I query a table that has no index at all and the query is very simple, then the server has only one option - do a table scan. Even if I'm wrong about it, I checked the query's plan's properties, and according to it, it is a trivial plan. I do agree with you that with out the where clause it wouldn't create the statistics

    Adi

    --------------------------------------------------------------
    To know how to ask questions and increase the chances of getting asnwers:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Adi Cohn-120898 (12/20/2015)


    I think that it is a trivial plan. If I understand correctly (and I admit that that I might not understand it correctly:-D), a trivial plan is for queries that have only one possible plan. If I query a table that has no index at all and the query is very simple, then the server has only one option - do a table scan. Even if I'm wrong about it, I checked the query's plan's properties, and according to it, it is a trivial plan. I do agree with you that with out the where clause it wouldn't create the statistics

    Adi

    Heh... to borrow from Star Trek, that's the trouble with trivials. 😛 Trivial or not, if there's a WHERE clause, it's most likely going to create some statistics to support it. Gail Shaw or someone like her would be able to explain much better.

    Me? I'm having to rewrite some junk code that we have that my predecessor brought down from the internet. If you think you have problems with statistics, imagine code that automatically generates statistics for every bloody table column in every bloody database and then regenerates them all twice a week just due to age instead of whether the table has actually taken on any new rows or not and whether the code or the data is trivial or not.

    Don't get me started on the index maintenance routines in that code either. We hit some sort of a tipping point in the decision making that code makes and I now have 90GB log files (and a lot longer restore if something happens) where I never needed more than 10GB before.

    I spotted this crud software shortly after I came on board and wanted to rewrite it way back then but have always been told "we've got bigger fish for you to fry". I'm thinking it's time for some beef and I'm now in the process of rewriting it. With all due respect to some of the more popular solutions, they just don't do it the way I want to.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/20/2015)


    If you think you have problems with statistics, imagine code that automatically generates statistics for every bloody table column in every bloody database and then regenerates them all twice a week just due to age instead of whether the table has actually taken on any new rows or not and whether the code or the data is trivial or not.

    Don't get me started on the index maintenance routines in that code either. We hit some sort of a tipping point in the decision making that code makes and I now have 90GB log files (and a lot longer restore if something happens) where I never needed more than 10GB before.

    I spotted this crud software shortly after I came on board and wanted to rewrite it way back then but have always been told "we've got bigger fish for you to fry". I'm thinking it's time for some beef and I'm now in the process of rewriting it.

    You can't be serious.

    Reading this has made me weary.

    My condolences.

    .............. have you communicated this to the offending parties? It can only help them and whoever they work for in future if you do.

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

  • MadAdmin (12/21/2015)


    Jeff Moden (12/20/2015)


    If you think you have problems with statistics, imagine code that automatically generates statistics for every bloody table column in every bloody database and then regenerates them all twice a week just due to age instead of whether the table has actually taken on any new rows or not and whether the code or the data is trivial or not.

    Don't get me started on the index maintenance routines in that code either. We hit some sort of a tipping point in the decision making that code makes and I now have 90GB log files (and a lot longer restore if something happens) where I never needed more than 10GB before.

    I spotted this crud software shortly after I came on board and wanted to rewrite it way back then but have always been told "we've got bigger fish for you to fry". I'm thinking it's time for some beef and I'm now in the process of rewriting it.

    You can't be serious.

    Reading this has made me weary.

    My condolences.

    .............. have you communicated this to the offending parties? It can only help them and whoever they work for in future if you do.

    Heh... it's tempting to send some very high velocity pork chops their way but it wouldn't do any good. The folks that downloaded the code had the best intentions and are mostly ignorant of SQL Server. They were desperate before I came along. I've not tried to find the folks that originally wrote the code and published on the web because I'd be tempted to make a road trip in my favorite postal uniform. Once I fix things, then I might try to find them. I have, in the past, found that a lot of people simply won't respond. After all, you do get what you paid for it and that's nothing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/20/2015)


    Adi Cohn-120898 (12/20/2015)


    I think that it is a trivial plan. If I understand correctly (and I admit that that I might not understand it correctly:-D), a trivial plan is for queries that have only one possible plan. If I query a table that has no index at all and the query is very simple, then the server has only one option - do a table scan. Even if I'm wrong about it, I checked the query's plan's properties, and according to it, it is a trivial plan. I do agree with you that with out the where clause it wouldn't create the statistics

    Adi

    Heh... to borrow from Star Trek, that's the trouble with trivials. 😛 Trivial or not, if there's a WHERE clause, it's most likely going to create some statistics to support it. Gail Shaw or someone like her would be able to explain much better.

    Just a guess!

    If sql does a table scan, then its going to have the opportunity to generate a histogram by virtue of seeing the column values as a part of the predicate. Its also possible to note that this query could have used an index and the system might put the appropriate index information into the "missing indexes" dmv. Given all that, the sql server designers might determine that doing the histogram makes sense, since you're already doing most of the work doing the table scan, and that there is now evidence that you might run this query again and that it would benefit from the index being created. On the other hand, if the statistics run determine that the column isn't very selective, then it makes sense NOT to put this into the "missing indexes" dmv. But the only way sql server knows if the index is "missing" is to determine the cardinality of the column, and looks like you'll want the statistics / histogram to do so.

    This is obviously just a guess on my part, Gail or any other experts will obviously know better!

  • The fact that a column-level statistics object gets created for the RowNum column means nothing more nor less than that you have auto_create_statistics on for database in question.

    When that is on, column-level statistics will get created for columns used in query predicates (see "When to Create Statistics" at https://msdn.microsoft.com/en-us/library/ms190397.aspx).

    As far as I'm aware, I've never run across anything (official documentation or otherwise) that suggests that auto_create_statistics only applies to non-trivial plans, so that may just be a red herring of mysterious origin 🙂

    Also, with auto_create_statistics on, the statistics object gets created when the plan is generated, whether the query is executed or not. Just generate the estimated plan for that query without running it, and you'll see that the statistics get created anyway.

    If you turn off auto_create_statistics for the database and run through the exercise again, you'll see a warning in the plan for a column without statistics.

    Cheers!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply