Non clustered Column store index

  • Anyway, why my non-column store index is not performing well as it should? I'm querying a 5 million row table. I've attached a query plan. Any tips to speed this up would be greatly appreciated.

    Thanks in advance.

  • SQL_Surfer (4/24/2016)


    Anyway, why my non-column store index is not performing well as it should? I'm querying a 5 million row table. I've attached a query plan. Any tips to speed this up would be greatly appreciated.

    Thanks in advance.

    1) How is it performing? Why do you think it should be faster?

    2) Have you done a file IO stall and wait stats analysis while it is running? Try sp_whoisactive with it's delay parameter.

    3) Why are you using column store index with a paltry 5M row table?!?! I don't know if I would bother with that.

    4) The estimated row count is 1 coming out of the fact table. If that estimate is off by a lot the query plan is probably suboptimal.

    5) The estimated cost of the plan is 3.84 yet it shows parallelism. That is crazy. What is your cost threshold for parallelism setting?? I note that the default of 5 is WAY to low for modern systems, and yours seems to be even lower.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • SQL_Surfer (4/24/2016)


    Anyway, why my non-column store index is not performing well as it should? I'm querying a 5 million row table. I've attached a query plan. Any tips to speed this up would be greatly appreciated.

    Thanks in advance.

    You ask about a non-columnstore index, however the query plan shows that a columnstore index is used.

    I agree with all the points Kevin makes. But I have some additional points as well:

    1. Look into the foreign key constraints between fact table and dimension tables. They should exist, be enabled, and be trusted. They currently fail at least one of those three points.

    2. Please post a repro script that includes CREATE TABLE statements, a few sample rows of data and an indication of the actual table size for each table, and the query. You can simplify if this is the typical 300+-column datawarehouse query - the pattern will probably not change if you reduce it to just a handful of columns per table used.

    3. Please post the *actual* execution plan instead of the *estimated* plan you posted now.

    4. Can you also try what happens if you run the same query with the added hint OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX)? Please post actual plan for that case and tell us how performance is affected.

    5. Which SQL Server version are you running?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • SQL_Surfer (4/24/2016)


    Anyway, why my non-column store index is not performing well as it should? I'm querying a 5 million row table. I've attached a query plan. Any tips to speed this up would be greatly appreciated.

    Thanks in advance.

    You've attached an Estimated Plan, how about sending us an Actual Plan?

    Also, what is the actual versus expected runtime?

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • In addition to what everyone else has already asked, I'd be curious about a couple of things about this query. The estimates are that it's processing 22 rows out of the tables and that it's ultimately returning a single row. There is no aggregation going on that I can see. Columnstore tables aren't generally as good at small batch (and singleton) processing as traditional indexes. You might not be seeing the best behavior because you've got the wrong choices on the indexing. Why are using a columnstore index if you're not doing any kind of aggregation to take advantage of the pivoted storage that it offers?

    However, as everyone else has pointed out, getting an actual plan would help. Then we can see if the estimated number of rows and the actual number of rows match. The table cardinality in the plan is 5 million, so it's likely they will, but no way to know without checking.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • TheSQLGuru (4/24/2016)


    5) The estimated cost of the plan is 3.84 yet it shows parallelism. That is crazy. What is your cost threshold for parallelism setting?? I note that the default of 5 is WAY to low for modern systems, and yours seems to be even lower.

    One point on that, I have found that even with small data sets (10s of thousands of rows), columnstore can benefit aggregate queries in a marvelous way. Giant improvements in performance... but only if they go parallel and you get batch mode processing. I don't recommend dropping the cost threshold even lower than 5 (if anything, it still needs to be bumped up), but I do recommend using traceflag 8649 to force parallel execution on queries that are using the columnstore appropriately.

    I'm not sure we are here though.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Is TF 8649 supported now? If not (or even if so) it could be better to use Adam Machanic's spiffy high-CPU-estimate trick: http://sqlblog.com/blogs/adam_machanic/archive/2013/07/11/next-level-parallel-plan-porcing.aspx

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (4/26/2016)


    Is TF 8649 supported now? If not (or even if so) it could be better to use Adam Machanic's spiffy high-CPU-estimate trick: http://sqlblog.com/blogs/adam_machanic/archive/2013/07/11/next-level-parallel-plan-porcing.aspx

    I doubt it, but all the people i know who are doing columnstore are using the traceflag or Adam's CPU trick.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (4/26/2016)


    TheSQLGuru (4/26/2016)


    Is TF 8649 supported now? If not (or even if so) it could be better to use Adam Machanic's spiffy high-CPU-estimate trick: http://sqlblog.com/blogs/adam_machanic/archive/2013/07/11/next-level-parallel-plan-porcing.aspx

    I doubt it, but all the people i know who are doing columnstore are using the traceflag or Adam's CPU trick.

    IIRC there have been quite a few improvements to the column-store-index subsystem since 2012 RTM too.

    Hey, OP, what build of SQL Server are you on??

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Grant Fritchey (4/26/2016)


    One point on that, I have found that even with small data sets (10s of thousands of rows), columnstore can benefit aggregate queries in a marvelous way. Giant improvements in performance...

    And it's even hard to imagine how gigantic would be improvements in performance from normalising the data...

    Not to mention appropriate indexing.

    No, no, not gonna mention that. Because it would throw any columnstore index ouit of the window.

    _____________
    Code for TallyGenerator

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply