Non clustered Column store index

Question

Non clustered Column store index

SQL_Surfer

SSCrazy Eights

Points: 8320
More actions
April 24, 2016 at 4:48 pm

#313444

Anyway, why my non-column store index is not performing well as it should? I'm querying a 5 million row table. I've attached a query plan. Any tips to speed this up would be greatly appreciated.
Thanks in advance.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 1

SQL_Surfer (4/24/2016)
Anyway, why my non-column store index is not performing well as it should? I'm querying a 5 million row table. I've attached a query plan. Any tips to speed this up would be greatly appreciated.
Thanks in advance.

1) How is it performing? Why do you think it should be faster?

2) Have you done a file IO stall and wait stats analysis while it is running? Try sp_whoisactive with it's delay parameter.

3) Why are you using column store index with a paltry 5M row table?!?! I don't know if I would bother with that.

4) The estimated row count is 1 coming out of the fact table. If that estimate is off by a lot the query plan is probably suboptimal.

5) The estimated cost of the plan is 3.84 yet it shows parallelism. That is crazy. What is your cost threshold for parallelism setting?? I note that the default of 5 is WAY to low for modern systems, and yours seems to be even lower.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 2

SQL_Surfer (4/24/2016)
Anyway, why my non-column store index is not performing well as it should? I'm querying a 5 million row table. I've attached a query plan. Any tips to speed this up would be greatly appreciated.
Thanks in advance.

You ask about a non-columnstore index, however the query plan shows that a columnstore index is used.

I agree with all the points Kevin makes. But I have some additional points as well:

1. Look into the foreign key constraints between fact table and dimension tables. They should exist, be enabled, and be trusted. They currently fail at least one of those three points.

2. Please post a repro script that includes CREATE TABLE statements, a few sample rows of data and an indication of the actual table size for each table, and the query. You can simplify if this is the typical 300+-column datawarehouse query - the pattern will probably not change if you reduce it to just a handful of columns per table used.

3. Please post the *actual* execution plan instead of the *estimated* plan you posted now.

4. Can you also try what happens if you run the same query with the added hint OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX)? Please post actual plan for that case and tell us how performance is affected.

5. Which SQL Server version are you running?

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Eric M Russell SSC Guru Points: 125561 More actions · Answer 3

SQL_Surfer (4/24/2016)
Anyway, why my non-column store index is not performing well as it should? I'm querying a 5 million row table. I've attached a query plan. Any tips to speed this up would be greatly appreciated.
Thanks in advance.

You've attached an Estimated Plan, how about sending us an Actual Plan?

Also, what is the actual versus expected runtime?

"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Grant Fritchey SSC Guru Points: 398890 More actions · Answer 4

In addition to what everyone else has already asked, I'd be curious about a couple of things about this query. The estimates are that it's processing 22 rows out of the tables and that it's ultimately returning a single row. There is no aggregation going on that I can see. Columnstore tables aren't generally as good at small batch (and singleton) processing as traditional indexes. You might not be seeing the best behavior because you've got the wrong choices on the indexing. Why are using a columnstore index if you're not doing any kind of aggregation to take advantage of the pivoted storage that it offers?

However, as everyone else has pointed out, getting an actual plan would help. Then we can see if the estimated number of rows and the actual number of rows match. The table cardinality in the plan is 5 million, so it's likely they will, but no way to know without checking.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Grant Fritchey SSC Guru Points: 398890 More actions · Answer 5

TheSQLGuru (4/24/2016)
5) The estimated cost of the plan is 3.84 yet it shows parallelism. That is crazy. What is your cost threshold for parallelism setting?? I note that the default of 5 is WAY to low for modern systems, and yours seems to be even lower.

One point on that, I have found that even with small data sets (10s of thousands of rows), columnstore can benefit aggregate queries in a marvelous way. Giant improvements in performance... but only if they go parallel and you get batch mode processing. I don't recommend dropping the cost threshold even lower than 5 (if anything, it still needs to be bumped up), but I do recommend using traceflag 8649 to force parallel execution on queries that are using the columnstore appropriately.

I'm not sure we are here though.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 6

Is TF 8649 supported now? If not (or even if so) it could be better to use Adam Machanic's spiffy high-CPU-estimate trick: http://sqlblog.com/blogs/adam_machanic/archive/2013/07/11/next-level-parallel-plan-porcing.aspx

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Grant Fritchey SSC Guru Points: 398890 More actions · Answer 7

TheSQLGuru (4/26/2016)
Is TF 8649 supported now? If not (or even if so) it could be better to use Adam Machanic's spiffy high-CPU-estimate trick: http://sqlblog.com/blogs/adam_machanic/archive/2013/07/11/next-level-parallel-plan-porcing.aspx

I doubt it, but all the people i know who are doing columnstore are using the traceflag or Adam's CPU trick.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 8

Grant Fritchey (4/26/2016)
TheSQLGuru (4/26/2016)
Is TF 8649 supported now? If not (or even if so) it could be better to use Adam Machanic's spiffy high-CPU-estimate trick: http://sqlblog.com/blogs/adam_machanic/archive/2013/07/11/next-level-parallel-plan-porcing.aspx
I doubt it, but all the people i know who are doing columnstore are using the traceflag or Adam's CPU trick.

IIRC there have been quite a few improvements to the column-store-index subsystem since 2012 RTM too.

Hey, OP, what build of SQL Server are you on??

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Sergiy SSC Guru Points: 110208 More actions · Answer 9

Grant Fritchey (4/26/2016)
One point on that, I have found that even with small data sets (10s of thousands of rows), columnstore can benefit aggregate queries in a marvelous way. Giant improvements in performance...

And it's even hard to imagine how gigantic would be improvements in performance from normalising the data...

Not to mention appropriate indexing.

No, no, not gonna mention that. Because it would throw any columnstore index ouit of the window.

_____________
Code for TallyGenerator