Parallelism and Columnstore Indexes

Columnstore indexes are fascinating and really cool. Unfortunately, they’re adding an interesting new wrinkle to an old problem.

What’s the Cost Threshold for Parallelism set to on your server? If you just said “The whatsis of whositz?” then the value is 5. The cost threshold is the point at which the estimated cost of an execution plan goes from definitely serial to possibly parallel. This default was set for SQL Server 2000 and hasn’t been changed since. I’ve long argued, loudly, that it’s too low. I’ve suggested changing it to a much higher value. My advice has gone from 35 to 50 and several places in between. You could just look at the median or the mode of costs on your system and use the higher of those values as starting point. Getting a cost higher than 5 in a query is insanely simple. Which means that queries that are absolutely not going to benefit from parallelism go parallel. This causes unnecessary CPU load on your system and slow performance of the queries. So, listen to my advice and run out there and change this value immediately.

Or…

Columnstore indexes have two methods of processing. The first is the row mode. That’s actually the same method used by all your standard indexes and tables. You can write a query that does aggregate calculations against a table that has a Columnstore index. You’ll then see a performance benefit. However, check the properties on the Columnstore operator in the execution plan. Does it say “Actual/Estimated Execution Mode = Row?” Then you’re getting row level processing. That can be fine and you can see a performance improvement using row mode. There is another level of processing though.

Batch mode processing is where it takes 1000 rows at a time and processes the aggregations simultaneously across multiple threads. Wait, multiple threads? Yes. You have to have parallelism in order to get batch mode. Joe Sack talks about it here. You can read more about it in the SQL Server documentation. Batch mode processing is faster (depending, etc.). Especially if you’re dealing with larger data sets (more than tens of thousands of rows), you’re going to want to see Batch as the Execution Mode in the Columnstore operators.

There is an issue here though. That’s right, we just bumped our Cost Threshold for Parallelism up to 5000 (if 50 is good, 500 is better, and we’ll add a zero just to be sure). OK, 50. How high does the cost have to be in order to see benefits from parallelism and batch mode processing? Ah, well, there’s the rub. When I was testing this, I had about 100,000 rows in a table. Nothing large. Execution times went from 100ms to 50ms. Yep, getting parallel execution, batch mode processing, of the nonclustered columnstore index cut my execution time in half.

Next question should be, oh, well, what was the estimated cost of that query? 30? 20? 15?

Try 3.7.

Yeah, I got a 50% improvement in performance by changing the Cost Threshold from 50 to 3.

In other tests I had a plan with an estimated cost of .77. Execution time went from 75ms to 57ms and reads went from 115 to 62 because of batch mode processing.

Now, I am not advocating dropping everyone’s Cost Threshold to 3 (or less). What I am absolutely saying is that you have a new wrinkle that you must add to your calculations. Are you using Columnstore indexes anywhere? Then you need to examine those execution plans. What are their costs? If you adjust your Cost Threshold below them, do you see a benefit? If so, how low did you just go? Now, is that negatively impacting your other queries? You could leave your Cost Threshold as is and use a query hint with TRACEFLAG 8649 (documented by the Sensei Paul White) on your queries against Columnstore indexes, but that comes with a list of caveats as long as your arm. The other alternative is to, again, leave the Cost Threshold alone and use Adam Machanic’s method of forcing parallelism just for the specified queries. These options can be unattractive, but you have to deal with getting parallelism for Columnstore indexes somehow.

Welcome to the quandary. Tuning in and around parallelism just got harder because of Columnstore Indexes.

Want to talk query tuning some more? I’m doing a live, all day session at Connections on query tuning using execution plans this September, 2015. We’ll go over examples with columnstore and lots more. Click here now to sign up.

The post Parallelism and Columnstore Indexes appeared first on Home Of The Scary DBA.

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Parallelism and Columnstore Indexes

Rate

Share

Share

Rate

Parallelism and Columnstore Indexes

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts