Columnstore index question

Question

Columnstore index question

ManicStar

SSCoach

Points: 17992
More actions
October 9, 2017 at 1:14 pm

#325791

I have some large ETL tables that i've created to snapshot production OLTP data that is sourced from more than one server and product.
The tables are wide, about 70 columns, and queried mostly by the date the rows were created in the warehouse, what i call its snapshot date. Each table has millions of rows.
Would a column store index on the most commonly queried column, that of snapshot date, help performance? Most of the queries are just returning large chunks of the result table into spreadsheets.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

bartedgerton SSCrazy Points: 2062 More actions · Answer 1

Column Store Index is on all columns, not just a single column, and a date column isn't really that good of a key for a column store because it has many unique values.

I'm guessing your best bet would be to remove any columns that are unnecessary and partition by your date key (year or year/month).

ScottPletcher SSC Guru Points: 101235 More actions · Answer 2

It would very likely help performance. Certainly if:

1) you created a nonclustered columnstore index that covered the query entirely.
Or
2) the total rows you're retrieving were not a significant enough percentage of the table

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

ManicStar SSCoach Points: 17992 More actions · Answer 3

Cool, i think its worth testing out then. Is there any formula to calculate how much more memory the BI box may need to implement this on the big ones? like width of columns by # of rows etc?

ManicStar SSCoach Points: 17992 More actions · Answer 4

Seeing about a 40% increase in performance on a larger dataset, smaller ones are actually running a bit slower, but the gain on the larger ones may outweigh that effect.

Interesting exercise.

Phil Parkin SSC Guru Points: 247222 More actions · Answer 5

ManicStar - Wednesday, October 11, 2017 11:09 AM
Seeing about a 40% increase in performance on a larger dataset, smaller ones are actually running a bit slower, but the gain on the larger ones may outweigh that effect.
Interesting exercise.

Add OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX) to the queries which run slower and have the best of both worlds.

ScottPletcher SSC Guru Points: 101235 More actions · Answer 6

Interesting on the smaller ones. Probably just SQL generating a non-optimal query plan. You can probably resolve that; worst case, you'd have to use a "hint" to force SQL to get the best plan.

But overall it'd probably be more useful to review indexing across the board to check for possible improvements, including looking at missing index stats, index usage stats, index op stats, and cardinality of values, as the most important factors.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

ManicStar SSCoach Points: 17992 More actions · Answer 7

ScottPletcher - Wednesday, October 11, 2017 11:24 AM
Interesting on the smaller ones. Probably just SQL generating a non-optimal query plan. You can probably resolve that; worst case, you'd have to use a "hint" to force SQL to get the best plan.
But overall it'd probably be more useful to review indexing across the board to check for possible improvements, including looking at missing index stats, index usage stats, index op stats, and cardinality of values, as the most important factors.

Removed the rowstore non-clustered indexes for this exercise. Gunna put one of the back as another experiment.

Mike Good SSCertifiable Points: 7415 More actions · Answer 8

ManicStar, thanks for posting your findings. Good info.