February 28, 2012 at 10:14 pm
I am pretty much sold on the following strategy, but given that my history is mostly in OLTP environments where a large number of indexes can be bad, I am looking for a sanity check.
We are working on the roll out of a new phase of our DW. There is an SSAS cube on top of the relational structure. Processing of the cube results in the same, repeatable, documented workload each night. My goal is to ensure that cube processing is as efficient as possible on the relational side.
We will use one dimension as an example. This dimensions contains 14 attributes, so during processing I see a SELECT DISTINCT <attribute/column> FROM TABLE for each of the 14 attributes. The most efficient means of execution per query is via a covering index (the single attribute column) so the select distinct results in an index scan and a stream aggregate. I can easily prove this is the most efficient method, by far, per query.
Taking this to the next level, that would mean 14 non clustered indexes on this dimension table. While initially leaving a bad taste in my mouth, the result is exactly what I need it to be. Am I missing something obvious? Is this a crazy approach?
The only downside I could come up with is on INSERTS during the daily load, but we typically disable all NC indexes during the load anyways. I have plenty of time to rebuild the NC indexes between load and cube processing.
Should I check myself into an institution?
March 1, 2012 at 10:28 am
Do you have the drive space to support those indexes? Will you in five years, given data growth?
Do you have the time to create and/or maintain those indexes? Will you in five years, given data growth?
Are there other queries that happen later that might benefit from wider multi-column indexes (i.e. augment the single-column indexes for other uses)?
If you get answers that lead you to doing this, and other questions don't give contrary answers, then do it!
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply