February 16, 2009 at 9:20 am
I have a fact table with multi-part clustered PK on date_key, store_key, sku_key, and a non-clustered index on sku_key, store_key, date_key. I want to run a query filtered by date and sku, but I'm geting very poor performance.
Why doesn't the optimizer lookup the sku_key first and then run the (second) query shown below which performs LOTS better? FYI this is on SQL 2005 Standard. I've read that SQL 2008 has something called star-schema optimization. I this something that would help here?
SELECT * FROM [dbo].[FactSalesFY2009] f
JOIN DimDate d ON d.date_key = f.date_key
JOIN DimStore s ON s.store_key = f.store_key
JOIN DimSku cs ON cs.sku_key = f.sku_key
WHERE d.full_date > '1/13/09' and cs.sku = '88116'
|--Parallelism(Gather Streams)
|--Merge Join(Inner Join, MERGE:(.[store_key])=([f].[store_key]), RESIDUAL:([VMIData].[dbo].[FactSalesFY2009].[store_key] as [f].[store_key]=[VMIData].[dbo].[DimStore].[store_key] as .[store_key]))
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:(.[store_key]), ORDER BY:(.[store_key] ASC))
| |--Clustered Index Scan(OBJECT:([VMIData].[dbo].[DimStore].[PK_DimStore] AS ), ORDERED FORWARD)
|--Sort(ORDER BY:([f].[store_key] ASC))
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([f].[store_key]))
|--Hash Match(Inner Join, HASH:([cs].[sku_key])=([f].[sku_key]))
|--Parallelism(Distribute Streams, Broadcast Partitioning)
| |--Nested Loops(Inner Join, OUTER REFERENCES:([cs].[sku_key]))
| |--Index Seek(OBJECT:([VMIData].[dbo].[DimSku].[_dta_index_DimSku_10_1701581100__K4_K1] AS [cs]), SEEK:([cs].[sku]='88116') ORDERED FORWARD)
| |--Clustered Index Seek(OBJECT:([VMIData].[dbo].[DimSku].[PK_DimSku] AS [cs]), SEEK:([cs].[sku_key]=[VMIData].[dbo].[DimSku].[sku_key] as [cs].[sku_key]) LOOKUP ORDERED FORWARD)
|--Nested Loops(Inner Join, OUTER REFERENCES:([d].[date_key], [Expr1008]) OPTIMIZED WITH UNORDERED PREFETCH)
|--Clustered Index Seek(OBJECT:([VMIData].[dbo].[DimDate].[PK_DimDate] AS [d]), SEEK:([d].[date_key] >= (5748) AND [d].[date_key] '2009-01-13 00:00:00.000') ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([VMIData].[dbo].[FactSalesFY2009].[PK_SalesFY2009] AS [f]), SEEK:([f].[date_key]=[VMIData].[dbo].[DimDate].[date_key] as [d].[date_key]) ORDERED FORWARD)
SELECT * FROM [dbo].[FactSalesFY2009] f
JOIN DimDate d ON d.date_key = f.date_key
JOIN DimStore s ON s.store_key = f.store_key
JOIN DimSku cs ON cs.sku_key = f.sku_key
WHERE d.full_date > '1/13/09' and f.sku_key = 3362
|--Nested Loops(Inner Join)
|--Clustered Index Seek(OBJECT:([VMIData].[dbo].[DimSku].[PK_DimSku] AS [cs]), SEEK:([cs].[sku_key]=(3362)) ORDERED FORWARD)
|--Hash Match(Inner Join, HASH:([d].[date_key])=([f].[date_key]))
|--Clustered Index Seek(OBJECT:([VMIData].[dbo].[DimDate].[PK_DimDate] AS [d]), SEEK:([d].[date_key] >= (5748) AND [d].[date_key] '2009-01-13 00:00:00.000') ORDERED FORWARD)
|--Merge Join(Inner Join, MERGE:(.[store_key])=([f].[store_key]), RESIDUAL:([VMIData].[dbo].[FactSalesFY2009].[store_key] as [f].[store_key]=[VMIData].[dbo].[DimStore].[store_key] as .[store_key]))
|--Clustered Index Scan(OBJECT:([VMIData].[dbo].[DimStore].[PK_DimStore] AS ), ORDERED FORWARD)
|--Nested Loops(Inner Join, OUTER REFERENCES:([f].[date_key], [f].[store_key], [f].[sku_key], [Expr1008]) WITH ORDERED PREFETCH)
|--Index Seek(OBJECT:([VMIData].[dbo].[FactSalesFY2009].[IX_FactSalesFY2009_3_2_1] AS [f]), SEEK:([f].[sku_key]=(3362)) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([VMIData].[dbo].[FactSalesFY2009].[PK_SalesFY2009] AS [f]), SEEK:([f].[date_key]=[VMIData].[dbo].[FactSalesFY2009].[date_key] as [f].[date_key] AND [f].[store_key]=[VMIData].[dbo].[FactSalesFY2009].[store_key] as [f].[store_key] AND [f].[sku_key]=[VMIData].[dbo].[FactSalesFY2009].[sku_key] as [f].[sku_key]) LOOKUP ORDERED FORWARD)
February 17, 2009 at 2:20 pm
welcome to the wonderful world of query tuning! You'll have to make sure you have all the correct indexes in place with columns in the correct order - you may have to have a few tries to get it right.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
February 17, 2009 at 2:35 pm
SELECT
* -- really should specify the columns you need, even if it is all of them, if not a covering index may be of benefit
FROM
[dbo].[FactSalesFY2009] f
INNER JOIN DimDate d
ON (d.date_key = f.date_key)
INNER JOIN DimStore s
ON (s.store_key = f.store_key)
INERR JOIN DimSku cs
ON (cs.sku_key = f.sku_key)
WHERE
d.full_date > '1/13/09'
and cs.sku = '88116' -- An index on f.sku_key, f.date_key would probably help
Just my 2 cents worth.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply