August 21, 2012 at 3:25 am
Hi
I have a fact table with around 100 million rows. Below is my query to pull top 10 records. ( i understand that this is fact table and I should be talking cubes, but I have the requirement to do it using a tsql query):
SELECT TOP 10 (DLDest.CityCode) AS 'DestCityCode',Count(DLDest.CityCode) AS 'CountDestCityCode',
(DL.CityCode) AS 'OriginCityCode',
(DA.AirlineName) AS 'AirlineName'
FROM dbo.FactFares AS FAF
LEFT OUTER JOIN DimLocation AS DLDest ON FAF.Destination = DLDest.Location_Key
LEFT OUTER JOIN DimLocation AS DL ON FAF.Origin = DL.Location_Key
LEFT OUTER JOIN DimAirlines AS DA ON FAF.Airline = DA.AirlineCode
WHERE (FAF.Tax >= 0)
GROUP BY DLDest.CityCode , DL.CityCode , DA.AirlineName
ORDER BY Count(DLDest.CityCode) DESC
So if i create an indexed view and simply do : select * from view. will that give a performance boost and correct result? Currently the query is taking ages for obvious reason of sorting entire table.(FAF.Tax >= 0 doesn't filter out much records)
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
August 21, 2012 at 3:47 am
Just an addition that i will have to remove top, left joins etc to make an indexed view here.
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
August 21, 2012 at 4:01 am
-- The query requires a scan of the whole 100M rows to evaluate the ORDER BY.
-- An index with Destination, Origin, Airline and Tax will help.
-- You *probably* don't need to join the lookups until after the aggregate -
-- this help will speed up the query.
;WITH Preaggregate AS (
SELECT TOP 10
FAF.Destination,
[CountDestCityCode] = Count(FAF.Destination),
FAF.Origin,
FAF.Airline
FROM dbo.FactFares AS FAF
WHERE (FAF.Tax >= 0)
GROUP BY FAF.Destination, FAF.Origin, FAF.Airline
ORDER BY Count(FAF.Destination) DESC
)
SELECT
DLDest.CityCode AS [DestCityCode],
FAF.[CountDestCityCode],
DL.CityCode AS [OriginCityCode],
DA.AirlineName AS [AirlineName]
FROM Preaggregate FAF
LEFT OUTER JOIN DimLocation AS DLDest ON FAF.Destination = DLDest.Location_Key
LEFT OUTER JOIN DimLocation AS DL ON FAF.Origin = DL.Location_Key
LEFT OUTER JOIN DimAirlines AS DA ON FAF.Airline = DA.AirlineCode
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 21, 2012 at 4:31 am
Thanks so much for your thoughts. I'll give it a try.
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
August 21, 2012 at 9:32 am
But the results of this and previous query doesn't match. Seems ordering on different column changed the results.
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
August 21, 2012 at 9:41 am
S_Kumar_S (8/21/2012)
But the results of this and previous query doesn't match. Seems ordering on different column changed the results.
The results will differ if any of the DIM... tables introduce more than one row per row in dbo.FactFares. Without data to analyse, it's impossible for me to say. You will have to look into this. If you need some help, try picking up say the first 50 rather than the first 10, from both queries, and posting here.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 22, 2012 at 9:59 am
thanks Chris,I'll look into it myself.
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
August 22, 2012 at 12:02 pm
Just on a side note, be careful with indexed views, particularly top 10 views and the like. On paper it doesn't look like they'll cost much overhead, they're only monitoring 10 rows. In reality, they're not. Almost every update/insert/delete on any connected table will cause this view to refresh.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
August 22, 2012 at 11:10 pm
indexed view works fine with static data (non changing data).
if your dimension changes frequently drop and create index .
August 22, 2012 at 11:47 pm
Evil Kraig F (8/22/2012)
Just on a side note, be careful with indexed views, particularly top 10 views and the like. On paper it doesn't look like they'll cost much overhead, they're only monitoring 10 rows. In reality, they're not.
Views that contain TOP cannot be indexed directly, so this specific issue does not really arise in these terms.
Almost every update/insert/delete on any connected table will cause this view to refresh.
Maintenance to an indexed view is built in to the query plan of the statement that causes a change to the view. In general, the maintenance is incremental, affecting only those rows in the indexed view that are directly affected by the change. I'm sure you knew this Craig, but I wanted to make it clear for other readers that it is not the whole view that is refreshed whenever a change occurs.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
August 23, 2012 at 3:45 am
This was a good point Paul.
Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply