June 7, 2007 at 8:39 pm
I have a query with many subqueries that use the same filters. Is there a way I can list the filters only once?
SELECT MetroCity AS City, COUNT(*) AS CountMonthOne
, (SELECT COUNT(*) FROM Property a WHERE a.MetroCity = s.MetroCity
AND SaleDate Between '2005-08-01' AND '2005-08-31'
AND County = 'King'
AND MetroCity IS NOT NULL
AND PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
) AS CountMonthTwo
, AVG(SalePrice) AS AverageMonthOne
, (SELECT AVG(SalePrice) FROM Property b WHERE b.MetroCity = s.MetroCity
AND SaleDate Between '2005-08-01' AND '2005-08-31'
AND County = 'King'
AND MetroCity IS NOT NULL
AND PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
) AS AverageMonthTwo
, (SELECT COUNT(*) FROM Property c WHERE c.MetroCity = s.MetroCity
AND SaleDate Between '2006-01-01' AND '2006-08-31'
AND County = 'King'
AND MetroCity IS NOT NULL
AND PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
) AS CountYTDOne
, (SELECT COUNT(*) FROM Property d WHERE d.MetroCity = s.MetroCity
AND SaleDate Between '2005-01-01' AND '2005-08-31'
AND County = 'King'
AND MetroCity IS NOT NULL
AND PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
) AS CountYTDTwo
, (SELECT AVG(SalePrice) FROM Property e WHERE e.MetroCity = s.MetroCity
AND SaleDate Between '2006-01-01' AND '2006-08-31'
AND County = 'King'
AND MetroCity IS NOT NULL
AND PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
) AS AverageYTDOne
, (SELECT AVG(SalePrice) FROM Property f WHERE f.MetroCity = s.MetroCity
AND SaleDate Between '2005-01-01' AND '2005-08-31'
AND County = 'King'
AND MetroCity IS NOT NULL
AND PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
) AS AverageYTDTwo
FROM dbo.Property s
WHERE SaleDate Between '2006-08-01' AND '2006-08-31'
AND County = 'King'
AND MetroCity IS NOT NULL
AND PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
GROUP BY MetroCity, s.MetroCity
ORDER BY MetroCity
June 7, 2007 at 10:34 pm
Create table UDF and supply changed values as parameters to it.
Do your aggregations from the values returned by this UDF.
_____________
Code for TallyGenerator
June 8, 2007 at 7:22 am
I couldn't test it but the following should work and makes the code a little less redundant.
select metrocity
,saledate
,saleprice
into #temp_
from property
where saledate between '2005-01-01' and '2005-08-31'
or saledate between '2006-01-01' and '2006-08-31'
AND County = 'King'
AND MetroCity IS NOT NULL
AND PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
select metrocity as city
,count(*) as CountMonthOne
,(SELECT COUNT(*) FROM #temp t WHERE t.MetroCity = s.MetroCity and saledate between '2005-08-01' and '2005-08-31') as CountMonthTwo
,AVG(SalePrice) AS AverageMonthOne
,(SELECT AVG(SalePrice) FROM #temp t WHERE t.MetroCity = s.MetroCity and saledate between '2005-08-01' and '2005-08-31') as AverageMonthTwo
--,etc......
FROM dbo.Property s
WHERE SaleDate Between '2006-08-01' AND '2006-08-31'
AND County = 'King'
AND MetroCity IS NOT NULL
AND PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
GROUP BY MetroCity, s.MetroCity
ORDER BY MetroCity
--James
June 11, 2007 at 12:24 pm
Here's a solution that gets rid of all subqueries. It could be written simpler with CTE if you have SS2005. If you use the temporary table, I would still use the unions in a derived table rather than the subqueries--and the where clause in the unions would just be "WHERE SaleDate between...". BTW, I got much better performance out of both our queries by creating one index on County, PropertyUseCode and SaleDate.
select MetroCity as City, Count(MonthOne) as CountMonthOne, Count(MonthTwo) as CountMonthTwo, Avg(MonthOne) as AvgMonthOne, Avg(MonthTwo) as AvgMonthTwo, Count(YTDOne) as CountYTDOne, Count(YTDTwo) as CountYTDTwo, Avg(YTDOne) as AvgYTDOne, Avg(YTDTwo) as AvgYTDTwo from ( select MetroCity, SalePrice as MonthOne, null as YTDOne, null as MonthTwo, null as YTDTwo from dbo.Property where MetroCity IS NOT NULL AND County = 'King' and PropertyUseCode = 'Single Family' AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5 and SaleDate Between '2006-08-01' AND '2006-08-31' union all select MetroCity, null as MonthOne, SalePrice as YTDOne, null as MonthTwo, null as YTDTwo from dbo.Property where MetroCity IS NOT NULL AND County = 'King' and PropertyUseCode = 'Single Family' AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5 and SaleDate Between '2006-01-01' AND '2006-08-31' union all select MetroCity, null as MonthOne, null as YTDOne, SalePrice as MonthTwo, null as YTDTwo from dbo.Property where MetroCity IS NOT NULL AND County = 'King' and PropertyUseCode = 'Single Family' AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5 and SaleDate Between '2005-08-01' AND '2005-08-31' union all select MetroCity, null as MonthOne, null as YTDOne, null as MonthTwo, SalePrice as YTDTwo from dbo.Property where MetroCity IS NOT NULL AND County = 'King' and PropertyUseCode = 'Single Family' AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5 and SaleDate Between '2005-01-01' AND '2005-08-31' ) as P order by MetroCity group by MetroCity;
Tomm Carr
--
Version Normal Form -- http://groups.google.com/group/vrdbms
June 11, 2007 at 12:41 pm
If Tomm's solution works then this might be a bit faster because it should only do one pass thru the table. Since I have no data I could not test this...
select
MetroCity as City,
Count(MonthOne) as CountMonthOne,
Count(MonthTwo) as CountMonthTwo,
Avg(MonthOne) as AvgMonthOne,
Avg(MonthTwo) as AvgMonthTwo,
Count(YTDOne) as CountYTDOne,
Count(YTDTwo) as CountYTDTwo,
Avg(YTDOne) as AvgYTDOne,
Avg(YTDTwo) as AvgYTDTwo
from
(
select
MetroCity,
CASE
WHEN SaleDate Between '2006-08-01' AND '2006-08-31' THEN SalePrice
ELSE NULL
END AS MonthOne,
CASE
WHEN SaleDate Between '2006-01-01' AND '2006-08-31' THEN SalePrice
ELSE NULL
END AS YTDOne,
CASE
WHEN SaleDate Between '2005-08-01' AND '2005-08-31' THEN SalePrice
ELSE NULL
END AS MonthTwo,
CASE
WHEN SaleDate Between '2005-01-01' AND '2005-08-31' THEN SalePrice
ELSE NULL
END AS YTDTwo
from
dbo.Property
where MetroCity IS NOT NULL
AND County = 'King'
and PropertyUseCode = 'Single Family'
AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5
) as P
order by MetroCitygroup by MetroCity;
---------------------------------------------
[font="Verdana"]Nothing is impossible.
It is just a matter of time and money.[/font]
June 11, 2007 at 1:27 pm
Thanks everyone for your input. These were all great suggestions. They all work good and when comparing speed only, here are the results:
The one I posted takes about 4 seconds
JLK: 1 second to create the temp table and then 1 second to get the result set. I combined both sets into one stored procedure and then got the result set in 2 seconds and then in 1 second in subsequent runs.
Tomm Carr: 1st run: 23 seconds, 2nd run: 5 seconds; Subsequent runs: 1-2 seconds
JacekO: 1st run: 22 seconds, 2nd run: 4 seconds; Subsequent runs: 1-2 seconds
There are 1,338,224 records in the table. Thanks again!
Note: I did need to make one correction to JLK's query: where (saledate between '2005-01-01' and '2005-08-31' or saledate between '2006-01-01' and '2006-08-31'). Need to confine the "OR." For the other two, needed to put GROUP BY before ORDER BY.
June 11, 2007 at 2:24 pm
Wow. I generally do all I can to avoid temp tables but with almost 1.5M records, apparently the initial overhead averages out. Just for grins, how good is the hybrid of the best. Test this out:
select metrocity, saledate, saleprice into #temp from property where MetroCity IS NOT NULL AND County = 'King' AND PropertyUseCode = 'Single Family' AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5 AND (saledate between '2005-01-01' and '2005-08-31' or saledate between '2006-01-01' and '2006-08-31'); select MetroCity as City, Count(MonthOne) as CountMonthOne, Count(MonthTwo) as CountMonthTwo, Avg(MonthOne) as AvgMonthOne, Avg(MonthTwo) as AvgMonthTwo, Count(YTDOne) as CountYTDOne, Count(YTDTwo) as CountYTDTwo, Avg(YTDOne) as AvgYTDOne, Avg(YTDTwo) as AvgYTDTwo from ( select MetroCity, CASE WHEN SaleDate Between '2006-08-01' AND '2006-08-31' THEN SalePrice ELSE NULL END AS MonthOne, CASE WHEN SaleDate Between '2006-01-01' AND '2006-08-31' THEN SalePrice ELSE NULL END AS YTDOne, CASE WHEN SaleDate Between '2005-08-01' AND '2005-08-31' THEN SalePrice ELSE NULL END AS MonthTwo, CASE WHEN SaleDate Between '2005-01-01' AND '2005-08-31' THEN SalePrice ELSE NULL END AS YTDTwo from #Temp) as P group by MetroCity order by MetroCity;
What are the numbers on that?
Tomm Carr
--
Version Normal Form -- http://groups.google.com/group/vrdbms
June 11, 2007 at 3:19 pm
Tom,
I ran both steps together and got the results in about 1 to 2 seconds!
June 12, 2007 at 4:04 am
And since there's only one SELECT from #temp...
select MetroCity as City, Count(MonthOne) as CountMonthOne, Count(MonthTwo) as CountMonthTwo, Avg(MonthOne) as AvgMonthOne, Avg(MonthTwo) as AvgMonthTwo, Count(YTDOne) as CountYTDOne, Count(YTDTwo) as CountYTDTwo, Avg(YTDOne) as AvgYTDOne, Avg(YTDTwo) as AvgYTDTwo from ( select MetroCity, CASE WHEN SaleDate Between '2006-08-01' AND '2006-08-31' THEN SalePrice ELSE NULL END AS MonthOne, CASE WHEN SaleDate Between '2006-01-01' AND '2006-08-31' THEN SalePrice ELSE NULL END AS YTDOne, CASE WHEN SaleDate Between '2005-08-01' AND '2005-08-31' THEN SalePrice ELSE NULL END AS MonthTwo, CASE WHEN SaleDate Between '2005-01-01' AND '2005-08-31' THEN SalePrice ELSE NULL END AS YTDTwo from (select metrocity, saledate, saleprice from property where MetroCity IS NOT NULL AND County = 'King' AND PropertyUseCode = 'Single Family' AND SalePrice >= (ApprLandVal + ApprImpsVal) * .5 AND (saledate between '2005-01-01' and '2005-08-31' or saledate between '2006-01-01' and '2006-08-31') ) t
) as P group by MetroCity order by MetroCity;
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 12, 2007 at 6:27 am
David: Glad I/we could help, and yes I did forget the parenthesis to group the "OR" clause. I'm also glad to see you got a lot of help/ideas on how to best write the query. I'm suprised that the time with the temp table and all the coorelated sub-queries was faster than a few of the others posted. Just goes to show that sometimes, simplier is better (though like most folks I think try to avoid them if possible) Thanks for the feed back on the posts.
Good Luck, James.
June 12, 2007 at 6:36 am
Instead of using #temp table use the table variable. It should work faster and is not creating any locks on the master db. And yes, sometimes when the initial record set is large I also preload the subset I need into a table variable in order to maximize performance.
And thanks for posting back, with the results.
---------------------------------------------
[font="Verdana"]Nothing is impossible.
It is just a matter of time and money.[/font]
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply