October 28, 2010 at 4:11 am
Hi Guys,
Just a small one this time.
I have a table which contains household insurance quotations.
As is normal with insurance each customer can do multiple quotes for each policy, they change little things like their excess, the number of days of unnocupancy they have etc etc to get a good balance between low premium and good cover.
In order to report on conversion rates (percentage of customers who took out an insurance policy after quoting) we need to count up the number of quotes have been done.
Now if we say that each one of those individual quotes are 1 quote, our conversion rate will be horriffically low. As such we want to group those quotes up so If a customer phones in to get a quote on tuesday, thinks about it, does another quote on wednesday then finally takes out a policy on thursday this is counted as one quote and not 3.
We have a file of 'all' quotes (imaginatively named) tblHhiQuotes
in order to get the desired 'unique' quotes we are grouping the quotes by customer surname, customer postcode and insurance cover type (eg contents, buildings or integrated)
All I want to get is an integer count... I can achieve this in the following manner....
DECLARE @count AS TABLE (one bit)
INSERT INTO @count
SELECT1
FROMtblHhiQuotes
WHEREsGlDate <= '28 Oct 2010'
AND sEffectiveDate BETWEEN
'01 Nov 2010' AND '30 Nov 2010'
AND cTransType = 'FQ'
GROUP BYvSurname
,cPostcode
,cCoverType
--output...
SELECTCOUNT(*)
FROM@count
this works and returns a value of 2173
Initially I tried select count(*) from..... but this of course returns 2173 rows of small numbers (those being the number of 'duplicates')
I'm happy that my program works and returns a value of 2173 but I'm thinking there MUST be a more efficient way of doing this without resorting to inserting records into a temporary table. I may well be being very very stupid and forgetting some extremely simple function but I cant think of any....
Is there a built in function of count or group by that im missing here?
Thanks!
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
October 28, 2010 at 4:21 am
;WITH ctSource(Data)
AS (
SELECT1
FROMtblHhiQuotes
WHEREsGlDate <= '28 Oct 2010'
AND sEffectiveDate BETWEEN '01 Nov 2010' AND '30 Nov 2010'
AND cTransType = 'FQ'
GROUP BYvSurname,
cPostcode,
cCoverType
)
SELECTSUM(Data)
FROMcteSource
N 56Β°04'39.16"
E 12Β°55'05.25"
October 28, 2010 at 4:25 am
thats brilliant thanks.
I'm surprised there isnt an alternative version of count in sql server that returns the number of rows like a numrows() function or whatever.
Thanks
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
October 28, 2010 at 5:12 am
BenWard (10/28/2010)
thats brilliant thanks.I'm surprised there isnt an alternative version of count in sql server that returns the number of rows like a numrows() function or whatever.
Thanks
Like this version?
SELECT COUNT(*)
FROM (
SELECT 1
FROM tblHhiQuotes
WHERE sGlDate <= '28 Oct 2010'
AND sEffectiveDate BETWEEN
'01 Nov 2010' AND '30 Nov 2010'
AND cTransType = 'FQ'
GROUP BY vSurname
,cPostcode
,cCoverType
) d
π
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
October 28, 2010 at 5:18 am
haha yes
which is (I think) exactly the same as
;WITH cteSource(Data)
AS (
SELECT 1
FROM tblHhiQuotes
WHERE sGlDate <= '28 Oct 2010'
AND sEffectiveDate BETWEEN '01 Nov 2010' AND '30 Nov 2010'
AND cTransType = 'FQ'
GROUP BY vSurname,
cPostcode,
cCoverType
)
SELECT COUNT(*)
FROM cteSource
isnt it?!?
I'm sure we could spend many many hours attempting to determine if we can save 13 milliseconds of processing time by redoing this that or the other here!
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
October 28, 2010 at 5:39 am
BenWard (10/28/2010)
haha yeswhich is (I think) exactly the same as
;WITH cteSource(Data)
AS (
SELECT 1
FROM tblHhiQuotes
WHERE sGlDate <= '28 Oct 2010'
AND sEffectiveDate BETWEEN '01 Nov 2010' AND '30 Nov 2010'
AND cTransType = 'FQ'
GROUP BY vSurname,
cPostcode,
cCoverType
)
SELECT COUNT(*)
FROM cteSource
isnt it?!?
Heh yep you're absolutely right, they're functionally equivalent, though I wouldn't go so far as to say they are the same.
I haven't got an instance to play with today - I wonder if any of the extensions to GROUP BY (e.g. WITH ROLLUP) would do the trick in a single statement?
I'm sure we could spend many many hours attempting to determine if we can save 13 milliseconds of processing time by redoing this that or the other here!
Big topic, and current, "Speed Phreakery" or something. Well worth a read.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
October 28, 2010 at 6:47 am
I did try rollup, that just gave extra records.
I'll give that a read...
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
October 28, 2010 at 8:16 am
BenWard (10/28/2010)
thats brilliant thanks.I'm surprised there isnt an alternative version of count in sql server that returns the number of rows like a numrows() function or whatever.
Thanks
You mean like Count(DISTINCT vSurname + cPostcode + cCoverType ) ?
You may need to do conversions to get the datatypes to match.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 28, 2010 at 8:49 am
you
are
a ...
π
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
October 28, 2010 at 9:01 am
Hahaha that's too funny!
I bet it's the same performance too.
Ben, if you're interested and have the time, you could assess all three queries for performance using the simplest method:
SET STATISTICS TIME ON
SET STATISTICS IO ON
<<statements>>
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
Run three or four times per statement and take average, the first run of each statement is likely to be slowest due to cacheing of the subsequent runs.
If you're on a dev box on your own, I think you can use something like DBCC FREEPROCCACHE before each run, which you would then do only once.
Measuring the performance of a statement or batch can get quite involved. The method above is quick and reliable enough to use during production coding to tie-break between functionally-equivalent methods.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
October 28, 2010 at 12:01 pm
Or maybe this, since PARTITION BY generally seems to be pretty fast in SQL Server:
;WITH cte1 AS (
SELECT ROW_NUMBER() OVER (PARTITION BY vSurname, cPostcode, cCoverType ORDER BY vSurname)
AS rownum
FROM @count
)
SELECT COUNT(*)
FROM cte1
WHERE rownum = 1
I'd be afraid that the concatenation may be a bit slow, since string manipulation is definitely a weak point for SQL.
Scott Pletcher, SQL Server MVP 2008-2010
October 29, 2010 at 4:05 am
And the results are in !
My query was the slowest (as was expected lol)
most common speed 156ms with a low of 141 and a high of 196
3rd place goes to Peso whose query most commonly executes in 141ms with a low of 130 and a high of 156
2nd is scott, the partition table can be very fast but is not reliably so, I get even results of either 125ms and 141ms with the occaisional high of 170
1st place goes to drew with a most common time of 125ms, a low of 110 and a high of 125.
here is your prize:
Thanks Guys π
As always, very impressed by the great community spirit here.
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
October 29, 2010 at 4:58 am
BenWard (10/29/2010)
And the results are in !My query was the slowest (as was expected lol)
most common speed 156ms with a low of 141 and a high of 196
3rd place goes to Peso whose query most commonly executes in 141ms with a low of 130 and a high of 156
2nd is scott, the partition table can be very fast but is not reliably so, I get even results of either 125ms and 141ms with the occaisional high of 170
1st place goes to drew with a most common time of 125ms, a low of 110 and a high of 125.
Ahem...how did the derived table compare with the CTE? π
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
October 29, 2010 at 5:30 am
erm....
Msg 8155, Level 16, State 2, Line 6
No column was specified for column 1 of 'd'.
:Whistling:
Ben
^ Thats me!
----------------------------------------
01010111011010000110000101110100 01100001 0110001101101111011011010111000001101100011001010111010001100101 01110100011010010110110101100101 011101110110000101110011011101000110010101110010
----------------------------------------
October 29, 2010 at 5:38 am
SELECTSUM(1) AS Items
FROMdbo.tblHhiQuotes
WHERE sGlDate <= '28 Oct 2010'
AND sEffectiveDate BETWEEN '01 Nov 2010' AND '30 Nov 2010'
AND cTransType = 'FQ'
GROUP BYvSurname,
cPostcode,
cCoverType
WITH ROLLUP
HAVINGGROUPING(vSurname) = 1
N 56Β°04'39.16"
E 12Β°55'05.25"
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply