May 7, 2008 at 4:04 am
How about using recursion :hehe: to replace loops? (One of the great new functions in SQL2005)
No need for a tally or temporary tables.
Heres a params example:
DECLARE @params varchar(8000)
SET @params = '1,2,3,4,5,6,7,8,9,10';
WITH Params AS (
-- start with last item
SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params)))) as [value], len(@Params) as start
UNION ALL
-- base case
SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value], charindex(',', @params) + 1 as start
UNION ALL
-- recursive case
SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value], charindex(',', @params, start) + 1 as start
FROM Params
WHERE charindex(',', @params, start) > 0
)
SELECT *
FROM Params
OPTION (MAXRECURSION 999); -- must be greater than max number of items
May 7, 2008 at 4:06 am
Another great article ! Thanks
The first reference for a tally table that I can recall is in "Guide to Sybase and SQL Server" by C. J. Date and D.McGoveran published in June of 1992.
Does anyone know of an earlier reference ?
SQL = Scarcely Qualifies as a Language
May 7, 2008 at 4:12 am
This is a super cool article. Let me repeat, this is a super cool article. What a refreshing read!
I look back on all the TSQL I have written using loops or cursors - arrrgh! Oh, and all the arguments/discussions on cursors, loops, variables of type table....
I am hooked on Tally tables. It is an extremely obvious in-your-face concept and I can't imagine why I haven't used this technique before.
Thanks again.
-Mike
May 7, 2008 at 4:14 am
:smooooth: This is slick...... Great work Jeff!
This is a great example of implementing set-based programming, three thumbs up from me.
Now to apply this to problems going forward!
Thanks for the excellent article and insight into this subject.
May 7, 2008 at 4:39 am
From the article, you wrote " Tally table .. starting at 0 or 1 (mine start at 1)"
There are advantages for having the Tally table include 0 and the "Freight by Date" problem becomes easier if 0 is included.
The original SQL solution contains:
SELECT t.N-1+@DateStart AS ShippedDate
FROM dbo.Tally t
WHERE t.N-1+@DateStart <= @DateEnd
This SQL snippet has "N - 1", which derives the tally to start with 0. Wouldl it not be simplier to have the tally table already contain 0?
The original SQL solution has a problem that if the tally table does have a row for zero, the results are different (there is an extra day before the oldest shipment).
The original SQL solution has been modified to eliminate all variables so that the solution can be used in a view, does not assume that the tally table has 0 but allows the tally table to contain 0:
SELECT dates.ShippedDate
, COALESCE( SUM(o.Freight) , 0) AS TotalFreight
FROM dbo.Orders o
RIGHT OUTER JOIN
(SELECT t.N - 1 + Shipments.DateStart AS ShippedDate
FROM dbo.Tally t
JOIN(SELECTMIN(ShippedDate) AS DateStart
,MAX(ShippedDate) AS DateEnd
FROMdbo.Orders
) AS Shipments
ON t.N BETWEEN 1 AND ( Shipments.DateEnd - Shipments.DateStart + 1 )
) AS dates
ON o.ShippedDate = dates.ShippedDate
GROUP BY dates.ShippedDate
ORDER BY dates.ShippedDate
Here is revised SQL that assumes that the tally table has 0. The solution is slight simplier with the original SQL as comments and bold.
SELECT dates.ShippedDate
, COALESCE( SUM(o.Freight) , 0) AS TotalFreight
FROM dbo.Orders o
RIGHT OUTER JOIN
-- (SELECT t.N - 1 + Shipments.DateStart AS ShippedDate
(SELECT t.N + Shipments.DateStart AS ShippedDate
FROMdbo.Tally t
JOIN(SELECT MIN(ShippedDate) AS DateStart
,MAX(ShippedDate) AS DateEnd
FROMdbo.Orders
) AS Shipments
-- ON t.N BETWEEN 1 AND ( Shipments.DateEnd - Shipments.DateStart + 1 )
ON t.N BETWEEN 0 AND (Shipments.DateEnd - Shipments.DateStart )
) AS dates
ON o.ShippedDate = dates.ShippedDate
GROUP BY dates.ShippedDate
ORDER BY dates.ShippedDate
SQL = Scarcely Qualifies as a Language
May 7, 2008 at 6:05 am
Fantastic article, Jeff.
Many thanks.
May 7, 2008 at 6:44 am
Bravo - Very good article. It is bookmarked and printed.
-- Cory
May 7, 2008 at 6:45 am
Jeff you have done it again! Faaaantastic!:P
:-PManie Verster
Developer
Johannesburg
South Africa
I can do all things through Christ who strengthens me. - Holy Bible
I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)
May 7, 2008 at 6:45 am
The CTE solution, while elegent, does suffer from a higher resource utilization as the number of element increases. Here is an example with 887 elements.
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @params varchar(8000)
SET @params =
'1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,509,510,511,512,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,632,633,634,635,636,637,638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682,683,684,685,686,687,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,749,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,819,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,850,851,852,853,854,855,856,857,858,859,860,861,862,863,864,865,866,867,868,869,870,871,872,873,874,875,876,877,878,879,880,881,882,883,884,885,886,887';
WITH Params AS (
SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params)))) as [value], len(@Params) as start
UNION ALL
SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value], charindex(',', @params) + 1 as start
UNION ALL
SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value], charindex(',', @params, start) + 1 as start
FROM Params
WHERE charindex(',', @params, start) > 0
)
select value FROM Params ORDER BY value OPTION (MAXRECURSION 999);
PRINT 'CTE Done'
SELECT substring(',' + @params + ',', Tally.N + 1
,charindex(',', ',' + @params + ',', Tally.N + 1) - Tally.N - 1)
AS Value
FROM tempdb.dbo.Tally AS TAlly
WHERE Tally.N <= len(',' + @params + ',') - 1
AND substring(',' + @params + ',', Tally.N, 1) = ','
ORDER BY value
PRINT 'Tally Done'
CPU and I/O statistics are:
CTE Solution:
Table 'Worktable'. Scan count 2, logical reads 5321, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 32 ms, elapsed time = 240 ms.
Tally Solution:
Table 'Tally'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 23 ms.
SQL = Scarcely Qualifies as a Language
May 7, 2008 at 6:54 am
Hugo:
Just SP is fine (though I personally dislike the habit - as if anything other than a stored procedure might follow an EXEC keyword). It's SP_ (with an underscore directly following the SP) that turns it into the special prefix reserved for system stored procedures.
Actually, you can execute scalar UDFs.
create function ExecTest()
returns int
as
begin
return (1)
end
go
declare @a int
exec @a = dbo.exectest
select @a
Will select 1
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
May 7, 2008 at 7:04 am
Awesome article. Alot of useful examples even though it's just the tip of the iceberg
For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
May 7, 2008 at 7:07 am
🙂 Great work Jeff! Five stars!
May 7, 2008 at 7:27 am
chris (5/7/2008)
How about using recursion :hehe: to replace loops? (One of the great new functions in SQL2005)No need for a tally or temporary tables.
Heres a params example:
DECLARE @params varchar(8000)
SET @params = '1,2,3,4,5,6,7,8,9,10';
WITH Params AS (
-- start with last item
SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params)))) as [value],
len(@Params) as start
UNION ALL
-- base case
SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value],
charindex(',', @params) + 1 as start
UNION ALL
-- recursive case
SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value],
charindex(',', @params, start) + 1 as start
FROM Params
WHERE charindex(',', @params, start) > 0
)
SELECT *
FROM Params
OPTION (MAXRECURSION 999); -- must be greater than max number of items
I ran some speed and load tests on this code.
DECLARE @params varchar(8000)
--SET @params = '1,2,3,4,5,6,7,8,9,10';
select @params = coalesce(@params + ',' + cast(number as varchar(10)), cast(number as varchar(10)))
from dbo.numbers
where number between 1 and 1820 -- 1820 was most I could fit in varchar(8000) with this pattern
;WITH Params AS (
-- start with last item
SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params))))
as [value], len(@Params) as start
UNION ALL
-- base case
SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value],
charindex(',', @params) + 1 as start
UNION ALL
-- recursive case
SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value],
charindex(',', @params, start) + 1 as start
FROM Params
WHERE charindex(',', @params, start) > 0
)
SELECT *
FROM Params
OPTION (MAXRECURSION 2000);
Average runtime = 90 ms, CPU time = 47 ms, 2 scans & 10920 reads for 'Worktable'.
Numbers version 1 (uses UDF):
select parsed
from common.dbo.stringparser(@params, ',')
/*
/****** Object: UserDefinedFunction [dbo].[StringParser] Script Date: 05/07/2008 09:10:53 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE function [dbo].[StringParser]
(@String_in varchar(max),
@Delimiter_in char(1))
returns table
as
return(
SELECT top 100 percent
SUBSTRING(@String_in+@Delimiter_in, number,
CHARINDEX(@Delimiter_in, @String_in+@Delimiter_in, number) - number)
as Parsed,
row_number() over (order by number) as Row
FROM numbers
WHERE number <= LEN(@String_in)
AND SUBSTRING(@Delimiter_in + @String_in, number, 1) = @Delimiter_in
ORDER BY number
);
*/
@params defined and populated with the same code.
Avg runtime = 64 ms, 16 ms CPU time, 1 scan & 15 reads from dbo.Numbers.
Removing the UDF overhead and querying against Numbers directly in the code (this also eliminates the Order By and RowNumber columns from the UDF):
SELECT
SUBSTRING(@params+',', number,
CHARINDEX(',', @params+',', number) - number) as Parsed
FROM dbo.numbers
WHERE number <= LEN(@params)
AND SUBSTRING(',' + @params, number, 1) = ','
Same method of populating @params.
Avg runtime = 54 ms, 0 ms CPU time (too short to measure), 1 scan & 15 reads from dbo.Numbers.
The recursive CTE is definitely fast, but both versions of the Numbers table parser were faster, and required less IO. On the Numbers version that has the code directly in the query (instead of a UDF), the only real time spent on the thing is displaying the results.
If I change it to:
DECLARE @params varchar(8000), @Res varchar(10)
--SET @params = '1,2,3,4,5,6,7,8,9,10';
select @params = coalesce(@params + ',' + cast(number as varchar(10)),
cast(number as varchar(10)))
from dbo.numbers
where number between 1 and 1820
SELECT @res =
SUBSTRING(@params+',', number,
CHARINDEX(',', @params+',', number) - number) --as Parsed
FROM dbo.numbers
WHERE number <= LEN(@params)
AND SUBSTRING(',' + @params, number, 1) = ','
Which eliminates returning the results to the client (as if it were an internal query instead of a user query), the total run-time goes down to 4 ms.
If I change the CTE as follows:
DECLARE @params varchar(8000), @Res varchar(10)
--SET @params = '1,2,3,4,5,6,7,8,9,10';
select @params = coalesce(@params + ',' + cast(number as varchar(10)),
cast(number as varchar(10)))
from dbo.numbers
where number between 1 and 1820
;WITH Params AS (
-- start with last item
SELECT REVERSE(SUBSTRING(REVERSE(@Params), 0, charindex(',', REVERSE(@Params)))) as [value],
len(@Params) as start
UNION ALL
-- base case
SELECT SUBSTRING(@Params, 0, charindex(',', @params)) as [value],
charindex(',', @params) + 1 as start
UNION ALL
-- recursive case
SELECT SUBSTRING(@Params, start, charindex(',', @params, start) - start) as [value],
charindex(',', @params, start) + 1 as start
FROM Params
WHERE charindex(',', @params, start) > 0
)
SELECT @res = [value]
FROM Params
OPTION (MAXRECURSION 2000);
CPU time is 47 ms and run-time is reduced to 54 ms.
Summary: While the CTE works, and is fast by any normal standard, the Numbers version is even faster, and requires less IO.
Note: All tests run 5 or more times on an isolated box running no concurrent queries.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
May 7, 2008 at 7:27 am
As always, a great article from Jeff.
As for performance testing of various techniques by anyone, I am continually amazed as to the incorrect methods for generating sample data.
Example (taken from this thread)
SET @params =
'1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
For the use of a tally table or other mechanism, populating a variable with sample data like the above is completely wrong! Those delimited values would never be passed by an application. The proper approach for generating sample data is to use RANDOM values.
May 7, 2008 at 7:33 am
Excellent article, Jeff. Good introduction.
I've been using Numbers tables for a couple of years now, and I just keep finding more uses for them all the time.
Examples include: String and Column parsing, Dates/Times lists, generating test data, cleaning up strings, random number generation, finding missing rows in ID columns
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 15 posts - 16 through 30 (of 511 total)
You must be logged in to reply to this topic. Login to reply