December 14, 2009 at 6:49 am
Why should I use this method versus using a good old fashioned Numbers table?
December 14, 2009 at 6:59 am
msaleem-583379 (12/14/2009)
I've used a function similar to MiddleDatestoTable in my apps, however I followed a different approach....
You do realize since your routine is a multi-statement TVF that the Quesry Optimizer will always treat the table returned y this function as if it has only one row regardless of the nuumber of rows actually returned?
This means if your function is returning more than a few hundred rows you won't be getting an optimal execution plan.
I discovered this with some AsOF multi-statement TVF's I developed. I had a query linking seven of this TVF together, and after 30 minutes, cancelled the query and modified (actually drop and create) the TVF from multi-statement to in-line TVF. Amazingly, the query then returned in four minutes.
I have to agree with Paul White, here we go with the discussions regarding what is the best string splitting functions. The use of a recursive CTE, however, is not very scalable. If you don't have a CLR TVF, which Paul indicates is the best, the next best is one based on a tally table. I have a function that actually contains its own tally table should you not have one.
December 14, 2009 at 7:23 am
Rob Fisk (12/14/2009)
Pretty certain I found this based on another article here somewhere but cannot find it to give credit.
I believe that must be a Jeff Moden article, unless I'm mistaken?
December 14, 2009 at 7:37 am
I'd like to do a poll and find out what most people call it - a "tally" table, a "numbers" table, a "helper" table or what. It would be interesting to know.
Maybe there's a geographic trend. a "tally" table sounds like something out of Great Britain - where they ride around on the trolley - while "helper" comes from Texas/Colorado/Tennessee with some cities rated as the friendliest in the U.S. and then maybe straight up "numbers," an ambiguous and somewhat unhelpful name for L.A. and New York City where the people are rated by some as the most unhelpful in the U.S. (Of course we all know these are generalizations that don't hold true for everybody - and where do they come up with these stats anyway? ).
Are there any more names you've heard for the "tally" table?
Bill Nicolich: www.SQLFave.com.
Daily tweet of what's new and interesting: AppendNow
December 14, 2009 at 7:56 am
I called mine the 'Integers' table. Because it's a table of INTs. Also, I used to write code for the lumber industry, where 'Tally' has a specific meaning.
December 14, 2009 at 8:06 am
I have some great bit of difficulty in believing that people are still using While Loops and the "Hidden RBAR" method of recursive CTE's for such a thing.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 14, 2009 at 8:09 am
Jonathan Melo (12/14/2009)
Rob Fisk (12/14/2009)
Pretty certain I found this based on another article here somewhere but cannot find it to give credit.I believe that must be a Jeff Moden article, unless I'm mistaken?
First saw the Tally table mentioned by Jeff but it was shortly after and by someone else that I saw an article that carried the string splitting example.
I call it a Tally because that what it was referred to primarily when I was first introduced to it. Never heard it called a helper table which, I must say, is the most ambiguous of the names since it does not say what it helps with. Numbers table, fairly straightforward but lacking in descriptiveness since it does indeed contain numbers and only numbers.
If you look at the dictionary definition of Tally then it comes closest to the mark as a description though none quite fit the bill. Besides it's nicely alliterative.
[font="Arial"]_______________________________________________________
Change is inevitable... Except from a vending machine.[/font]
December 14, 2009 at 8:13 am
Paul White, I am certain this is not correct.
"THE" fastest way to achieve string splitting is a 2 stage approach
1. use a CLR scalar function to convert a CSV or delimitted string in to a fixed width column string eg 10 chars per id.
122_______12345324__1234567___12________
2. with the result string, use a tally/numbers/etc table to split the fixed width string using substring. substring(@fixedlist, n * 20-19, n * 20)
The problem with solely using CLR is that the marshalling between SQL and .NET is very slow per row.
So it actually turns out more efficient this way.
And in the specific case of this article, it is even easier, you can skip the CLR part by initially constructing the argument list as fixed width separation.
December 14, 2009 at 8:17 am
Sweet bit of TSQL code. As stated by Amit, he used this to parse out parameters passed into a stored proc, so practically the number of params normally would not hit the Max Recursion boundaries. I however do feel a few minor tweaks are needed in order to make it really general purpose.
1. return a varchar instead of an int, if the values passed in the array are integers, then simply convert the resultant table in the caller
2. Add a check for at least on delimiter or else the function may fail
3. And lastly check for empty strings as these are generated when the array list has only 1 item.
The changed code and examples are below
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:Amit Gaur
-- Create date: July 25th 2008
-- Description:Convert a string to a table
-- Modified by: Pratap Prabhu, 12/14/2009
-- 1. Changed returned table to return varchar instead of int
-- 2. Added trailing Delimiter if no delimiter exists
-- 3. Added check for Blank item
-- =============================================
CREATE FUNCTION [dbo].[strToTable]
(
@array varchar(max),
@del char(1)
)
RETURNS
@listTable TABLE
(
Item varchar(max) --changed from int to varchar(max)
)
AS
BEGIN
--Added check for at least 1 delimiter
if(CHARINDEX(@del,@array,1)=0)
set @array=@array+@del
WITH rep (item,list) AS
(
SELECT SUBSTRING(@array,1,CHARINDEX(@del,@array,1) - 1) as item,
SUBSTRING(@array,CHARINDEX(@del,@array,1) + 1, LEN(@array)) + @del list
UNION ALL
SELECT SUBSTRING(list,1,CHARINDEX(@del,list,1) - 1) as item,
SUBSTRING(list,CHARINDEX(@del,list,1) + 1, LEN(list)) list
FROM rep
WHERE LEN(rep.list) > 0
)
INSERT INTO @listTable
SELECT item FROM rep where item<>'' --added check that item is not blank
RETURN
END
GO
select * from strToTable('String1',',')
go
select * from strToTable('String1,String2',',')
go
select Item=convert(int,Item) from strToTable('1,2,3,4',',')
go
select Item=convert(datetime,Item) from strToTable('10/12/2001,10/13/2001,10/14/2001,10/15/2001',',')
go
December 14, 2009 at 8:26 am
We use the following function and it works quite well. It will return 'empty' items in the string array as well as handle single item strings without the delimiter. However, this won't handle delimiters with lengths > 1.
CREATE FUNCTION udf_StrList2Table (
@List NVARCHAR(MAX) = '',
@Delimiter NCHAR(1) = N','
)
RETURNS TABLE
AS
RETURN
SELECT SUBSTRING( @Delimiter + @List + @Delimiter, n + 1,
CHARINDEX( @Delimiter, @Delimiter + @List + @Delimiter, n + 1 ) - n - 1 ) AS Value,
ROW_NUMBER() OVER ( ORDER BY n ) AS ListPos
FROM MSLC.dbo.Tally
WHERE SUBSTRING( @Delimiter + @List + @Delimiter, n, 1 ) = @Delimiter
AND n < LEN( @Delimiter + @List + @Delimiter )
/* Anything is possible but is it worth it? */
December 14, 2009 at 8:46 am
Please don't flame me for suggesting a poorly performing technique. The code below works great with a minimum of programming... but only when the two initial assumptions are correct!
If the list contains only integer primary keys, and the primary keys are defined by a reference table that doesn't take too long to scan, then I like to delimit every entry in the string list with a non-integer character such as '|' and then build the list table in this very simple way for a reference table named MyTable having a primary key named MyID:
INSERT INTO @listTable (item)
SELECT MyID
FROM MyTable
WHERE CharIndex('|' + Cast(MyID AS varchar(10)) + '|',@array) > 0
Here, @listTable and @array are the same as in Amit's example, but the @array string must begin and end with the delimiter ('|' in this case).
December 14, 2009 at 8:52 am
I second Roland's comment.
I was involved in a similar situation before and since we were working with sql server 2005 I had the developer pass a xml value to the SProc.
December 14, 2009 at 9:19 am
I originally stipulated
Paul White, I am certain this is not correct.
"THE" fastest way to achieve string splitting is a 2 stage approach
1. use a CLR scalar function to convert a CSV or delimitted string in to a fixed width column string eg 10 chars per id.
122_______12345324__1234567___12________
2. with the result string, use a tally/numbers/etc table to split the fixed width string using substring. substring(@fixedlist, n * 20-19, n * 20)
The problem with solely using CLR is that the marshalling between SQL and .NET is very slow per row.
So it actually turns out more efficient this way.
And in the specific case of this article, it is even easier, you can skip the CLR part by initially constructing the argument list as fixed width separation.
i apologise I should really put some code in place to demonstrate
-- first build the fixed padded width string with 10000 items!! @fixed
-- this simluates what the application would pass through
declare @fixed varchar(max)
declare @CT int
select @CT=0,@fixed = ''
select @fixed = @fixed + cast(n as char(10))
from dbo.Tally where n < 10000
-- here is the REAL code, now split it
select starttime = getdate()
select substring(@fixed, (n*10)+1, 10)
from dbo.tally
where n <=(len(@fixed)-1)/10
select endtime = getdate()
runs in only a few hundred ms!! on my local pc.
I suspect it would be a lot less without having to send the list back to the client mgmt studio.
Infact, I just checked and it is more like 20ms.
December 14, 2009 at 9:44 am
This helps the original query handle the case of a single value passed in? Not sure if it's the most elegant solution since I'm checking charindex many times, but...
DECLARE@array varchar(max)
DECLARE @del char(1)
SET @array = '1' -- Works for '1','1,2,3','1,,2,3' etc.
SET @del = ',';-- Semicolon needed for cte expression below
WITH rep (item,list) AS
(
SELECTCASE WHEN CHARINDEX(@del,@array,1) > 0 THEN
SUBSTRING(@array,1,CHARINDEX(@del,@array,1) - 1)
ELSE @array END AS item,
CASE WHEN CHARINDEX(@del,@array,1) > 0 THEN
SUBSTRING(@array,CHARINDEX(@del,@array,1) + 1, LEN(@array)) + @del
ELSE '' END AS list
UNION ALL
SELECT SUBSTRING(list,1,CHARINDEX(@del,list,1) - 1) as item,
SUBSTRING(list,CHARINDEX(@del,list,1) + 1, LEN(list)) list
FROM rep
WHERE LEN(rep.list) > 0
)
-- Use this to retrieve all items including empty values like 1,,,3 or 1,2,3, which has a trailing empty item
SELECT * FROM rep
-- Use this to remove empty items from string
-- SELECT item, list FROM rep WHERE LEN(item) > 0
December 14, 2009 at 10:03 am
Pratap Prabhu (12/14/2009)
I however do feel a few minor tweaks are needed ...
The best minor tweek you could make to the code is to write it so it doesn't use either recursion or explicit loops. Both are quite slow compared to a set based solution.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 16 through 30 (of 73 total)
You must be logged in to reply to this topic. Login to reply