December 14, 2009 at 10:25 am
As mentioned its for parsing Parameters passed in to a stored proc. In a real world scenerio an app would let a user select multiple items in a list, most probably via some check boxes, accumulate the selected items in a delimited string and pass them in to the proc. So lets say a user selects 30 odd items, or make it 50. Whats the difference in time.....I suggest you look at it from a practical stand point, in theory you are quite right, but in practical terms for the usage described the routine is perfect.
December 14, 2009 at 10:43 am
Pratap Prabhu (12/14/2009)
As mentioned its for parsing Parameters passed in to a stored proc. In a real world scenerio an app would let a user select multiple items in a list, most probably via some check boxes, accumulate the selected items in a delimited string and pass them in to the proc. So lets say a user selects 30 odd items, or make it 50. Whats the difference in time.....I suggest you look at it from a practical stand point, in theory you are quite right, but in practical terms for the usage described the routine is perfect.
Okay, so you have a split function that works great for ten to thirty items. Along comes a requirement to split strings that are even longer and over several hundred thousand records. Oh, great, I have this nifty function I use to parse strings from an application, I'll just use it here. Problem is, the function doesn't scale well, so you end up with a routine that runs for an extended period of time, and you don't really know why. You don't bother to look at your string parsing rountine but, hey, it works great over here, it can't be the problem.
See a problem??
One thing I have learned here, code for scalability and performance. You never know how your nifty little functions may get used by others.
December 14, 2009 at 10:44 am
write it so it doesn't use either recursion or explicit loops. Both are quite slow compared to a set based solution.
Following along with Jeff's comment, doing a set-based solution is good for developing the mind-set for thinking in sets in addition to the performance benefits.
Here's a great resource by Robyn Page and Phil Factor over at simple-talk called The Helper Table Workbench[/url] which includes a set-based technique for Splitting Strings into table-rows, based on a specified delimiter.
With that said, I think a helpful discussion we can have here is the question of when is it a good idea (generalizing about problems) to use a CTE approach which uses recursion and is an implicit looping technique given that we can usually use a set-based technique to get the same job done? Any opinions on that? Perhaps traversing hierarchies is one.
Bill Nicolich: www.SQLFave.com.
Daily tweet of what's new and interesting: AppendNow
December 14, 2009 at 11:11 am
Traversing hierarchies is about the only one I can think of although there are methods that will blow the doors off that, as well.
Also, since we're dropping names and articles, here's one on how a Tally table actually works. I know the author personally 😉 and except for a minor error in one of the example scripts, he did a pretty good job of explaining how a Tally (or Numbers) table actually works to replace RBAR.
http://www.sqlservercentral.com/articles/T-SQL/62867/
He also wrote an article on passing parameters as 1, 2, and 3 dimensional "arrays".... without clogging the pipe and without the errors of not being able to handle single, blank, or null parameters.
http://www.sqlservercentral.com/articles/T-SQL/63003/
--Jeff Moden
Change is inevitable... Change for the better is not.
December 14, 2009 at 11:15 am
Pratap Prabhu (12/14/2009)
As mentioned its for parsing Parameters passed in to a stored proc. In a real world scenerio an app would let a user select multiple items in a list, most probably via some check boxes, accumulate the selected items in a delimited string and pass them in to the proc. So lets say a user selects 30 odd items, or make it 50. Whats the difference in time.....I suggest you look at it from a practical stand point, in theory you are quite right, but in practical terms for the usage described the routine is perfect.
It's not a matter of time for a single usage.... it's a matter of time if you have 10,000 users all passing parameters within very short time periods of each other. Also, if some poor slob faced with an impossible schedule finds the performance challenged RBAR of the solution in this article and many of the solutions offered in this discussion and (s)he needs it to split a substantial number of rows in a table, (s)he is dead meat.
NEVER justify poorly performing code based on its lack of frequency of usage because that frequency may not stay the same if someone uses the method for something else.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 14, 2009 at 11:18 am
Aaron Gonzalez-394690 (12/14/2009)
I second Roland's comment.I was involved in a similar situation before and since we were working with sql server 2005 I had the developer pass a xml value to the SProc.
Aaron, do some "load" testing using the XML method... compared to the more straight forward set based methods, XML shredding of parameters is relatively slow and the extra overhead in the form of tags and the like serves for nothing but a pipe clogger. Seriously, don't use XML for this type of thing.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 14, 2009 at 11:20 am
Roland Howard Boorman (12/14/2009)
I simply use XML.Using XML whole complex structures can be passed to SQL and treated as Tables.
You can use Functions with the XML to render these into virtual tables and treat these as regular tables!
See above.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 14, 2009 at 11:22 am
Jeff Moden (12/14/2009)
Aaron Gonzalez-394690 (12/14/2009)
I second Roland's comment.I was involved in a similar situation before and since we were working with sql server 2005 I had the developer pass a xml value to the SProc.
Aaron, do some "load" testing using the XML method... compared to the more straight forward set based methods, XML shredding of parameters is relatively slow and the extra overhead in the form of tags and the like serves for nothing but a pipe clogger. Seriously, don't use XML for this type of thing.
Speaking of performance, you can take a look at some of the results I got when testing each method.
http://jahaines.blogspot.com/2009/12/splitting-delimited-string-part-2.html
December 14, 2009 at 11:23 am
I'd suggest taking a look at Erland Sommarskog's page on passing arrays (list).
http://www.sommarskog.se/arrays-in-sql-2005.html or the older article for sql 200
http://www.sommarskog.se/arrays-in-sql-2000.html
Good coverage of the various possibilities (with performance results)
John
December 14, 2009 at 11:45 am
Adam Haines (12/14/2009)
Jeff Moden (12/14/2009)
Aaron Gonzalez-394690 (12/14/2009)
I second Roland's comment.I was involved in a similar situation before and since we were working with sql server 2005 I had the developer pass a xml value to the SProc.
Aaron, do some "load" testing using the XML method... compared to the more straight forward set based methods, XML shredding of parameters is relatively slow and the extra overhead in the form of tags and the like serves for nothing but a pipe clogger. Seriously, don't use XML for this type of thing.
Speaking of performance, you can take a look at some of the results I got when testing each method.
http://jahaines.blogspot.com/2009/12/splitting-delimited-string-part-2.html
Nice artical/blog and I'll have to revisit XML based on what you said. However, you posted...
the XML methods were just as performant as the numbers table methods
... not to the "pipe". No matter what you do, XML will pass more bytes than a simple tab delimited parameter. As you know, one of the biggest performance problems between the GUI and the database is the usage and clogging of the pipe mostly due to unnessary bytes being passed in one form or another.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 14, 2009 at 11:52 am
Based on my earlier question
Fatal Exception Error (12/14/2009)
Why should I use this method versus using a good old fashioned Numbers table?
It is safe to say that there is no reason except when the requirements are so different that it does not even represent the same problem?
December 14, 2009 at 1:03 pm
Fatal Exception Error (12/14/2009)
Based on my earlier questionFatal Exception Error (12/14/2009)
Why should I use this method versus using a good old fashioned Numbers table?It is safe to say that there is no reason except when the requirements are so different that it does not even represent the same problem?
I'm thinking that's a pretty safe thing to say.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 14, 2009 at 2:06 pm
This is a useful function, even with the limitations mentioned in the thread. Here is a minor revision for consideration, which addresses the single item point, and a couple of other minor issues. Also, for clarity, comments have been added and some minor naming changes are offered.
/* ========================================================================
Script Name: CREATE_udf_StringToTable.sql
Description: Convert a string to a table
Parameters:(1) A delimited string of values
(2) The character used to delimit values
Test Calls:
DECLARE @listTable TABLE (RowID int ,Item varchar(100));
INSERT INTO @listTable
SELECT * FROM [_udf_StringToTable]('Heuy|Dewey|Louie','|');
=========================================================================*/
CREATE FUNCTION [dbo].[_udf_StringToTable]
(@valueStr varchar(max),
@delimeter char(1)
)
RETURNS
@listTable TABLE (RowID int Identity ,Item varchar(100))
AS
BEGIN
IF (LTRIM(RTRIM(@valueStr)) <> SPACE(0)) --Ensure both parameters have values
BEGIN
--If the delimeter cannot be found in @valueStr, append it.
IF (CHARINDEX(@delimeter,@valueStr,1) < 1)
SET @valueStr = @valueStr + @delimeter;
IF CHARINDEX(@delimeter,@valueStr,1) > 0
BEGIN
WITH rep (tableItem,listRemainder) AS
(
-- Get the portion of the valueStr that preceeds the delimiter, along with the remainder.
-- Since the logic in the 2nd statement needs a delimiter for the CHARINDEX function,
-- the @delimeter is always appended to the @valueStr remainder.
SELECT
SUBSTRING(@valueStr, 1, CHARINDEX(@delimeter,@valueStr,1) - 1) -- extracted @valueStr item
, SUBSTRING(@valueStr, CHARINDEX(@delimeter,@valueStr,1) + 1, LEN(@valueStr)) + @delimeter --remainder
UNION ALL
-- Recursion, using the @valueStr remainder from the previous statement.
-- When the remainder (listRemainder) is empty, we're done.
SELECT
SUBSTRING(listRemainder,1,CHARINDEX(@delimeter,listRemainder,1) - 1) -- extracted @valueStr item
, SUBSTRING(listRemainder,CHARINDEX(@delimeter,listRemainder,1) + 1, LEN(listRemainder)) -- remainder
FROM rep
WHERE LEN(rep.listRemainder) > 0
)
INSERT INTO @listTable
SELECT tableItem FROM rep WHERE tableItem <> SPACE(0)
END
END
RETURN
END
GO
December 14, 2009 at 3:02 pm
sam.walker (12/14/2009)
Paul White, I am certain this is not correct.
One of the longest threads ever on SSC was devoted to this: http://www.sqlservercentral.com/Forums/Topic695508-338-17.aspx
There is a handy summary of the results on Florian Reischl's blog. That complements the SQLBlog article from Adam Machanic I mentioned before.
December 14, 2009 at 3:19 pm
Paul White (12/14/2009)
sam.walker (12/14/2009)
Paul White, I am certain this is not correct.One of the longest threads ever on SSC was devoted to this: http://www.sqlservercentral.com/Forums/Topic695508-338-17.aspx
There is a handy summary of the results on Florian Reischl's blog. That complements the SQLBlog article from Adam Machanic I mentioned before.
Flo has a kid on his team that wrote a splitter CLR that blows everything away and, if memory serves, the code for that CLR is included someone in that thread.
For things under 8k, the Tally table was definitely worthwhile performance wise as were a couple of other methods.... Recursive CTE's certainly weren't one of those.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 31 through 45 (of 73 total)
You must be logged in to reply to this topic. Login to reply