December 8, 2007 at 4:14 pm
Hello,
create procedure [dbo].[_test]
@list varchar(50)
as
select * from Listings where cat_id in (@list)
I run the SP like exec _test '1,2,3' and want to retrieve the results where the cat_id = 1,2,3
How do I do something like that, preferably using static queries and using as few code as possible?
Thanks,
Ham
December 8, 2007 at 6:49 pm
There are a couple of ways to tackle this. Maybe the least coding is to do a dynamic sql string.
something like
Declare @sql varchar(4000)
Set @sql = 'Select * From table Where value in(' + @CSVList + ')'
exec sp_executesql @sql
OR you could write some code to parse out the list and insert into a temp table, then join to the temp table. There were some other recent articles here on using XML, but perhaps the first solution is the least coding.
Hope this helps.
December 9, 2007 at 3:07 am
Using a numbers/sequence table
create procedure [dbo].[_test]
@list varchar(50)
as
WITH CTE AS (
SELECT CAST(SUBSTRING(@list,
Number,
CHARINDEX(',',@list+',',Number)-Number) AS INT) AS Val
FROM dbo.Numbers
WHERE Number BETWEEN 1 AND LEN(@list)+1
AND SUBSTRING(','+@list,Number,1)=','
)
select * from Listings where cat_id in (select Val from CTE)
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537December 9, 2007 at 3:32 am
I wrote a function that splits a delimited varchar into a table with an ElementID, and an element column (http://www.sqlservercentral.com/scripts/Miscellaneous/30225/)
You could do something like this in your stored proc to get the items in the delimited parameter:
create procedure [dbo].[_test]
@list varchar(50)
as
select * from Listings
where cat_id in
(
SELECT CAST(Element as int) FROM dbo.Split(@list, ',')
)
Since the function returns a table you could also do something like this
select l.* from Listings l
INNER JOIN dbo.Split(@list, ',') l2
ON l.cat_id = CAST(l2.Element as int)
Karen Gayda
MCP, MCSD, MCDBA
gaydaware.com
December 10, 2007 at 6:34 am
If it's possible, format the delimited list as XML and use XQuery to retreive it from there. We've started to find this a more optimal approach than the old one we had of using a table valued function to pivot the list.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
December 11, 2007 at 7:41 am
A very common problem and use case. Even within stored procedures or between stored procedures.
The easy way is a User-Defined Function (UDF) that parses and splits the string into a table. Then you reference the table anywhere. Similar to the "split" function use as presented in an earlier reply.
One thing to add to a function is to return the ordinal position of the value in the string. It is very handy when ordering of the final result set needs to be based on the order of the "IN" list values. Not all cases will need it, but by writing a common function, it is there when you need it.
Personally, I don't like the "numbers" table approach due to limitations on the values that can be passed in the delimited string along with duplicate handling.
Sample function:
[font="Courier New"]CREATE FUNCTION Foo
(@p_integer_list VARCHAR(MAX))
RETURNS @return_table TABLE (ROW_NUMBER int IDENTITY(1,1), INTEGER_ID int)
AS
BEGIN;
DECLARE @v_length bigint;
DECLARE @v_start bigint;
DECLARE @v_end bigint;
DECLARE @v_integer int;
SET @v_length = LEN(@p_integer_list);
SET @v_start = 1;
SET @v_end = 1;
WHILE @v_start <= @v_length
BEGIN
SET @v_end = CHARINDEX (',', @p_integer_list, @v_start);
IF @v_end = 0 -- Delimiter not found
SET @v_end = @v_length + 1;
SET @v_integer = SUBSTRING(@p_integer_list, @v_start, @v_end-@v_start);
INSERT INTO @return_table (integer_id) VALUES (@v_integer);
SET @v_start = @v_end + 1;
END;
RETURN;
END;[/font]
December 11, 2007 at 9:01 am
> Personally, I don't like the "numbers" table approach due to limitations on
> the values that can be passed in the delimited string along with duplicate handling.
John,
Can you expand on this please.
I haven't had a problem with the "numbers" table approach and have found it performs well.
FYI there's an interesting and detailed analysis here
http://www.sommarskog.se/arrays-in-sql-2005.html
http://www.sommarskog.se/arrays-in-sql-perftest.html
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537December 11, 2007 at 10:34 am
Re: "numbers" table usage
I'm not stating that a "numbers" table is not a good thing. Just be cautious with the related logic (code) that utilizes a numbers table.
To wit: Given the delimited string value of
' 1, 3, 123456789, 5, 7, 7, 9, 14, 36, 395, 7 '
(note the blanks and duplicate values)
the code presented under "Parsing a string" in http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html will:
1. Is painfully slow (with a number table of 512,000 rows)
2. Return an empty answer (can't handle the blanks)
3. With the blanks removed, return an INCORRECT answer (value 123456789 will not be returned)
4. Can't handle duplicate values in the list
Granted, the issue is not with the "numbers" table per se, but with the associated logic.
To give credit: The CTE logic utilizing a numbers table does return the correct results.
I also had one or more of the following business logic requirements in certain circumstances (i.e., sometimes, but not always):
1. to preserve the order of the entries passed
2. have an ordinal number associated with each entry
3. support duplicate values
4. preserve the order of the duplicate value(s)
5. support (rarely) a delimited string value exceeding 8,000 characters
Requirements 1 and 2 would, in my business case, preclude using the CTE example as:
1. No ordinal position is returned
2. The order of the values is not guaranteed (NOTwithout an ORDER BY clause)
This forces at least some IDENTITY value or equivalent.
Therefore, my table-valued function works for me.
December 11, 2007 at 11:18 am
You can quite easily get ordinal numbers by doing this.
CREATE FUNCTION Foo
(@p_integer_list VARCHAR(MAX))
RETURNS @return_table TABLE (ROW_NUMBER int IDENTITY(1,1), INTEGER_ID int)
AS
BEGIN;
INSERT INTO @return_table (INTEGER_ID)
SELECT CAST(SUBSTRING(@p_integer_list,
Number,
CHARINDEX(',',@p_integer_list+',',Number)-Number) AS INT)
FROM dbo.Numbers
WHERE Number BETWEEN 1 AND LEN(@p_integer_list)+1
AND SUBSTRING(','+@p_integer_list,Number,1)=','
ORDER BY Number;
RETURN;
END;
Also the CTE can generate an ordinal via ROW_NUMBER() OVER(ORDER BY Number)
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537December 11, 2007 at 11:49 am
The function I presented was initially written for SQL Server 2000. Unfortunately, at that time I had to "chunk" the incoming TEXT datatype into VARCHAR(8000) "chunks" as CHARINDEX and SUBSTRING would silently fail after 8000 characters. The function has since been streamlined for SQL Server 2005.
The ranking function ROW_NUMBER OVER() didn't exist.
Note: I'm not so sure your ROW_NUMBER (ordinal position logic) is correct as you're ordering on the Numbers, not their position in the string.
Anyway, you've taken my plain parsing function and placed a physical table lookup (number table) within it. Now you're making a trip to a table vs. just code. And you still haven't eliminated the need for CHARINDEX and SUBSTRING.
In a "quick" performance test (on my dedicated development machine running SQL Server 2005 SP2) between my function and the CTE using a 512,000 row "numbers" table (both using VARCHAR(MAX)) I get the following elapsed times:
CTE: ~80-100 ms
Function: 0 ms
And my function is doing more! So although I technically "can" use an associated "numbers" table within the function to "parse and split" the values, I don't think that it is the "best" solution to the problem. I could also use XML logic to "pivot" the values into the table. However, I think that, although "techie", it would lose on performance.
An interesting "head to head" challenge tempered by both performance and simplicity (code and usage)
Test code:
[font="Courier New"]set nocount on
declare @start_time datetime
DECLARE @list varchar(max)
SET @list = ' 1 , 3 , 123456789 , 5 , 7 , 7 , 9 , 14 , 36 , 395 , 7 ';
set @start_time = getdate();
SELECT * FROM Foo(@list)
print datediff(ms, @start_time, getdate());
set @start_time = getdate();
WITH CTE AS
(
SELECT CAST(SUBSTRING(@list, Number, CHARINDEX(',',@list+',',Number)-Number) AS INT) AS Val
FROM dbo.Numbers
WHERE Number BETWEEN 1 AND LEN(@list)+1 AND SUBSTRING(','+@list,Number,1)=','
)
SELECT * FROM cte
print datediff(ms, @start_time, getdate())[/font]
December 11, 2007 at 12:35 pm
> Note: I'm not so sure your ROW_NUMBER (ordinal position logic) is correct as you're ordering on the > Numbers, not their position in the string.
It does work. Try it.
As for performance tests, I've no idea what happening there. The timings seem inconsistent between runs.
If you increase the length of the CSV by doing this
set @list=@list+','+@list+','+@list+','+@list+','+@list+','+@list+','+@list+','+@list
set @list=@list+','+@list+','+@list+','+@list+','+@list+','+@list+','+@list+','+@list
the CTE runs faster (sometimes)
Also changing varchar(max) to varchar(8000) changes the results.
I don't have the time to look into this, but I would suggest following the link I posted,
there's a good analysis on the performance of the available methods.
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537December 11, 2007 at 12:55 pm
Mark,
I'll wrap up this thread by stating the following:
1. Although (sometimes) faster, changing from VARCHAR(MAX) to VARCHAR(8000) breaks my business logic (use case) requirement.
2. In the performance testing, just look at the actual query plan. You'll see "clustered index seeks" to the "numbers" table whereas the function doesn't have to go to any table. That speaks for itself.
3. I'm aware of the work by Erland Sommarskog. As with all things computer related, there is never a pure "right" and "wrong" way to do things.
4. As I stated, my method satisfied all of my business logic requirements, is flexible, is fast, and doesn't require another physical table in the database. Therefore, it is correct (for me).
5. The original poster was looking for a way to accomplish a task. My suggestion of a pure function that I use was provided along with suggestions by others. I'll leave it up to the individual to digest all of this thread and make a decision.
December 11, 2007 at 1:40 pm
The problem with the numbers table not out-performing the straight function has nothing to do with the actual numbes table; it is the CTE that is dragging it down. Consider this function:
CREATE FUNCTION dbo.udf_ListTable (@parmString VARCHAR(8000), @parmDelim varchar(10))
RETURNS TABLE
AS
RETURN (
SELECT Number,
RTRIM(LTRIM(SUBSTRING(@parmString,
CASE Number
WHEN 1 THEN 1
ELSE Number + 1
END,
CASE CHARINDEX(@parmdelim, @parmString, Number + 1)
WHEN 0 THEN LEN(@parmString) - Number + 1
ELSE CHARINDEX(@parmdelim, @parmString, Number + 1) - Number -
CASE
WHEN Number > 1 THEN 1
ELSE 0
END
END
))) AS ListItem
FROM dbo.Numbers
WHERE Number <= LEN(@parmString)
AND (SUBSTRING(@parmString, Number, 1) = @parmdelim
OR Number = 1)
)
Now, let's add in another test call to our new UDF that used the numbers table and re-run the test. Make sure you run the test more than once! You'll notice that the UDF using the numbers table always returns in 0 ms whereas the original UDF containing procedureal logic to parse is not as consistent and many times returns greater than that of our new function.
set nocount on
declare @start_time datetime
DECLARE @list varchar(max)
SET @list = ' 1 , 3 , 123456789 , 5 , 7 , 7 , 9 , 14 , 36 , 395 , 7 ';
set @start_time = getdate();
SELECT * FROM Foo(@list)
print datediff(ms, @start_time, getdate());
set @start_time = getdate();
WITH CTE AS
(
SELECT CAST(SUBSTRING(@list, Number, CHARINDEX(',',@list+',',Number)-Number) AS INT) AS Val
FROM dbo.Numbers
WHERE Number BETWEEN 1 AND LEN(@list)+1 AND SUBSTRING(','+@list,Number,1)=','
)
SELECT * FROM cte
print datediff(ms, @start_time, getdate())
set @start_time = getdate();
SELECT *
FROM dbo.udf_ListTable(@list,',')
print datediff(ms, @start_time, getdate())
By the way, my number table contains 100,000 rows. 512,000 is a bit overkill.
December 11, 2007 at 2:10 pm
True, but what have I gained other than an exercise. We're nit picking. The logic in the function is nice, maybe for someone else, but I can't use it for the following reasons:
1. The ordinal values are not correct. I get back 1, 6, 11, 24, 29, 34, 39, 44, 51, 58, 65 rather than 1 through 11 inclusive.
2. I can't pass in strings longer than 8000 characters due to use of VARCHAR(8000)
3. Performance isn't any better. 0ms = 0ms
4. The number table must have as many entries as the longest string that is passed into the function due to "LEN(@parmString)". An artificial constraint and potential hidden "time bomb" that would silently return an incorrect answer. So in my opinion, 512,000 rows in the number table is NOT overkill, but probably not enough. Note: I used the large number of rows to ensure that performance results would not be skewed with a small table.
For my business logic requirements, I'll stick with what I've got as it works.
December 11, 2007 at 2:17 pm
I guess my question would be... why does anyone pass more than 8000 bytes of parameters to a stored procedure... what is the purpose?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 23 total)
You must be logged in to reply to this topic. Login to reply