passing csv list into stored procedure

  • Hello,

    create procedure [dbo].[_test]

    @list varchar(50)

    as

    select * from Listings where cat_id in (@list)

    I run the SP like exec _test '1,2,3' and want to retrieve the results where the cat_id = 1,2,3

    How do I do something like that, preferably using static queries and using as few code as possible?

    Thanks,

    Ham

  • There are a couple of ways to tackle this. Maybe the least coding is to do a dynamic sql string.

    something like

    Declare @SQL varchar(4000)

    Set @SQL = 'Select * From table Where value in(' + @CSVList + ')'

    exec sp_executesql @SQL

    OR you could write some code to parse out the list and insert into a temp table, then join to the temp table. There were some other recent articles here on using XML, but perhaps the first solution is the least coding.

    Hope this helps.

  • Using a numbers/sequence table

    http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html

    create procedure [dbo].[_test]

    @list varchar(50)

    as

    WITH CTE AS (

    SELECT CAST(SUBSTRING(@list,

    Number,

    CHARINDEX(',',@list+',',Number)-Number) AS INT) AS Val

    FROM dbo.Numbers

    WHERE Number BETWEEN 1 AND LEN(@list)+1

    AND SUBSTRING(','+@list,Number,1)=','

    )

    select * from Listings where cat_id in (select Val from CTE)

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • I wrote a function that splits a delimited varchar into a table with an ElementID, and an element column (http://www.sqlservercentral.com/scripts/Miscellaneous/30225/)

    You could do something like this in your stored proc to get the items in the delimited parameter:

    create procedure [dbo].[_test]

    @list varchar(50)

    as

    select * from Listings

    where cat_id in

    (

    SELECT CAST(Element as int) FROM dbo.Split(@list, ',')

    )

    Since the function returns a table you could also do something like this

    select l.* from Listings l

    INNER JOIN dbo.Split(@list, ',') l2

    ON l.cat_id = CAST(l2.Element as int)


    Karen Gayda
    MCP, MCSD, MCDBA

    gaydaware.com

  • If it's possible, format the delimited list as XML and use XQuery to retreive it from there. We've started to find this a more optimal approach than the old one we had of using a table valued function to pivot the list.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • A very common problem and use case. Even within stored procedures or between stored procedures.

    The easy way is a User-Defined Function (UDF) that parses and splits the string into a table. Then you reference the table anywhere. Similar to the "split" function use as presented in an earlier reply.

    One thing to add to a function is to return the ordinal position of the value in the string. It is very handy when ordering of the final result set needs to be based on the order of the "IN" list values. Not all cases will need it, but by writing a common function, it is there when you need it.

    Personally, I don't like the "numbers" table approach due to limitations on the values that can be passed in the delimited string along with duplicate handling.

    Sample function:

    [font="Courier New"]CREATE FUNCTION Foo

    (@p_integer_list VARCHAR(MAX))

    RETURNS @return_table TABLE (ROW_NUMBER int IDENTITY(1,1), INTEGER_ID int)

    AS

    BEGIN;

    DECLARE @v_length bigint;

    DECLARE @v_start bigint;

    DECLARE @v_end bigint;

    DECLARE @v_integer int;

    SET @v_length = LEN(@p_integer_list);

    SET @v_start = 1;

    SET @v_end = 1;

    WHILE @v_start <= @v_length

    BEGIN

    SET @v_end = CHARINDEX (',', @p_integer_list, @v_start);

    IF @v_end = 0 -- Delimiter not found

    SET @v_end = @v_length + 1;

    SET @v_integer = SUBSTRING(@p_integer_list, @v_start, @v_end-@v_start);

    INSERT INTO @return_table (integer_id) VALUES (@v_integer);

    SET @v_start = @v_end + 1;

    END;

    RETURN;

    END;[/font]


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • > Personally, I don't like the "numbers" table approach due to limitations on

    > the values that can be passed in the delimited string along with duplicate handling.

    John,

    Can you expand on this please.

    I haven't had a problem with the "numbers" table approach and have found it performs well.

    FYI there's an interesting and detailed analysis here

    http://www.sommarskog.se/arrays-in-sql-2005.html

    http://www.sommarskog.se/arrays-in-sql-perftest.html

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Re: "numbers" table usage

    I'm not stating that a "numbers" table is not a good thing. Just be cautious with the related logic (code) that utilizes a numbers table.

    To wit: Given the delimited string value of

    ' 1, 3, 123456789, 5, 7, 7, 9, 14, 36, 395, 7 '

    (note the blanks and duplicate values)

    the code presented under "Parsing a string" in http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html will:

    1. Is painfully slow (with a number table of 512,000 rows)

    2. Return an empty answer (can't handle the blanks)

    3. With the blanks removed, return an INCORRECT answer (value 123456789 will not be returned)

    4. Can't handle duplicate values in the list

    Granted, the issue is not with the "numbers" table per se, but with the associated logic.

    To give credit: The CTE logic utilizing a numbers table does return the correct results.

    I also had one or more of the following business logic requirements in certain circumstances (i.e., sometimes, but not always):

    1. to preserve the order of the entries passed

    2. have an ordinal number associated with each entry

    3. support duplicate values

    4. preserve the order of the duplicate value(s)

    5. support (rarely) a delimited string value exceeding 8,000 characters

    Requirements 1 and 2 would, in my business case, preclude using the CTE example as:

    1. No ordinal position is returned

    2. The order of the values is not guaranteed (NOTwithout an ORDER BY clause)

    This forces at least some IDENTITY value or equivalent.

    Therefore, my table-valued function works for me.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • You can quite easily get ordinal numbers by doing this.

    CREATE FUNCTION Foo

    (@p_integer_list VARCHAR(MAX))

    RETURNS @return_table TABLE (ROW_NUMBER int IDENTITY(1,1), INTEGER_ID int)

    AS

    BEGIN;

    INSERT INTO @return_table (INTEGER_ID)

    SELECT CAST(SUBSTRING(@p_integer_list,

    Number,

    CHARINDEX(',',@p_integer_list+',',Number)-Number) AS INT)

    FROM dbo.Numbers

    WHERE Number BETWEEN 1 AND LEN(@p_integer_list)+1

    AND SUBSTRING(','+@p_integer_list,Number,1)=','

    ORDER BY Number;

    RETURN;

    END;

    Also the CTE can generate an ordinal via ROW_NUMBER() OVER(ORDER BY Number)

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • The function I presented was initially written for SQL Server 2000. Unfortunately, at that time I had to "chunk" the incoming TEXT datatype into VARCHAR(8000) "chunks" as CHARINDEX and SUBSTRING would silently fail after 8000 characters. The function has since been streamlined for SQL Server 2005.

    The ranking function ROW_NUMBER OVER() didn't exist.

    Note: I'm not so sure your ROW_NUMBER (ordinal position logic) is correct as you're ordering on the Numbers, not their position in the string.

    Anyway, you've taken my plain parsing function and placed a physical table lookup (number table) within it. Now you're making a trip to a table vs. just code. And you still haven't eliminated the need for CHARINDEX and SUBSTRING.

    In a "quick" performance test (on my dedicated development machine running SQL Server 2005 SP2) between my function and the CTE using a 512,000 row "numbers" table (both using VARCHAR(MAX)) I get the following elapsed times:

    CTE: ~80-100 ms

    Function: 0 ms

    And my function is doing more! So although I technically "can" use an associated "numbers" table within the function to "parse and split" the values, I don't think that it is the "best" solution to the problem. I could also use XML logic to "pivot" the values into the table. However, I think that, although "techie", it would lose on performance.

    An interesting "head to head" challenge tempered by both performance and simplicity (code and usage)

    Test code:

    [font="Courier New"]set nocount on

    declare @start_time datetime

    DECLARE @list varchar(max)

    SET @list = ' 1 , 3 , 123456789 , 5 , 7 , 7 , 9 , 14 , 36 , 395 , 7 ';

    set @start_time = getdate();

    SELECT * FROM Foo(@list)

    print datediff(ms, @start_time, getdate());

    set @start_time = getdate();

    WITH CTE AS

    (

    SELECT CAST(SUBSTRING(@list, Number, CHARINDEX(',',@list+',',Number)-Number) AS INT) AS Val

    FROM dbo.Numbers

    WHERE Number BETWEEN 1 AND LEN(@list)+1 AND SUBSTRING(','+@list,Number,1)=','

    )

    SELECT * FROM cte

    print datediff(ms, @start_time, getdate())[/font]


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • > Note: I'm not so sure your ROW_NUMBER (ordinal position logic) is correct as you're ordering on the > Numbers, not their position in the string.

    It does work. Try it.

    As for performance tests, I've no idea what happening there. The timings seem inconsistent between runs.

    If you increase the length of the CSV by doing this

    set @list=@list+','+@list+','+@list+','+@list+','+@list+','+@list+','+@list+','+@list

    set @list=@list+','+@list+','+@list+','+@list+','+@list+','+@list+','+@list+','+@list

    the CTE runs faster (sometimes)

    Also changing varchar(max) to varchar(8000) changes the results.

    I don't have the time to look into this, but I would suggest following the link I posted,

    there's a good analysis on the performance of the available methods.

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Mark,

    I'll wrap up this thread by stating the following:

    1. Although (sometimes) faster, changing from VARCHAR(MAX) to VARCHAR(8000) breaks my business logic (use case) requirement.

    2. In the performance testing, just look at the actual query plan. You'll see "clustered index seeks" to the "numbers" table whereas the function doesn't have to go to any table. That speaks for itself.

    3. I'm aware of the work by Erland Sommarskog. As with all things computer related, there is never a pure "right" and "wrong" way to do things.

    4. As I stated, my method satisfied all of my business logic requirements, is flexible, is fast, and doesn't require another physical table in the database. Therefore, it is correct (for me).

    5. The original poster was looking for a way to accomplish a task. My suggestion of a pure function that I use was provided along with suggestions by others. I'll leave it up to the individual to digest all of this thread and make a decision.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • The problem with the numbers table not out-performing the straight function has nothing to do with the actual numbes table; it is the CTE that is dragging it down. Consider this function:

    CREATE FUNCTION dbo.udf_ListTable (@parmString VARCHAR(8000), @parmDelim varchar(10))

    RETURNS TABLE

    AS

    RETURN (

    SELECT Number,

    RTRIM(LTRIM(SUBSTRING(@parmString,

    CASE Number

    WHEN 1 THEN 1

    ELSE Number + 1

    END,

    CASE CHARINDEX(@parmdelim, @parmString, Number + 1)

    WHEN 0 THEN LEN(@parmString) - Number + 1

    ELSE CHARINDEX(@parmdelim, @parmString, Number + 1) - Number -

    CASE

    WHEN Number > 1 THEN 1

    ELSE 0

    END

    END

    ))) AS ListItem

    FROM dbo.Numbers

    WHERE Number <= LEN(@parmString)

    AND (SUBSTRING(@parmString, Number, 1) = @parmdelim

    OR Number = 1)

    )

    Now, let's add in another test call to our new UDF that used the numbers table and re-run the test. Make sure you run the test more than once! You'll notice that the UDF using the numbers table always returns in 0 ms whereas the original UDF containing procedureal logic to parse is not as consistent and many times returns greater than that of our new function.

    set nocount on

    declare @start_time datetime

    DECLARE @list varchar(max)

    SET @list = ' 1 , 3 , 123456789 , 5 , 7 , 7 , 9 , 14 , 36 , 395 , 7 ';

    set @start_time = getdate();

    SELECT * FROM Foo(@list)

    print datediff(ms, @start_time, getdate());

    set @start_time = getdate();

    WITH CTE AS

    (

    SELECT CAST(SUBSTRING(@list, Number, CHARINDEX(',',@list+',',Number)-Number) AS INT) AS Val

    FROM dbo.Numbers

    WHERE Number BETWEEN 1 AND LEN(@list)+1 AND SUBSTRING(','+@list,Number,1)=','

    )

    SELECT * FROM cte

    print datediff(ms, @start_time, getdate())

    set @start_time = getdate();

    SELECT *

    FROM dbo.udf_ListTable(@list,',')

    print datediff(ms, @start_time, getdate())

    By the way, my number table contains 100,000 rows. 512,000 is a bit overkill.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • True, but what have I gained other than an exercise. We're nit picking. The logic in the function is nice, maybe for someone else, but I can't use it for the following reasons:

    1. The ordinal values are not correct. I get back 1, 6, 11, 24, 29, 34, 39, 44, 51, 58, 65 rather than 1 through 11 inclusive.

    2. I can't pass in strings longer than 8000 characters due to use of VARCHAR(8000)

    3. Performance isn't any better. 0ms = 0ms

    4. The number table must have as many entries as the longest string that is passed into the function due to "LEN(@parmString)". An artificial constraint and potential hidden "time bomb" that would silently return an incorrect answer. So in my opinion, 512,000 rows in the number table is NOT overkill, but probably not enough. Note: I used the large number of rows to ensure that performance results would not be skewed with a small table.

    For my business logic requirements, I'll stick with what I've got as it works.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • I guess my question would be... why does anyone pass more than 8000 bytes of parameters to a stored procedure... what is the purpose?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply