passing csv list into stored procedure

  • Jeff,

    Because SQL Server doesn't allow passing in arrays of values into a stored procedure. This [long overdue] defect is being corrected in SQL Server 2008.

    In our web-based (i.e., stateless) application, the middle tier needs to pass to the database layer an array of values (hypothetically a list of PK values) to process (as a transaction).

    Using integers, a comma delimited list is limited to about 1500 +/- entries depending upon the number of the digits in each entry.

    So, being "typical" programmers with the thinking that VARCHAR(8000) will be "way more than enough", we coded it that way. Naturally, it failed in a very rare actual use case by a customer. So do you tell the customer to "not do that"? Or do you fix your code?

    For us, we had no choice but to fix the code for those rare "edge" cases.

    So when someone says "that'll never happend" or "the customer will never do that", start coding for exactly those conditions.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Who did you say was nit-picking?? 😀 We all know that your solution works for you cuz you told us so more than once. I am simply displaying yet another technique to solve the users initial problems. My main motivation to do so was to show that using a numbers table to parse a string is NOT less efficient that using procedural logic. It is rather the opposite. Did you run your test more than once? Did your solution come back 0ms every time? It did not on my system.

    1. Your ordinal value requirement did not call for sequencial numbers did it? I recall the reason for the ordinal value to be for sorting. My solution does that just fine.

    2. Well then, change the funcction to varchar(MAX) like yours (my function was taken off a SS2000 box).

    3. Let's run several itterations of a performance test between the two. Everone knows that 0=0. What I was saying is that when run multiple times on my system, your function returned > 0 about 1/2 of the times where my function returned 0ms EVERY time.

    4. It's not a timebomb at all as soon as you fix the varchar(MAX) parameter declaration.

    OK, so let's blow this up and test it. Here's what I've done. I've changed the ListTable function to use varchar(MAX) as I probably should have up front. I've increased my numbers table to be 1 million rows. Let's run through a test with a huge list and see what happens. You'll notice in the code that I am using the tally table to create the @list values but then I'm dropping the buffers so my function has to read from disk and not memory. Here's my test syntax:

    set nocount on

    declare @list varchar(max),

    @start_time datetime

    set @list = ''

    --use numbers table to create list variable

    select @list = @list + cast(number as varchar) + ','

    from numbers

    where number < 50000

    --start from scratch and make SQL Server get numbers rows from disk (worse case)

    dbcc freeproccache

    dbcc dropcleanbuffers

    --procedural parse function

    set @start_time = getdate();

    SELECT * FROM Foo(@list)

    print 'Procedural logic: ' + CAST(datediff(ms, @start_time, getdate()) as varchar)

    --tally table parse

    set @start_time = getdate();

    select * from dbo.udf_listtable(@list,',')

    print 'Tally table logic: ' + CAST(datediff(ms, @start_time, getdate()) as varchar)

    Drumroll please:

    1st run

    Procedural logic: 4170

    Tally table logic: 2623

    2nd run

    Procedural logic: 3483

    Tally table logic: 1920

    3rd run

    Procedural logic: 2903

    Tally table logic: 1940

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • John,

    You and I are presenting similar, albeit different, solutions to the initial poster's problem. Note that I stated that in my earlier reply.

    As for the performance test: yes, I did exercise the function more than once. All iterations of my function, as well as yours, returned 0ms (maybe I need a slower machine for the testing). I'm well aware of proper performance testing techniques.

    So again, I'll close my participation in this thread with the following:

    1. I'll let others decide what is correct for them.

    2. I always have an open mind as to solutions to computer logic problems. Therefore, a "numbers" table may be utilized in a solution.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Because SQL Server doesn't allow passing in arrays of values into a stored procedure

    Thanks, but still doesn't answer my question... what type of data would those "arrays of values" contain?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You are correct. There are 10 ways to skin a cat. I've just skinned enough cats to prefer the fastest way because their guts stink real bad. Speed is all relative to what you are used to.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Thanks, but still doesn't answer my question... what type of data would those "arrays of values" contain?

    Jeff,

    I did answered your specific question which was why would the length of the delimited string be greater than 8,000 characters.

    As to what would be in the delimited list I'll try to summarize without getting too deep into the application and/or its architecture. Suffice to say that it is 3-tiered.

    First, off we actually need a 2 dimensional array (i.e, a table) to be passed to the stored procedure. Our application allows the user the "assemble" the contents of what has been placed into a "tray". The tray's contents (i.e., identifiers) is maintained by the application layer as an array (table). This is what needs to passed to the database layer (stored procedure). Similar to the requirement in an earlier post to you about the same subject (passing arrays of values as parameters). See http://www.sqlservercentral.com/Forums/Topic389423-356-3.aspx#bm400359.

    Therefore, since SQL Server does not allow arrays to be passed as parameters, the middle tier pivots the arrays values as delimited strings. These are then pivoted back as table variables within the procedure.

    An XML snippet, which I'm utilizing in some new development, could also be passed as a parameter. Then the XML gets transformed (parsed) into an array for use by the SQL statements within the procedure. However, the initial code was written for SQL Server 2000.

    Both kludgy, but suitable workarounds for a major SQL Server functionality defect. Note that Oracle supported passing arrays as parameters to stored procedures back in version 7.2 (circa 1995).


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • JohnG (12/12/2007)


    Thanks, but still doesn't answer my question... what type of data would those "arrays of values" contain?

    Jeff,

    I did answered your specific question which was why would the length of the delimited string be greater than 8,000 characters.

    As to what would be in the delimited list I'll try to summarize without getting too deep into the application and/or its architecture. Suffice to say that it is 3-tiered.

    First, off we actually need a 2 dimensional array (i.e, a table) to be passed to the stored procedure. Our application allows the user the "assemble" the contents of what has been placed into a "tray". The tray's contents (i.e., identifiers) is maintained by the application layer as an array (table). This is what needs to passed to the database layer (stored procedure). Similar to the requirement in an earlier post to you about the same subject (passing arrays of values as parameters). See http://www.sqlservercentral.com/Forums/Topic389423-356-3.aspx#bm400359.

    Therefore, since SQL Server does not allow arrays to be passed as parameters, the middle tier pivots the arrays values as delimited strings. These are then pivoted back as table variables within the procedure.

    An XML snippet, which I'm utilizing in some new development, could also be passed as a parameter. Then the XML gets transformed (parsed) into an array for use by the SQL statements within the procedure. However, the initial code was written for SQL Server 2000.

    Both kludgy, but suitable workarounds for a major SQL Server functionality defect. Note that Oracle supported passing arrays as parameters to stored procedures back in version 7.2 (circa 1995).

    No, no... not what I meant... I understand that you're passing arrays of information... what is the information? Parts lists? Lists of purchased items? What?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No, no... not what I meant... I understand that you're passing arrays of information... what is the information? Parts lists? Lists of purchased items? What?

    Jeff,

    Sorry. Our SaaS application (inciteKnowledge) stores and manages content (Word document snippets, PowerPoint slides, etc.) which are then "assembled" (ad-hoc or pre-packaged) into a finished document (Word) or presentation (PowerPoint). The content items are identified by their ID (a GUID). Things also have a "type" classification (int). So the arrays that the code (application) works with contain internal identifiers (surrogate keys). The exact same stuff that I elaborated on a few months ago (in the referenced posting).


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Crud... missed your reference where you explained it before. Sorry, John.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply