Interesting Way to Parse Comma Separated Values

  • Hey Guys,

    This is one of those unorthodox requests and I wanted to rub minds with the community before I proceed with it.

    Background

    I have a huge stored procedure with several parameters that is called by an application. There are various combinations of parameters the application can call the sproc with.

    Question

    I have "profiler traced" the execution of the sprocs and exported the results to a sql table. I want to do an analysis on the sproc and it's combination of parameters to see what combination sets are called the most. What is the best way to go about this? I have I know it will involve some sort of parsing the comma separated values, turn the rows into a column etc.

    What is the best way to go about this sort of thing?

    Sample Sproc Results from Trace Table

    RowNo, TextData

    1, exec sproc @param1 = 2, @param2 = 'yes', @param3 = 'no'

    2, exec sproc @param1 = 2, @param2 = 'yes', @param3 = 'no'

    3, exec sproc @param1 = 1, @param2 = 'no', @param3 = 'no'

    4, exec sproc @param1 = 1, @param2 = 'no', @param3 = 'yes'

    5, exec sproc @param1 = 2, @param2 = 'yes', @param3 = 'no'

    In this sample trace Table data, rowno1,2 and 5 parameter combination is called the most and the others 3 and 4 are called once. How can I go about determining this if I have thousands of data as opposed to the 5 in this example.

  • Can you change the Sproc? If it were me I would, temporarily, add some logging to the Sproc.

    Even rename the SProc and write a wrapper, with original name, that logs the parameters and then EXECs the Sproc

    If the data is in a table and you want to split it then this perhaps:

    SELECTT.MyKey AS [Key],

    X.SplitItem AS [Item],

    LTrim(RTrim(X.SplitValue)) AS [Value]

    FROMMyTraceTable AS T

    CROSS APPLY dbo.MyStringSplitterFunction(T.MyDelimitedData, ',') AS X

    ORDER BY [Key], [Item]

    but it will barf if any of the parameters contain a comma - e.g. @param2 = 'y,s'

  • I'm not sure whether the parameters are displayed in the proc definition order or the call order. If it's the proc definition order, then you do not need to split the string at all. Just group on the string.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (9/15/2015)


    I'm not sure whether the parameters are displayed in the proc definition order or the call order. If it's the proc definition order, then you do not need to split the string at all. Just group on the string.

    If I'm reading your question correctly, Drew's right. Here's an example of what he's saying:

    SELECT TextData, COUNT(*)

    FROM #TraceData

    GROUP BY TextData

    ORDER BY COUNT(*) DESC;

    We could go down the road of splitting the string, but I don't know that it's really necessary.

  • Thanks for the responses. You make a great point about just grouping it. The issue however is that since the sproc is so huge, the text data is an ntext column datatype and we can't group those.

  • johnsonchase7 (9/15/2015)


    Thanks for the responses. You make a great point about just grouping it. The issue however is that since the sproc is so huge, the text data is an ntext column datatype and we can't group those.

    You're looking at the calls to the procedure, right? Unless I'm missing something obvious (which may very well be the case), I don't see what the size of the procedure itself has to do with it.

    If you have your procedure calls in an ntext column, how large is the actual data? The calls to the procedure shouldn't require that wide of a column. If they're less than 8000 in actual length, then you could do something like this:

    SELECT CONVERT(Varchar(8000), TextData), COUNT(*)

    FROM #TraceData

    GROUP BY CONVERT(Varchar(8000), TextData)

    ORDER BY COUNT(*) DESC;

    Yes, it's going to take longer to run due to the conversion, but it should run and give you what you're after. Will this work for you?

  • This actually works for me. Thanks guys.

  • johnsonchase7 (9/15/2015)


    This actually works for me. Thanks guys.

    Excellent. Thanks for the feedback.

  • I wouldn't think you'd want to force the parameter values to be the same to group them. That is, as long as they specified params 1, 3 and 5, that would be a similar pattern, even if the actual param values were different.

    Maybe something like below to pull out the param names. You can then group / count them as you see fit.

    SELECT tt.rowno, LEFT(ds.Item, PATINDEX('%[ =]%', ds.Item) - 1) AS param_name

    FROM trace_table tt

    CROSS APPLY (

    SELECT SUBSTRING(tt.TextData, CHARINDEX('@', tt.TextData + '@'), 8000) AS params

    ) AS assign_alias_names

    CROSS APPLY dbo.DelimitedSplit8K ( params, '@' ) AS ds

    WHERE

    ds.Item > ''

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Since you're on SQL 2012, you should convert your ntext columns to nvarchar(max) columns, which you can sort on.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • johnsonchase7 (9/15/2015)


    This actually works for me

    Syntax of all the parameters in all the calls is identical (sequence and space padding etc)? No quoting of numbers sometimes, and passing them as digits other times? or any other variations-on-a-theme ...

    If so I would count myself very lucky!

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply