SQL SERVER Comma Separated values XML PATH

  • Hi

    I Have a very critical issue in Comma separated values using XML Path

    I have table "MENU" storing data as mentioned below

    MENU Table:

    -----------------

    CallStartDateTime MenuNames

    2014-09-18 srinivas;saipu;vasu;krishna;srinivas;saipu;krishna

    2014-09-17 srinivas;saipu;vasu;krishna;srinivas;saipu;krishna;vasu;saipu

    2014-09-18 krishna;srinivas;saipu;vasu

    2014-09-18 saipu;vasu;a1;a2;a3;.......a100

    2014-09-16 saipu;vasu;a1;a2;a3;.......a100

    2014-09-18 a100;a99;a34;saipu;vasu;krishna;

    Output required:Table Name as "TEMP"

    TEMP TABLE:

    ---------------

    CallStartDateTime MenuNames

    2014-09-18 srinivas;saipu;vasu;krishna;srinivas;saipu;krishna; krishna;srinivas;saipu;

    vasu;saipu;vasu;a1;a2;a3;.......a100; a100;a99;a34;saipu;vasu;krishna;

    Everytime I need to get getdate-1 records to temp table.Single row output.

    Please help me.

    Thanks

    Srini

  • Here is a suggestion

    😎

    USE tempdb;

    GO

    SET NOCOUNT ON;

    DECLARE @SAMPLE_DATA TABLE

    (

    CallStartDateTime DATE NOT NULL

    ,MenuNames VARCHAR(100) NOT NULL

    );

    INSERT INTO @SAMPLE_DATA(CallStartDateTime,MenuNames)

    VALUES

    ('2014-09-18','srinivas;saipu;vasu;krishna;srinivas;saipu;krishna')

    ,('2014-09-17','srinivas;saipu;vasu;krishna;srinivas;saipu;krishna;vasu;saipu')

    ,('2014-09-18','krishna;srinivas;saipu;vasu')

    ,('2014-09-18','saipu;vasu;a1;a2;a3;.......a100')

    ,('2014-09-16','saipu;vasu;a1;a2;a3;.......a100')

    ,('2014-09-18','a100;a99;a34;saipu;vasu;krishna;');

    ;WITH DISTINCT_DATES AS

    (

    SELECT DISTINCT

    SD.CallStartDateTime

    FROM @SAMPLE_DATA SD

    )

    SELECT

    DD.CallStartDateTime

    ,STUFF((SELECT CHAR(44) + SD.MenuNames

    FROM @SAMPLE_DATA SD

    WHERE DD.CallStartDateTime = SD.CallStartDateTime

    FOR XML PATH(''),TYPE).value('.[1]','VARCHAR(MAX)'),1,1,'') AS MenuNames

    FROM DISTINCT_DATES DD;

    Results

    CallStartDateTime MenuNames

    ----------------- ------------------------------------------------------------------------------------------------------------------------------------------------

    2014-09-16 saipu;vasu;a1;a2;a3;.......a100

    2014-09-17 srinivas;saipu;vasu;krishna;srinivas;saipu;krishna;vasu;saipu

    2014-09-18 srinivas;saipu;vasu;krishna;srinivas;saipu;krishna,krishna;srinivas;saipu;vasu,saipu;vasu;a1;a2;a3;.......a100,a100;a99;a34;saipu;vasu;krishna;

  • Hi

    Thanks for your help.

    This is working fine if the records are less

    And not working if you have lakhs of records.

    So in my production server it is not working.

    It is not fetching all the comma separated values till the end of rows.

    Please help me to resolve this issue.

    Thanks

    Srinivas.

  • Not certain what is causing the problem, nothing in the code should limit the length apart from the maximum 2Gb limit of the varchar(max). Consider the following example

    😎

    GO

    SET NOCOUNT ON;

    /* Create a 24000 character string variable

    Replicate has a limit of 8000 bytes, so we fill one variable

    and then concatenate into another three times the value

    */

    DECLARE @TSTR_1 VARCHAR(MAX) = REPLICATE('1234567890',800)

    DECLARE @TSTR_2 VARCHAR(MAX) = @TSTR_1 + @TSTR_1 + @TSTR_1;

    DECLARE @SAMPLE_DATA TABLE

    (

    CallStartDateTime DATE NOT NULL

    ,MenuNames VARCHAR(MAX) NOT NULL

    );

    INSERT INTO @SAMPLE_DATA(CallStartDateTime,MenuNames)

    VALUES

    ('2014-09-18','srinivas;saipu;vasu;krishna;srinivas;saipu;krishna')

    ,('2014-09-17','srinivas;saipu;vasu;krishna;srinivas;saipu;krishna;vasu;saipu')

    ,('2014-09-18','krishna;srinivas;saipu;vasu')

    ,('2014-09-18','saipu;vasu;a1;a2;a3;.......a100')

    ,('2014-09-16','saipu;vasu;a1;a2;a3;.......a100')

    ,('2014-09-18',@TSTR_2 );

    ;WITH DISTINCT_DATES AS

    (

    SELECT DISTINCT

    SD.CallStartDateTime

    FROM @SAMPLE_DATA SD

    )

    SELECT

    DD.CallStartDateTime

    ,STUFF((SELECT CHAR(44) + SD.MenuNames

    FROM @SAMPLE_DATA SD

    WHERE DD.CallStartDateTime = SD.CallStartDateTime

    FOR XML PATH(''),TYPE).value('.[1]','VARCHAR(MAX)'),1,1,'') AS MenuNames

    ,LEN(STUFF((SELECT CHAR(44) + SD.MenuNames

    FROM @SAMPLE_DATA SD

    WHERE DD.CallStartDateTime = SD.CallStartDateTime

    FOR XML PATH(''),TYPE).value('.[1]','VARCHAR(MAX)'),1,1,'')) AS Menu_Len

    FROM DISTINCT_DATES DD;

    Results (chopped)

    CallStartDateTime MenuNames Menu_Len

    ----------------- ----------------------------- ---------

    2014-09-16 saipu;vasu;a1;a2;a3;..... 31

    2014-09-17 srinivas;saipu;vasu;krish 61

    2014-09-18 srinivas;saipu;vasu;krish.. 24111

  • HI

    Thanks for your help

    Still facing the same problem.Actually the original query is mentioned below.

    changed the column MenuName as NodeTraversed.

    First Iam going to mentioned below function which splits comma separated values before going to actual query

    Using Function :

    DelimitedSplit8K_T1

    --------------------------------------------------------------------------------------------------------------------

    USE [UIDAI_IVR_DB]

    GO

    /****** Object: UserDefinedFunction [dbo].[DelimitedSplit8K_T1] Script Date: 09/22/2014 21:45:20 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    ALTER FUNCTION [dbo].[DelimitedSplit8K_T1]

    --===== Define I/O parameters

    (@pString VARCHAR(8000), @pDelimiter CHAR(1))

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    --===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...

    -- enough to cover VARCHAR(8000)

    WITH E1(N) AS (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ), --10E+1 or 10 rows

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

    cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front

    -- for both a performance gain and prevention of accidental "overruns"

    SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4

    ),

    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)

    SELECT 1 UNION ALL -- does away with 0 base CTE, and the OR condition in one go!

    SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter

    ),

    cteLen(N1,L1) AS(--==== Return start and length (for use in substring)

    SELECT s.N1,

    ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)

    FROM cteStart s

    )

    --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.

    SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),

    Item = SUBSTRING(@pString, l.N1, l.L1)

    FROM cteLen l

    ;

    ----------------------------------------------------------------------------------------------------------------------------

    Original Query:

    ---------------

    select * into #temp

    from

    (

    SELECT convert(varchar,t1.CallStartDateTime,121) as date,

    NodeTraversed=STUFF(

    (SELECT ';' + NodeTraversed

    FROM Call_Detail_Report_Repository t2

    WHERE convert(varchar,t1.CallStartDateTime,121) = convert(varchar,t2.CallStartDateTime,121)

    FOR XML PATH(''),TYPE).value('.[1]','VARCHAR(MAX)'),1,1,'')

    FROM Call_detail_Report_Repository t1 where convert(varchar,CallStartDateTime,112) = convert(varchar,getdate()-1,112)

    GROUP BY convert(varchar,t1.CallStartDateTime,121)

    )t

    select * from #temp

    drop table #temp

    IF DATEPART (HH,GETDATE()) in

    (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23)

    INSERT INTO [UIDAI_IVR_DB].[dbo].[Node_Traversed_Summary_Report]

    select t2.date as CallStartDate,t2.item as MenuName,count(t2.item)as NoOfHits

    FROM

    (

    SELECT * from #temp

    cross apply dbo.DelimitedSplit8K_T1(#temp.NodeTraversed,';')

    )t2

    group by t2.date,t2.item

    order by t2.date

    ELSE IF DATEPART (HH,GETDATE()) in (0)

    INSERT INTO [UIDAI_IVR_DB].[dbo].[Node_Traversed_Summary_Report]

    select t2.date as CallStartDate,t2.item as MenuName,count(t2.item)as NoOfHits

    FROM

    (

    SELECT * from #temp

    cross apply dbo.DelimitedSplit8K_T1(#temp.NodeTraversed,';')

    )t2

    group by t2.date,t2.item

    order by t2.date

    drop table #temp

    I need to get the output as

    CallStartDate NodeTraversed Noofhits(count of each NodeTraversed)

    2014-09-21 vasu 111110

    2014-09-21 srinivas 232345

    2014-09-21 krish 23234

    Please help me in this.If u need more to elaborate. I will send you in mail

    Please provide your mail ID.

    Thanks

    Srinivas

  • The split function you cited returns table of VARCHAR(8000) because this is what Item = SUBSTRING(@pString, l.N1, l.L1) returns due to it's first argument type. Provided Call_Detail_Report_Repository.NodeTraversed is the result of the split function, you need to CAST it to VARCHAR(MAX) explicitly.

  • cnu1252 (9/22/2014)


    HI

    Thanks for your help

    Still facing the same problem.Actually the original query is mentioned below.

    changed the column MenuName as NodeTraversed.

    First Iam going to mentioned below function which splits comma separated values before going to actual query

    ...

    This is quite different to your first post. Can you supply some sample data to work with and a table of expected output? Thanks.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Firstly, Using 8K limited functions will of course truncate larger values, does your production data exceed that limit? Secondly, please comply to Chris's request on DDL, sample and output.

    😎

  • Hi

    The below query is working fine but it is taking more time to execute 2-3lakhs records per day

    It has taken 2:30min in my production server today.Please check the below query.I have more than 50k comma separated items in each row and total rows in that table is around 2-3lakhs per day.

    How to make to execute faster??

    -- table definition and data

    declare @CallDetailReport table

    (

    CallStartDateTime datetime,

    NodeTraversed nvarchar(MAX)

    )

    insert into @CallDetailReport (CallStartDateTime, NodeTraversed)

    select '2014-09-22 03:44:33', 'Srinivas;vasu;lakshmi;srini;srini'

    union all select '2014-09-20 09:42:00', 'vasu;kumar;raj;'

    union all select '2014-09-21 23:43:11', 'Srinivas;srini'

    union all select '2014-09-22 12:33:44', 'krishh;raj;kumar;Srinivas;srini;'

    union all select '2014-09-22 01:33:33', 'vasu;srini;lakshmi;raj;krishh;Srinivas;'

    union all select '2014-09-21 05:11:09', 'krishh;raj;srini'

    --=====================================================================--=============================================================================

    -- Create and populate a Tally table

    --===== Conditionally drop

    IF OBJECT_ID('dbo.Tally') IS NOT NULL

    DROP TABLE dbo.Tally

    --===== Create and populate the Tally table on the fly

    SELECT TOP 11000

    IDENTITY(INT,1,1) AS N

    INTO dbo.Tally

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    --=============================================================================

    -- Main query to group by date (and remove nodes visited only once for the date)

    select convert(char(10), X.CallStartDateTime, 120) as CallStartDateTime, X.Node as MenuName, count(*) as NoOfHits

    from (

    -- Find words separated by ";" in NodeTraversed

    select R.CallStartDateTime, substring(R.NodeTraversed, T.N, MIN(T2.N-2) - T.N + 1) AS Node

    from CallDetailReport R

    inner join dbo.Tally T

    on substring(';' + R.NodeTraversed, T.N, 1) = ';'

    and T.N < len(R.NodeTraversed) -- ignore when ";" is last character

    inner join dbo.Tally T2

    on substring(';'+ R.NodeTraversed + ';', T2.N, 1) = ';'

    and T2.N > T.N WHERE convert(varchar,CallStartDateTime,101) = convert(varchar,GETDATE()-1,101)

    group by CallStartDateTime, R.NodeTraversed, T.N

    ) X

    group by convert(char(10), CallStartDateTime, 120), Node

    order by 1, 2

    output:

    CallStartDateTimeMenuName NoOfHits(/*count of each item*/)

    2014-09-23 Srinivas 5725343

    2014-09-23 vasu 3434

    2014-09-23 Kumar 192934

    2014-09-23 raj 2400343

    2014-09-23 srini 1525454

    2014-09-23 Krishh 189534

  • Two main issues with the original: the date filter isn't SARGable whereas CAST(datetimecolumn AS DATE) is, and you're referencing the tally table twice. Try this. You may have to make one or two adjustments but it won't be far off:

    SELECT

    CallStartDateTime = CAST(CallStartDateTime AS DATE),

    d.item AS MenuName,

    COUNT(*) as NoOfHits

    FROM @CallDetailReport R

    CROSS APPLY ( -- d

    SELECT

    ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),

    Item = SUBSTRING(R.NodeTraversed, l.N1, l.L1)

    FROM ( -- l

    SELECT

    s.N1,

    L1 = ISNULL(NULLIF(CHARINDEX(';',R.NodeTraversed,s.N1),0)-s.N1,8000) -- change this?

    FROM ( -- s

    SELECT n1 = 1

    UNION ALL

    SELECT t.N+1

    FROM ( -- t

    SELECT TOP (ISNULL(DATALENGTH(R.NodeTraversed),0)) n = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM dbo.Tally

    ) t

    WHERE SUBSTRING(R.NodeTraversed,t.N,1) COLLATE Latin1_General_BIN = ';'

    ) s

    ) l

    ) d

    WHERE CAST(CallStartDateTime AS DATE) = DATEADD(DAY,-1,CAST(GETDATE() AS DATE))

    AND d.item > ''

    GROUP BY CAST(CallStartDateTime AS DATE), d.item

    ORDER BY CAST(CallStartDateTime AS DATE), d.item

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks a lot.As of now it is working fine.Its just taking 2min for executing 2lakhs records.

    Thank you very much for your help.

    Regards

    Srinivas

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply