help optimizing a very complex query (join issues?)

  • I need help optimizing a query. I'm trying to create a result set of 312 rows, even if they are filled with zeros. I using a 'date lookup' table and cross joining to a flat list of char(1) district codes. I think this might be slowing me down, but if I don't some of the rows drop out for lack of data. In this example, I'm looking for 312 rows -- 13 codes * 24 months, 'with cube' will give me 350. Can anyone give me some suggestions?

    
    SELECT do.[itdo],DATEDIFF(mm,0,ud.[date]) AS MonthGroup,
    COUNT(r.[rapintakeid]) AS [Total],
    SUM(CASE r.[rapfinalrec] WHEN 'Y' THEN 1 END) AS [Accepted_Total],
    SUM(CASE WHEN r.[rapfinalrec]='Y' AND r.rapsrchap55 IN ('CHB','UNM','ENV','TRU','DEL') THEN 1 END) AS [Accepted_Chap55],
    SUM(CASE WHEN r.[rapfinalrec]='Y' AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL)) THEN 1 END) AS [Accepted_Chap49],
    SUM(CASE d.[sd_spvsrrev] WHEN 'S' THEN 1 END) AS [Substant_Total],
    SUM(CASE WHEN d.[sd_spvsrrev]='S' AND r.rapsrchap55 IN ('CHB','UNM','ENV','TRU','DEL') THEN 1 END) AS [Substant_Chap55],
    SUM(CASE WHEN d.[sd_spvsrrev]='S' AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL)) THEN 1 END) AS [Substant_Chap49],
    SUM(CASE d.[sd_spvsrsub] WHEN 'O' THEN 1 END) AS [Open_Total],
    SUM(CASE
    WHEN d.[sd_spvsrsub]='O' AND r.rapsrchap55 IN ('CHB','UNM','ENV','TRU','DEL')
    THEN 1 END
    ) AS [Open_Chap55],
    SUM(CASE
    WHEN d.[sd_spvsrsub]='O' AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL))
    THEN 1 END
    ) AS [Open_Chap49]
    FROM (SELECT [date] FROM dbo.[utility_date] WHERE [date]>='1/1/2003' AND [date]<'1/1/2005') ud
    CROSS JOIN (SELECT 'T'as [itdo] UNION SELECT 'M' UNION SELECT 'N' UNION SELECT 'Z' UNION SELECT 'S' UNION SELECT 'L' UNION SELECT 'A' UNION SELECT 'B' UNION SELECT 'Y' UNION SELECT 'R' UNION SELECT 'J' UNION SELECT 'V' UNION SELECT 'H') do
    LEFT JOIN 
    (SELECT [itid],[itdo],[itdate],[itdvsa],[itfamviol],[itinciddesc] FROM dbo.[intake] WHERE [itde]<>[itdc] AND [itdate]>='1/1/2003' AND [itdate]<'1/1/2005')
    i ON (CONVERT(varchar(10),ud.[date],110)=CONVERT(varchar(10),i.[itdate],110) AND i.[itdo]=do.[itdo])
    LEFT JOIN dbo.[reportaccpri] r ON i.[itid]=r.[rapintakeid]
    LEFT JOIN dbo.[family_service_disp] d ON i.[itid]=d.[sd_itid]
    
    GROUP BY do.[itdo],DATEDIFF(mm,0,ud.[date]) WITH CUBE
    
    ORDER BY do.[itdo],MonthGroup
    
  • Can you give some textual output data?

    Some things that might improve:

    Union -> Union All

    Rewrite CONVERT(varchar(10),ud.[date],110)=CONVERT(varchar(10),i.[itdate],110) to use BETWEEN instead of performing a function on a column

  • They 'union all' seemed useful, but it's hard to measure. As for not using a 'between', the trouble is, I'm trying to compare within a month, so I have to use a function somewhere.

    Here's a couple of the output lines.

    
    A2003-01-01 00:00:004017NULL17NULLNULLNULLNULLNULLNULL
    A2003-02-01 00:00:003718NULL18NULLNULLNULLNULLNULLNULL
    A2003-03-01 00:00:0053NULL3NULLNULLNULLNULLNULLNULL
    A2004-10-01 00:00:0092315268NULL82NULL2
    A2004-11-01 00:00:00733553014113211
    A2004-12-01 00:00:00813372612NULL124NULL4
    B2003-01-01 00:00:00173NULL3NULLNULLNULLNULLNULLNULL
    B2003-02-01 00:00:0031NULL1NULLNULLNULLNULLNULLNULL
    B2003-03-01 00:00:001NULLNULLNULLNULLNULLNULLNULLNULLNULL
    B2003-04-01 00:00:0012345NULL4511NULL116NULL6
    B2004-10-01 00:00:0026660114913112817
    B2004-11-01 00:00:0025862144817116918
    B2004-12-01 00:00:002635875118NULL188NULL8
    H2003-01-01 00:00:000NULLNULLNULLNULLNULLNULLNULLNULLNULL
    H2003-02-01 00:00:000NULLNULLNULLNULLNULLNULLNULLNULLNULL
    H2003-03-01 00:00:000NULLNULLNULLNULLNULLNULLNULLNULLNULL
    H2003-04-01 00:00:00439NULL92NULL22NULL2
    

    Also, I've since refined it further, which I'll post separately (the unabridged version is long).

  • Here's my more complete version. I've dropped use of the lookup table, and combined it with the district codes for joining. My speed is greatly improved, but only if I don't grab the execution plan 🙁

    
    DECLARE @filterDateBegin smalldatetime,
    @filterDateEnd smalldatetime,
    @districtList varchar(80),
    @counterDate smalldatetime
    DECLARE @dList TABLE(
    [itdo] varchar(1) COLLATE database_default
    )
    DECLARE @seed TABLE(
    [itdo] char(1),
    [monthGroup] smalldatetime,
    [dateDiff] int
    )
    SET @filterDateBegin='1/1/2003'
    SET @filterDateEnd='12/31/2004'
    SET @counterDate=@filterDateBegin
    SET @districtList=null
    SET @filterDateEnd=DATEADD(d,1,@filterDateEnd)
    IF @districtList IS NOT NULL INSERT @dList SELECT * FROM dbo.fn_parseList(@districtList,default)
    WHILE DATEDIFF(m,@counterDate,@filterDateEnd)>0
    BEGIN
    INSERT @seed
    SELECT *
    FROM (SELECT 'T' AS [itdo] UNION ALL SELECT 'M' UNION ALL SELECT 'N' UNION ALL SELECT 'Z' UNION ALL SELECT 'S' UNION ALL SELECT 'L' UNION ALL SELECT 'A' UNION ALL SELECT 'B' UNION ALL SELECT 'Y' UNION ALL SELECT 'R' UNION ALL SELECT 'J' UNION ALL SELECT 'V' UNION ALL SELECT 'H') ds
    CROSS JOIN (SELECT DATEADD(m,DATEDIFF(m,0,@counterDate),0)AS [monthGroup],DATEDIFF(m,0,@counterDate)AS [dateDiff]) dt
    SET @counterDate=DATEADD(m,1,@counterDate)
    END
    --==: END DECLARATIONS :==--
    
    
    SELECT ud.[itdo],ud.[monthGroup],
    COUNT(r.[rapintakeid]) AS [Total],
    SUM(CASE r.[rapfinalrec] WHEN 'Y' THEN 1 END) AS [Accepted_Total],
    SUM(CASE WHEN r.[rapfinalrec]='Y' AND r.rapsrchap55 IN ('CHB','UNM','ENV','TRU','DEL') THEN 1 END) AS [Accepted_Chap55],
    SUM(CASE WHEN r.[rapfinalrec]='Y' AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL)) THEN 1 END) AS [Accepted_Chap49],
    SUM(CASE d.[sd_spvsrrev] WHEN 'S' THEN 1 END) AS [Substant_Total],
    SUM(CASE WHEN d.[sd_spvsrrev]='S' AND r.rapsrchap55 IN ('CHB','UNM','ENV','TRU','DEL') THEN 1 END) AS [Substant_Chap55],
    SUM(CASE WHEN d.[sd_spvsrrev]='S' AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL)) THEN 1 END) AS [Substant_Chap49],
    SUM(CASE d.[sd_spvsrsub] WHEN 'O' THEN 1 END) AS [Open_Total],
    SUM(CASE
    WHEN d.[sd_spvsrsub]='O'
    AND r.rapsrchap55 IN ('CHB','UNM','ENV','TRU','DEL')
    THEN 1 END
    ) AS [Open_Chap55],
    SUM(CASE
    WHEN d.[sd_spvsrsub]='O'
    AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL))
    THEN 1 END
    ) AS [Open_Chap49]
    FROM @seed ud
    LEFT JOIN (SELECT [itid],[itdo],[itdate],[itdvsa],[itfamviol],[itinciddesc] FROM dbo.[intake] WHERE [itde][itdc] AND [itdate]>=@filterDateBegin AND [itdate]<@filterDateEnd) i ON (DATEDIFF(m,0,i.[itdate])=ud.[dateDiff] AND i.[itdo]=ud.[itdo])
    LEFT JOIN dbo.[reportaccpri] r ON i.[itid]=r.[rapintakeid]
    LEFT JOIN dbo.[family_service_disp] d ON i.[itid]=d.[sd_itid]
    GROUP BY ud.[itdo],ud.[monthGroup]
    HAVING CASE
    WHEN @districtList IS NULL THEN 1
    WHEN @districtList IS NOT NULL AND (ud.[itdo] IN (SELECT * FROM @dList) OR ud.[itdo] IS NULL) THEN 1
    ELSE 0
    END =1
    ORDER BY ud.[itdo],ud.[monthGroup]
    
  • It's a big piece of code, glad that I'm not responsible for it

    If I was the one who faced this query I would open the connection option "SET STATISTICS IO ON" and try to see what tables are scanned most. If "reportaccpri" table is scanned too much and you have an index on column "rapsrchap55" then you might consider seperating the script into two parts and union them all. Just to explain what I said; (part of it to make it easier to read)

    SUM(CASE WHEN r.[rapfinalrec]='Y' AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL)) THEN 1 END) AS [Accepted_Chap49],
    SUM(CASE d.[sd_spvsrrev] WHEN 'S' THEN 1 END) AS [Substant_Total],
    SUM(CASE WHEN d.[sd_spvsrrev]='S' AND r.rapsrchap55 IN ('CHB','UNM','ENV','TRU','DEL') THEN 1 END) AS [Substant_Chap55],
    SUM(CASE WHEN d.[sd_spvsrrev]='S' AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL)) THEN 1 END) AS [Substant_Chap49],
    UNION ALL
    SUM(CASE WHEN r.[rapfinalrec]='Y' AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL)) THEN 1 END) AS [Accepted_Chap49],
    SUM(CASE d.[sd_spvsrrev] WHEN 'S' THEN 1 END) AS [Substant_Total],
    SUM(CASE WHEN d.[sd_spvsrrev]='S' AND r.rapsrchap55 IN ('CHB','UNM','ENV','TRU','DEL') THEN 1 END) AS [Substant_Chap55],
    SUM(CASE WHEN d.[sd_spvsrrev]='S' AND (r.rapsrchap55='INA' OR (r.rapsrchap55 IS NULL AND r.[rapintakeid] IS NOT NULL)) THEN 1 END) AS [Substant_Chap49],

    I included the striked lines for referance. Just copy&paste your code and place a "UNION ALL" between. Then remove the striked lines and hopefully find a good increase in performance. (If the above conditions meet otherwise don't bother yourself)

    Good Luck

    Zubeyir

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply