performance issue, need to tune a query

  • WITH MaxD AS (select t1.* from D as t1

    inner join (

    SELECT ID, CN, TB, MAX(BD) as MaxBD

    FROM (

    SELECT ID, CN, TB, BD FROM [D]

    UNION ALL

    SELECT ID, CN, TB, BD FROM [M]

    ) AS t2

    GROUP BY ID, CN, TB)t3

    on t1.ID = t3.ID

    and t1.cn = t3.cn

    and t1.tb = t3.tb

    and t1.BD = t3.MaxBD

    )

    select * from (

    (SELECT '2' AS CI, t3.*

    FROM [M] t3

    LEFT JOIN MaxD t4 ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB

    WHERE t4.ID IS NULL

    UNION ALL

    SELECT  t4.*

    FROM MaxD t4

    right JOIN [M] t3 ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB )

    ) I

    where  I.CI != '1';

    I have optimized the logic to the above one. But still i am facing performance issues.

    Table M and Table D contains close to 150 million records in each table and each table has column store index.each table has 364(M table) and 365(D table) columns.

    I should not create indexes on any other columns. what is the best solution to tune the above query.

    To get the count for the above query it is taking 30 secs. if we can reduce down to atleast 15 secs by tuning it that would be great.

    Please help me on the above issue , thanks in advance.

     

  • having explain plan would be the best thing for us to help - there are a few things on your query that don't look right.

     

    -- original query formatted so we can read it easily.
    WITH MaxD AS (select t1.* from D as t1

    inner join (SELECT ID, CN, TB, MAX(BD) as MaxBD
    FROM (SELECT ID, CN, TB, BD FROM [D]
    UNION ALL
    SELECT ID, CN, TB, BD FROM [M]
    ) AS t2
    GROUP BY ID, CN, TB
    )t3
    on t1.ID = t3.ID
    and t1.cn = t3.cn
    and t1.tb = t3.tb
    and t1.BD = t3.MaxBD
    )
    select *
    from (SELECT '2' AS CI, t3.*
    FROM [M] t3
    LEFT JOIN MaxD t4
    ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB
    WHERE t4.ID IS NULL
    UNION ALL
    SELECT t4.*
    FROM MaxD t4
    right JOIN [M] t3
    ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB
    ) I
    where I.CI != '1';


    try this instead -

    drop table if exists #tempvals;

    -- create temp table definition and add index to it
    SELECT top (0)
    ID, CN, TB, MAX(BD) as MaxBD
    FROM (SELECT ID, CN, TB, BD FROM [D]
    UNION ALL
    SELECT ID, CN, TB, BD FROM [M]
    ) AS t2
    GROUP BY ID, CN, TB
    ;

    create clustered index ci_#tempvals on #tempvals
    (ID
    , CN
    , TB
    )

    -- now add the data to the temp table
    insert into #tempvals with (tablock)
    SELECT ID, CN, TB, MAX(BD) as MaxBD
    FROM (SELECT ID, CN, TB, BD FROM [D]
    UNION ALL
    SELECT ID, CN, TB, BD FROM [M]
    ) AS t2
    GROUP BY ID, CN, TB


    WITH MaxD AS (select t1.* from D as t1

    inner join #tempvals t3
    on t1.ID = t3.ID
    and t1.cn = t3.cn
    and t1.tb = t3.tb
    and t1.BD = t3.MaxBD
    )
    select *
    from (SELECT '2' AS CI, t3.*
    FROM [M] t3
    LEFT JOIN MaxD t4
    ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB
    WHERE t4.ID IS NULL
    UNION ALL
    SELECT t4.*
    FROM MaxD t4
    right JOIN [M] t3
    ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB
    ) I
    where I.CI != '1';



     

    I do have to query part your logic

    select *
    from (SELECT '2' AS CI, t3.*
    FROM [M] t3
    LEFT JOIN MaxD t4
    ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB
    WHERE t4.ID IS NULL
    UNION ALL
    SELECT t4.*
    FROM MaxD t4
    right JOIN [M] t3
    ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB
    ) I
    where I.CI != '1';

    you have T4.* on the union all right side with a right join - so there will be lots of rows being retrieved (or not if SQL is smart enough to completely ignore the join to M )

    all the rows that do NOT have a match on MaxD will be ignored due to the outside filter of I.CI != '1' as the value, according to the current query, will be null.

    is your original query correct, and working?

  • Somewhat confusing, the right join looks odd, but, anyway, you might try this and see if it produces the result you need:

    ;WITH DUniques AS (
    SELECT DISTINCT ID, CN, TB
    FROM [D]
    )
    SELECT *
    FROM (
    SELECT [D].*, ROW_NUMBER() OVER(PARTITION BY [D].ID, [D].CN, [D].TB ORDER BY [D].BD DESC) AS row_num
    FROM [D]
    ) AS MaxD
    WHERE row_num = 1 AND CI != '1'
    UNION ALL
    SELECT *
    FROM (
    SELECT '2' AS CI, [M].*, ROW_NUMBER() OVER(PARTITION BY [M].ID, [M].CN, [M].TB ORDER BY [M].BD DESC) AS row_num
    FROM [M]
    LEFT OUTER JOIN DUniques DU ON DU.ID = [M].ID AND DU.CN = [M].CN AND DU.TB = [M].TB
    WHERE DU.ID IS NULL
    ) AS MaxM
    WHERE row_num = 1

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Yes, my original code is working.

    One more thing i forgot to mention that the above query is a "view" as we cant use #temp tables in that.

  • that was a very important aspect to omit.

    try this - without the current explain plan its impossible to know if it will help

    WITH MaxD AS (select t1.* from D as t1

    inner join (SELECT ID, CN, TB, MAX(BD) as MaxBD
    FROM (SELECT ID, CN, TB, MAX(BD) AS BD FROM [D] group by ID, CN, TB -- if not doing it already this may push aggregation to vertipaq engine
    UNION ALL
    SELECT ID, CN, TB, MAX(BD) AS BD FROM [M] group by ID, CN, TB -- if not doing it already this may push aggregation to vertipaq engine
    ) AS t2
    GROUP BY ID, CN, TB
    )t3
    on t1.ID = t3.ID
    and t1.cn = t3.cn
    and t1.tb = t3.tb
    and t1.BD = t3.MaxBD
    )
    select *
    from (SELECT '2' AS CI, t3.*
    FROM [M] t3
    LEFT JOIN MaxD t4
    ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB
    WHERE t4.ID IS NULL
    UNION ALL
    SELECT t4.*
    FROM MaxD t4
    INNER JOIN [M] t3 -- changed to inner join - the where I.CI != '1' below turns this into a inner join as only match rows will have a value on column CI
    ON t4.ID = t3.ID AND t4.CN = t3.CN AND t4.TB = t3.TB
    ) I
    where I.CI != '1';
  • The above query you provided is good, but the performance is not up to the mark.

    Actually below is the logic i am looking for.

    M intersection D' (M∩D')

    conditon -- retrieve dataset based on Max(date) is nothing but Max(BD) by clubbing  M and D tables group by ID,CN,TB

    M table Doesnt have CI column, we are hard coding 2 as CI in M table

    M' union D (M'∪D)

    conditon - retrieve dataset based on Max(date) is nothing but Max(BD) by clubbing  M and D tables group by ID,CN,TB

    Final result needed -- M union D  --> (M∩D') ∪ (M'∪D)

    condition - CI !='1'

    can you please provide optimized t-sql query . Thanks in advance.

     

  • kris6363 wrote:

    can you please provide optimized t-sql query . Thanks in advance.

    Like has already been stated, it's pretty much impossible to help with performance issues just by looking at code.  Please see the article at the second link in my signature line for what and how to post for performance problems.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply