How Can I set the right value according to comparison conditition

  • Hello,

    i have 2 tables , the first one is a referential of the quality services and the second is a historical tables as bellow the DDL :

    drop table if exists #oderstatus_quality
    create table #oderstatus_quality (
    id int identity primary key,
    label varchar(100),
    orderstatus_src int,
    orderstatus_src_label varchar(100),
    orderstatus_tgt int,
    orderstatus_tgt_label varchar(100)
    )
    insert into #oderstatus_quality values
    ('refunded vs Ordered',36,'Ordered',37,'refunded')
    ,('Cancelled vs Ordered',36,'Ordered',35,'Cancelled')
    ,('received vs Ordered',36,'Ordered',9,'received')
    ,('navigator Closed vs Cancelled',4,'Cancelled',11,'navigator Closed')
    ,('Banking Authorization refused vs Cancelled',4,'Cancelled',13,'Banking Authorization refused')
    drop table if exists #ordersStatusHistory
    Create table #ordersStatusHistory(
    id int identity primary key,
    order_number varchar(255),
    order_status int,
    order_date datetime
    )
    insert into #ordersStatusHistory
    values
    ('XN75',36,'2022-01-27 15:29:22.000')
    ,('XN75',9,'2022-02-01 07:15:53.000')
    ,('XN75',37,'2022-02-03 07:50:29.000')
    ,('XN76',36,'2022-03-01 07:15:53.000')
    ,('XN76',9,'2022-03-05 07:15:53.000')
    ,('XN77',36,'2022-02-03 07:15:53.000')
    ,('XN77',36,'2022-02-03 08:15:53.000')
    ,('XN77',9,'2022-02-04 17:30:00.000')
    ,('XN78',36,'2022-02-05 07:15:53.000')
    ,('XN78',4,'2022-02-05 07:20:53.000')
    ,('XN79',4,'2022-02-06 07:20:53.000')
    ,('XN79',11,'2022-02-06 07:21:53.000')
    ,('XN80',36,'2022-04-06 09:15:53.000')
    ,('XN80',4,'2022-04-06 09:20:53.000')
    ,('XN80',11,'2022-04-06 09:21:53.000')

    For the reporting purpose , i want to calculate the quality of my delivery for that i need two measures Comparaison_Status and All_Status

    the comparaison status it's the number of status 1 vs status 2 for example i have 'refunded vs Ordered' That mean i want to calcuate the number of orders that was orded but also was refunded

    The All_Status , it's the number of the status 2 for example i have 'refunded vs Ordered' that mean i want to see only the the orders that was ordered

    I tried the query as bellow , it's give me the right result but if someone have another solution and better for performance because my table is huge arround 30 millions rows , it will better

    ;with src as (
    select *,ROW_NUMBER() over(partition by order_number,order_status order by order_date asc) as grp from #ordersStatusHistory o
    where o.order_status in(select distinct orderstatus_src from #oderstatus_quality)
    )
    ,tgt as(
    select *,ROW_NUMBER() over(partition by order_number,order_status order by order_date asc) as grp from #ordersStatusHistory o
    where o.order_status in(select distinct orderstatus_tgt from #oderstatus_quality)
    )
    ,numberofAllstatus as(
    select *, 1 AS AllValues from src
    )

    select
    s.order_number
    ,s.order_date
    ,q.label
    ,count(s.order_status) as Comparaison_Status
    ,max(s.AllValues) as All_Status
    from numberofAllstatus s
    inner join tgt t
    on s.order_number = t.order_number and s.grp = t.grp
    inner join #oderstatus_quality q
    on s.order_status =q.orderstatus_src and t.order_status = q.orderstatus_tgt
    group by
    s.order_number
    ,s.order_date
    ,q.label
    union
    select
    s.order_number
    ,s.order_date
    ,q.label
    ,0 as Comparaison_Status
    ,max(s.AllValues) as All_Status
    from #oderstatus_quality q
    left join numberofAllstatus s
    on s.order_status =q.orderstatus_src
    left join tgt t
    on s.order_number = t.order_number and s.grp = t.grp and t.order_status = q.orderstatus_tgt
    where
    1=1
    --and s.order_number='XN75'
    and t.id is null
    group by
    s.order_number
    ,s.order_date
    ,q.label

    the expected result that i want :

    Screenshot 2024-03-23 005536

    Thanks for help !

     

    • This topic was modified 10 months ago by  MrRobot.
  • You have 15 rows that got expanded to 24 and you say you have 30 million original rows, which would be expanded to 48 million rows according the 24/15 ratio of the original data, and you want that to be fast?

    I'm thinking that there needs to be some aggregation/analysis somewhere along the line to seriously reduce the output even if the output is going to a table.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • One thought, which Jeff implied, on performance improvement would be to pre-calculate the data. What I mean is rather than calculating the data on demand, do a nightly (or hourly) dump of the data to a table and then have the report pull the data from the new table.

    BUT, if like what Jeff said, you are going to have 48 million rows, Excel is going to barf. Excel can't handle that much data. Plus, if Excel cannot handle 48 million rows, how do you expect your end user to consume 48 million rows of data?

    Nobody wants a report with 48 million rows in it. It is too much data. I would go back to the report requester and see if they can define the problem they are trying to solve and then you can work on giving them a report that is useful to them rather than a huge dump.

    I am also a bit confused about your data result as the date per order seems to be the same on every row, yet the date per order is different in your history table. Is this a typo on your part or are you expecting the date to be the minimum date for that order?

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • This query produces the exact same results as your query with far fewer reads.

    WITH Order_Statuses AS
    (
    SELECT h.order_number
    , h.order_date
    , q.[label]
    , q.orderstatus_src
    , q.orderstatus_tgt
    , h.order_status
    , ROW_NUMBER() OVER(PARTITION BY h.order_number, q.[label], h.order_status ORDER BY h.order_date) AS rn
    FROM #oderstatus_quality AS q
    INNER JOIN #ordersStatusHistory AS h
    ON h.order_status IN (q.orderstatus_src, q.orderstatus_tgt)
    )
    SELECT os.order_number
    , MIN(os.order_date) AS order_date
    , os.[label]
    --, os.rn
    , SUM(CASE WHEN os.orderstatus_tgt = order_status THEN 1 ELSE 0 END) AS Comparison_Status
    , 1 AS All_Status
    FROM Order_Statuses AS os
    GROUP BY os.order_number, os.rn, os.[label]
    ORDER BY os.order_number, os.rn;

    Here are the read stats.

    /*  Original query  */
    Table 'Worktable'. Scan count 2, logical reads 86, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
    Table '#oderstatus_quality_00000003A926'. Scan count 6, logical reads 128, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
    Table '#ordersStatusHistory_00000003A927'. Scan count 4, logical reads 8, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
    Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    /* Drew's query */
    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
    Table '#ordersStatusHistory_00000003A927'. Scan count 1, logical reads 11, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
    Table '#oderstatus_quality_00000003A926'. Scan count 1, logical reads 2, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply