March 22, 2024 at 11:58 pm
Hello,
i have 2 tables , the first one is a referential of the quality services and the second is a historical tables as bellow the DDL :
drop table if exists #oderstatus_quality
create table #oderstatus_quality (
id int identity primary key,
label varchar(100),
orderstatus_src int,
orderstatus_src_label varchar(100),
orderstatus_tgt int,
orderstatus_tgt_label varchar(100)
)
insert into #oderstatus_quality values
('refunded vs Ordered',36,'Ordered',37,'refunded')
,('Cancelled vs Ordered',36,'Ordered',35,'Cancelled')
,('received vs Ordered',36,'Ordered',9,'received')
,('navigator Closed vs Cancelled',4,'Cancelled',11,'navigator Closed')
,('Banking Authorization refused vs Cancelled',4,'Cancelled',13,'Banking Authorization refused')
drop table if exists #ordersStatusHistory
Create table #ordersStatusHistory(
id int identity primary key,
order_number varchar(255),
order_status int,
order_date datetime
)
insert into #ordersStatusHistory
values
('XN75',36,'2022-01-27 15:29:22.000')
,('XN75',9,'2022-02-01 07:15:53.000')
,('XN75',37,'2022-02-03 07:50:29.000')
,('XN76',36,'2022-03-01 07:15:53.000')
,('XN76',9,'2022-03-05 07:15:53.000')
,('XN77',36,'2022-02-03 07:15:53.000')
,('XN77',36,'2022-02-03 08:15:53.000')
,('XN77',9,'2022-02-04 17:30:00.000')
,('XN78',36,'2022-02-05 07:15:53.000')
,('XN78',4,'2022-02-05 07:20:53.000')
,('XN79',4,'2022-02-06 07:20:53.000')
,('XN79',11,'2022-02-06 07:21:53.000')
,('XN80',36,'2022-04-06 09:15:53.000')
,('XN80',4,'2022-04-06 09:20:53.000')
,('XN80',11,'2022-04-06 09:21:53.000')
For the reporting purpose , i want to calculate the quality of my delivery for that i need two measures Comparaison_Status and All_Status
the comparaison status it's the number of status 1 vs status 2 for example i have 'refunded vs Ordered' That mean i want to calcuate the number of orders that was orded but also was refunded
The All_Status , it's the number of the status 2 for example i have 'refunded vs Ordered' that mean i want to see only the the orders that was ordered
I tried the query as bellow , it's give me the right result but if someone have another solution and better for performance because my table is huge arround 30 millions rows , it will better
;with src as (
select *,ROW_NUMBER() over(partition by order_number,order_status order by order_date asc) as grp from #ordersStatusHistory o
where o.order_status in(select distinct orderstatus_src from #oderstatus_quality)
)
,tgt as(
select *,ROW_NUMBER() over(partition by order_number,order_status order by order_date asc) as grp from #ordersStatusHistory o
where o.order_status in(select distinct orderstatus_tgt from #oderstatus_quality)
)
,numberofAllstatus as(
select *, 1 AS AllValues from src
)
select
s.order_number
,s.order_date
,q.label
,count(s.order_status) as Comparaison_Status
,max(s.AllValues) as All_Status
from numberofAllstatus s
inner join tgt t
on s.order_number = t.order_number and s.grp = t.grp
inner join #oderstatus_quality q
on s.order_status =q.orderstatus_src and t.order_status = q.orderstatus_tgt
group by
s.order_number
,s.order_date
,q.label
union
select
s.order_number
,s.order_date
,q.label
,0 as Comparaison_Status
,max(s.AllValues) as All_Status
from #oderstatus_quality q
left join numberofAllstatus s
on s.order_status =q.orderstatus_src
left join tgt t
on s.order_number = t.order_number and s.grp = t.grp and t.order_status = q.orderstatus_tgt
where
1=1
--and s.order_number='XN75'
and t.id is null
group by
s.order_number
,s.order_date
,q.label
the expected result that i want :
Thanks for help !
March 23, 2024 at 9:08 pm
You have 15 rows that got expanded to 24 and you say you have 30 million original rows, which would be expanded to 48 million rows according the 24/15 ratio of the original data, and you want that to be fast?
I'm thinking that there needs to be some aggregation/analysis somewhere along the line to seriously reduce the output even if the output is going to a table.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 25, 2024 at 7:52 pm
One thought, which Jeff implied, on performance improvement would be to pre-calculate the data. What I mean is rather than calculating the data on demand, do a nightly (or hourly) dump of the data to a table and then have the report pull the data from the new table.
BUT, if like what Jeff said, you are going to have 48 million rows, Excel is going to barf. Excel can't handle that much data. Plus, if Excel cannot handle 48 million rows, how do you expect your end user to consume 48 million rows of data?
Nobody wants a report with 48 million rows in it. It is too much data. I would go back to the report requester and see if they can define the problem they are trying to solve and then you can work on giving them a report that is useful to them rather than a huge dump.
I am also a bit confused about your data result as the date per order seems to be the same on every row, yet the date per order is different in your history table. Is this a typo on your part or are you expecting the date to be the minimum date for that order?
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
March 26, 2024 at 3:21 pm
This query produces the exact same results as your query with far fewer reads.
WITH Order_Statuses AS
(
SELECT h.order_number
, h.order_date
, q.[label]
, q.orderstatus_src
, q.orderstatus_tgt
, h.order_status
, ROW_NUMBER() OVER(PARTITION BY h.order_number, q.[label], h.order_status ORDER BY h.order_date) AS rn
FROM #oderstatus_quality AS q
INNER JOIN #ordersStatusHistory AS h
ON h.order_status IN (q.orderstatus_src, q.orderstatus_tgt)
)
SELECT os.order_number
, MIN(os.order_date) AS order_date
, os.[label]
--, os.rn
, SUM(CASE WHEN os.orderstatus_tgt = order_status THEN 1 ELSE 0 END) AS Comparison_Status
, 1 AS All_Status
FROM Order_Statuses AS os
GROUP BY os.order_number, os.rn, os.[label]
ORDER BY os.order_number, os.rn;
Here are the read stats.
/* Original query */
Table 'Worktable'. Scan count 2, logical reads 86, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#oderstatus_quality_00000003A926'. Scan count 6, logical reads 128, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#ordersStatusHistory_00000003A927'. Scan count 4, logical reads 8, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
/* Drew's query */
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#ordersStatusHistory_00000003A927'. Scan count 1, logical reads 11, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table '#oderstatus_quality_00000003A926'. Scan count 1, logical reads 2, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply