April 7, 2017 at 9:50 am
hi
i have a query i want to improve speed of, the problem part of the query is
left join
(select a.SalesOrder,TrnTime,a.SalesOrderLine,MStockCode,MWarehouse,
row_number() over(partition by a.SalesOrder Order By TrnTime desc)as rn from SysproCompanyW.dbo.SorAdditions a
join SysproCompanyW.dbo.SorDetail d on d.SalesOrder = a.SalesOrder and d.SalesOrderLine = a.SalesOrderLine)as d
on d.SalesOrder = s.SalesOrder and rn = 1
basically i need to get the last entry per sales order in the table SorAdditions, the first way i learn to do this was by using above method of partitioning the data and assigning a rownumber per record, but i am guessing there is a cleaner way to achieve this?
anyone any tips for me?
mal
April 7, 2017 at 10:08 am
Could you provide a full SELECT statement and consumable Sample data please (normal drill). That JOIN on it's own doesn't really mean a lot to us.
At a guess, however, this might be the kind of thing you are after though:CREATE TABLE #Customer
(CustomerID int IDENTITY(1,1),
CustomerName varchar(20));
GO
CREATE TABLE #Sale
(SaleID int IDENTITY(1,1),
CustomerID int,
ItemName varchar(20),
SaleDateTime datetime);
GO
INSERT INTO #Customer
VALUES
('Thom'),
('Steve'),
('Jane');
GO
INSERT INTO #Sale
VALUES
(1, 'Banana', '20170401 12:43:00'),
(1, 'Cheese', '20170401 17:19:00'),
(2, 'Bacon', '20170402 07:49:21'),
(2, 'Bacon', '20170405 07:43:02'),
(2, 'Bacon', '20170407 08:21:59'),
(3, 'Milk', '20170403 09:49:18');
GO
WITH CTE AS (
SELECT C.CustomerID, S.SaleID,
C.CustomerName,
S.ItemName, S.SaleDateTime,
ROW_NUMBER() OVER (PARTITION BY C.CustomerID
ORDER BY S.SaleDateTime DESC) AS RN
FROM #Customer C
JOIN #Sale S ON C.CustomerID = S.CustomerID)
SELECT CustomerName, ItemName, SaleDateTime
FROM CTE
WHERE RN = 1;
GO
DROP TABLE #Sale;
DROP TABLE #Customer;
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
April 7, 2017 at 10:18 am
Hi Thom
full query is
select distinct s.Customer
,(cASE when OrderType <> 'E' then OrderType else ProductGroup end),d.MWarehouse
,OrderDate
,left(datename(dw,s.OrderDate),3)
,left (TrnTime, len (TrnTime)-4) , Cut_Off_Day,Cut_Off_Time,
ISNULL(datediff(day,OrderDate,ReqShipDate),0),ReqShipDate,
Forward_Del_Day,
s.SalesOrder
from SysproCompanyW.dbo.SorMaster s
left join
(select a.SalesOrder,TrnTime,a.SalesOrderLine,MStockCode,MWarehouse,
row_number() over(partition by a.SalesOrder Order By TrnTime desc)as rn from SysproCompanyW.dbo.SorAdditions a
join SysproCompanyW.dbo.SorDetail d on d.SalesOrder = a.SalesOrder and d.SalesOrderLine = a.SalesOrderLine)as d
on d.SalesOrder = s.SalesOrder and rn = 1
left join SysproCompanyW.dbo.InvMaster i on i.StockCode = d.MStockCode
join mal.dbo.Delivery_SLAs da on da.Customer = s.Customer and da.Division = (cASE when OrderType <> 'E' then OrderType else ProductGroup end) and da.Warehouse = d.MWarehouse
where
OrderStatus in ('1','F','2') and
CancelledFlag <> 'Y'
and OrderDate > = '2015-01-01'
--and convert(varchar,OrderDate,111) = convert(varchar,getdate(),111)
and not exists
(select KeyField from AdmFormData ad where FormType = 'ORD' AND FieldName = 'SLA001'
and ad.KeyField = s.SalesOrder)
and (MWarehouse in ('AF','PE')
or SentToMessagedb = '1')
April 7, 2017 at 1:04 pm
dopydb - Friday, April 7, 2017 9:50 AMhii have a query i want to improve speed of, the problem part of the query is
left join
(select a.SalesOrder,TrnTime,a.SalesOrderLine,MStockCode,MWarehouse,
row_number() over(partition by a.SalesOrder Order By TrnTime desc)as rn from SysproCompanyW.dbo.SorAdditions a
join SysproCompanyW.dbo.SorDetail d on d.SalesOrder = a.SalesOrder and d.SalesOrderLine = a.SalesOrderLine)as d
on d.SalesOrder = s.SalesOrder and rn = 1basically i need to get the last entry per sales order in the table SorAdditions, the first way i learn to do this was by using above method of partitioning the data and assigning a rownumber per record, but i am guessing there is a cleaner way to achieve this?
anyone any tips for me?
mal
The ROW_NUMBER() approach is usually fairly efficient, so I'm curious how you determined that that was where your problems lay.
If your data is dense enough (lots of TrnTime values for each SalesOrder) and there is a supporting index, the following MAY perform betterOUTER APPLY
(
select TOP(1) a.SalesOrder,TrnTime,a.SalesOrderLine,MStockCode,MWarehouse
from SysproCompanyW.dbo.SorAdditions a
join SysproCompanyW.dbo.SorDetail d on d.SalesOrder = a.SalesOrder and d.SalesOrderLine = a.SalesOrderLine
WHERE d.SalesOrder = s.SalesOrder
ORDER BY TrnTime DESC
)as d
Just from the field names, I suspect that your data won't be dense enough.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
April 7, 2017 at 1:15 pm
Sorry thom obviously didnt read your request very well! Will try your suggestion when back in the office
Drew, i am basing purely off execution plan, i think 80% cost was here. Not sure what classifies as dense, but i could have anything from 1 to 30 records in general in trntime.
April 7, 2017 at 1:27 pm
Generally speaking, you're going to take a performance hit when applying ranking at runtime, especially if you're joining on a ranked sub-query. Optimal indexing helps, but I'd suggest adding a new column (let's call it TopAdditionRank for now) to [SorAdditions] that indicates something like a value of (1) for the most recent row that needs to be joined with the related sales order. Once done, you can do away with the sub-query on [SorAdditions] and just straight join on SalesOrder and TopAdditionRank = 1. This column can perhaps be populated when the sales order is first created, or if it needs to be more dynamic (ex: line detail is added after the sales order is created), then have a job to update the indicator column across all sales orders on a daily or hourly schedule. To reduce the load, you would perhaps only need to update [SorAdditions] for sales orders that havn't been finalized, etc.
Alternatively, if this ranking really needs to still be performed at runtime, or you can't modify the existing schema, then experiment with this alternate approach.
select SalesOrder, max(TrnTime)MaxTrnTime
into #SorAdd
from SysproCompanyW.dbo.SorAdditions
group by SalesOrder;
...
...
join SysproCompanyW.dbo.SorAdditions a
join SysproCompanyW.dbo.SorDetail d
on d.SalesOrder = a.SalesOrder
and d.SalesOrderLine = a.SalesOrderLine
join #SorAdd on #SorAdd.MaxTrnTime = d.TrnTime
...
...
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
April 7, 2017 at 3:58 pm
dopydb - Friday, April 7, 2017 1:15 PMSorry thom obviously didnt read your request very well! Will try your suggestion when back in the officeDrew, i am basing purely off execution plan, i think 80% cost was here. Not sure what classifies as dense, but i could have anything from 1 to 30 records in general in trntime.
The only really way to be sure is to test both approaches and to see which one works better. The idea is that the ROW_NUMBER is likely to do an index scan and that the CROSS APPLY is likely to do a bunch of index seeks. The index scan will be roughly linear, but overall the index seeks will cost approximately n*LOG(n), so a better approximation of the density is COUNT(*)/(COUNT(DISTINCT SalesOrder) * LOG(COUNT(DISTINCT SalesOrder)). The larger this number is, the more likely the CROSS APPLY approach is going to work.
Drew
PS: I have only a passing knowledge of how indexes work behind the scenes. My analysis may be completely off.
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply