June 28, 2019 at 3:30 pm
My real example doesn't use orders or employees but I've adapted it to use this common nomenclature for ease of understanding.
I basically want to find all the Employees who processed a given number of orders within a given period. I want to see those orders, not just the Employee Reference. In my example below, I've used 7 orders in a 365 day period. Now, this isn't orders in the last 365 days, that would be easy. It's any employee who processed 7 or more orders in any consecutive 365 day period. So we need to measure time between each order per employee.
I've got some working code below, it does work, but it feels clunky and if I change the number of orders, I need to add more columns using the Window functions to look forward or backwards from each order.
I'm looking for a more elegant solution and one where I can change the order count without rewriting the query. Ideally without DSQL.
Each row in my Orders table is a unique order. EmployeeRef is a foreign key. Employees can make many orders. Order ID is unique but cannot be used to infer if orders were placed after or before each other, OrderDate must be used.
Table Create Query
CREATE TABLE [dbo].[Orders](
[OrderID] [int] NOT NULL,
[OrderLabel] [nvarchar](15) NOT NULL,
[OrderDate] [datetime] NULL,
[EmployeeRef] [nvarchar](15) NULL
)
GO
Current Query:
DECLARE @NumOrders INT = 7
DECLARE @ConsecutivePeriodDays INT = 365
;WITH Orders AS (SELECT ROW_NUMBER() OVER (partition by o.EmployeeRef ORDER BY o.OrderDate) AS RowNum, o.OrderID,o.OrderLabel,o.EmployeeRef, o.OrderDate
FROM dbo.Orders o with(nolock)
WHERE o.OrderDate IS NOT NULL
AND o.EmployeeRef IS NOT NULL
AND o.EmployeeRef <> ''
)
,Orders2 (RowNum,OrderID,OrderLabel,EmployeeRef,OrderDate)
AS
(
select Orders.RowNum,Orders.OrderID,Orders.OrderLabel,Orders.EmployeeRef,Orders.OrderDate
from Orders
JOIN
(SELECT Orders.EmployeeRef FROM Orders GROUP BY EmployeeRef HAVING COUNT(Orders.OrderID) >=@NumOrders) SevenOrders on SevenOrders.EmployeeRef = Orders.EmployeeRef
)
SELECT
Orders3.EmployeeRef,
Orders3.OrderID,
Orders3.OrderLabel,
Orders3.OrderDate
FROM
(
SELECT
t1.rownum
,t1.OrderID
,t1.EmployeeRef
,t1.OrderLabel
,t1.OrderDate
,ISNULL(DATEDIFF(dd, t2.OrderDate, t1.OrderDate), 0) AS daysSinceLastOrder
,last6 = SUM(ISNULL(DATEDIFF(dd, t2.OrderDate, t1.OrderDate), 0)) OVER (PARTITION BY t1.EmployeeRef ORDER BY t1.OrderDate ROWS between 6 PRECEDING and CURRENT ROW)
,last5next1 = SUM(ISNULL(DATEDIFF(dd, t2.OrderDate, t1.OrderDate), 0)) OVER (PARTITION BY t1.EmployeeRef ORDER BY t1.OrderDate ROWS between 5 PRECEDING and 1 FOLLOWING)
,last4next2 = SUM(ISNULL(DATEDIFF(dd, t2.OrderDate, t1.OrderDate), 0)) OVER (PARTITION BY t1.EmployeeRef ORDER BY t1.OrderDate ROWS between 4 PRECEDING and 2 FOLLOWING)
,last3next3 = SUM(ISNULL(DATEDIFF(dd, t2.OrderDate, t1.OrderDate), 0)) OVER (PARTITION BY t1.EmployeeRef ORDER BY t1.OrderDate ROWS between 3 PRECEDING and 3 FOLLOWING)
,last2next4 = SUM(ISNULL(DATEDIFF(dd, t2.OrderDate, t1.OrderDate), 0)) OVER (PARTITION BY t1.EmployeeRef ORDER BY t1.OrderDate ROWS between 2 PRECEDING and 4 FOLLOWING)
,last1next5 = SUM(ISNULL(DATEDIFF(dd, t2.OrderDate, t1.OrderDate), 0)) OVER (PARTITION BY t1.EmployeeRef ORDER BY t1.OrderDate ROWS between 1 PRECEDING and 5 FOLLOWING)
,next6 = SUM(ISNULL(DATEDIFF(dd, t2.OrderDate, t1.OrderDate), 0)) OVER (PARTITION BY t1.EmployeeRef ORDER BY t1.OrderDate ROWS between CURRENT ROW and 6 FOLLOWING)
FROM Orders2 t1
LEFT JOIN Orders2 t2 ON t1.rownum = t2.rownum + 1 and t1.EmployeeRef = t2.EmployeeRef
)Orders3
WHERE Orders3.last6 <= @ConsecutivePeriodDays
AND Orders3.last5next1 <= @ConsecutivePeriodDays
AND Orders3.last4next2 <= @ConsecutivePeriodDays
AND Orders3.last3next3 < @ConsecutivePeriodDays
AND Orders3.last2next4 < @ConsecutivePeriodDays
AND Orders3.last1next5 < @ConsecutivePeriodDays
AND Orders3.next6 < @ConsecutivePeriodDays
ORDER BY 1,4
June 28, 2019 at 3:45 pm
Could you also post sample data and expected results based on the sample data? If the number of "orders" can change you may also want to post a couple of sets of sample data and expected results based on each sample set. This will give us something with which to work and test against.
June 28, 2019 at 4:00 pm
Here's a script to create some sample data.
Here we have 19 orders made by 3 employees.
My script and any new script should only show Employee 3. Employee 1 only made 2 orders so is under the 7 required.
Employees 2 and 3 both made 7 or more orders, but only 6 of Employee 2's orders were made inside 365 days. There is a two year gap between order 6 and the last one.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Orders](
[OrderID] [int] NOT NULL,
[OrderLabel] [nvarchar](15) NOT NULL,
[OrderDate] [datetime] NULL,
[EmployeeRef] [nvarchar](15) NULL
) ON [PRIMARY]
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (1, N'order1', CAST(N'2010-01-01T00:00:00.000' AS DateTime), N'Emp1')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (2, N'order2', CAST(N'2010-01-10T00:00:00.000' AS DateTime), N'Emp1')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (3, N'order3', CAST(N'2010-01-01T00:00:00.000' AS DateTime), N'Emp2')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (4, N'order4', CAST(N'2010-01-02T00:00:00.000' AS DateTime), N'Emp2')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (5, N'order5', CAST(N'2010-01-03T00:00:00.000' AS DateTime), N'Emp2')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (6, N'order6', CAST(N'2010-01-04T00:00:00.000' AS DateTime), N'Emp2')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (7, N'order7', CAST(N'2010-01-05T00:00:00.000' AS DateTime), N'Emp2')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (8, N'order8', CAST(N'2010-01-06T00:00:00.000' AS DateTime), N'Emp2')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (9, N'order9', CAST(N'2012-01-07T00:00:00.000' AS DateTime), N'Emp2')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (10, N'order10', CAST(N'2011-01-01T00:00:00.000' AS DateTime), N'Emp3')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (11, N'order11', CAST(N'2011-01-02T00:00:00.000' AS DateTime), N'Emp3')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (12, N'order12', CAST(N'2011-01-02T00:00:00.000' AS DateTime), N'Emp3')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (13, N'order13', CAST(N'2011-01-02T00:00:00.000' AS DateTime), N'Emp3')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (14, N'order14', CAST(N'2011-02-03T00:00:00.000' AS DateTime), N'Emp3')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (15, N'order15', CAST(N'2011-02-04T00:00:00.000' AS DateTime), N'Emp3')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (16, N'order16', CAST(N'2011-02-05T00:00:00.000' AS DateTime), N'Emp3')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (17, N'order17', CAST(N'2011-02-06T00:00:00.000' AS DateTime), N'Emp3')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (18, N'order18', CAST(N'2011-06-01T00:00:00.000' AS DateTime), N'Emp3')
GO
INSERT [dbo].[Orders] ([OrderID], [OrderLabel], [OrderDate], [EmployeeRef]) VALUES (19, N'order19', CAST(N'2011-07-01T00:00:00.000' AS DateTime), N'Emp3')
GO
June 28, 2019 at 4:10 pm
You didn't provide sample data. Also, what do you want to do with people who have two or more sets that overlap and is it different if the overlap is in a single or multiple 365 day windows?
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
June 28, 2019 at 4:43 pm
You posted one set of sample data but you failed to provide your expected results based on that data. Also, what if the number of "orders" during the span changes? Do you want to handle this dynamically or do you want to have to write new code each time? Posting sample data AND expected results for multiple possibilities will allow us to look at these as well.
June 28, 2019 at 5:08 pm
Here is a query that matches the results for your query. NOTE: I changed your permanent table to a temporary table. WHY DO PEOPLE INSIST ON PROVIDING SAMPLE DATA IN PERMANENT TABLES?
WITH order_ranges AS
(
SELECT *
,CASE
WHEN DATEADD(DAY, -365, o.OrderDate) < LAG(o.OrderDate, 6) OVER(PARTITION BY o.EmployeeRef ORDER BY o.OrderDate)
THEN 1
END AS in_range
FROM #Orders AS o
)
, order_groups AS
(
SELECT *, MAX(o.in_range) OVER(PARTITION BY o.EmployeeRef ORDER BY o.OrderDate ROWS BETWEEN CURRENT ROW AND 6 FOLLOWING) AS grp
FROM order_ranges AS o
)
SELECT o.EmployeeRef, o.OrderID, o.OrderLabel, o.OrderDate
FROM order_groups o
WHERE o.grp = 1
ORDER BY o.OrderID
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
July 1, 2019 at 8:24 am
Thanks Drew.
I'm still testing this against my full production data-set, but it looks good. In fact, you solution appears far superior to mine as it handles overlapping groups and is simpler.
Much appreciated.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply