February 28, 2019 at 1:12 am
I need to build a solution and for brevity I'm going to use the example of customers buying products.
I want to return, for the current full year (in this case 2018) customer records in the year (2018) that bought a product in the previous 12 months of that purchase.
This means I'm looking at a rolling 12 month (potentially 24 months of data 2017-01-01- 2018-12-31).
example:
Customer A buys a product at 2018-08-10. If customer has bought a product before this in the previous 12 months from 2017-08-10- 2018-08-10, Return the previous purchase record and so on. If customer A buys a product on 2018-01-01 (first day of current full year), then I'm looking as far back as 2017-01-01.
Customer | purchasedate | Lastpurchasedate | DaysDifference | repeat count | RecStatus |
c1 | 20/12/2018 | 04/12/2018 | 16 | 8 | |
c1 | 04/12/2018 | 25/07/2018 | 132 | 7 | |
c1 | 25/07/2018 | 01/03/2018 | 146 | 6 | |
c1 | 01/03/2018 | 05/02/2018 | 24 | 5 | |
c1 | 05/02/2018 | 25/01/2018 | 11 | 4 | |
c1 | 25/01/2018 | 05/01/2018 | 20 | 3 | |
c1 | 05/01/2018 | 01/05/2017 | 249 | 2 | |
c1 | 01/05/2017 | 01/01/2017 | 120 | 1 | |
c1 | 01/01/2017 | null | 0 | delete | |
c2 | 20/12/2018 | 04/12/2018 | 16 | 6 | |
c2 | 04/12/2018 | 25/07/2018 | 132 | 5 | |
c2 | 01/08/2018 | 05/02/2018 | 177 | 4 | |
c2 | 25/07/2018 | 01/03/2018 | 146 | 3 | |
c2 | 05/07/2018 | 25/01/2018 | 161 | 2 | |
c2 | 05/05/2018 | 01/05/2017 | 369 | 1 | |
c2 | 01/02/2017 | 10/01/2017 | 22 | 0 | delete |
c2 | 10/01/2017 | null | 0 | delete |
You can see that for Customer C1, the first date of purchase in the current year is 2018-05-01. there was a record made in the previous 12 months from this date which is 2018-05-01. This means 2018-05-01 is included and is the first purchase. As long as the subsequent purchases are in the current year and within 12 months of each other the repeat count increments.
Customer C2s first record in the current month is 2018-05-05. However the last purchase record is 2017-02-01 which is more than 12 months so 2018-05-05 is the first record since it's in the current year.
I hope this all makes sense. Ideally I'd like to have a similar result set to the table above. I've tried the following as an example but this is not working as i can't delete the records the records mentioned above.
Any help would be very appreciated.
FROM (
SELECT Row_number() OVER( partition BY customer ORDER BY purchasedate) AS rn, * FROM data
WHERE purchasedate between '2017-01-01' and '2019-01-01'
) a
LEFT JOIN
(
SELECT Row_number() OVER( partition BY customer ORDER BY purchasedate) AS rn, * FROM data
WHERE purchasedate between '2018-01-01' and '2019-01-01'
) b
ON b.rn + 1 = a.rn
AND a.customer = b.customer
February 28, 2019 at 3:30 pm
I'm not sure I understand. Are you looking for only 1 or 2 records per customer? Meaning a purchase within the last year (or calendar year) and possibly a purchase in the 12 months before that. Are you returning all purchases in 2018?
I would tackle this initially in a series of CTEs to organize the logic and before you're getting the results you want. I'd get a specific set of test data and results you want, especially at the boundary cases.
I think that you would get a list of customers that have had a purchase in the last year (or 2018). It's not quite clear if you mean calendar for the last purchase. From there, I'd use that list of customers and orders and search for an order that is within a dateadd(year, -1) of that date. That lets you get two lists of things. You could order those, but depending on how deeply you filter (all orders, 1, 2, etc), it may be more complex.
Posting some DDL and test data would be good. Show specific results with your explanation.
February 28, 2019 at 4:15 pm
First, this is not the best way to post sample data. I've created a quick and dirty method of doing it, because I'm a bit short on time.
SELECT *
INTO #data
FROM (
VALUES
('c1', CAST('20181220' AS DATE), CAST('20181204' AS DATE), 16, 8, NULL),
('c1', '20181204', '20180725', 132, 7, NULL),
('c1', '20180725', '20180301', 146, 6, NULL),
('c1', '20180301', '20180205', 24, 5, NULL),
('c1', '20180205', '20180125', 11, 4, NULL),
('c1', '20180125', '20180105', 20, 3, NULL),
('c1', '20180105', '20170501', 249, 2, NULL),
('c1', '20170501', '20170101', 120, 1, NULL),
('c1', '20170101', null, NULL, 0, 'delete'),
('c2', '20181220', '20181204', 16, 6, NULL),
('c2', '20181204', '20180725', 132, 5, NULL),
('c2', '20180801', '20180205', 177, 4, NULL),
('c2', '20180725', '20180301', 146, 3, NULL),
('c2', '20180705', '20180125', 161, 2, NULL),
('c2', '20180505', '20170501', 369, 1, NULL),
('c2', '20170201', '20170110', 22, 0, 'delete'),
('c2', '20170110', null, NULL, 0, 'delete')
)v (Customer, purchasedate, Lastpurchasedate, DaysDifference, repeat_count, RecStatus)
I also wasn't sure exactly what you were looking for with your ranges. I interpreted it as customers who hadn't visited in 12 months as having "lapsed" and being considered "new" customers.
WITH new_returning AS
(
SELECT *,
CASE WHEN LEAD(purchasedate, 1, '9999-12-30') OVER(PARTITION BY Customer ORDER BY purchasedate) > DATEADD(YEAR, 1, purchasedate) THEN 1 ELSE 0 END AS new_returning_customer
FROM #data
)
, customer_groups AS
(
SELECT *, SUM(new_returning_customer) OVER(PARTITION BY Customer ORDER BY purchasedate DESC) AS grp
FROM new_returning
)
SELECT *
FROM customer_groups
WHERE grp = 1
I'm also not quite getting the results you are expecting, but it is close. Looking at the data, it matches what I expect.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply