September 8, 2018 at 1:03 pm
Hello:
I have 2 tables with data as below:
create table #product(product_id int, product_name varchar(50), regular_price money, start_dt datetime)
create table #product_promotions(product_id int, promo_price money, start_dt datetime, end_dt datetime)
insert into #product (product_id, product_name, regular_price, start_dt)
VALUES ( 1, 'Calculator', $50, '1/1/2010'),
( 2, 'Cabinet', $100, '1/1/2011' ),
( 3, 'Chair', $200, '1/1/2012');
insert into #product_promotions(product_id, promo_price, start_dt, end_dt)
values
(1, $40, '5/1/2010', '5/31/2010'),
(1, $30, '10/1/2011', '10/31/2011'),
(1, $20, '1/1/2018', '1/31/2018')
insert into #product_promotions(product_id, promo_price, start_dt, end_dt)
values
(2, $90, '5/1/2010', '5/31/2010'),
(2, $80, '10/1/2011', '10/31/2011'),
(2, $70, '1/1/2018', '1/31/2018')
select * from #product
select * from #product_promotions
As you see there are products with regular price and every now and then there are promo prices for the products. Now, I need to get one table with all the prices for the product with the date ranges as shown in the result below.
What's the best way to get the above data using, CTE or LEAD (if that's better). I'm using SQL Server 2016.
September 10, 2018 at 7:40 am
The method needed is to first generate a list of dates and prices, and then group those values by product and the fact that the price changed between a pair of dates. The latter can be represented by subtracting a ROW_NUMBER() from another wider ROW_NUMBER() to develop a group value. Not sure if I have it set up perfectly, but it does appear to work. Would highly recommend this be thoroughly tested.CREATE TABLE #product (
product_id int,
product_name varchar(50),
regular_price money,
start_dt datetime
);
CREATE TABLE #product_promotions (
product_id int,
promo_price money,
start_dt datetime,
end_dt datetime
);
INSERT INTO #product (product_id, product_name, regular_price, start_dt)
VALUES ( 1, 'Calculator', $50, '1/1/2010'),
( 2, 'Cabinet', $100, '1/1/2011' ),
( 3, 'Chair', $200, '1/1/2012');
INSERT INTO #product_promotions (product_id, promo_price, start_dt, end_dt)
VALUES (1, $40, '5/1/2010', '5/31/2010'),
(1, $30, '10/1/2011', '10/31/2011'),
(1, $20, '1/1/2018', '1/31/2018'),
(2, $90, '5/1/2010', '5/31/2010'),
(2, $80, '10/1/2011', '10/31/2011'),
(2, $70, '1/1/2018', '1/31/2018');
DECLARE @MIN_DATE AS date = (
SELECT MIN(S.start_dt)
FROM (
SELECT start_dt
FROM #product
UNION ALL
SELECT start_dt
FROM #product_promotions
) AS S
);
DECLARE @MAX_DATE AS date = (
SELECT MAX(SE.start_dt)
FROM (
SELECT start_dt
FROM #product
UNION ALL
SELECT start_dt
FROM #product_promotions
UNION ALL
SELECT end_dt AS start_dt
FROM #product_promotions
UNION ALL
SELECT GETDATE() AS start_dt
) AS SE
);
WITH E1 AS (
SELECT 1 AS N UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
TALLY AS (
SELECT TOP(DATEDIFF(day, @MIN_DATE, @MAX_DATE) + 1)
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1 AS NumDays
FROM E1 AS A
CROSS APPLY E1 AS B
CROSS APPLY E1 AS C
CROSS APPLY E1 AS D
),
ALL_DATES AS (
SELECT DATEADD(day, T.NumDays, @MIN_DATE) AS THE_DATE
FROM TALLY AS T
),
CURRENT_PRICES AS (
SELECT
AD.THE_DATE,
P.product_id,
P.product_name,
CASE
WHEN PP.product_id IS NOT NULL THEN PP.promo_price
ELSE P.regular_price
END AS CurrentPrice,
ROW_NUMBER() OVER(PARTITION BY P.product_id, P.product_name ORDER BY AD.THE_DATE) -
ROW_NUMBER() OVER(
PARTITION BY P.product_id, P.product_name,
CASE
WHEN PP.product_id IS NULL THEN 1
ELSE 0
END
ORDER BY AD.THE_DATE) AS GrpNum
FROM #product AS P
INNER JOIN ALL_DATES AS AD
ON P.start_dt <= AD.THE_DATE
LEFT OUTER JOIN #product_promotions AS PP
ON P.product_id = PP.product_id
AND AD.THE_DATE BETWEEN PP.start_dt AND PP.end_dt
--ORDER BY
-- P.product_id,
-- AD.THE_DATE;
)
SELECT
CP.product_id,
CP.product_name,
CP.CurrentPrice,
MIN(CP.THE_DATE) AS MIN_DATE,
MAX(CP.THE_DATE) AS MAX_DATE
FROM CURRENT_PRICES AS CP
GROUP BY
CP.product_id,
CP.product_name,
CP.CurrentPrice,
CP.GrpNum
ORDER BY
CP.product_id,
CP.product_name,
MIN(CP.THE_DATE);
DROP TABLE #product;
DROP TABLE #product_promotions;
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
September 10, 2018 at 7:55 am
I'd certainly use a calendar table here to help set ranges. A tally table in a CTE will help build this. I'd then have to put in CASEs that would get the regular price except when there are intervals with matching values.
This is kind of an islands and gaps problem. We have an article on the site for that: http://www.sqlservercentral.com/search/?q=islands+and+gaps&t=a&sort=relevance
September 10, 2018 at 8:14 am
sgmunson - Monday, September 10, 2018 7:40 AMThe method needed is to first generate a list of dates and prices, and then group those values by product and the fact that the price changed between a pair of dates. The latter can be represented by subtracting a ROW_NUMBER() from another wider ROW_NUMBER() to develop a group value. Not sure if I have it set up perfectly, but it does appear to work. Would highly recommend this be thoroughly tested.CREATE TABLE #product (
product_id int,
product_name varchar(50),
regular_price money,
start_dt datetime
);
CREATE TABLE #product_promotions (
product_id int,
promo_price money,
start_dt datetime,
end_dt datetime
);
INSERT INTO #product (product_id, product_name, regular_price, start_dt)
VALUES ( 1, 'Calculator', $50, '1/1/2010'),
( 2, 'Cabinet', $100, '1/1/2011' ),
( 3, 'Chair', $200, '1/1/2012');INSERT INTO #product_promotions (product_id, promo_price, start_dt, end_dt)
VALUES (1, $40, '5/1/2010', '5/31/2010'),
(1, $30, '10/1/2011', '10/31/2011'),
(1, $20, '1/1/2018', '1/31/2018'),
(2, $90, '5/1/2010', '5/31/2010'),
(2, $80, '10/1/2011', '10/31/2011'),
(2, $70, '1/1/2018', '1/31/2018');DECLARE @MIN_DATE AS date = (
SELECT MIN(S.start_dt)
FROM (
SELECT start_dt
FROM #product
UNION ALL
SELECT start_dt
FROM #product_promotions
) AS S
);DECLARE @MAX_DATE AS date = (
SELECT MAX(SE.start_dt)
FROM (
SELECT start_dt
FROM #product
UNION ALL
SELECT start_dt
FROM #product_promotions
UNION ALL
SELECT end_dt AS start_dt
FROM #product_promotions
UNION ALL
SELECT GETDATE() AS start_dt
) AS SE
);WITH E1 AS (
SELECT 1 AS N UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
TALLY AS (SELECT TOP(DATEDIFF(day, @MIN_DATE, @MAX_DATE) + 1)
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1 AS NumDays
FROM E1 AS A
CROSS APPLY E1 AS B
CROSS APPLY E1 AS C
CROSS APPLY E1 AS D
),
ALL_DATES AS (SELECT DATEADD(day, T.NumDays, @MIN_DATE) AS THE_DATE
FROM TALLY AS T
),
CURRENT_PRICES AS (SELECT
AD.THE_DATE,
P.product_id,
P.product_name,
CASE
WHEN PP.product_id IS NOT NULL THEN PP.promo_price
ELSE P.regular_price
END AS CurrentPrice,
ROW_NUMBER() OVER(PARTITION BY P.product_id, P.product_name ORDER BY AD.THE_DATE) -
ROW_NUMBER() OVER(
PARTITION BY P.product_id, P.product_name,
CASE
WHEN PP.product_id IS NULL THEN 1
ELSE 0
END
ORDER BY AD.THE_DATE) AS GrpNum
FROM #product AS P
INNER JOIN ALL_DATES AS AD
ON P.start_dt <= AD.THE_DATE
LEFT OUTER JOIN #product_promotions AS PP
ON P.product_id = PP.product_id
AND AD.THE_DATE BETWEEN PP.start_dt AND PP.end_dt
--ORDER BY
-- P.product_id,
-- AD.THE_DATE;
)
SELECT
CP.product_id,
CP.product_name,
CP.CurrentPrice,
MIN(CP.THE_DATE) AS MIN_DATE,
MAX(CP.THE_DATE) AS MAX_DATE
FROM CURRENT_PRICES AS CP
GROUP BY
CP.product_id,
CP.product_name,
CP.CurrentPrice,
CP.GrpNum
ORDER BY
CP.product_id,
CP.product_name,
MIN(CP.THE_DATE);DROP TABLE #product;
DROP TABLE #product_promotions;
Steve,
This looks promising.. I also have a calendar table (with all the dates filled in for 30 years). I'm guessing in this case, I will simply use that calendar table instead of the "tally" table?
September 10, 2018 at 9:47 am
ganeshmuthuvelu - Monday, September 10, 2018 8:14 AMSteve,
This looks promising.. I also have a calendar table (with all the dates filled in for 30 years). I'm guessing in this case, I will simply use that calendar table instead of the "tally" table?
Exactly. Most "gap and island" types of problems become solvable when you're able to subtract a narrower ROW_NUMBER() from a wider range ROW_NUMBER(). In this case, the key event is that within a date sequence, the price changes. Thus I needed to generate both of those row numbers and do the subtraction. That's easiest when you have a complete date range, such as a calendar table, or CTE Tally table to generate the list of dates.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
September 10, 2018 at 10:48 am
sgmunson - Monday, September 10, 2018 9:47 AMExactly. Most "gap and island" types of problems become solvable when you're able to subtract a narrower ROW_NUMBER() from a wider range ROW_NUMBER(). In this case, the key event is that within a date sequence, the price changes. Thus I needed to generate both of those row numbers and do the subtraction. That's easiest when you have a complete date range, such as a calendar table, or CTE Tally table to generate the list of dates.
;
WITH product_dates AS
(
SELECT p.product_id, CAST(d.dt AS DATE) AS dt
FROM #product p
LEFT OUTER JOIN #product_promotions pp
ON pp.product_id = p.product_id
AND pp.end_dt > p.start_dt
CROSS APPLY ( VALUES(p.start_dt), (pp.start_dt), (DATEADD(DAY, 1, pp.end_dt))) d(dt)
WHERE d.dt IS NOT NULL
)
, product_intervals AS
(
SELECT pd.product_id, pd.dt AS start_dt, LEAD(pd.dt, 1, '9999-12-30') OVER(PARTITION BY pd.product_id ORDER BY pd.dt) AS end_dt
FROM product_dates pd
)
SELECT p.product_id, p.product_name, cp.current_price, i.start_dt, DATEADD(DAY, -1, NULLIF(i.end_dt, '9999-12-30')) AS end_dt
FROM #product p
INNER JOIN product_intervals AS i
ON p.product_id = i.product_id
CROSS APPLY
(
SELECT MIN(current_price) AS current_price
FROM
(
SELECT p.regular_price AS current_price
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply