October 22, 2018 at 8:43 am
I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/14
10/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13
And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.
Editing to add data:
--===== If the test table already exists, drop it
IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL
DROP TABLE #mytable
--===== Create the test table with
CREATE TABLE #mytable
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, --Is an IDENTITY column on real table
custID INT,
DateValue DATETIME
)
--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT #mytable ON
--===== Insert the test data into the test table
INSERT INTO #mytable
(ID, custID, DateValue)
SELECT '1','1','09/01/2018 12:00 AM' UNION ALL
SELECT '2','1','09/02/2018 12:00 AM' UNION ALL
SELECT '3','1','09/05/2018 12:00 AM' UNION ALL
SELECT '4','1','10/01/2018 12:00 AM' UNION ALL
SELECT '5','2','09/08/2018 12:00 AM' UNION ALL
SELECT '6','2','09/09/2018 12:00 AM' UNION ALL
SELECT '7','3','01/01/2018 12:00 AM' UNION ALL
SELECT '8','3','01/04/2018 12:00 AM' UNION ALL
SELECT '9','3','01/08/2018 12:00 AM' UNION ALL
SELECT '10','4','05/01/2018 12:00 AM' UNION ALL
SELECT '11','4','05/08/2018 12:00 AM' UNION ALL
SELECT '12','4','06/01/2018 12:00 AM' UNION ALL
SELECT '13','4','06/15/2018 12:00 AM' UNION ALL
SELECT '14','5','10/01/2018 12:00 AM' UNION ALL
SELECT '15','6','07/01/2018 12:00 AM' UNION ALL
SELECT '16','6','07/02/2018 12:00 AM' UNION ALL
SELECT '17','6','07/04/2018 12:00 AM' UNION ALL
SELECT '18','6','07/05/2018 12:00 AM' UNION ALL
SELECT '19','7','10/01/2018 12:00 AM' UNION ALL
SELECT '20','7','10/05/2018 12:00 AM' UNION ALL
SELECT '21','8','04/10/2018 12:00 AM' UNION ALL
SELECT '22','8','04/11/2018 12:00 AM' UNION ALL
SELECT '23','8','04/13/2018 12:00 AM' UNION ALL
SELECT '24','9','05/01/2018 12:00 AM' UNION ALL
SELECT '25','9','05/05/2018 12:00 AM' UNION ALL
SELECT '26','9','05/19/2018 12:00 AM' UNION ALL
SELECT '27','10','01/01/2018 12:00 AM' UNION ALL
SELECT '28','10','01/04/2018 12:00 AM' UNION ALL
SELECT '29','10','01/05/2018 12:00 AM' UNION ALL
SELECT '30','11','01/05/2018 12:00 AM' UNION ALL
SELECT '31','12','03/17/2018 12:00 AM' UNION ALL
SELECT '32','12','03/18/2018 12:00 AM' UNION ALL
SELECT '33','12','03/20/2018 12:00 AM' UNION ALL
SELECT '34','13','02/19/2018 12:00 AM' UNION ALL
SELECT '35','13','02/20/2018 12:00 AM' UNION ALL
SELECT '36','14','01/15/2018 12:00 AM' UNION ALL
SELECT '37','14','06/01/2018 12:00 AM' UNION ALL
SELECT '38','14','06/04/2018 12:00 AM' UNION ALL
SELECT '39','14','06/05/2018 12:00 AM' UNION ALL
SELECT '40','14','07/15/2018 12:00 AM'
--===== Set the identity insert back to normal
SET IDENTITY_INSERT #mytable OFF
October 22, 2018 at 8:52 am
Sounds like a classic gaps and islands problem. I don't have a tailor-made solution, I'm afraid, but the page I linked to should provide a few pointers.
John
October 22, 2018 at 8:53 am
timothy.m.phillips - Monday, October 22, 2018 8:43 AMI have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/1410/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.
Would you be able to create some test data in a consumable (ie, ready to paste into SSMS) format, please? Please see the link in my signature if you are unsure how to do this.
People here enjoy solving puzzles like this – you'll soon have a working solution if you take the time to post the test data.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
October 22, 2018 at 9:22 am
Phil Parkin - Monday, October 22, 2018 8:53 AMtimothy.m.phillips - Monday, October 22, 2018 8:43 AMI have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/1410/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.
Would you be able to create some test data in a consumable (ie, ready to paste into SSMS) format, please? Please see the link in my signature if you are unsure how to do this.
People here enjoy solving puzzles like this – you'll soon have a working solution if you take the time to post the test data.
And make sure your data includes edge cases such as
10/1
10/4
10/8
Where each of the dates is within five of the previous date, but the third date is not within five days of the first date.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 23, 2018 at 2:52 pm
Phil Parkin - Monday, October 22, 2018 8:53 AMtimothy.m.phillips - Monday, October 22, 2018 8:43 AMI have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/1410/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.
Would you be able to create some test data in a consumable (ie, ready to paste into SSMS) format, please? Please see the link in my signature if you are unsure how to do this.
People here enjoy solving puzzles like this – you'll soon have a working solution if you take the time to post the test data.
Thanks. I have added data to the original post following the format on the link in your signature.
October 24, 2018 at 9:29 am
timothy.m.phillips - Tuesday, October 23, 2018 2:52 PMPhil Parkin - Monday, October 22, 2018 8:53 AMtimothy.m.phillips - Monday, October 22, 2018 8:43 AMI have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/1410/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.
Would you be able to create some test data in a consumable (ie, ready to paste into SSMS) format, please? Please see the link in my signature if you are unsure how to do this.
People here enjoy solving puzzles like this – you'll soon have a working solution if you take the time to post the test data.Thanks. I have added data to the original post following the format on the link in your signature.
Here you go:--===== If the test table already exists, drop it
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;
--===== Create the test table with
CREATE TABLE #mytable (
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, --Is an IDENTITY column on real table
custID INT,
DateValue DATETIME
);
--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT #mytable ON;
--===== Insert the test data into the test table
INSERT INTO #mytable (ID, custID, DateValue)
SELECT '1','1','09/01/2018 12:00 AM' UNION ALL
SELECT '2','1','09/02/2018 12:00 AM' UNION ALL
SELECT '3','1','09/05/2018 12:00 AM' UNION ALL
SELECT '4','1','10/01/2018 12:00 AM' UNION ALL
SELECT '5','2','09/08/2018 12:00 AM' UNION ALL
SELECT '6','2','09/09/2018 12:00 AM' UNION ALL
SELECT '7','3','01/01/2018 12:00 AM' UNION ALL
SELECT '8','3','01/04/2018 12:00 AM' UNION ALL
SELECT '9','3','01/08/2018 12:00 AM' UNION ALL
SELECT '10','4','05/01/2018 12:00 AM' UNION ALL
SELECT '11','4','05/08/2018 12:00 AM' UNION ALL
SELECT '12','4','06/01/2018 12:00 AM' UNION ALL
SELECT '13','4','06/15/2018 12:00 AM' UNION ALL
SELECT '14','5','10/01/2018 12:00 AM' UNION ALL
SELECT '15','6','07/01/2018 12:00 AM' UNION ALL
SELECT '16','6','07/02/2018 12:00 AM' UNION ALL
SELECT '17','6','07/04/2018 12:00 AM' UNION ALL
SELECT '18','6','07/05/2018 12:00 AM' UNION ALL
SELECT '19','7','10/01/2018 12:00 AM' UNION ALL
SELECT '20','7','10/05/2018 12:00 AM' UNION ALL
SELECT '21','8','04/10/2018 12:00 AM' UNION ALL
SELECT '22','8','04/11/2018 12:00 AM' UNION ALL
SELECT '23','8','04/13/2018 12:00 AM' UNION ALL
SELECT '24','9','05/01/2018 12:00 AM' UNION ALL
SELECT '25','9','05/05/2018 12:00 AM' UNION ALL
SELECT '26','9','05/19/2018 12:00 AM' UNION ALL
SELECT '27','10','01/01/2018 12:00 AM' UNION ALL
SELECT '28','10','01/04/2018 12:00 AM' UNION ALL
SELECT '29','10','01/05/2018 12:00 AM' UNION ALL
SELECT '30','11','01/05/2018 12:00 AM' UNION ALL
SELECT '31','12','03/17/2018 12:00 AM' UNION ALL
SELECT '32','12','03/18/2018 12:00 AM' UNION ALL
SELECT '33','12','03/20/2018 12:00 AM' UNION ALL
SELECT '34','13','02/19/2018 12:00 AM' UNION ALL
SELECT '35','13','02/20/2018 12:00 AM' UNION ALL
SELECT '36','14','01/15/2018 12:00 AM' UNION ALL
SELECT '37','14','06/01/2018 12:00 AM' UNION ALL
SELECT '38','14','06/04/2018 12:00 AM' UNION ALL
SELECT '39','14','06/05/2018 12:00 AM' UNION ALL
SELECT '40','14','07/15/2018 12:00 AM';
--===== Set the identity insert back to normal
SET IDENTITY_INSERT #mytable OFF;
WITH ORDERED_DATA AS (
SELECT *,
CASE
WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
THEN 1
ELSE 0
END AS IS_THIRD
FROM #mytable AS MT
)
SELECT
OD.custID,
MIN(OD.DateValue)
FROM ORDERED_DATA AS OD
WHERE OD.IS_THIRD = 1
GROUP BY OD.custID
ORDER BY OD.custID;
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
October 24, 2018 at 9:37 am
Okay, let's revisit the performance on this:--===== If the test table already exists, drop it
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;
--===== Create the test table with
CREATE TABLE #mytable (
ID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED, --Is an IDENTITY column on real table
custID INT,
DateValue DATETIME,
);
CREATE CLUSTERED INDEX CDX_TEMPDB_mytable_ ON #mytable
(
custID ASC,
DateValue ASC
);
--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT #mytable ON;
--===== Insert the test data into the test table
INSERT INTO #mytable (ID, custID, DateValue)
SELECT 1, '1', '09/01/2018 12:00 AM' UNION ALL
SELECT 2, '1', '09/02/2018 12:00 AM' UNION ALL
SELECT 3, '1', '09/05/2018 12:00 AM' UNION ALL
SELECT 4, '1', '10/01/2018 12:00 AM' UNION ALL
SELECT 5, '2', '09/08/2018 12:00 AM' UNION ALL
SELECT 6, '2', '09/09/2018 12:00 AM' UNION ALL
SELECT 7, '3', '01/01/2018 12:00 AM' UNION ALL
SELECT 8, '3', '01/04/2018 12:00 AM' UNION ALL
SELECT 9, '3', '01/08/2018 12:00 AM' UNION ALL
SELECT 10, '4', '05/01/2018 12:00 AM' UNION ALL
SELECT 11, '4', '05/08/2018 12:00 AM' UNION ALL
SELECT 12, '4', '06/01/2018 12:00 AM' UNION ALL
SELECT 13, '4', '06/15/2018 12:00 AM' UNION ALL
SELECT 14, '5', '10/01/2018 12:00 AM' UNION ALL
SELECT 15, '6', '07/01/2018 12:00 AM' UNION ALL
SELECT 16, '6', '07/02/2018 12:00 AM' UNION ALL
SELECT 17, '6', '07/04/2018 12:00 AM' UNION ALL
SELECT 18, '6', '07/05/2018 12:00 AM' UNION ALL
SELECT 19, '7', '10/01/2018 12:00 AM' UNION ALL
SELECT 20, '7', '10/05/2018 12:00 AM' UNION ALL
SELECT 21, '8', '04/10/2018 12:00 AM' UNION ALL
SELECT 22, '8', '04/11/2018 12:00 AM' UNION ALL
SELECT 23, '8', '04/13/2018 12:00 AM' UNION ALL
SELECT 24, '9', '05/01/2018 12:00 AM' UNION ALL
SELECT 25, '9', '05/05/2018 12:00 AM' UNION ALL
SELECT 26, '9', '05/19/2018 12:00 AM' UNION ALL
SELECT 27, '10', '01/01/2018 12:00 AM' UNION ALL
SELECT 28, '10', '01/04/2018 12:00 AM' UNION ALL
SELECT 29, '10', '01/05/2018 12:00 AM' UNION ALL
SELECT 30, '11', '01/05/2018 12:00 AM' UNION ALL
SELECT 31, '12', '03/17/2018 12:00 AM' UNION ALL
SELECT 32, '12', '03/18/2018 12:00 AM' UNION ALL
SELECT 33, '12', '03/20/2018 12:00 AM' UNION ALL
SELECT 34, '13', '02/19/2018 12:00 AM' UNION ALL
SELECT 35, '13', '02/20/2018 12:00 AM' UNION ALL
SELECT 36, '14', '01/15/2018 12:00 AM' UNION ALL
SELECT 37, '14', '06/01/2018 12:00 AM' UNION ALL
SELECT 38, '14', '06/04/2018 12:00 AM' UNION ALL
SELECT 39, '14', '06/05/2018 12:00 AM' UNION ALL
SELECT 40, '14', '07/15/2018 12:00 AM';
--===== Set the identity insert back to normal
SET IDENTITY_INSERT #mytable OFF;
WITH ORDERED_DATA AS (
SELECT *,
CASE
WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
AND LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue) IS NULL
THEN 1
ELSE 0
END AS IS_THIRD
FROM #mytable AS MT
)
SELECT
OD.custID,
OD.DateValue
FROM ORDERED_DATA AS OD
WHERE OD.IS_THIRD = 1
ORDER BY OD.custID;
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
October 24, 2018 at 5:44 pm
Thank you for your assistance. This is really close. Works perfect when the 3 contacts are 1-3 but does not catch an instance where the 2nd, 3rd & 4th are the qualifying ones such as the last example in the table:
SELECT '36','14','01/15/2018 12:00 AM' UNION ALL
SELECT '37','14','06/01/2018 12:00 AM' UNION ALL
SELECT '38','14','06/04/2018 12:00 AM' UNION ALL
SELECT '39','14','06/05/2018 12:00 AM' UNION ALL
That is the tricky part that really has me stuck.
October 24, 2018 at 5:45 pm
sgmunson - Wednesday, October 24, 2018 9:37 AMOkay, let's revisit the performance on this:--===== If the test table already exists, drop it
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;--===== Create the test table with
CREATE TABLE #mytable (
ID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED, --Is an IDENTITY column on real table
custID INT,
DateValue DATETIME,
);
CREATE CLUSTERED INDEX CDX_TEMPDB_mytable_ ON #mytable
(
custID ASC,
DateValue ASC
);--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT #mytable ON;--===== Insert the test data into the test table
INSERT INTO #mytable (ID, custID, DateValue)
SELECT 1, '1', '09/01/2018 12:00 AM' UNION ALL
SELECT 2, '1', '09/02/2018 12:00 AM' UNION ALL
SELECT 3, '1', '09/05/2018 12:00 AM' UNION ALL
SELECT 4, '1', '10/01/2018 12:00 AM' UNION ALL
SELECT 5, '2', '09/08/2018 12:00 AM' UNION ALL
SELECT 6, '2', '09/09/2018 12:00 AM' UNION ALL
SELECT 7, '3', '01/01/2018 12:00 AM' UNION ALL
SELECT 8, '3', '01/04/2018 12:00 AM' UNION ALL
SELECT 9, '3', '01/08/2018 12:00 AM' UNION ALL
SELECT 10, '4', '05/01/2018 12:00 AM' UNION ALL
SELECT 11, '4', '05/08/2018 12:00 AM' UNION ALL
SELECT 12, '4', '06/01/2018 12:00 AM' UNION ALL
SELECT 13, '4', '06/15/2018 12:00 AM' UNION ALL
SELECT 14, '5', '10/01/2018 12:00 AM' UNION ALL
SELECT 15, '6', '07/01/2018 12:00 AM' UNION ALL
SELECT 16, '6', '07/02/2018 12:00 AM' UNION ALL
SELECT 17, '6', '07/04/2018 12:00 AM' UNION ALL
SELECT 18, '6', '07/05/2018 12:00 AM' UNION ALL
SELECT 19, '7', '10/01/2018 12:00 AM' UNION ALL
SELECT 20, '7', '10/05/2018 12:00 AM' UNION ALL
SELECT 21, '8', '04/10/2018 12:00 AM' UNION ALL
SELECT 22, '8', '04/11/2018 12:00 AM' UNION ALL
SELECT 23, '8', '04/13/2018 12:00 AM' UNION ALL
SELECT 24, '9', '05/01/2018 12:00 AM' UNION ALL
SELECT 25, '9', '05/05/2018 12:00 AM' UNION ALL
SELECT 26, '9', '05/19/2018 12:00 AM' UNION ALL
SELECT 27, '10', '01/01/2018 12:00 AM' UNION ALL
SELECT 28, '10', '01/04/2018 12:00 AM' UNION ALL
SELECT 29, '10', '01/05/2018 12:00 AM' UNION ALL
SELECT 30, '11', '01/05/2018 12:00 AM' UNION ALL
SELECT 31, '12', '03/17/2018 12:00 AM' UNION ALL
SELECT 32, '12', '03/18/2018 12:00 AM' UNION ALL
SELECT 33, '12', '03/20/2018 12:00 AM' UNION ALL
SELECT 34, '13', '02/19/2018 12:00 AM' UNION ALL
SELECT 35, '13', '02/20/2018 12:00 AM' UNION ALL
SELECT 36, '14', '01/15/2018 12:00 AM' UNION ALL
SELECT 37, '14', '06/01/2018 12:00 AM' UNION ALL
SELECT 38, '14', '06/04/2018 12:00 AM' UNION ALL
SELECT 39, '14', '06/05/2018 12:00 AM' UNION ALL
SELECT 40, '14', '07/15/2018 12:00 AM';--===== Set the identity insert back to normal
SET IDENTITY_INSERT #mytable OFF;WITH ORDERED_DATA AS (
SELECT *,
CASE
WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
AND LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue) IS NULL
THEN 1
ELSE 0
END AS IS_THIRD
FROM #mytable AS MT
)
SELECT
OD.custID,
OD.DateValue
FROM ORDERED_DATA AS OD
WHERE OD.IS_THIRD = 1
ORDER BY OD.custID;IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;
Thank you for your assistance. This is really close. Works perfect when the 3 contacts are 1-3 but does not catch an instance where the 2nd, 3rd & 4th are the qualifying ones such as the last example in the table:
SELECT '36','14','01/15/2018 12:00 AM' UNION ALL
SELECT '37','14','06/01/2018 12:00 AM' UNION ALL
SELECT '38','14','06/04/2018 12:00 AM' UNION ALL
SELECT '39','14','06/05/2018 12:00 AM' UNION ALL
That is the tricky part that really has me stuck.
October 25, 2018 at 12:42 am
WITH ORDERED_DATA AS (
SELECT *,Should do it, I think
October 25, 2018 at 7:00 am
timothy.m.phillips - Wednesday, October 24, 2018 5:45 PMsgmunson - Wednesday, October 24, 2018 9:37 AMOkay, let's revisit the performance on this:--===== If the test table already exists, drop it
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;--===== Create the test table with
CREATE TABLE #mytable (
ID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED, --Is an IDENTITY column on real table
custID INT,
DateValue DATETIME,
);
CREATE CLUSTERED INDEX CDX_TEMPDB_mytable_ ON #mytable
(
custID ASC,
DateValue ASC
);--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT #mytable ON;--===== Insert the test data into the test table
INSERT INTO #mytable (ID, custID, DateValue)
SELECT 1, '1', '09/01/2018 12:00 AM' UNION ALL
SELECT 2, '1', '09/02/2018 12:00 AM' UNION ALL
SELECT 3, '1', '09/05/2018 12:00 AM' UNION ALL
SELECT 4, '1', '10/01/2018 12:00 AM' UNION ALL
SELECT 5, '2', '09/08/2018 12:00 AM' UNION ALL
SELECT 6, '2', '09/09/2018 12:00 AM' UNION ALL
SELECT 7, '3', '01/01/2018 12:00 AM' UNION ALL
SELECT 8, '3', '01/04/2018 12:00 AM' UNION ALL
SELECT 9, '3', '01/08/2018 12:00 AM' UNION ALL
SELECT 10, '4', '05/01/2018 12:00 AM' UNION ALL
SELECT 11, '4', '05/08/2018 12:00 AM' UNION ALL
SELECT 12, '4', '06/01/2018 12:00 AM' UNION ALL
SELECT 13, '4', '06/15/2018 12:00 AM' UNION ALL
SELECT 14, '5', '10/01/2018 12:00 AM' UNION ALL
SELECT 15, '6', '07/01/2018 12:00 AM' UNION ALL
SELECT 16, '6', '07/02/2018 12:00 AM' UNION ALL
SELECT 17, '6', '07/04/2018 12:00 AM' UNION ALL
SELECT 18, '6', '07/05/2018 12:00 AM' UNION ALL
SELECT 19, '7', '10/01/2018 12:00 AM' UNION ALL
SELECT 20, '7', '10/05/2018 12:00 AM' UNION ALL
SELECT 21, '8', '04/10/2018 12:00 AM' UNION ALL
SELECT 22, '8', '04/11/2018 12:00 AM' UNION ALL
SELECT 23, '8', '04/13/2018 12:00 AM' UNION ALL
SELECT 24, '9', '05/01/2018 12:00 AM' UNION ALL
SELECT 25, '9', '05/05/2018 12:00 AM' UNION ALL
SELECT 26, '9', '05/19/2018 12:00 AM' UNION ALL
SELECT 27, '10', '01/01/2018 12:00 AM' UNION ALL
SELECT 28, '10', '01/04/2018 12:00 AM' UNION ALL
SELECT 29, '10', '01/05/2018 12:00 AM' UNION ALL
SELECT 30, '11', '01/05/2018 12:00 AM' UNION ALL
SELECT 31, '12', '03/17/2018 12:00 AM' UNION ALL
SELECT 32, '12', '03/18/2018 12:00 AM' UNION ALL
SELECT 33, '12', '03/20/2018 12:00 AM' UNION ALL
SELECT 34, '13', '02/19/2018 12:00 AM' UNION ALL
SELECT 35, '13', '02/20/2018 12:00 AM' UNION ALL
SELECT 36, '14', '01/15/2018 12:00 AM' UNION ALL
SELECT 37, '14', '06/01/2018 12:00 AM' UNION ALL
SELECT 38, '14', '06/04/2018 12:00 AM' UNION ALL
SELECT 39, '14', '06/05/2018 12:00 AM' UNION ALL
SELECT 40, '14', '07/15/2018 12:00 AM';--===== Set the identity insert back to normal
SET IDENTITY_INSERT #mytable OFF;WITH ORDERED_DATA AS (
SELECT *,
CASE
WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
AND LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue) IS NULL
THEN 1
ELSE 0
END AS IS_THIRD
FROM #mytable AS MT
)
SELECT
OD.custID,
OD.DateValue
FROM ORDERED_DATA AS OD
WHERE OD.IS_THIRD = 1
ORDER BY OD.custID;IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;Thank you for your assistance. This is really close. Works perfect when the 3 contacts are 1-3 but does not catch an instance where the 2nd, 3rd & 4th are the qualifying ones such as the last example in the table:
SELECT '36','14','01/15/2018 12:00 AM' UNION ALL
SELECT '37','14','06/01/2018 12:00 AM' UNION ALL
SELECT '38','14','06/04/2018 12:00 AM' UNION ALL
SELECT '39','14','06/05/2018 12:00 AM' UNION ALLThat is the tricky part that really has me stuck.
Just realized where the problem was.... Try this instead:--===== If the test table already exists, drop it
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;
--===== Create the test table with
CREATE TABLE #mytable (
ID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED, --Is an IDENTITY column on real table
custID INT,
DateValue DATETIME,
);
CREATE CLUSTERED INDEX CDX_TEMPDB_mytable_ ON #mytable
(
custID ASC,
DateValue ASC
);
--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT #mytable ON;
--===== Insert the test data into the test table
INSERT INTO #mytable (ID, custID, DateValue)
SELECT 1, '1', '09/01/2018 12:00 AM' UNION ALL
SELECT 2, '1', '09/02/2018 12:00 AM' UNION ALL
SELECT 3, '1', '09/05/2018 12:00 AM' UNION ALL
SELECT 4, '1', '10/01/2018 12:00 AM' UNION ALL
SELECT 5, '2', '09/08/2018 12:00 AM' UNION ALL
SELECT 6, '2', '09/09/2018 12:00 AM' UNION ALL
SELECT 7, '3', '01/01/2018 12:00 AM' UNION ALL
SELECT 8, '3', '01/04/2018 12:00 AM' UNION ALL
SELECT 9, '3', '01/08/2018 12:00 AM' UNION ALL
SELECT 10, '4', '05/01/2018 12:00 AM' UNION ALL
SELECT 11, '4', '05/08/2018 12:00 AM' UNION ALL
SELECT 12, '4', '06/01/2018 12:00 AM' UNION ALL
SELECT 13, '4', '06/15/2018 12:00 AM' UNION ALL
SELECT 14, '5', '10/01/2018 12:00 AM' UNION ALL
SELECT 15, '6', '07/01/2018 12:00 AM' UNION ALL
SELECT 16, '6', '07/02/2018 12:00 AM' UNION ALL
SELECT 17, '6', '07/04/2018 12:00 AM' UNION ALL
SELECT 18, '6', '07/05/2018 12:00 AM' UNION ALL
SELECT 19, '7', '10/01/2018 12:00 AM' UNION ALL
SELECT 20, '7', '10/05/2018 12:00 AM' UNION ALL
SELECT 21, '8', '04/10/2018 12:00 AM' UNION ALL
SELECT 22, '8', '04/11/2018 12:00 AM' UNION ALL
SELECT 23, '8', '04/13/2018 12:00 AM' UNION ALL
SELECT 24, '9', '05/01/2018 12:00 AM' UNION ALL
SELECT 25, '9', '05/05/2018 12:00 AM' UNION ALL
SELECT 26, '9', '05/19/2018 12:00 AM' UNION ALL
SELECT 27, '10', '01/01/2018 12:00 AM' UNION ALL
SELECT 28, '10', '01/04/2018 12:00 AM' UNION ALL
SELECT 29, '10', '01/05/2018 12:00 AM' UNION ALL
SELECT 30, '11', '01/05/2018 12:00 AM' UNION ALL
SELECT 31, '12', '03/17/2018 12:00 AM' UNION ALL
SELECT 32, '12', '03/18/2018 12:00 AM' UNION ALL
SELECT 33, '12', '03/20/2018 12:00 AM' UNION ALL
SELECT 34, '13', '02/19/2018 12:00 AM' UNION ALL
SELECT 35, '13', '02/20/2018 12:00 AM' UNION ALL
SELECT 36, '14', '01/15/2018 12:00 AM' UNION ALL
SELECT 37, '14', '06/01/2018 12:00 AM' UNION ALL
SELECT 38, '14', '06/04/2018 12:00 AM' UNION ALL
SELECT 39, '14', '06/05/2018 12:00 AM' UNION ALL
SELECT 40, '14', '07/15/2018 12:00 AM';
--===== Set the identity insert back to normal
SET IDENTITY_INSERT #mytable OFF;
WITH ORDERED_DATA AS (
SELECT *,
CASE
WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
AND (
LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue) IS NULL
OR
DATEADD(day, 5, LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) < MT.DateValue
) THEN 1
ELSE 0
END AS IS_THIRD
FROM #mytable AS MT
)
SELECT
OD.custID,
OD.DateValue
FROM ORDERED_DATA AS OD
WHERE OD.IS_THIRD = 1
ORDER BY OD.custID;
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL
BEGIN
DROP TABLE #mytable;
END;
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
October 26, 2018 at 3:16 pm
>> I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. <<
Why do you think that the non-relational table property could ever be a key in a valid relational model? Why do you think it identifier can ever be a numeric? What math were you going to do on it? Let's try and give you a decent table.
CREATE TABLE Customer_Contacts
(customer_id CHAR(10) NOT NULL
REFERENCES Customers(customer_id)
ON DELETE CASCADE
ON UPDATE CASCADE,
contact_date DATE DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY(customer_id, contact_date)
);
see how we use actual columns to construct a valid relational key. See the references clause the ties this occurrence of the customer identifier to the customer’s table. See how the default plugs a date automatically.
Why are you still using the original proprietary Sybase “SELECT ..UNION ALL” syntax? Microsoft is supported the ANSI/ISO standard “INSERT INTO .. VALUES (<tuple list>)” table constructor for quite a few years now. You also don’t seem to know that we have a date data type now. Nor do you know that the only format allowed in the ANSI ISO standards is “yyyyyyyy-mm-dd” instead of the local dialect you used. Basically, your writing SQL code that you should ofof stopped writing over 20 years ago.
Let’s try this approach; build a table of 50 or hundred years of time frames of five days. If your actual business rules allow for holidays versus business days, then you can adjust these frames. I’m not going to discuss any tricks for that this post. You can play with it. You might also want to download the PDF of the Rick Snodgrass book on temporal queries in SQL from the University of Arizona website.
CREATE TABLE Timeframes
(time_frame_start_date DATE NOT NULL PRIMARY KEY,
time_frame_end_date DATE NOT NULL
CHECK (DATEADD(5, DAY, time_frame_start_date)
= time_frame_end_date)
);
you can now write a query that looks at each timeframe, finds contact dates that fall within the start and end of each frame, and counts them by customer_id. Any grouping of (customer_id, time_frame_start_date) that has less than three rows in it can be rejected. Any timeframe that ended before the current date, can be rejected. Within a timeframe, you can use a ROW_NUMBER OVER(PARTITION BY time_frame_start_date) function and look for the row number = 3. I’m not sure what the performance will be, but it should be pretty adjustable
Please post DDL and follow ANSI/ISO standards when asking for help.
October 26, 2018 at 11:22 pm
jcelko212 32090 - Friday, October 26, 2018 3:16 PM>> I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. <<
Why do you think that the non-relational table property could ever be a key in a valid relational model? Why do you think it identifier can ever be a numeric? What math were you going to do on it? Let's try and give you a decent table.
CREATE TABLE Customer_Contacts
(customer_id CHAR(10) NOT NULL
REFERENCES Customers(customer_id)
ON DELETE CASCADE
ON UPDATE CASCADE,
contact_date DATE DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY(customer_id, contact_date)
);see how we use actual columns to construct a valid relational key. See the references clause the ties this occurrence of the customer identifier to the customer’s table. See how the default plugs a date automatically.
Why are you still using the original proprietary Sybase “SELECT ..UNION ALL†syntax? Microsoft is supported the ANSI/ISO standard “INSERT INTO .. VALUES (<tuple list>)†table constructor for quite a few years now. You also don’t seem to know that we have a date data type now. Nor do you know that the only format allowed in the ANSI ISO standards is “yyyyyyyy-mm-dd†instead of the local dialect you used. Basically, your writing SQL code that you should ofof stopped writing over 20 years ago.
Let’s try this approach; build a table of 50 or hundred years of time frames of five days. If your actual business rules allow for holidays versus business days, then you can adjust these frames. I’m not going to discuss any tricks for that this post. You can play with it. You might also want to download the PDF of the Rick Snodgrass book on temporal queries in SQL from the University of Arizona website.
CREATE TABLE Timeframes
(time_frame_start_date DATE NOT NULL PRIMARY KEY,
time_frame_end_date DATE NOT NULL
CHECK (DATEADD(5, DAY, time_frame_start_date)
= time_frame_end_date)
);you can now write a query that looks at each timeframe, finds contact dates that fall within the start and end of each frame, and counts them by customer_id. Any grouping of (customer_id, time_frame_start_date) that has less than three rows in it can be rejected. Any timeframe that ended before the current date, can be rejected. Within a timeframe, you can use a ROW_NUMBER OVER(PARTITION BY time_frame_start_date) function and look for the row number = 3. I’m not sure what the performance will be, but it should be pretty adjustable
So you've converted the Customer_ID from 4 bytes to 10 bytes. There's a part of the math you speak of. If you have non-clustered indexes, you'll find more math involved. Also, what will you actually use to uniquely identify the customer within those 10 bytes?
Also, have you tested your 5 day time frame approach for performance? If you were to actually write the code for your suggestion, I'd be happy to setup a test to compare it to the other methods suggested.
And, since you cited it, do you have a link for the PDF that you suggested? Mr. Snodgrass has written a whole bunch of material so it would be nice to know the specific material you speak of. Is it the following? If not, please provide the link.
https://cs.ulb.ac.be/public/_media/teaching/infoh415/teradata_temporal_case_study.pdf
While you're at it, you should try your "What math were you going to do on it?" rhetoric on Mr. Snodgrass and see how far you get because in the article that I provided the link for above, he creates the following table to demonstrate with in the upper right corner of page 3...
CREATE MULTISET TABLE Prop_Owner (
customer_number INTEGER,
property_number INTEGER,
property_VT PERIOD(DATE) NOT NULL AS VALIDTIME,
property_TT PERIOD (TIMESTAMP(6) WITH TIME ZONE)
NOT NULL AS TRANSACTIONTIME)
PRIMARY INDEX(property_number);
Also, if the link I provided IS the article that you cite, notice that Mr. Snodgrass specifically states that the SQL Language typically has inadequate temporal functionality and goes on to state that he's using Teradata, which does have the necessary functionality. The problem there is that SQL is not SQL and this is not a Teradata forum. I suspect the same will hold true if you cite a different link from Mr. Snodgrass.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply