Identifying the 3rd qualifying record

Question

Identifying the 3rd qualifying record

timothy.m.phillips

SSC Enthusiast

Points: 110
More actions
October 22, 2018 at 8:43 am

#371975

I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/14
10/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13
And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.
Editing to add data:
--===== If the test table already exists, drop it
IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL
DROP TABLE #mytable
--===== Create the test table with
CREATE TABLE #mytable
(
ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, --Is an IDENTITY column on real table
custID INT,
DateValue DATETIME
)
--===== All Inserts into the IDENTITY column
SET IDENTITY_INSERT #mytable ON
--===== Insert the test data into the test table
INSERT INTO #mytable
(ID, custID, DateValue)
SELECT '1','1','09/01/2018 12:00 AM' UNION ALL
SELECT '2','1','09/02/2018 12:00 AM' UNION ALL
SELECT '3','1','09/05/2018 12:00 AM' UNION ALL
SELECT '4','1','10/01/2018 12:00 AM' UNION ALL
SELECT '5','2','09/08/2018 12:00 AM' UNION ALL
SELECT '6','2','09/09/2018 12:00 AM' UNION ALL
SELECT '7','3','01/01/2018 12:00 AM' UNION ALL
SELECT '8','3','01/04/2018 12:00 AM' UNION ALL
SELECT '9','3','01/08/2018 12:00 AM' UNION ALL
SELECT '10','4','05/01/2018 12:00 AM' UNION ALL
SELECT '11','4','05/08/2018 12:00 AM' UNION ALL
SELECT '12','4','06/01/2018 12:00 AM' UNION ALL
SELECT '13','4','06/15/2018 12:00 AM' UNION ALL
SELECT '14','5','10/01/2018 12:00 AM' UNION ALL
SELECT '15','6','07/01/2018 12:00 AM' UNION ALL
SELECT '16','6','07/02/2018 12:00 AM' UNION ALL
SELECT '17','6','07/04/2018 12:00 AM' UNION ALL
SELECT '18','6','07/05/2018 12:00 AM' UNION ALL
SELECT '19','7','10/01/2018 12:00 AM' UNION ALL
SELECT '20','7','10/05/2018 12:00 AM' UNION ALL
SELECT '21','8','04/10/2018 12:00 AM' UNION ALL
SELECT '22','8','04/11/2018 12:00 AM' UNION ALL
SELECT '23','8','04/13/2018 12:00 AM' UNION ALL
SELECT '24','9','05/01/2018 12:00 AM' UNION ALL
SELECT '25','9','05/05/2018 12:00 AM' UNION ALL
SELECT '26','9','05/19/2018 12:00 AM' UNION ALL
SELECT '27','10','01/01/2018 12:00 AM' UNION ALL
SELECT '28','10','01/04/2018 12:00 AM' UNION ALL
SELECT '29','10','01/05/2018 12:00 AM' UNION ALL
SELECT '30','11','01/05/2018 12:00 AM' UNION ALL
SELECT '31','12','03/17/2018 12:00 AM' UNION ALL
SELECT '32','12','03/18/2018 12:00 AM' UNION ALL
SELECT '33','12','03/20/2018 12:00 AM' UNION ALL
SELECT '34','13','02/19/2018 12:00 AM' UNION ALL
SELECT '35','13','02/20/2018 12:00 AM' UNION ALL
SELECT '36','14','01/15/2018 12:00 AM' UNION ALL
SELECT '37','14','06/01/2018 12:00 AM' UNION ALL
SELECT '38','14','06/04/2018 12:00 AM' UNION ALL
SELECT '39','14','06/05/2018 12:00 AM' UNION ALL
SELECT '40','14','07/15/2018 12:00 AM'
--===== Set the identity insert back to normal
SET IDENTITY_INSERT #mytable OFF

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 1

Sounds like a classic gaps and islands problem. I don't have a tailor-made solution, I'm afraid, but the page I linked to should provide a few pointers.

John

Phil Parkin SSC Guru Points: 246990 More actions · Answer 2

timothy.m.phillips - Monday, October 22, 2018 8:43 AM
I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/14
10/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13
And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.

Would you be able to create some test data in a consumable (ie, ready to paste into SSMS) format, please? Please see the link in my signature if you are unsure how to do this.
People here enjoy solving puzzles like this â€“ you'll soon have a working solution if you take the time to post the test data.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

drew.allen SSC Guru Points: 76997 More actions · Answer 3

Phil Parkin - Monday, October 22, 2018 8:53 AM
timothy.m.phillips - Monday, October 22, 2018 8:43 AM
I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/14
10/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13
And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.
Would you be able to create some test data in a consumable (ie, ready to paste into SSMS) format, please? Please see the link in my signature if you are unsure how to do this.
People here enjoy solving puzzles like this â€“ you'll soon have a working solution if you take the time to post the test data.

And make sure your data includes edge cases such as

10/1
10/4
10/8

Where each of the dates is within five of the previous date, but the third date is not within five days of the first date.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

timothy.m.phillips SSC Enthusiast Points: 110 More actions · Answer 4

Phil Parkin - Monday, October 22, 2018 8:53 AM
timothy.m.phillips - Monday, October 22, 2018 8:43 AM
I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/14
10/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13
And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.
Would you be able to create some test data in a consumable (ie, ready to paste into SSMS) format, please? Please see the link in my signature if you are unsure how to do this.
People here enjoy solving puzzles like this â€“ you'll soon have a working solution if you take the time to post the test data.

Thanks. I have added data to the original post following the format on the link in your signature.

sgmunson SSC Guru Points: 110640 More actions · Answer 5

timothy.m.phillips - Tuesday, October 23, 2018 2:52 PM
Phil Parkin - Monday, October 22, 2018 8:53 AM
timothy.m.phillips - Monday, October 22, 2018 8:43 AM
I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. So if we had the following:
10/11
10/12
10/14
10/14 would be the successful record and I would need to grab that date. However we could have
10/1
10/2
10/10
10/11
10/13
And now the 5th attempt (10/13) would now be the third qualifying attempt. I am a little stumped at the best way to approach this.
Would you be able to create some test data in a consumable (ie, ready to paste into SSMS) format, please? Please see the link in my signature if you are unsure how to do this.
People here enjoy solving puzzles like this â€“ you'll soon have a working solution if you take the time to post the test data.
Thanks. I have added data to the original post following the format on the link in your signature.

Here you go:
--===== If the test table already exists, drop it IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END;

--===== Create the test table with CREATE TABLE #mytable ( ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, --Is an IDENTITY column on real table custID INT, DateValue DATETIME );

--===== All Inserts into the IDENTITY column SET IDENTITY_INSERT #mytable ON;

--===== Insert the test data into the test table INSERT INTO #mytable (ID, custID, DateValue) SELECT '1','1','09/01/2018 12:00 AM' UNION ALL SELECT '2','1','09/02/2018 12:00 AM' UNION ALL SELECT '3','1','09/05/2018 12:00 AM' UNION ALL SELECT '4','1','10/01/2018 12:00 AM' UNION ALL SELECT '5','2','09/08/2018 12:00 AM' UNION ALL SELECT '6','2','09/09/2018 12:00 AM' UNION ALL SELECT '7','3','01/01/2018 12:00 AM' UNION ALL SELECT '8','3','01/04/2018 12:00 AM' UNION ALL SELECT '9','3','01/08/2018 12:00 AM' UNION ALL SELECT '10','4','05/01/2018 12:00 AM' UNION ALL SELECT '11','4','05/08/2018 12:00 AM' UNION ALL SELECT '12','4','06/01/2018 12:00 AM' UNION ALL SELECT '13','4','06/15/2018 12:00 AM' UNION ALL SELECT '14','5','10/01/2018 12:00 AM' UNION ALL SELECT '15','6','07/01/2018 12:00 AM' UNION ALL SELECT '16','6','07/02/2018 12:00 AM' UNION ALL SELECT '17','6','07/04/2018 12:00 AM' UNION ALL SELECT '18','6','07/05/2018 12:00 AM' UNION ALL SELECT '19','7','10/01/2018 12:00 AM' UNION ALL SELECT '20','7','10/05/2018 12:00 AM' UNION ALL SELECT '21','8','04/10/2018 12:00 AM' UNION ALL SELECT '22','8','04/11/2018 12:00 AM' UNION ALL SELECT '23','8','04/13/2018 12:00 AM' UNION ALL SELECT '24','9','05/01/2018 12:00 AM' UNION ALL SELECT '25','9','05/05/2018 12:00 AM' UNION ALL SELECT '26','9','05/19/2018 12:00 AM' UNION ALL SELECT '27','10','01/01/2018 12:00 AM' UNION ALL SELECT '28','10','01/04/2018 12:00 AM' UNION ALL SELECT '29','10','01/05/2018 12:00 AM' UNION ALL SELECT '30','11','01/05/2018 12:00 AM' UNION ALL SELECT '31','12','03/17/2018 12:00 AM' UNION ALL SELECT '32','12','03/18/2018 12:00 AM' UNION ALL SELECT '33','12','03/20/2018 12:00 AM' UNION ALL SELECT '34','13','02/19/2018 12:00 AM' UNION ALL SELECT '35','13','02/20/2018 12:00 AM' UNION ALL SELECT '36','14','01/15/2018 12:00 AM' UNION ALL SELECT '37','14','06/01/2018 12:00 AM' UNION ALL SELECT '38','14','06/04/2018 12:00 AM' UNION ALL SELECT '39','14','06/05/2018 12:00 AM' UNION ALL SELECT '40','14','07/15/2018 12:00 AM';

--===== Set the identity insert back to normal SET IDENTITY_INSERT #mytable OFF;

WITH ORDERED_DATA AS (

SELECT *, CASE WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue THEN 1 ELSE 0 END AS IS_THIRD FROM #mytable AS MT ) SELECT OD.custID, MIN(OD.DateValue) FROM ORDERED_DATA AS OD WHERE OD.IS_THIRD = 1 GROUP BY OD.custID ORDER BY OD.custID;

IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END;

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

sgmunson SSC Guru Points: 110640 More actions · Answer 6

Okay, let's revisit the performance on this:
--===== If the test table already exists, drop it IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END;

--===== Create the test table with CREATE TABLE #mytable ( ID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED, --Is an IDENTITY column on real table custID INT, DateValue DATETIME, ); CREATE CLUSTERED INDEX CDX_TEMPDB_mytable_ ON #mytable ( custID ASC, DateValue ASC );

--===== All Inserts into the IDENTITY column SET IDENTITY_INSERT #mytable ON;

--===== Insert the test data into the test table INSERT INTO #mytable (ID, custID, DateValue) SELECT 1, '1', '09/01/2018 12:00 AM' UNION ALL SELECT 2, '1', '09/02/2018 12:00 AM' UNION ALL SELECT 3, '1', '09/05/2018 12:00 AM' UNION ALL SELECT 4, '1', '10/01/2018 12:00 AM' UNION ALL SELECT 5, '2', '09/08/2018 12:00 AM' UNION ALL SELECT 6, '2', '09/09/2018 12:00 AM' UNION ALL SELECT 7, '3', '01/01/2018 12:00 AM' UNION ALL SELECT 8, '3', '01/04/2018 12:00 AM' UNION ALL SELECT 9, '3', '01/08/2018 12:00 AM' UNION ALL SELECT 10, '4', '05/01/2018 12:00 AM' UNION ALL SELECT 11, '4', '05/08/2018 12:00 AM' UNION ALL SELECT 12, '4', '06/01/2018 12:00 AM' UNION ALL SELECT 13, '4', '06/15/2018 12:00 AM' UNION ALL SELECT 14, '5', '10/01/2018 12:00 AM' UNION ALL SELECT 15, '6', '07/01/2018 12:00 AM' UNION ALL SELECT 16, '6', '07/02/2018 12:00 AM' UNION ALL SELECT 17, '6', '07/04/2018 12:00 AM' UNION ALL SELECT 18, '6', '07/05/2018 12:00 AM' UNION ALL SELECT 19, '7', '10/01/2018 12:00 AM' UNION ALL SELECT 20, '7', '10/05/2018 12:00 AM' UNION ALL SELECT 21, '8', '04/10/2018 12:00 AM' UNION ALL SELECT 22, '8', '04/11/2018 12:00 AM' UNION ALL SELECT 23, '8', '04/13/2018 12:00 AM' UNION ALL SELECT 24, '9', '05/01/2018 12:00 AM' UNION ALL SELECT 25, '9', '05/05/2018 12:00 AM' UNION ALL SELECT 26, '9', '05/19/2018 12:00 AM' UNION ALL SELECT 27, '10', '01/01/2018 12:00 AM' UNION ALL SELECT 28, '10', '01/04/2018 12:00 AM' UNION ALL SELECT 29, '10', '01/05/2018 12:00 AM' UNION ALL SELECT 30, '11', '01/05/2018 12:00 AM' UNION ALL SELECT 31, '12', '03/17/2018 12:00 AM' UNION ALL SELECT 32, '12', '03/18/2018 12:00 AM' UNION ALL SELECT 33, '12', '03/20/2018 12:00 AM' UNION ALL SELECT 34, '13', '02/19/2018 12:00 AM' UNION ALL SELECT 35, '13', '02/20/2018 12:00 AM' UNION ALL SELECT 36, '14', '01/15/2018 12:00 AM' UNION ALL SELECT 37, '14', '06/01/2018 12:00 AM' UNION ALL SELECT 38, '14', '06/04/2018 12:00 AM' UNION ALL SELECT 39, '14', '06/05/2018 12:00 AM' UNION ALL SELECT 40, '14', '07/15/2018 12:00 AM';

--===== Set the identity insert back to normal SET IDENTITY_INSERT #mytable OFF;

WITH ORDERED_DATA AS (

SELECT *, CASE WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue AND LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue) IS NULL THEN 1 ELSE 0 END AS IS_THIRD FROM #mytable AS MT ) SELECT OD.custID, OD.DateValue FROM ORDERED_DATA AS OD WHERE OD.IS_THIRD = 1 ORDER BY OD.custID;

IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END;

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

timothy.m.phillips SSC Enthusiast Points: 110 More actions · Answer 7

Thank you for your assistance. This is really close. Works perfect when the 3 contacts are 1-3 but does not catch an instance where the 2nd, 3rd & 4th are the qualifying ones such as the last example in the table:
SELECT '36','14','01/15/2018 12:00 AM' UNION ALL
SELECT '37','14','06/01/2018 12:00 AM' UNION ALL
SELECT '38','14','06/04/2018 12:00 AM' UNION ALL
SELECT '39','14','06/05/2018 12:00 AM' UNION ALL

That is the tricky part that really has me stuck.

timothy.m.phillips SSC Enthusiast Points: 110 More actions · Answer 8

sgmunson - Wednesday, October 24, 2018 9:37 AM
Okay, let's revisit the performance on this:
--===== If the test table already exists, drop it IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END; --===== Create the test table with CREATE TABLE #mytable ( ID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED, --Is an IDENTITY column on real table custID INT, DateValue DATETIME, ); CREATE CLUSTERED INDEX CDX_TEMPDB_mytable_ ON #mytable ( custID ASC, DateValue ASC ); --===== All Inserts into the IDENTITY column SET IDENTITY_INSERT #mytable ON; --===== Insert the test data into the test table INSERT INTO #mytable (ID, custID, DateValue) SELECT 1, '1', '09/01/2018 12:00 AM' UNION ALL SELECT 2, '1', '09/02/2018 12:00 AM' UNION ALL SELECT 3, '1', '09/05/2018 12:00 AM' UNION ALL SELECT 4, '1', '10/01/2018 12:00 AM' UNION ALL SELECT 5, '2', '09/08/2018 12:00 AM' UNION ALL SELECT 6, '2', '09/09/2018 12:00 AM' UNION ALL SELECT 7, '3', '01/01/2018 12:00 AM' UNION ALL SELECT 8, '3', '01/04/2018 12:00 AM' UNION ALL SELECT 9, '3', '01/08/2018 12:00 AM' UNION ALL SELECT 10, '4', '05/01/2018 12:00 AM' UNION ALL SELECT 11, '4', '05/08/2018 12:00 AM' UNION ALL SELECT 12, '4', '06/01/2018 12:00 AM' UNION ALL SELECT 13, '4', '06/15/2018 12:00 AM' UNION ALL SELECT 14, '5', '10/01/2018 12:00 AM' UNION ALL SELECT 15, '6', '07/01/2018 12:00 AM' UNION ALL SELECT 16, '6', '07/02/2018 12:00 AM' UNION ALL SELECT 17, '6', '07/04/2018 12:00 AM' UNION ALL SELECT 18, '6', '07/05/2018 12:00 AM' UNION ALL SELECT 19, '7', '10/01/2018 12:00 AM' UNION ALL SELECT 20, '7', '10/05/2018 12:00 AM' UNION ALL SELECT 21, '8', '04/10/2018 12:00 AM' UNION ALL SELECT 22, '8', '04/11/2018 12:00 AM' UNION ALL SELECT 23, '8', '04/13/2018 12:00 AM' UNION ALL SELECT 24, '9', '05/01/2018 12:00 AM' UNION ALL SELECT 25, '9', '05/05/2018 12:00 AM' UNION ALL SELECT 26, '9', '05/19/2018 12:00 AM' UNION ALL SELECT 27, '10', '01/01/2018 12:00 AM' UNION ALL SELECT 28, '10', '01/04/2018 12:00 AM' UNION ALL SELECT 29, '10', '01/05/2018 12:00 AM' UNION ALL SELECT 30, '11', '01/05/2018 12:00 AM' UNION ALL SELECT 31, '12', '03/17/2018 12:00 AM' UNION ALL SELECT 32, '12', '03/18/2018 12:00 AM' UNION ALL SELECT 33, '12', '03/20/2018 12:00 AM' UNION ALL SELECT 34, '13', '02/19/2018 12:00 AM' UNION ALL SELECT 35, '13', '02/20/2018 12:00 AM' UNION ALL SELECT 36, '14', '01/15/2018 12:00 AM' UNION ALL SELECT 37, '14', '06/01/2018 12:00 AM' UNION ALL SELECT 38, '14', '06/04/2018 12:00 AM' UNION ALL SELECT 39, '14', '06/05/2018 12:00 AM' UNION ALL SELECT 40, '14', '07/15/2018 12:00 AM'; --===== Set the identity insert back to normal SET IDENTITY_INSERT #mytable OFF; WITH ORDERED_DATA AS ( SELECT *, CASE WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue AND LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue) IS NULL THEN 1 ELSE 0 END AS IS_THIRD FROM #mytable AS MT ) SELECT OD.custID, OD.DateValue FROM ORDERED_DATA AS OD WHERE OD.IS_THIRD = 1 ORDER BY OD.custID;
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END;

Thank you for your assistance. This is really close. Works perfect when the 3 contacts are 1-3 but does not catch an instance where the 2nd, 3rd & 4th are the qualifying ones such as the last example in the table:
SELECT '36','14','01/15/2018 12:00 AM' UNION ALL
SELECT '37','14','06/01/2018 12:00 AM' UNION ALL
SELECT '38','14','06/04/2018 12:00 AM' UNION ALL
SELECT '39','14','06/05/2018 12:00 AM' UNION ALL

That is the tricky part that really has me stuck.

andycadley SSCertifiable Points: 5296 More actions · Answer 9

WITH ORDERED_DATA AS (

   SELECT *,
   CASE
   WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
   AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue
   --AND LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue) IS NULL
   THEN 1
   ELSE 0
   END AS IS_THIRD
    FROM #mytable AS MT
)
SELECT
   OD.custID,
   Min(OD.DateValue) As DateValue
FROM ORDERED_DATA AS OD
WHERE OD.IS_THIRD = 1
GROUP By OD.custID
ORDER BY OD.custID;

Should do it, I think

sgmunson SSC Guru Points: 110640 More actions · Answer 10

timothy.m.phillips - Wednesday, October 24, 2018 5:45 PM
sgmunson - Wednesday, October 24, 2018 9:37 AM
Okay, let's revisit the performance on this:
--===== If the test table already exists, drop it IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END; --===== Create the test table with CREATE TABLE #mytable ( ID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED, --Is an IDENTITY column on real table custID INT, DateValue DATETIME, ); CREATE CLUSTERED INDEX CDX_TEMPDB_mytable_ ON #mytable ( custID ASC, DateValue ASC ); --===== All Inserts into the IDENTITY column SET IDENTITY_INSERT #mytable ON; --===== Insert the test data into the test table INSERT INTO #mytable (ID, custID, DateValue) SELECT 1, '1', '09/01/2018 12:00 AM' UNION ALL SELECT 2, '1', '09/02/2018 12:00 AM' UNION ALL SELECT 3, '1', '09/05/2018 12:00 AM' UNION ALL SELECT 4, '1', '10/01/2018 12:00 AM' UNION ALL SELECT 5, '2', '09/08/2018 12:00 AM' UNION ALL SELECT 6, '2', '09/09/2018 12:00 AM' UNION ALL SELECT 7, '3', '01/01/2018 12:00 AM' UNION ALL SELECT 8, '3', '01/04/2018 12:00 AM' UNION ALL SELECT 9, '3', '01/08/2018 12:00 AM' UNION ALL SELECT 10, '4', '05/01/2018 12:00 AM' UNION ALL SELECT 11, '4', '05/08/2018 12:00 AM' UNION ALL SELECT 12, '4', '06/01/2018 12:00 AM' UNION ALL SELECT 13, '4', '06/15/2018 12:00 AM' UNION ALL SELECT 14, '5', '10/01/2018 12:00 AM' UNION ALL SELECT 15, '6', '07/01/2018 12:00 AM' UNION ALL SELECT 16, '6', '07/02/2018 12:00 AM' UNION ALL SELECT 17, '6', '07/04/2018 12:00 AM' UNION ALL SELECT 18, '6', '07/05/2018 12:00 AM' UNION ALL SELECT 19, '7', '10/01/2018 12:00 AM' UNION ALL SELECT 20, '7', '10/05/2018 12:00 AM' UNION ALL SELECT 21, '8', '04/10/2018 12:00 AM' UNION ALL SELECT 22, '8', '04/11/2018 12:00 AM' UNION ALL SELECT 23, '8', '04/13/2018 12:00 AM' UNION ALL SELECT 24, '9', '05/01/2018 12:00 AM' UNION ALL SELECT 25, '9', '05/05/2018 12:00 AM' UNION ALL SELECT 26, '9', '05/19/2018 12:00 AM' UNION ALL SELECT 27, '10', '01/01/2018 12:00 AM' UNION ALL SELECT 28, '10', '01/04/2018 12:00 AM' UNION ALL SELECT 29, '10', '01/05/2018 12:00 AM' UNION ALL SELECT 30, '11', '01/05/2018 12:00 AM' UNION ALL SELECT 31, '12', '03/17/2018 12:00 AM' UNION ALL SELECT 32, '12', '03/18/2018 12:00 AM' UNION ALL SELECT 33, '12', '03/20/2018 12:00 AM' UNION ALL SELECT 34, '13', '02/19/2018 12:00 AM' UNION ALL SELECT 35, '13', '02/20/2018 12:00 AM' UNION ALL SELECT 36, '14', '01/15/2018 12:00 AM' UNION ALL SELECT 37, '14', '06/01/2018 12:00 AM' UNION ALL SELECT 38, '14', '06/04/2018 12:00 AM' UNION ALL SELECT 39, '14', '06/05/2018 12:00 AM' UNION ALL SELECT 40, '14', '07/15/2018 12:00 AM'; --===== Set the identity insert back to normal SET IDENTITY_INSERT #mytable OFF; WITH ORDERED_DATA AS ( SELECT *, CASE WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue AND LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue) IS NULL THEN 1 ELSE 0 END AS IS_THIRD FROM #mytable AS MT ) SELECT OD.custID, OD.DateValue FROM ORDERED_DATA AS OD WHERE OD.IS_THIRD = 1 ORDER BY OD.custID;
IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END;
Thank you for your assistance. This is really close. Works perfect when the 3 contacts are 1-3 but does not catch an instance where the 2nd, 3rd & 4th are the qualifying ones such as the last example in the table:
SELECT '36','14','01/15/2018 12:00 AM' UNION ALL
SELECT '37','14','06/01/2018 12:00 AM' UNION ALL
SELECT '38','14','06/04/2018 12:00 AM' UNION ALL
SELECT '39','14','06/05/2018 12:00 AM' UNION ALL
That is the tricky part that really has me stuck.

Just realized where the problem was.... Try this instead:
--===== If the test table already exists, drop it IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END;

--===== Create the test table with CREATE TABLE #mytable ( ID INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED, --Is an IDENTITY column on real table custID INT, DateValue DATETIME, ); CREATE CLUSTERED INDEX CDX_TEMPDB_mytable_ ON #mytable ( custID ASC, DateValue ASC );

--===== All Inserts into the IDENTITY column SET IDENTITY_INSERT #mytable ON;

--===== Insert the test data into the test table INSERT INTO #mytable (ID, custID, DateValue) SELECT 1, '1', '09/01/2018 12:00 AM' UNION ALL SELECT 2, '1', '09/02/2018 12:00 AM' UNION ALL SELECT 3, '1', '09/05/2018 12:00 AM' UNION ALL SELECT 4, '1', '10/01/2018 12:00 AM' UNION ALL SELECT 5, '2', '09/08/2018 12:00 AM' UNION ALL SELECT 6, '2', '09/09/2018 12:00 AM' UNION ALL SELECT 7, '3', '01/01/2018 12:00 AM' UNION ALL SELECT 8, '3', '01/04/2018 12:00 AM' UNION ALL SELECT 9, '3', '01/08/2018 12:00 AM' UNION ALL SELECT 10, '4', '05/01/2018 12:00 AM' UNION ALL SELECT 11, '4', '05/08/2018 12:00 AM' UNION ALL SELECT 12, '4', '06/01/2018 12:00 AM' UNION ALL SELECT 13, '4', '06/15/2018 12:00 AM' UNION ALL SELECT 14, '5', '10/01/2018 12:00 AM' UNION ALL SELECT 15, '6', '07/01/2018 12:00 AM' UNION ALL SELECT 16, '6', '07/02/2018 12:00 AM' UNION ALL SELECT 17, '6', '07/04/2018 12:00 AM' UNION ALL SELECT 18, '6', '07/05/2018 12:00 AM' UNION ALL SELECT 19, '7', '10/01/2018 12:00 AM' UNION ALL SELECT 20, '7', '10/05/2018 12:00 AM' UNION ALL SELECT 21, '8', '04/10/2018 12:00 AM' UNION ALL SELECT 22, '8', '04/11/2018 12:00 AM' UNION ALL SELECT 23, '8', '04/13/2018 12:00 AM' UNION ALL SELECT 24, '9', '05/01/2018 12:00 AM' UNION ALL SELECT 25, '9', '05/05/2018 12:00 AM' UNION ALL SELECT 26, '9', '05/19/2018 12:00 AM' UNION ALL SELECT 27, '10', '01/01/2018 12:00 AM' UNION ALL SELECT 28, '10', '01/04/2018 12:00 AM' UNION ALL SELECT 29, '10', '01/05/2018 12:00 AM' UNION ALL SELECT 30, '11', '01/05/2018 12:00 AM' UNION ALL SELECT 31, '12', '03/17/2018 12:00 AM' UNION ALL SELECT 32, '12', '03/18/2018 12:00 AM' UNION ALL SELECT 33, '12', '03/20/2018 12:00 AM' UNION ALL SELECT 34, '13', '02/19/2018 12:00 AM' UNION ALL SELECT 35, '13', '02/20/2018 12:00 AM' UNION ALL SELECT 36, '14', '01/15/2018 12:00 AM' UNION ALL SELECT 37, '14', '06/01/2018 12:00 AM' UNION ALL SELECT 38, '14', '06/04/2018 12:00 AM' UNION ALL SELECT 39, '14', '06/05/2018 12:00 AM' UNION ALL SELECT 40, '14', '07/15/2018 12:00 AM';

--===== Set the identity insert back to normal SET IDENTITY_INSERT #mytable OFF;

WITH ORDERED_DATA AS (

SELECT *, CASE WHEN DATEADD(day, 5, LAG(MT.DateValue, 1, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue AND DATEADD(day, 5, LAG(MT.DateValue, 2, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) >= MT.DateValue AND ( LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue) IS NULL OR DATEADD(day, 5, LAG(MT.DateValue, 3, NULL) OVER(PARTITION BY MT.custID ORDER BY MT.DateValue)) < MT.DateValue ) THEN 1 ELSE 0 END AS IS_THIRD FROM #mytable AS MT ) SELECT OD.custID, OD.DateValue FROM ORDERED_DATA AS OD WHERE OD.IS_THIRD = 1 ORDER BY OD.custID;

IF OBJECT_ID('tempdb..#mytable','U') IS NOT NULL BEGIN DROP TABLE #mytable; END;

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

jcelko212 32090 SSCrazy Eights Points: 9303 More actions · Answer 11

>> I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. <<

Why do you think that the non-relational table property could ever be a key in a valid relational model? Why do you think it identifier can ever be a numeric? What math were you going to do on it? Let's try and give you a decent table.

CREATE TABLE Customer_Contacts (customer_id CHAR(10) NOT NULL REFERENCES Customers(customer_id) ON DELETE CASCADE ON UPDATE CASCADE, contact_date DATE DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(customer_id, contact_date) );

see how we use actual columns to construct a valid relational key. See the references clause the ties this occurrence of the customer identifier to the customer’s table. See how the default plugs a date automatically.

Why are you still using the original proprietary Sybase “SELECT ..UNION ALL” syntax? Microsoft is supported the ANSI/ISO standard “INSERT INTO .. VALUES (<tuple list>)” table constructor for quite a few years now. You also don’t seem to know that we have a date data type now. Nor do you know that the only format allowed in the ANSI ISO standards is “yyyyyyyy-mm-dd” instead of the local dialect you used. Basically, your writing SQL code that you should ofof stopped writing over 20 years ago.

Let’s try this approach; build a table of 50 or hundred years of time frames of five days. If your actual business rules allow for holidays versus business days, then you can adjust these frames. I’m not going to discuss any tricks for that this post. You can play with it. You might also want to download the PDF of the Rick Snodgrass book on temporal queries in SQL from the University of Arizona website.

CREATE TABLE Timeframes (time_frame_start_date DATE NOT NULL PRIMARY KEY, time_frame_end_date DATE NOT NULL CHECK (DATEADD(5, DAY, time_frame_start_date) = time_frame_end_date) );

you can now write a query that looks at each timeframe, finds contact dates that fall within the start and end of each frame, and counts them by customer_id. Any grouping of (customer_id, time_frame_start_date) that has less than three rows in it can be rejected. Any timeframe that ended before the current date, can be rejected. Within a timeframe, you can use a ROW_NUMBER OVER(PARTITION BY time_frame_start_date) function and look for the row number = 3. I’m not sure what the performance will be, but it should be pretty adjustable

Please post DDL and follow ANSI/ISO standards when asking for help.

Jeff Moden SSC Guru Points: 1004425 More actions · Answer 12

jcelko212 32090 - Friday, October 26, 2018 3:16 PM
>> I have a scenario where the client wants to identify the 3rd successful attempt at contacting a customer within 5 days of each other. <<
Why do you think that the non-relational table property could ever be a key in a valid relational model? Why do you think it identifier can ever be a numeric? What math were you going to do on it? Let's try and give you a decent table.
CREATE TABLE Customer_Contacts (customer_id CHAR(10) NOT NULL REFERENCES Customers(customer_id) ON DELETE CASCADE ON UPDATE CASCADE, contact_date DATE DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(customer_id, contact_date) );
see how we use actual columns to construct a valid relational key. See the references clause the ties this occurrence of the customer identifier to the customerâ€™s table. See how the default plugs a date automatically.
Why are you still using the original proprietary Sybase â€œSELECT ..UNION ALLâ€ syntax? Microsoft is supported the ANSI/ISO standard â€œINSERT INTO .. VALUES (<tuple list>)â€ table constructor for quite a few years now. You also donâ€™t seem to know that we have a date data type now. Nor do you know that the only format allowed in the ANSI ISO standards is â€œyyyyyyyy-mm-ddâ€ instead of the local dialect you used. Basically, your writing SQL code that you should ofof stopped writing over 20 years ago.
Letâ€™s try this approach; build a table of 50 or hundred years of time frames of five days. If your actual business rules allow for holidays versus business days, then you can adjust these frames. Iâ€™m not going to discuss any tricks for that this post. You can play with it. You might also want to download the PDF of the Rick Snodgrass book on temporal queries in SQL from the University of Arizona website.
CREATE TABLE Timeframes (time_frame_start_date DATE NOT NULL PRIMARY KEY, time_frame_end_date DATE NOT NULL CHECK (DATEADD(5, DAY, time_frame_start_date) = time_frame_end_date) );
you can now write a query that looks at each timeframe, finds contact dates that fall within the start and end of each frame, and counts them by customer_id. Any grouping of (customer_id, time_frame_start_date) that has less than three rows in it can be rejected. Any timeframe that ended before the current date, can be rejected. Within a timeframe, you can use a ROW_NUMBER OVER(PARTITION BY time_frame_start_date) function and look for the row number = 3. Iâ€™m not sure what the performance will be, but it should be pretty adjustable

So you've converted the Customer_ID from 4 bytes to 10 bytes. There's a part of the math you speak of. If you have non-clustered indexes, you'll find more math involved. Also, what will you actually use to uniquely identify the customer within those 10 bytes?

Also, have you tested your 5 day time frame approach for performance? If you were to actually write the code for your suggestion, I'd be happy to setup a test to compare it to the other methods suggested.

And, since you cited it, do you have a link for the PDF that you suggested? Mr. Snodgrass has written a whole bunch of material so it would be nice to know the specific material you speak of. Is it the following? If not, please provide the link.
https://cs.ulb.ac.be/public/_media/teaching/infoh415/teradata_temporal_case_study.pdf

While you're at it, you should try your "What math were you going to do on it?" rhetoric on Mr. Snodgrass and see how far you get because in the article that I provided the link for above, he creates the following table to demonstrate with in the upper right corner of page 3...

CREATE MULTISET TABLE Prop_Owner ( customer_number INTEGER, property_number INTEGER, property_VT PERIOD(DATE) NOT NULL AS VALIDTIME, property_TT PERIOD (TIMESTAMP(6) WITH TIME ZONE) NOT NULL AS TRANSACTIONTIME) PRIMARY INDEX(property_number);

Also, if the link I provided IS the article that you cite, notice that Mr. Snodgrass specifically states that the SQL Language typically has inadequate temporal functionality and goes on to state that he's using Teradata, which does have the necessary functionality. The problem there is that SQL is not SQL and this is not a Teradata forum. I suspect the same will hold true if you cite a different link from Mr. Snodgrass.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)