How to remove duplicate records in this situation.

  • I am reporting on readmission data in our hospital. The criteria is to pull data from patients that have been readmitted to the hospital within 30 days. I've got it to a certain point. Below is the code to put the results in a table for you to test. If you look at UnitNumber 289816, there are actually only 3 discharges. Because the subsequent admissions fall within 30 days of the IndexDischarge, a record appears for each. I need to remove these "duplicates". Any thoughts?

    create table READMISSIONS

    as

    (

    UnitNumber varchar(6),

    IndexDischargeDate datetime,

    ReAdmitDate datetime,

    LOS int

    )

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '002611','07/16/2014','07/21/2014','5')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '091982','07/26/2014','07/29/2014','3')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '127126','07/22/2014','08/04/2014','13')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '196068','07/15/2014','07/21/2014','6')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '202710','07/03/2014','07/17/2014','14')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '209973','07/18/2014','08/12/2014','25')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '256275','07/03/2014','07/08/2014','5')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '262115','07/23/2014','08/09/2014','17')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '275059','07/28/2014','08/03/2014','6')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '288850','07/08/2014','07/27/2014','19')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/06/2014','07/08/2014','2')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/06/2014','07/21/2014','15')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/06/2014','08/04/2014','29')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/10/2014','07/21/2014','11')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/10/2014','08/04/2014','25')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/22/2014','08/04/2014','13')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '294846','07/14/2014','07/31/2014','17')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '319465','07/02/2014','07/29/2014','27')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '341376','07/01/2014','07/25/2014','24')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '358424','07/17/2014','08/16/2014','30')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '436847','07/31/2014','08/05/2014','5')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '494799','07/16/2014','08/04/2014','19')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '499668','07/25/2014','07/29/2014','4')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '595303','07/06/2014','07/31/2014','25')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '608752','07/09/2014','07/09/2014','0')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '608752','07/09/2014','07/24/2014','15')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '608752','07/17/2014','07/24/2014','7')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '625488','07/01/2014','07/26/2014','25')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '656932','07/17/2014','08/06/2014','20')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '672836','07/02/2014','07/19/2014','17')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '676663','07/21/2014','08/01/2014','11')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '676663','07/21/2014','08/19/2014','29')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '721819','07/18/2014','08/16/2014','29')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '772114','07/18/2014','07/30/2014','12')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '804803','07/07/2014','07/13/2014','6')

    insert into TEST(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '816389','07/10/2014','07/13/2014','3')

  • Your table and columns didn't match between the DDL and DML, so I made a few adjustments to create the sample data. This is so you can see what I worked on to create the solution.

    create table Readmissions (

    UnitNumber varchar(6),

    AdmitDateTime datetime,

    DischargeDateTime datetime,

    Days integer);

    go

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '002611','07/16/2014','07/21/2014','5')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '091982','07/26/2014','07/29/2014','3')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '127126','07/22/2014','08/04/2014','13')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '196068','07/15/2014','07/21/2014','6')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '202710','07/03/2014','07/17/2014','14')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '209973','07/18/2014','08/12/2014','25')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '256275','07/03/2014','07/08/2014','5')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '262115','07/23/2014','08/09/2014','17')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '275059','07/28/2014','08/03/2014','6')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '288850','07/08/2014','07/27/2014','19')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/06/2014','07/08/2014','2')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/06/2014','07/21/2014','15')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/06/2014','08/04/2014','29')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/10/2014','07/21/2014','11')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/10/2014','08/04/2014','25')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '289816','07/22/2014','08/04/2014','13')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '294846','07/14/2014','07/31/2014','17')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '319465','07/02/2014','07/29/2014','27')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '341376','07/01/2014','07/25/2014','24')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '358424','07/17/2014','08/16/2014','30')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '436847','07/31/2014','08/05/2014','5')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '494799','07/16/2014','08/04/2014','19')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '499668','07/25/2014','07/29/2014','4')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '595303','07/06/2014','07/31/2014','25')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '608752','07/09/2014','07/09/2014','0')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '608752','07/09/2014','07/24/2014','15')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '608752','07/17/2014','07/24/2014','7')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '625488','07/01/2014','07/26/2014','25')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '656932','07/17/2014','08/06/2014','20')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '672836','07/02/2014','07/19/2014','17')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '676663','07/21/2014','08/01/2014','11')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '676663','07/21/2014','08/19/2014','29')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '721819','07/18/2014','08/16/2014','29')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '772114','07/18/2014','07/30/2014','12')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '804803','07/07/2014','07/13/2014','6')

    insert into Readmissions(UnitNumber, AdmitDateTime, DischargeDateTime, Days) values( '816389','07/10/2014','07/13/2014','3')

    Now, given that structure, you can use the ROW_NUMBER windowing function to query each UnitNumber with a counter for each one.

    WITH cte AS (

    SELECT UnitNumber,

    ROW_NUMBER() OVER(PARTITION BY UnitNumber ORDER BY DischargeDateTime DESC) row_num

    FROM Readmissions

    )

    SELECT *

    FROM cte

    WHERE row_num > 1;

    One really cool thing about a CTE is that you can use it to delete from a single table. I made the assumption that you want to keep only the most recent discharge date and delete the earlier ones. If this assumption is wrong, you can adjust the ORDER BY clause to suit your needs.

    WITH cte AS (

    SELECT UnitNumber,

    ROW_NUMBER() OVER(PARTITION BY UnitNumber ORDER BY DischargeDateTime DESC) row_num

    FROM Readmissions

    )

    DELETE FROM cte

    WHERE row_num > 1;

  • ...within 30 days from what ?

    Pls try restate the problem it terms like "To incude this row in a result set there must [not] exsist rows such that..."

    Particulary, what if for the given patient, facility the sequence of dates is 2014-01-30, 2014-02-15, 2014-03-10, 2014-04-05 ?

    As a guess,

    declare @test-2 table

    (

    ID int identity,

    UnitNumber varchar(6),

    AdmitDate datetime,

    DischargeDate datetime,

    days int

    )

    insert into @test-2 values( '002611','2014/07/16 ','2014/07/21 ','5')

    ,( '091982','2014/07/26 ','2014/07/29 ','3')

    ,( '127126','2014/07/22 ','2014/08/04 ','13')

    ,( '196068','2014/07/15 ','2014/07/21 ','6')

    ,( '202710','2014/07/03 ','2014/07/17 ','14')

    ,( '209973','2014/07/18 ','2014/08/12 ','25')

    ,( '256275','2014/07/03 ','2014/07/08 ','5')

    ,( '262115','2014/07/23 ','2014/08/09 ','17')

    ,( '275059','2014/07/28 ','2014/08/03 ','6')

    ,( '288850','2014/07/08 ','2014/07/27 ','19')

    ,( '289816','2014/07/06 ','2014/07/08 ','2')

    ,( '289816','2014/07/06 ','2014/07/21 ','15')

    ,( '289816','2014/07/06 ','2014/08/04 ','29')

    ,( '289816','2014/07/10 ','2014/07/21 ','11')

    ,( '289816','2014/07/10 ','2014/08/04 ','25')

    ,( '289816','2014/07/22 ','2014/08/04 ','13')

    ,( '294846','2014/07/14 ','2014/07/31 ','17')

    ,( '319465','2014/07/02 ','2014/07/29 ','27')

    ,( '341376','2014/07/01 ','2014/07/25 ','24')

    ,( '358424','2014/07/17 ','2014/08/16 ','30')

    ,( '436847','2014/07/31 ','2014/08/05 ','5')

    ,( '494799','2014/07/16 ','2014/08/04 ','19')

    ,( '499668','2014/07/25 ','2014/07/29 ','4')

    ,( '595303','2014/07/06 ','2014/07/31 ','25')

    ,( '608752','2014/07/09 ','2014/07/09 ','0')

    ,( '608752','2014/07/09 ','2014/07/24 ','15')

    ,( '608752','2014/07/17 ','2014/07/24 ','7')

    ,( '625488','2014/07/01 ','2014/07/26 ','25')

    ,( '656932','2014/07/17 ','2014/08/06 ','20')

    ,( '672836','2014/07/02 ','2014/07/19 ','17')

    ,( '676663','2014/07/21 ','2014/08/01 ','11')

    ,( '676663','2014/07/21 ','2014/08/19 ','29')

    ,( '721819','2014/07/18 ','2014/08/16 ','29')

    ,( '772114','2014/07/18 ','2014/07/30 ','12')

    ,( '804803','2014/07/07 ','2014/07/13 ','6')

    ,( '816389','2014/07/10 ','2014/07/13 ','3')

    select t1.* from @test-2 t1

    where not exists (

    select 1 from @test-2 t2

    where t2.UnitNumber = t1.UnitNumber and t1.ID != t2.ID

    and datediff(dd, t2.DischargeDate, t1.AdmitDate) between 1 and 30)

    order by t1.UnitNumber, t1.AdmitDate

    This will select 3 rows for 289816, all with the same AdmitDate. If only one is needed provide more conditions to narrow the result set.

  • Pardon my ignorance but, what would be the correct syntax to see the next set of results after deleting from the CTE?

  • This query will show you what will be deleted:

    WITH cte AS (

    SELECT UnitNumber,

    ROW_NUMBER() OVER(PARTITION BY UnitNumber ORDER BY DischargeDateTime DESC) row_num

    FROM Readmissions

    )

    SELECT *

    FROM cte

    WHERE row_num > 1;

    This will show you what will remain:

    WITH cte AS (

    SELECT UnitNumber,

    ROW_NUMBER() OVER(PARTITION BY UnitNumber ORDER BY DischargeDateTime DESC) row_num

    FROM Readmissions

    )

    SELECT *

    FROM cte

    WHERE row_num = 1;

  • Sorry for the confusion but, the DischargeDateTime should come before the AdmitDateTime. The patient is discharged from the hospital then, readmitted within 30 days. I screwed up the create table!

    So, to restate, I need to report only one pair of DischargeDateTime and AdmitDateTime results for each UnitNumber.

  • For example, UnitNumber, 289816, should have 3 records. Sorry again for the confusion.

    UnitNumber, DischargeDate, AdmitDate

    289816, 07/6/2014, 07/08/2014

    289816, 07/10/2014, 07/21/2014

    289816, 07/22/2014, 08/04/2014

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply