Help comparing a data set within a table.

  • The first data set contains a schedule with ten dates and I would like to pull the ScheduleID for data that matches these dates exactly. The First data set is know and the second data set is stored in a table. I have thought of using INTERSECT however this returns true when the dates match and there are additional dates attached to the ScheduleID. I am looking to only return if all dates for a ScheduleID match exactly. I am looking for a scalable solution, I do not want to set a president with my developers by running a cursor!

    With this example the result set would be:

    ScheduleID

    114

    201

    222

    ---------------------------------------

    ScheduleIDEndDate

    103 2009-01-03 00:00:00.000

    103 2009-01-17 00:00:00.000

    103 2009-01-31 00:00:00.000

    103 2009-02-14 00:00:00.000

    103 2009-02-28 00:00:00.000

    103 2009-03-14 00:00:00.000

    103 2009-03-28 00:00:00.000

    103 2009-04-11 00:00:00.000

    103 2009-04-25 00:00:00.000

    103 2009-05-09 00:00:00.000

    --------------------------------------

    ScheduleTable

    ScheduleIDEndDate

    114 2009-01-03 00:00:00.000

    114 2009-01-17 00:00:00.000

    114 2009-01-31 00:00:00.000

    114 2009-02-14 00:00:00.000

    114 2009-02-28 00:00:00.000

    114 2009-03-14 00:00:00.000

    114 2009-03-28 00:00:00.000

    114 2009-04-11 00:00:00.000

    114 2009-04-25 00:00:00.000

    114 2009-05-09 00:00:00.000

    201 2009-01-03 00:00:00.000

    201 2009-01-17 00:00:00.000

    201 2009-01-31 00:00:00.000

    201 2009-02-14 00:00:00.000

    201 2009-02-28 00:00:00.000

    201 2009-03-14 00:00:00.000

    201 2009-03-28 00:00:00.000

    201 2009-04-11 00:00:00.000

    201 2009-04-25 00:00:00.000

    201 2009-05-09 00:00:00.000

    222 2009-01-03 00:00:00.000

    222 2009-01-17 00:00:00.000

    222 2009-01-31 00:00:00.000

    222 2009-02-14 00:00:00.000

    222 2009-02-28 00:00:00.000

    222 2009-03-14 00:00:00.000

    222 2009-03-28 00:00:00.000

    222 2009-04-11 00:00:00.000

    222 2009-04-25 00:00:00.000

    222 2009-05-09 00:00:00.000

  • You have some nice data there but it's useless to the people that are trying to help you because it's not "readily consumable". You'll get better help quicker if you make it easy for people. Please see the article at the first link in my signature line below for what I'm talking about.

    Here's another way to make readily consumable data using your data. I've included some "negative" data so we can see that the code is capable of excluding non-matching schedules.

    --=====================================================================================================================

    -- Create the test data. Nothing in this section is a part of the solution.

    --=====================================================================================================================

    --===== Conditionally drop the temp tables to make reruns in SSMS easier.

    IF OBJECT_ID('tempdb..#FindThisSchedule','U') IS NOT NULL DROP TABLE #FindThisSchedule;

    IF OBJECT_ID('tempdb..#Schedule' ,'U') IS NOT NULL DROP TABLE #Schedule;

    --===== Build the table we need to match

    SELECT d.ScheduleID, EndDate = CAST(d.EndDate AS DATETIME)

    INTO #FindThisSchedule

    FROM (

    SELECT 103,'2009-01-03 00:00:00.000' UNION ALL

    SELECT 103,'2009-01-17 00:00:00.000' UNION ALL

    SELECT 103,'2009-01-31 00:00:00.000' UNION ALL

    SELECT 103,'2009-02-14 00:00:00.000' UNION ALL

    SELECT 103,'2009-02-28 00:00:00.000' UNION ALL

    SELECT 103,'2009-03-14 00:00:00.000' UNION ALL

    SELECT 103,'2009-03-28 00:00:00.000' UNION ALL

    SELECT 103,'2009-04-11 00:00:00.000' UNION ALL

    SELECT 103,'2009-04-25 00:00:00.000' UNION ALL

    SELECT 103,'2009-05-09 00:00:00.000'

    ) d (ScheduleID,EndDate)

    ;

    --===== Build the schedule table where we need to find the matches.

    SELECT d.ScheduleID, EndDate = CAST(d.EndDate AS DATETIME)

    INTO #Schedule

    FROM (

    SELECT 114,'2009-01-03 00:00:00.000' UNION ALL

    SELECT 114,'2009-01-17 00:00:00.000' UNION ALL

    SELECT 114,'2009-01-31 00:00:00.000' UNION ALL

    SELECT 114,'2009-02-14 00:00:00.000' UNION ALL

    SELECT 114,'2009-02-28 00:00:00.000' UNION ALL

    SELECT 114,'2009-03-14 00:00:00.000' UNION ALL

    SELECT 114,'2009-03-28 00:00:00.000' UNION ALL

    SELECT 114,'2009-04-11 00:00:00.000' UNION ALL

    SELECT 114,'2009-04-25 00:00:00.000' UNION ALL

    SELECT 114,'2009-05-09 00:00:00.000' UNION ALL

    SELECT 201,'2009-01-03 00:00:00.000' UNION ALL

    SELECT 201,'2009-01-17 00:00:00.000' UNION ALL

    SELECT 201,'2009-01-31 00:00:00.000' UNION ALL

    SELECT 201,'2009-02-14 00:00:00.000' UNION ALL

    SELECT 201,'2009-02-28 00:00:00.000' UNION ALL

    SELECT 201,'2009-03-14 00:00:00.000' UNION ALL

    SELECT 201,'2009-03-28 00:00:00.000' UNION ALL

    SELECT 201,'2009-04-11 00:00:00.000' UNION ALL

    SELECT 201,'2009-04-25 00:00:00.000' UNION ALL

    SELECT 201,'2009-05-09 00:00:00.000' UNION ALL

    SELECT 222,'2009-01-03 00:00:00.000' UNION ALL

    SELECT 222,'2009-01-17 00:00:00.000' UNION ALL

    SELECT 222,'2009-01-31 00:00:00.000' UNION ALL

    SELECT 222,'2009-02-14 00:00:00.000' UNION ALL

    SELECT 222,'2009-02-28 00:00:00.000' UNION ALL

    SELECT 222,'2009-03-14 00:00:00.000' UNION ALL

    SELECT 222,'2009-03-28 00:00:00.000' UNION ALL

    SELECT 222,'2009-04-11 00:00:00.000' UNION ALL

    SELECT 222,'2009-04-25 00:00:00.000' UNION ALL

    SELECT 222,'2009-05-09 00:00:00.000' UNION ALL

    SELECT 333,'2009-01-03 00:00:01.000' UNION ALL --1 Second difference

    SELECT 333,'2009-01-17 00:00:00.000' UNION ALL

    SELECT 333,'2009-01-31 00:00:00.000' UNION ALL

    SELECT 333,'2009-02-14 00:00:00.000' UNION ALL

    SELECT 333,'2009-02-28 00:00:00.000' UNION ALL

    SELECT 333,'2009-03-14 00:00:00.000' UNION ALL

    SELECT 333,'2009-03-28 00:00:00.000' UNION ALL

    SELECT 333,'2009-04-11 00:00:00.000' UNION ALL

    SELECT 333,'2009-04-25 00:00:00.000' UNION ALL

    SELECT 333,'2009-05-09 00:00:00.000' UNION ALL

    SELECT 444,'2009-01-03 00:00:00.000' UNION ALL

    SELECT 444,'2009-01-17 00:00:00.000' UNION ALL

    SELECT 444,'2009-01-31 00:00:00.000' UNION ALL

    --SELECT 444,'2009-02-14 00:00:00.000' UNION ALL --Missing one row

    SELECT 444,'2009-02-28 00:00:00.000' UNION ALL

    SELECT 444,'2009-03-14 00:00:00.000' UNION ALL

    SELECT 444,'2009-03-28 00:00:00.000' UNION ALL

    SELECT 444,'2009-04-11 00:00:00.000' UNION ALL

    SELECT 444,'2009-04-25 00:00:00.000' UNION ALL

    SELECT 444,'2009-05-09 00:00:00.000' UNION ALL

    SELECT 555,'2009-01-03 00:00:00.000' UNION ALL

    SELECT 555,'2009-01-17 00:00:00.000' UNION ALL

    SELECT 555,'2009-01-31 00:00:00.000' UNION ALL

    SELECT 555,'2009-01-31 00:00:00.000' UNION ALL --Duplicate row

    SELECT 555,'2009-02-28 00:00:00.000' UNION ALL

    SELECT 555,'2009-03-14 00:00:00.000' UNION ALL

    SELECT 555,'2009-03-28 00:00:00.000' UNION ALL

    SELECT 555,'2009-04-11 00:00:00.000' UNION ALL

    SELECT 555,'2009-04-25 00:00:00.000' UNION ALL

    SELECT 555,'2009-05-09 00:00:00.000'

    ) d (ScheduleID,EndDate)

    ;

    With that in mind, this is actually a fairly easy problem to solve. The details are in the comments in the code.

    --=====================================================================================================================

    -- Solve the problem

    --=====================================================================================================================

    WITH

    cteEnumerateFind AS

    ( --=== This simply adds sequential numbers to the dates in what we want to find

    SELECT DateNum = ROW_NUMBER() OVER (ORDER BY EndDate),

    EndDate

    FROM #FindThisSchedule

    ),

    cteEnumerateSchedule AS

    ( --=== This also adds sequential numbers to the dates starting at 1 for each ScheduleID

    SELECT DateNum = ROW_NUMBER() OVER (PARTITION BY ScheduleID ORDER BY EndDate),

    ScheduleID,

    EndDate

    FROM #Schedule

    ) --=== This does a join on perfect matches, counts them and only returns those ScheduleID's that

    -- no more and no less than the perfect matches.

    SELECT s.ScheduleID

    FROM cteEnumerateSchedule s

    JOIN cteEnumerateFind f

    ON s.DateNum = f.DateNum

    AND s.EndDate = f.EndDate

    GROUP BY s.ScheduleID

    HAVING COUNT(*) = (SELECT COUNT(*) FROM #FindThisSchedule)

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply