Join table to maintain historical relationships

  • Hi All,

    I have a join table that in which I want to maintain historical relationships. Pardon my lack of keys, I am in proof of concept mode here.

    CREATE TABLE #TestJoin

    (

    Entity1 int,

    Entity2 varchar(1),

    EffectiveDate date

    )

    INSERT INTO #TestJoin VALUES( 1,'a','2012-05-30')

    INSERT INTO #TestJoin VALUES( 1,'b','2012-02-15')

    INSERT INTO #TestJoin VALUES( 1,'c','2012-02-03')

    INSERT INTO #TestJoin VALUES( 1,'d','2012-01-01')

    SELECT * FROM #TestJoin

    DROP TABLE #TestJoin

    In my reporting system I would like to retrieve data based on the relationship at a specific time.

    The following would return multiple records.

    DECLARE @ReportDate DATE

    SET @ReportDate = '2012-02-10'

    SELECT * FROM #TestJoin

    WHERE EffectiveDate > '2012-02-10'

    The desired logic would be to retrieve the record equal to @ReportDate or the most recent record older than @ReportDate.

    I thought a view with TOP 1 and an ORDER BY would work but it is not.

    I would like to use a view if possible since I would reference this table/view table frequently and I'd like to minimize the complexity to my report writers.

    I am stumped here.

  • Are you looking for something like this? Please note that you can't put my query in a view and pass it a parameter, but an inline table valued function can function as a dynamic view in that regard.

    CREATE TABLE dbo.TestJoin

    (

    Entity1 int,

    Entity2 varchar(1),

    EffectiveDate date

    );

    INSERT INTO dbo.TestJoin VALUES( 1,'a','2012-05-30');

    INSERT INTO dbo.TestJoin VALUES( 1,'b','2012-02-15');

    INSERT INTO dbo.TestJoin VALUES( 1,'c','2012-02-03');

    INSERT INTO dbo.TestJoin VALUES( 1,'d','2012-01-01');

    SELECT * FROM dbo.TestJoin;

    go

    CREATE FUNCTION dbo.EffectiveRecord(

    @ReportDate DATE

    )

    RETURNS TABLE WITH SCHEMABINDING

    AS RETURN (

    WITH BaseData AS (

    SELECT

    Entity1,

    Entity2,

    EffectiveDate,

    ROW_NUMBER() OVER (PARTITION BY Entity1 ORDER BY EffectiveDate ASC) rn

    FROM

    dbo.TestJoin

    WHERE

    EffectiveDate >= @ReportDate

    )

    SELECT

    Entity1,

    Entity2,

    EffectiveDate

    FROM

    BaseData

    WHERE

    rn = 1);

    go

    --DROP TABLE #TestJoin

    /*

    In my reporting system I would like to retrieve data based on the relationship at a specific time.

    The following would return multiple records.

    DECLARE @ReportDate DATE

    SET @ReportDate = '2012-02-10'

    SELECT * FROM #TestJoin

    WHERE EffectiveDate > '2012-02-10'

    */

    DECLARE @ReportDate DATE;

    SET @ReportDate = '2012-02-10';

    WITH BaseData AS (

    SELECT

    Entity1,

    Entity2,

    EffectiveDate,

    ROW_NUMBER() OVER (PARTITION BY Entity1 ORDER BY EffectiveDate ASC) rn

    FROM

    dbo.TestJoin

    WHERE

    EffectiveDate >= @ReportDate

    )

    SELECT

    Entity1,

    Entity2,

    EffectiveDate

    FROM

    BaseData

    WHERE

    rn = 1;

    GO

    SELECT * FROM dbo.EffectiveRecord('2012-02-10');

    go

    DROP FUNCTION dbo.EffectiveRecord;

    DROP TABLE dbo.TestJoin;

    GO

  • Thanks to you both. Lynn's code looks like a solution to the problem I posted. I am going to use the data structure suggested by Celko, using start and end dates. This seems more robust than the data model I posted and will simplify the queries I need to write.

    Thanks

  • CELKO (7/16/2012)


    To track the history of, say, Foobars we need to see time as a continuum and model it as (begin_date, end_date) pairs that define when a foobar had a particular value. Here is the skeleton.

    CREATE TABLE Foobar_History

    (foo_id CHAR(9) NOT NULL,

    start_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,

    end_date DATETIME, --null means current

    CHECK (start_date <= end_date),

    foo_status INTEGER NOT NULL,

    PRIMARY KEY (foo_id, start_date));

    When the end_date is NULL, that state of being is still current. You use a simple query for the status on any particular date;

    SELECT *

    FROM Foobar

    WHERE @in_cal_date

    BETWEEN start_date

    AND COALESCE (end_date, CURRENT_TIMESTAMP);

    There are more tricks in the DDL to prevent gaps, etc

    CREATE TABLE Events

    (event_id CHAR(10) NOT NULL,

    previous_event_end_date DATE NOT NULL

    CONSTRAINT Chained_Dates

    REFERENCES Events (event_end_date),

    event_start_date DATE NOT NULL,

    event_end_date DATE UNIQUE, -- null means event in progress

    PRIMARY KEY (event_id, event_start_date),

    CONSTRAINT Event_Order_Valid

    CHECK (event_start_date <= event_end_date),

    CONSTRAINT Chained_Dates

    CHECK (DATEADD(DAY, 1, previous_event_end_date) = event_start_date)

    -- CHECK (previous_event_end_date + INTERVAL '01' DAYS) = event_start_date)

    );

    -- disable the Chained_Dates constraint

    ALTER TABLE Events NOCHECK CONSTRAINT Chained_Dates;

    GO

    -- insert a starter row

    INSERT INTO Events(event_id, previous_event_end_date, event_start_date, event_end_date)

    VALUES ('Foo Fest', '2010-01-01', '2010-01-02', '2010-01-05');

    GO

    -- enable the constraint in the table

    ALTER TABLE Events CHECK CONSTRAINT Chained_Dates;

    GO

    -- this works

    INSERT INTO Events(event_id, previous_event_end_date, event_start_date, event_end_date)

    VALUES ('Glob Week', '2010-01-05', '2010-01-06', '2010-01-10');

    -- this fails

    INSERT INTO Events(event_id, previous_event_end_date, event_start_date, event_end_date)

    VALUES ('Snoob', '2010-01-09', '2010-01-11', '2010-01-15');

    Deja Vu

  • Is using BETWEEN to be avoided? I don't expect my criteria to ever have a time component.

    If I use the the DATE datatype in my historical table and then convert a DATETIME variable passed from SSRS to a DATE parameter in my procedure can I then reliably use BETWEEN?

    Any performance considerations?

    Thanks if you would like to offer your thoughts.

    DECLARE @EndDate datetime

    SET @EndDate = '2012-05-31 10:00:00.000' --Shouldn't have a time component but you never know

    DECLARE @MyEndDate date

    SET @MyEndDate = @EndDate

    CREATE TABLE #Test

    (

    StartDate DATE,

    EndDate DATE,

    )

    INSERT INTO #Test (StartDate,EndDate) VALUES ('2012-01-31 00:00:00.000','2012-05-31 00:00:00.000')

    SELECT * FROM #Test

    WHERE @MyEndDate BETWEEN StartDate AND EndDate

    SELECT * FROM #Test

    WHERE @EndDate BETWEEN StartDate AND EndDate

    SELECT * FROM #Test

    WHERE @EndDate >= StartDate AND @EndDate < DATEADD(d,1,EndDate)

    DROP TABLE #Test

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply