Point in time query against dimension

  • I have several dimensions tables with the standard housekeeping columns of RowStartDate, RowEndDate and IsRowCurrent.

    I need to provide users with the ability to return the active record at any point in time.

    A quick fix has been implemented (within a view) but is largely inefficient due to the join not being able use indexes and as it looks like it would materialize all dates whenever the view is called:

    FROM dimension

    INNER JOIN dates

    on dates.date between dimension.RowStartDate and dimension.RowEndDate

    or (dates.date >= dimension.RowStartDate and IsRowCurrent = 'Y'

    I am thinking that users quite often ask to know what a record was like at a certain point in time.

    Does anyone have any ideas of how best to satisfy this query?

  • depending on how big your Dimension tables are, you could implement partitioning by date.

  • Could you post the table and index DDL? It would probably help give you a better answer.

    Something that jumps out is that you have an OR in your select clause, and if you table isn't properly indexed for ORs, you could be doing a table or index scan instead of seeking.

  • The DDL for the dimension table is below - note I have replaced the business field names:

    There is a unique clustered index on RowID, a non clustered on RowStartDate and RowEndDate and a further non clustered on IsRowCurrent

    CREATE TABLE Dimension (

    [FIELD1] [numeric](11, 0) NULL,

    [FIELD2] [varchar](6) NULL,

    [FIELD3] [varchar](2) NULL,

    [FIELD4] [numeric](8, 0) NULL,

    [FIELD5] [numeric](8, 0) NULL,

    [FIELD6] [numeric](8, 0) NULL,

    [FIELD7] [numeric](8, 0) NULL,

    [FIELD8] [numeric](8, 0) NULL,

    [FIELD9] [numeric](8, 0) NULL,

    [FIELD10] [numeric](8, 0) NULL,

    [FIELD11] [numeric](8, 0) NULL,

    [FIELD12] [numeric](8, 0) NULL,

    [FIELD13] [numeric](8, 0) NULL,

    [FIELD14] [varchar](10) NULL,

    [FIELD15] [numeric](7, 0) NULL,

    [FIELD16] [numeric](6, 0) NULL,

    [FIELD17] [varchar](10) NULL,

    [FIELD18] [numeric](7, 0) NULL,

    [FIELD19] [numeric](6, 0) NULL,

    [RowStartDate] [smalldatetime] NULL,

    [RowEndDate] [smalldatetime] NULL,

    [IsRowCurrent] [char](1) NULL,

    [RowId] [int] IDENTITY(1,1) NOT NULL

    ) ON [Primary]

    GO

    The Date table is as follows

    CREATE TABLE [Reference].[DateLookup](

    [DateKey] [int] NOT NULL,

    [Date] [smalldatetime] NOT NULL,

    CONSTRAINT [PK_Date] PRIMARY KEY CLUSTERED

    (

    [Date] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MAIN]

    ) ON [MAIN]

    GO

  • Just by looking at this I would try splitting the query by the OR statement into two separate queries and UNION them together.

    ...

    FROM dimension

    INNER JOIN dates

    ON dates.date BETWEEN dimension.RowStartDate AND dimension.RowEndDate

    UNION

    ...

    FROM dimension

    INNER JOIN dates

    ON dates.date >= dimension.RowStartDate AND IsRowCurrent = 'Y'

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply