February 1, 2012 at 9:46 am
I have several dimensions tables with the standard housekeeping columns of RowStartDate, RowEndDate and IsRowCurrent.
I need to provide users with the ability to return the active record at any point in time.
A quick fix has been implemented (within a view) but is largely inefficient due to the join not being able use indexes and as it looks like it would materialize all dates whenever the view is called:
FROM dimension
INNER JOIN dates
on dates.date between dimension.RowStartDate and dimension.RowEndDate
or (dates.date >= dimension.RowStartDate and IsRowCurrent = 'Y'
I am thinking that users quite often ask to know what a record was like at a certain point in time.
Does anyone have any ideas of how best to satisfy this query?
February 1, 2012 at 11:12 am
depending on how big your Dimension tables are, you could implement partitioning by date.
February 1, 2012 at 11:35 am
Could you post the table and index DDL? It would probably help give you a better answer.
Something that jumps out is that you have an OR in your select clause, and if you table isn't properly indexed for ORs, you could be doing a table or index scan instead of seeking.
February 2, 2012 at 2:22 am
The DDL for the dimension table is below - note I have replaced the business field names:
There is a unique clustered index on RowID, a non clustered on RowStartDate and RowEndDate and a further non clustered on IsRowCurrent
CREATE TABLE Dimension (
[FIELD1] [numeric](11, 0) NULL,
[FIELD2] [varchar](6) NULL,
[FIELD3] [varchar](2) NULL,
[FIELD4] [numeric](8, 0) NULL,
[FIELD5] [numeric](8, 0) NULL,
[FIELD6] [numeric](8, 0) NULL,
[FIELD7] [numeric](8, 0) NULL,
[FIELD8] [numeric](8, 0) NULL,
[FIELD9] [numeric](8, 0) NULL,
[FIELD10] [numeric](8, 0) NULL,
[FIELD11] [numeric](8, 0) NULL,
[FIELD12] [numeric](8, 0) NULL,
[FIELD13] [numeric](8, 0) NULL,
[FIELD14] [varchar](10) NULL,
[FIELD15] [numeric](7, 0) NULL,
[FIELD16] [numeric](6, 0) NULL,
[FIELD17] [varchar](10) NULL,
[FIELD18] [numeric](7, 0) NULL,
[FIELD19] [numeric](6, 0) NULL,
[RowStartDate] [smalldatetime] NULL,
[RowEndDate] [smalldatetime] NULL,
[IsRowCurrent] [char](1) NULL,
[RowId] [int] IDENTITY(1,1) NOT NULL
) ON [Primary]
GO
The Date table is as follows
CREATE TABLE [Reference].[DateLookup](
[DateKey] [int] NOT NULL,
[Date] [smalldatetime] NOT NULL,
CONSTRAINT [PK_Date] PRIMARY KEY CLUSTERED
(
[Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MAIN]
) ON [MAIN]
GO
February 2, 2012 at 11:10 am
Just by looking at this I would try splitting the query by the OR statement into two separate queries and UNION them together.
...
FROM dimension
INNER JOIN dates
ON dates.date BETWEEN dimension.RowStartDate AND dimension.RowEndDate
UNION
...
FROM dimension
INNER JOIN dates
ON dates.date >= dimension.RowStartDate AND IsRowCurrent = 'Y'
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply