June 29, 2016 at 9:23 am
Hi
Should my dates be in my fact tables or dimension tables?
I have information about an application base on dates
- application start date
- application submitted date
- application completed date
- application closed date
there is also an amount field
currently the date fields are in the dimension table and the amount field in the fact.
Once populated these date fields will be constant.
would you advise this to be the best way to organise the data?
I appreciate the question might be vague.
Thanks in advance
June 29, 2016 at 12:42 pm
You need one generic calendar/dates dimension which will be related to different columns in your fact table.
June 30, 2016 at 8:00 am
thanks very much for the reply, do you mind explaining a bit please
thansk
June 30, 2016 at 8:35 am
You just need 1 Time dimension. That dimension can have several hierarchies to represent natural year, fiscal year, etc. The key for that dimension is a date.
In your fact tables, you can have different dates which are different attributes acting as foreign keys. All those columns will make reference to the same dimension
June 30, 2016 at 8:55 am
should the dates be on the fact table or the dimension table?
June 30, 2016 at 9:14 am
Maybe this simplified example from AdventureWorks can make it look simpler. The dates are both in the fact and the dimension.
CREATE TABLE [dbo].[DimDate](
[DateKey] [date] NOT NULL,
[DayNumberOfWeek] [tinyint] NOT NULL,
[DayNameOfWeek] [nvarchar](10) NOT NULL,
[DayNumberOfMonth] [tinyint] NOT NULL,
[DayNumberOfYear] [smallint] NOT NULL,
[WeekNumberOfYear] [tinyint] NOT NULL,
[MonthName] [nvarchar](10) NOT NULL,
[MonthNumberOfYear] [tinyint] NOT NULL,
[CalendarQuarter] [tinyint] NOT NULL,
[CalendarYear] [smallint] NOT NULL,
[CalendarSemester] [tinyint] NOT NULL,
[FiscalQuarter] [tinyint] NOT NULL,
[FiscalYear] [smallint] NOT NULL,
[FiscalSemester] [tinyint] NOT NULL,
CONSTRAINT [PK_DimDate_DateKey] PRIMARY KEY CLUSTERED ([DateKey] ASC),
);
CREATE TABLE [dbo].[FactInternetSales](
[ProductKey] [int] NOT NULL,
[OrderDate] [date] NULL,
[DueDate] [date] NULL,
[ShipDate] [date] NULL,
[CustomerKey] [int] NOT NULL,
[SalesTerritoryKey] [int] NOT NULL,
[OrderQuantity] [smallint] NOT NULL,
[UnitPrice] [money] NOT NULL,
[TotalProductCost] [money] NOT NULL,
[SalesAmount] [money] NOT NULL,
[TaxAmt] [money] NOT NULL,
);
ALTER TABLE [dbo].[FactInternetSales] WITH CHECK ADD CONSTRAINT [FK_FactInternetSales_DimDate] FOREIGN KEY([OrderDate])
REFERENCES [dbo].[DimDate] ([DateKey]);
ALTER TABLE [dbo].[FactInternetSales] WITH CHECK ADD CONSTRAINT [FK_FactInternetSales_DimDate1] FOREIGN KEY([DueDate])
REFERENCES [dbo].[DimDate] ([DateKey]);
ALTER TABLE [dbo].[FactInternetSales] WITH CHECK ADD CONSTRAINT [FK_FactInternetSales_DimDate2] FOREIGN KEY([ShipDate])
REFERENCES [dbo].[DimDate] ([DateKey]);
June 30, 2016 at 9:23 am
fantastic mate thanks
July 1, 2016 at 8:27 am
I would suggest that your date key be an small integer rather than a date data type. That would mean that there will be a day that is 1, and 2, and so on. It's smaller, and over millions of rows this isn't inconsequential. This will allow you to more easily manage an Unknown and a N/A member. Also, the fact table you've described looks like an accumulating snapshot, date differences can be handled with straight math.
There are multiple opinions on this and things would certainly work the way described, or with a "smart key" such as 20160701 as an integer. Each has advantages and disadvantages but overall I find my solution works well, and have been using it for more than a decade.
July 1, 2016 at 9:03 am
I don't like that idea because that way you need to join tables for simple queries restricted by dates. Even if you won't have that problem once the cube is processed, the development is a bit harder.
That's a personal opinion and wouldn't say either option is bad. They both have pros and cons.
July 1, 2016 at 9:11 am
I don't like that idea because that way you need to join tables for simple queries restricted by dates.
To that I always reply that no user should be looking at the information in that manner. Only a developer would have to do this, and they should have no trouble making the join. It's just a simple join.
July 1, 2016 at 9:39 am
RonKyle (7/1/2016)
I don't like that idea because that way you need to join tables for simple queries restricted by dates.
To that I always reply that no user should be looking at the information in that manner. Only a developer would have to do this, and they should have no trouble making the join. It's just a simple join.
It's a simple join, but it's still additional work for the developer and the server.
I agree that users shouldn't be querying the tables.
July 1, 2016 at 11:13 am
It's a simple join, but it's still additional work for the developer and the server
Doing the date the other way takes constant work for the servers. It's not as if they are equal alternatives but one takes a join. The one that doesn't require the join has costs that in my view outweigh this one very small advantage.
July 1, 2016 at 11:50 am
Luis Cazares (7/1/2016)
RonKyle (7/1/2016)
I don't like that idea because that way you need to join tables for simple queries restricted by dates.
To that I always reply that no user should be looking at the information in that manner. Only a developer would have to do this, and they should have no trouble making the join. It's just a simple join.
It's a simple join, but it's still additional work for the developer and the server.
I agree that users shouldn't be querying the tables.
Day dimensions are a pretty standard concept in datawarehousing, a developer working in a datawarehouse should be comfortable with the concept. They are also a very powerful tool for users who want to easily dissect data by more than just a date and at that point end up saving either the server a lot of work doing a lot of on the fly calculations or the developers a lot of work explaining to users why they have to manually enter every date range.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply