October 21, 2014 at 1:34 pm
I have a table (FactRetailSales) at the level of an invoice line item. My original design had the Invoice_Number as a Degenerate Dimension (or Fact Dimension) in FactRetailSales.
--Current fact table
CREATE TABLE [dbo].[FactRetailSales](
[StoreKey] [int] NOT NULL,
[InvoiceDateKey] [int] NOT NULL,
[ItemKey] [int] NOT NULL,
[InvoiceNumber] [int] NOT NULL, --<--Degenerate Dimension
[ExtendedCostAmount] [decimal](12, 4) NOT NULL,
[ExtendedSalesAmount] [decimal](12, 4) NOT NULL,
[ExtendedGPAmount] [decimal](12, 4) NOT NULL)
There was a problem with that design, as the Invoice_Number wasn't unique across all stores, so I contemplated separating out into an actual Dimension table (DimInvoice). DimInvoice would hold Invoice_Number, and the Store_Number or StoreKey (Don't know which is better design).
--Proposed: DimInvoice
CREATE TABLE [dbo].[DimInvoice](
[InvoiceKey] [int] NOT NULL,--<--Degenerate Dimension
[StoreNumber] [int] NOT NULL,
[InvoiceNumber] [int] NOT NULL)
--This would be FactRetailSales using DimInvoice surrogate key
CREATE TABLE [dbo].[FactRetailSales](
[StoreKey] [int] NOT NULL,
[InvoiceDateKey] [int] NOT NULL,
[ItemKey] [int] NOT NULL,
[InvoiceKey] [int] NOT NULL,
[ExtendedCostAmount] [decimal](12, 4) NOT NULL,
[ExtendedSalesAmount] [decimal](12, 4) NOT NULL,
[ExtendedGPAmount] [decimal](12, 4) NOT NULL)
Then the design called for a Fact Table that aggregated the FactRetailSales table up to the Invoice_Number (call it FactInvoice). If I keep DimInvoice, it would have a one-to-one relationship with FactInvoice.
So I'm now wondering if it is possible to create a Surrogate Key in FactInvoice and use it as a dimension foreign key in FactRetailSales.
Is this a possibility?
-- or Proposed: FactInvoice
[FactInvoiceKey] [int] identity, --< This surrogate key will act as the
[StoreKey] [int] NOT NULL, --Invoice Dimension Key in FactRetailSales
[InvoiceDateKey] [int] NOT NULL,
[InvoiceNumber] [int] NOT NULL,
[InvoiceTaxAmount] [decimal](12, 4) NOT NULL,
[InvoiceCostAmount] [decimal](12, 4) NOT NULL,
[InvoiceSalesAmount] [decimal](12, 4) NOT NULL,
[InvoiceGPAmount] [decimal](12, 4) NOT NULL)
--This would be FactRetailSales using FactInvoice surrogate key
CREATE TABLE [dbo].[FactRetailSales](
[StoreKey] [int] NOT NULL,
[InvoiceDateKey] [int] NOT NULL,
[ItemKey] [int] NOT NULL,
[FactInvoiceKey] [int] NOT NULL,
[ExtendedCostAmount] [decimal](12, 4) NOT NULL,
[ExtendedSalesAmount] [decimal](12, 4) NOT NULL,
[ExtendedGPAmount] [decimal](12, 4) NOT NULL)
It seems strange that the StoreKey is in both FactRetailSales and FactInvoice. I'm puzzled with the correct way to proceed.
Any help would be greatly appreciated, thanks.
October 24, 2014 at 9:31 am
My first question is why does the design call for keeping aggregated line item totals at an order level? At first glance it would appear to me that you're trying to force into a table what perhaps belongs in a cube or runtime aggregating query on the server or possibly in a pivot table or reporting tool on a client. What would happen if the line items had errors one day or missing data? Once the detail problem is fixed you would have to rerun your aggregator to update that table. bleh. Without knowing more, I would say if it feels strange, it probably is. My bet is on not keeping those aggregations around in table unless you have to. (The purists would say that that is what a cube is essentially doing...storing aggregations... but that was what it was designed for.)
October 30, 2014 at 8:55 am
Thanks for the input.
I do have some data that are only at the Order level (Tax Amount, Payments), but they may not be essential to reporting.
Primarily, I thought I'd have the table to permanently store some relatively complicated calculations on cross-selling/up-selling activity.
I am working through the cube design now and will check out the aggregation functionality.
So in reference to my original dilemma, regarding having InvoiceNumber as a degenerate dimension or putting into its own dimension, Kimball says:
A surrogate key is necessary if the transaction control numbers are not unique across locations or get reused. For example, the retailer’s POS system may not assign unique transaction numbers across stores. The system may wrap back to zero and reuse previous control numbers when its maximum has been reached. --Kimball
In my case, the InvoiceNumber is not unique and requires Store Number. Is the DimInvoice dimension recommended? Even though you can make a degenerate dimension combining InvoiceNumber and StoreKey from the FactRetailSales table?
November 14, 2014 at 4:14 am
Go with your original design;
StoreKey and InvoiceNumber are your member identifier for aggregation.
There is no need based on your stated problem to have an invoice dimension. If you have other data that is specific to the invoice (i.e. not to the lines) then there is a case to put this data into a separate FACT table but the membership would be the same (StoreKey and InvoiceNumber)
I would only create an invoice dimension if there is something else you need to manage (e.g. sub-orders or multiple delivery dates/locations)
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply