July 7, 2003 at 4:21 am
In my source tables, one is the fact table (call it General) and also contains Foreign Keys to the other tables. E.G. The General table will contain a column currency_code, which will link it to the Currency table. However, some rows in the General table may contain NULL values for the currency_code. The problem is, when trying to create a Currency Dimension for my cube, it does not include the NULL row values into the aggregation of the cube, causing data to be invalid (for example, the net_profit values in that NULL value line is not included when summing up the data).
If i create a Currency dimension, only using the currency code from the General table, everything obviously adds up, but if i create a Currency Dimension using the Currency Table (to get the currency name), the NULL's are not included and the total values of all the NULL rows are left out, causing the value to be out by the summ of all rows where currency_code = NULL.
Any suggestions or solutions, please.
Thanks,
Ryan
July 7, 2003 at 8:44 am
Hi Ryan,
Can you add a catch-all level in your dimension table with a value of something like "N/A"? Then you could default all nulls to this level's key value...just a thought...
hth,
Michael
Michael Weiss
Michael Weiss
July 7, 2003 at 8:58 am
Thanks for your time and help, Michael, but, excuse me if i sound a bit dumb here, how do you add a catch-all level? Never heard of or about it before.
Thanks,
Ryan
July 7, 2003 at 9:29 am
You could add a 'Not Defined' value to your currency table and update all 'NULL' values in the fact table to point to this one.
I guess this is about the thing Michael was indicating.
July 7, 2003 at 9:29 am
Well let's say that your dimension table has the following fields:
currency_key, currency_type
With data such as 1, US Dollar
2, Euro
Could you not have a data member that would look like 3, N/A? If you default all of your currency dimension foreign key values to 3 where they are null in your fact table, your data should aggregate correctly and you can always filter out those records where currency_type = "N/A".
hth,
Michael
Michael Weiss
Michael Weiss
July 7, 2003 at 4:32 pm
As a slight twist on Michaels suggestion, we always try to use negative numbers (e.g -1 = n/a , -2 = unknown etc) for these types of placeholder/catch-all values, this way you can see them straight away in the fact table when viewing records without joining to the dimension table. A bigger advantage is that any SQL/DTS you use to load the fact records can know the alternate value to insert instead of NULL *without* having to do a look-up on the dimensional table on the description (ie [some select sql here] ISNULL(mycol1, -1), this is really only an advantage if you use identity fields for your surrogate key.
HTH,
steve
Steve.
July 8, 2003 at 1:27 am
Thanks for all the help. I originally suggested replacing the null with a key, but management said that it was not the ideal situation as they wanted those specific fields to actually contain nothing (as there are actually about 5 or 6 foreign key references with null values in different rows), but it looks like they will have to make do with it.
Thanks again, I appreciate it.
Ryan.
July 8, 2003 at 1:41 am
Ryan,
In my opinion there is no reason at all to not replace NULLs in your fact table by references to a 'N/A' record.
I can see your management's point if you're talking about a table that is used in an OLTP environment, but in an OLAP environment NULL values in the dimension fields just make no sense.
July 8, 2003 at 9:27 am
Maybe I'm being overly simplistic here or this has already been tried, but why not something like this:
SELECT SUM(CurrencyColumn) AS DollarSum
WHERE CurrencyColumn NOT NULL
That way, the NULL columns don't throw off your math.
What that what you were talking about?
July 9, 2003 at 4:49 am
The fact table has many columns, including currency_code as well as, for example gross_profit. One row may have a NULL value for currency_code, but there will be a value for the gross_profit. If I try sum(gross_profit) where currency_code NOT NULL AND date = 'JUNE', for example, as per your suggestion, eljeffo, the value returned for gross_profit in JUNE is not the true value because it is not including those gross_profit_values where curerncy_code is NULL.
Anyway, I have just suggested that we make all NULLS a key value, but management still believes that there is a better way...
Oh well.
Thanks for all the help and suggestion's.
July 9, 2003 at 5:45 am
I've been looking into this a bit closer.
You might be able to solve this using a view as FACT table and/or as the Dimension table for currencies.
For the FACT table you replace the NULL values with a dummy FK to the Dimension table :
ISNULL(Currency_code, -1)
.
The Dimension table you can use the UNION to add the -1 currency code.
Hope you can solve it this way...
July 10, 2003 at 2:29 am
I'm not quite sure that I fully understand you, NPEETERS.
Could you please expand a bit more on your suggestion.
I appreciate your time and effort.
Thanks.
July 10, 2003 at 5:41 am
First, construct a view to substitute all NULL values in the fact table by a value. I am using '-1', since I guess this will not be present in the Currency table yet
CREATE VIEW FactView AS
SELECT IsNull(currency_code, -1),
<All other required fields>
FROM General
Now, create a second view on the Currency table to include the newly created '-1' currency.
CREATE VIEW DimensionView AS
SELECT currency_code, currency_name
FROM Currency
UNION
SELECT -1, 'Null'
Now, build your cube using the FactView as your 'fact table' and the DimensionView as your Currency Dimension source.
You will have an additional member of the dimension that is called 'Null' and that points to all the records in the fact table that don't have a currency code.
July 10, 2003 at 9:02 am
Thank you NPeeters. I tried out your suggestion and it all seems to work exactly as i was wanting, i appreciate it.
Thanks again to all of you for your help.
Ryan.
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply