A dimension with unique member values?

Question

A dimension with unique member values?

Del Piero

SSCrazy

Points: 2265
More actions
August 2, 2005 at 4:15 am

#110794

Hi all,
     Suppose I have the following tables:
PRODUCTS
(prd_id, prd_name, price, supplier, date_of_purchase)
VENDORS
(vendor_name)
      I am building a cube with price as the measure, and 3 dimensions: Product_id (from prd_id), date_of_purchase, and supplier.
      The supplier field maps to a vendor_name value in the VENDORS table.
      In order to show all vendors even if there is no product associated with that vendor, when I build the Supplier dimension, I use the vendor_name column of the VENDORS table, and join the VENDORS and PRODUCTS tables at the cube level.
      The problem is, it happens that in the VENDORS table, there are 2 records with the same vendor_name (there is no unique key or primary key in the table), let's call it "ADIDAS".
       Suppose that in the PRODUCTS table, the total price of all products with supplier "ADIDAS" is 3,000. When the aggregations are calculated in the cube, the "ADIDAS" vendor will show a value of 6,000, because there are 2 ADIDAS values in the VENDORS table to join with each PRODUCTS record with supplier ADIDAS.
       As I am not the designer of the database I cannot decide to remove the duplicate record. What can I do? I tried to set those allow unique names/keys properties in the dimension (set to false), but still the result is the same. Is it possible to do something like placing a "DISTINCT" keyword in the Source Table Filter (which I dunno how to do so as the filter is supposed to be a WHERE clause). Or is there anything I missed out?
Thanks for your help,
delpiero

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

stevefromOZ SSC-Forever Points: 43646 More actions · Answer 1

This is a pretty major fault with the DB design. If the two suppliers aren't meant to be the same entity, then there is no way to tell AS (or any other query) which one to assign the sales to.

If you can. i'd write a view over the suppliers table with a distinct clause in it, but as you've noted, this is really an arbitrary removal of one of the dupes.

Are you *sure* there is no other method of joining the transactions to the suppliers? If what you're seeing is truly how the source DB works, then looking through the front end (data entry) application, you'd not be able to differentiate between the two suppliers sales.

Steve.

robertm Ten Centuries Points: 1096 More actions · Answer 2

In reality this is not an AS issue. It simply comes down to duplicates in your data that would affect any application using this information.

As stevefromOZ rightly points out "If the two suppliers aren't meant to be the same entity, then there is no way to tell" them apart. Therefore any record (in your fact table for example) assigned to that entity will be duplicated for each instance of it in the joining table.

If these are in fact the same entity then Distincting the data used to populate your vendors dimension is a reasonable solution. However, if this is not the case then you have a very diffucult job ahead!! You could potentially get around this by assigning a surrogate key to each member of your vendors demension and assign this appropriatly in the fact table depending on what instance of vendor you want to assign it to, i.e. surrogate_key 1 or surrogate_key 2. But this is all pointless though, if at the end of the day if you have no way of differentiating the two records as they will appear to be a single member 'ADIDAS' in your cube.

Good luck!

tcraig SSC Veteran Points: 253 More actions · Answer 3

I guess I'd join with a subquery like

SELECT DISTNCT vendor_name FROM VENDORS

Or create a similar view in the DB and join with that

Joe Williams-227299 SSC Journeyman Points: 87 More actions · Answer 4

Chances are that the VENDORS table has a vendor_id column that is unique. This is the column that should be joined to the PRODUCTS table -- not the vendor_name column.