Reporting from a denormalized table

  • Hello SQL gurus,

    This is a mess of a problem for me...

    I am trying to write a query to compare product attributes between two different databases. (One is from CRM db and the other on a db backend of the live site.)

    Problem is, in the CRM db, the table 'Order_Variable' is denormalized. It has 4 columns:

    p_category

    row_id

    attribute_id

    attribute_value

    --

    p_category represents different product types like '01'=rental cars, '02'=blind ad, '03'=additional cars

    row_id is a unique number for each product sold.

    attribute_id: these are different product attributes for different p_category. For example p_category '01' may have 5 attributes, then it will have 1, 2, 3, 4, 5 rows in the table.

    attribute_value: if p_category has 5 attributes, their values are stored here: 1) Buyer 2) Registered User 3) Password 4) # of cars 5) Commitment.

    --

    HERE COMES THE PROBLEM!

    p_category '03' (additional cars) has 2 attributes: 1) Related row_id 2) # of additional cars

    So, when a customer rent 5 cars for 3 months, but later add 5 more cars. You will have two row_ids (sold products) in the CRM db, ie:

    row_id 12345, p_category='01' w/ 5 cars

    row_id 23456, p_category='03' w/ 5 cars

    But only ONE row_id on the live site's db

    row_id 12345, with 10 cars!!!

    A mess isn't it? What I have now is I completely ignore p_category '03' and only compare the attributes for p_category '01' with the live site's db. Which means some of the attributes_id=4 '# of cars' are wrong because I am ignoring any additional cars rented. I have no idea how to begin implementing the additional cars in my query. Can someone please help me as to what I can do? Thank you in advance.

    Sincerely,

    Nickel

  • I think you're going to have a big problem unless you can identify that row_id 12345 is the same customer as 23456. If you can do that by joining to a client lookup table, then I suggest you need to do a sub-select which totals the number of cars hired per client(row_id) and then join that to your live DB table.

    Regards

    Simon (not a guru!!)

  • Thanks for replying.

    The only link relating a '03' (Additional Cars) product to a '01' (Rental Cars) is attribute_id=1 (Related row_id) in the '03' additional cars product.

    I thought about self joining the order_variable table like this:

    order_variable ov01

    LEFT OUTER JOIN

    (SELECT p_category, row_id, attribute_id, attribute_value

    FROM order_variable

    WHERE p_category = '03') AS ov03

    ON ov01.row_id = ov03.attribute_value

    WHERE ov03.attribute_id = 1 --Related row_id

    hmm... I'm thinking... then I'll have an additional field "Additional cars" in the result. I guess I can do a ISNULL(ov3.attribute_value, 0) and do a CASE in the SELECT to add any additional cars together.

    But what if there are more than 1 'additional cars' product? How do I get the total amount of cars rented for certain? Is there any way to output the results into a normalized table/temp table/view so that it will be easier to browse and compare with the live site's data?

  • Hi Nickel,

    It depends on your RelatedId field. Does it reference the original ID created of your p_cat='01'? If it always does, then your query will work regardless on how many '03' records you have.

    Now, if your second '03' has a RelatedId that contains the ID of the previous '03', you are in trouble. What you would have to do is to create a temporary table that holds all the IDs that relate. Something like this would do it:

    Declare @iID int

    Set @iID = 123456 -- your record for p_cat='01'

    CREATE TABLE #T (ID int)

    INSERT INTO #T VALUES (@iID)

    WHILE Exists(SELECT ID FROM order_variable WHERE p_category='03' and attribute_value=@iID)

    BEGIN

    SET @iID = (SELECT ID FROM order_variable WHERE p_category='03' and attribute_value=@iID)

    INSERT INTO #T VALUES(@iID)

    END

    Now, your #T table contains the list of all the IDs that relate to the same initial record with p_cat='01'. All you have to do is something like this:

    SELECT * from order_value where ID IN (SELECT ID FROM #T)

    Hope this helps.

    Herve

    Herve Roggero
    hroggero@pynlogic.com
    MCDBA, MCSE, MCSD
    SQL Server Database Proxy/Firewall and Auditing

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply