Running totals

  • Just as a side bar, you do realize than 9758 * 30 days per month will create more than a quarter million rows per month? It's no problem for the Quirky Update to do running totals (about 3 seconds per million rows) but that graphing program is going to have to be pretty darned good to render 9758 graphs over a quarter million rows. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I think I may have misled you. in the text file, there's around 18000 records in total per month. Below is what I wrote previously. However, the last sentence, suggeting 9578 records are not per really per month. More the distinct combination of records for Product and salesman with a months data.

    Sorrry about the confusion

    There is around 180 sales guys per month

    There is around 2702 products sold per month.

    I have done a distinct in Product and sales guy and there is 9578 records, approx per month

  • Understood... If I'm understanding things correctly, you also stated that you wanted one running total for each Sales Person and Product combination by day of month. You have 9578 unique combinations of of Sales Person and Product in a given month. Multiply that number by 30 and you'll see the number of rows we'll eventually end up with to support the graphics program.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • yeah, I think i see what you mean. I guess it will grow exponentially

    :/

  • As a side note, the app we're using wouldn't need to render all rows as a selection would probably be made to limit the amount of data returned.

    Maybe that would help lol

  • cidr (8/31/2011)


    I think I may have misled you. in the text file, there's around 18000 records in total per month. Below is what I wrote previously. However, the last sentence, suggeting 9578 records are not per really per month. More the distinct combination of records for Product and salesman with a months data.

    Sorrry about the confusion

    There is around 180 sales guys per month

    There is around 2702 products sold per month.

    I have done a distinct in Product and sales guy and there is 9578 records, approx per month

    No reflection on you Cidr... but this task no longer makes sense. If we do the simple math of having about 18000 rows for a month and there are 9579 unique combinations of ProductID and SalesManID, then that means that each of those combinations has and average of only 1.87 real plot points per month. ALL of the charts would end up looking (at best) something like the following...

    ______

    /

    __________/

    /

    0 ______/

    111111111122222222223

    123456789012345678901234567890

    I can see doing a running total by SalesManID alone with a increments by day, but that's about it.

    Would you double check with the people who "designed" the requirments and make sure they unmderstand this simple math and how useless the graphics will be if they try to plot running totals by SalesManID and ProductID combined?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi jeff. Do you know if this is still possible?

  • Hi

    Please see below more info about the data and schema. The data I'm using has been extracted from a cube into a text file

    When your refer to 'cube' are you talking about a datawarehouse cube such as SSAS / Cognos etc.

    if so then these cubes are generally pretty adept at providing cumulative totals.

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • cidr (9/4/2011)


    Hi jeff. Do you know if this is still possible?

    It absolutely possible. And, no, this is NOT directed at you but someone has really been eating some bad mushrooms on the requirements. According to the stats on the monthly data you've given me, there's 1.8 plot points per available sales person and product combination. All of the rest of the data would be "straight line fluff" considing of ~28 days of duplicated data for each sale person and product to the tune of 270 thousand "fluff" rows per month. It's an insane requirement that would violate the sensibilites of every DBA I know.

    Now, if you told me you wanted running totals just by SalesManID or running totals just by ProductID, I can actually see that. But running totals by SalesMandID AND ProductID just isn't making any sense especially considering the number of duplicated dummy rows that would need to be spawned.

    Part of the reason for me not offering up any code yet is because I think, according to the current requirements compared to the status of the data, that this is such a bad idea that I want to help you protect your servers and your DBA's from this problem.

    PLEASE. Explain the situation to the people giving you this requirement. Even though data-marts/reporting servers frequently work best on denormalized data, this is just an insane amount of what I think is useless denormalization.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    I'll double check with them. I think the logic they had was that they wanted to slice/dice data, by productID, then by SalesmanID etc, not together.

    However, so I'm on the same page as you can you let me know if my math is correct.

    If 180 sales guys sell 2702 products on the first day then that would be 14590800 records in a 30 day month? If so, then yes, it is madness.

    Let me know if I'm correct and I'll use this as amo for the powers at be.

    Cheers

  • cidr (9/6/2011)


    Hi Jeff,

    I'll double check with them. I think the logic they had was that they wanted to slice/dice data, by productID, then by SalesmanID etc, not together.

    However, so I'm on the same page as you can you let me know if my math is correct.

    If 180 sales guys sell 2702 products on the first day then that would be 14590800 records in a 30 day month? If so, then yes, it is madness.

    Let me know if I'm correct and I'll use this as amo for the powers at be.

    Cheers

    Yep... that's the same number I come up with.

    In your previous posts, it sounded like you (they) wanted Product By SalesPerson and, according to the data you provided, that would only give you 1 or 2 plot points per chart.

    If they want sales by Salesman and sales by ProductID separately, then I can see maybe doing this.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I don't know enough about the cube product they're using. If it was SSAS then I'm sure I could create running totals quite easily for both ProductID and then SalesmanID etc... surely this would be easier than writing a query.

    So, the guy that's mainly working on the cube side, who asked me the request, is back soon so I'll speak to him and tell him it's not really viable unless they want to run a query all night.

    Thanks for your help Jeff, there's a lot you've made me aware about this issue.

  • You bet. C'mon back when you get some clarification on this. It's actually a fun problem. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply