Triangle Join - Can I do it right or a replacement?

  • codebyo (3/20/2012)


    Evil Kraig F (3/20/2012)


    This is basically a running totals query, and I believe that's where Jeff's going with his solution so I won't work the same process twice. 😉

    However, there's another concern with your query, and that's the non-SARGability of the year.

    WHERE YEAR(DATA) = '2011'

    will not use indexes. However:

    WHERE Data >= '20110101' AND Data < '20120101'

    will use an index on Data properly. Notice how the calculation is not being done on the column being reviewed.

    Yes, you are correct!

    I mentioned that issue in the second post of this topic and I will change the query.

    Thanks a lot for the analysis because it confirmed what I was thinking and allowed me to know that I'm in the right path. 🙂

    Whoops, sorry Andre, I somehow glazed right past that's what you meant in the second post.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Jeff Moden (3/20/2012)


    Example: Table ProductSummary

    Columns: Date, Product, Vendor, Total

    Rule: One row per day at maximum.

    So, one row per date per product per vendor, correct?

    And you want running totals by day and not month?

    Sorry for all the questions but I like to get these things right before I start writing code/

    Yes, exactly. The ProductSummary example will have the running totals by day because data will be imported like that already aggregated by those columns. So except for the total column, all the other values will be unique. A lot of different queries will be made out of that table, showing totals per year, than another one will show totals per month. Finally the user will be able to see a graph result per day.

    All the values must be cumulative. For example, if the table above contains these rows:

    Date: 01/01/2011, Product: Bike, Vendor: X, Total: 200

    Date: 01/01/2011, Product: Car, Vendor: X, Total: 200

    Date: 04/05/2011, Product: Bike, Vendor: X, Total: 225

    Date: 08/13/2011, Product: Bike, Vendor: X, Total: 180

    The final table will look like:

    Month: 01 - Total: 400

    Month: 02 - Total: 400 (because there were no sales)

    Month: 03 - Total: 400 (because there were no sales)

    Month: 04 - Total: 625

    Month: 05 - Total: 625 (because there were no sales)

    etc.

    Sorry for all the trouble.

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • Evil Kraig F (3/20/2012)


    Whoops, sorry Andre, I somehow glazed right past that's what you meant in the second post.

    No problem. Advices like that one are always great. I appreciate. 🙂

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • codebyo (3/20/2012)


    All the values must be cumulative. For example, if the table above contains these rows:

    Date: 01/01/2011, Product: Bike, Vendor: X, Total: 200

    Date: 01/01/2011, Product: Car, Vendor: X, Total: 200

    Date: 04/05/2011, Product: Bike, Vendor: X, Total: 225

    Date: 08/13/2011, Product: Bike, Vendor: X, Total: 180

    The final table will look like:

    Month: 01 - Total: 400

    Month: 02 - Total: 400 (because there were no sales)

    Month: 03 - Total: 400 (because there were no sales)

    Month: 04 - Total: 625

    Month: 05 - Total: 625 (because there were no sales)

    etc.

    Ummmm... gosh... Ok... I see a couple of things wrong with that idea. Let me first ask what the unit of measure for "Total" is. Is it a quantity or a currency amount?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/20/2012)


    Ummmm... gosh... Ok... I see a couple of things wrong with that idea. Let me first ask what the unit of measure for "Total" is. Is it a quantity or a currency amount?

    It's the count of how many sales, for instance. Most of the reports need to show "how many per day, month, year", etc.

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • sorry... haven't abandoned you. Had to get some sleep because I had to be to work very early to swap some drives on a Dev Box. Currently at work. I'll take a look at this tonight and see if we can put this to bed for you.

    I saw something almost exactly the same as this on SSC about a year ago. It takes a bit of thinking to do this correctly and with great performance but it'll all be easy to understand and maintain.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/21/2012)


    sorry... haven't abandoned you. Had to get some sleep because I had to be to work very early to swap some drives on a Dev Box. Currently at work. I'll take a look at this tonight and see if we can put this to bed for you.

    I saw something almost exactly the same as this on SSC about a year ago. It takes a bit of thinking to do this correctly and with great performance but it'll all be easy to understand and maintain.

    Slacker... 😎

    Jared
    CE - Microsoft

  • Jeff Moden (3/21/2012)


    sorry... haven't abandoned you. Had to get some sleep because I had to be to work very early to swap some drives on a Dev Box. Currently at work. I'll take a look at this tonight and see if we can put this to bed for you.

    I saw something almost exactly the same as this on SSC about a year ago. It takes a bit of thinking to do this correctly and with great performance but it'll all be easy to understand and maintain.

    Hi, Jeff. No problem.

    I had to abandon the forum to get some sleep as well.

    I was almost sleeping on the keyboard last night. 😀

    If you feel the question is still too short on detail, please let me know.

    I'm sure I'm going to have to use a cumulative count in many projects so anything will be of great value to me.

    Thank you. 🙂

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • Ah... there you are. Had accidently deleted this thread from my email chain and lost track of it.

    I've been thinking about what you say folks are going to do with this data and I'm not sure that storing a running total in the table is going to buy anyone anything especially if only a year's worth of data is going to be stored. If you really want this for high speed reporting purposes, then you probably need to store separate running totals by day for each week, month, and year. If you're going to do that, then you should probably have an aggregate table for each type of data (week, month, and year) each with it's own running total so that people aren't calculating the same informaton over and over and over.

    Or, do you really have your heart set on the table holding just one year with a running total by date, product, and vendor for each row instead of preaggregating the data as I suggested?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Excellent advice. Actually we came with something similar while we we were on a meeting in the company.

    Let me see if I understand what you suggested. A table with running totals per day, for instance, that will have every day as a row and no gaps in dates.

    While I was thinking of storing historical totals like this:

    Day - Product - Total

    2011-01-01 | Car | 1200.00

    2011-01-05 | Car | 2000.00

    I would have all the gaps pre-filled in the table:

    Day - Product - Total

    2011-01-01 | Car | 1200.00

    2011-01-02 | Car | 1200.00

    2011-01-03 | Car | 1200.00

    2011-01-04 | Car | 1200.00

    2011-01-05 | Car | 2000.00

    I mean, even if there's no increase in the car's sell from January, 2 to January, 4 I would still store the values in the table. Then I'd create running total tables for months and years as well. I was thinking about making it more towards a DW model (with dimension and facts) but it's maybe killing a mosquito with a rocket launcher from the requirements of the project. 🙂

    I'm really glad that you took time to read this and help me.

    I will put your advices to good use. Thanks again.

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • The thing I'm curious about is do you really intend to report by day? I can see storing weeks and months and their running totals but I'm not so sure about days. I guess it really depends on how many products you actually have. How many are there?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/24/2012)


    The thing I'm curious about is do you really intend to report by day? I can see storing weeks and months and their running totals but I'm not so sure about days. I guess it really depends on how many products you actually have. How many are there?

    Actually when I mentioned "products" I mean any table entity that could be comparable to that example because I'm not responsible for the DB modelling and I depend on the demand to start working. I have no idea what the table structures are going to look like but I know that I'll need a dimension of time per day.

    It's a BI presentation and they're going to show charts showing the results of the year (which can be the total of the products, for instance). When the user clicks the year the charts, the month charts will be shown. When the user clicks the month chart, the week chart will be shown until the granularity reaches the product itself. So it's not just a matter of showing calculated numbers which is the purpose of a DW. I need to be able to reach the final detail which is the product itself. A DW (multidimensional) model wouldn't be perfect for this so I'm mixing parts of the relational model with a DW model to achieve this. I'd love to hear suggestions to this but as of now I can't be sure what it's going to look like so it's mostly a generic question about how to do it properly.

    Thank you for your time. 🙂

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • codebyo (3/24/2012)


    When the user clicks the year the charts, the month charts will be shown. When the user clicks the month chart, the week chart will be shown until the granularity reaches the product itself.

    Then, as surprising as it may sound, there's no reason to store all of the data we were talking about. We just need to write some good, fast SQL against a proper table that stores only the data that actually occurs. Based on "granularity parameters" and the period (Year, Month, Week, Day, whatever), the data should be quickly aggregated and returned to the app responsible for building the chart. My recommendation after that is to have the app "fill in the blanks" and for the app to create the running total from the aggegated data it receives from the proc as a result set.

    Yes, we can very quickly create a cume column in the result set for whatever granularity they want but it really should be done in the app that is rendering the charts.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply