Tally Table

  • GSquared (5/6/2008)


    Dateadd lists of datetimes. For example, if you need a second-by-second list of all times in a month, or a quarter. (10,000 seconds is a little less than 3 hours. 100-million seconds is just over 3 months.) Millisecond lists for a full day (10,000 milliseconds = 10 seconds. 100-million is a little over 2 days.) Creation of test data tables (select whatever from dbo.BigNumbers), to get millions of rows of dates, text, numbers, whatever, with one select statement.

    I kinda figured it would be something like that... but the key question is why do you have to do things for every second over, say, a quarter? Not trying to bust anyone's chops here... I really want to know because I not seen anything of that nature (all seconds in a quarter) before. What is it that has to be reported by the second over such a long span of time? You looking for spikes in something? What's that "something"?

    Thanks, Gus.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/6/2008)


    GSquared (5/6/2008)


    Dateadd lists of datetimes. For example, if you need a second-by-second list of all times in a month, or a quarter. (10,000 seconds is a little less than 3 hours. 100-million seconds is just over 3 months.) Millisecond lists for a full day (10,000 milliseconds = 10 seconds. 100-million is a little over 2 days.) Creation of test data tables (select whatever from dbo.BigNumbers), to get millions of rows of dates, text, numbers, whatever, with one select statement.

    I kinda figured it would be something like that... but the key question is why do you have to do things for every second over, say, a quarter? Not trying to bust anyone's chops here... I really want to know because I not seen anything of that nature (all seconds in a quarter) before. What is it that has to be reported by the second over such a long span of time? You looking for spikes in something? What's that "something"?

    Thanks, Gus.

    I don't know about 100M, but I had to do exactly that - the Marketing dept at a client was looking for activity spikes/activity patterns, and then of course they wanted to do trending of spikes...... In our case that was minute-by-minute over a year, so I actually built Tally up to 1M (and have kept it there ever since).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Holy Moly... that's one heck of a lot of analysis. Did they actually get anything useful out of all that?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/6/2008)


    Holy Moly... that's one heck of a lot of analysis. Did they actually get anything useful out of all that?

    There were definitely some interesting usage patterns (people like to "shop for doctors" at work in the morning, and late late late at night, mostly on Mondays and Thursdays), but I don't know exactly what they got out of it (this was the kind of data that got passed off to someone, who digested and regurgitated it, then presented it to their boss, who digested and regurgitated it.... and on up the food chain....) Of course - we did get a few extra new web servers that year too, so maybe something good came out of it...

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Jeff Moden (5/6/2008)


    GSquared (5/6/2008)


    Dateadd lists of datetimes. For example, if you need a second-by-second list of all times in a month, or a quarter. (10,000 seconds is a little less than 3 hours. 100-million seconds is just over 3 months.) Millisecond lists for a full day (10,000 milliseconds = 10 seconds. 100-million is a little over 2 days.) Creation of test data tables (select whatever from dbo.BigNumbers), to get millions of rows of dates, text, numbers, whatever, with one select statement.

    I kinda figured it would be something like that... but the key question is why do you have to do things for every second over, say, a quarter? Not trying to bust anyone's chops here... I really want to know because I not seen anything of that nature (all seconds in a quarter) before. What is it that has to be reported by the second over such a long span of time? You looking for spikes in something? What's that "something"?

    Thanks, Gus.

    I originally created it for seconds in a quarter, doing a very detailed analysis on various log files, also compared to a table of dates with business days, holidays, weekends, etc. Was data mining and found that minutes wasn't granular enough for the patterns I needed.

    The analysis and results, in statistics, did help resolve some issues for the business, in terms of workflow internal vs workflow external (employee time vs customer time), which did result in a more efficient business process.

    Long-term benefit ended up being zero, because that company went out of business a year and a half later for disrelated reasons. But if the company had been able to sell and market worth a darn, it would have been quite beneficial.

    Since then, I've simply found it periodically useful in various things where 10k numbers just wasn't enough. You've seen me use it in generating test data for large numbers of rows in several of these forum threads.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks for the info, guys! 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My "essential" Tally table contains not only numbers but column of characters ( NCHAR(N) ), dates ( DATADD(dd, N, 0) and other date related values (DW, day of year, week of year, week of month, week of month another style, etc.).

    To host full set of those values table needs to have 216 rows - 65K.

    _____________
    Code for TallyGenerator

  • Outstanding, Sergiy... you've reminded me of a couple of questions that I've always wanted to ask about such multi-purpose (Numbers/Tally + Date table + other) tables... guess now's as good a time as any...

    1. Which column(s) did you include in the Clustered Index?

    2. Do you have a Primary Key on the table and, if so, which column(s)?

    3. Which columns have you indexed? Any composites?

    4. Are any of the columns simply calculated columns?

    5. Do you maintain a separate Holiday table for multi-national holidays?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • 1. and 2. PK Clustered is on N.

    3. All of them. Table is static - so there is no harm in over-indexing.

    Yes, there are composites: WeekOfMonth + Date, etc.

    4. No, I create the table from a view doing all calculations.

    Don't see any point in recalculating of DayOfWeek for 1/01/2008 every time. I doubt it will return different result ever.

    Actually, the whole point of such table is to have precalculated values ready.

    5. Holiday table is a "join" table between Tally, Country and HolidayType.

    I guess no need to explain further.

    _____________
    Code for TallyGenerator

  • Heh... you made my day, Sergiy. I've been trying to convice a DBA on a little side job of those very same points... he's doing things like using calculated columns to "save space", etc. Maybe I can get him to reconsider, now. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Save space?

    Heh...

    Ask him: how much space?

    Then, how much this space will cost?

    Then give him 5 bucks and BUY that space for yourself.

    😎

    _____________
    Code for TallyGenerator

  • Just curious. I like having the utility tables like Tally - but is there a value in having all of those things in a SINGLE table? to do all of the counting? I've always had separate ones, and was intrigued when I saw that you were combining them....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I don't know about Sergiy, but the only big advantage I've found is that I don't have to convince reluctant DBA's into having two helper tables. I haven't tested it (but this is a good reminder that I should), I suspect the separate tables will actually be a small bit faster for the Tally table because it'll have more rows per page when it's a stand-alone table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/6/2008)


    I don't know about Sergiy, but the only big advantage I've found is that I don't have to convince reluctant DBA's into having two helper tables. I haven't tested it (but this is a good reminder that I should), I suspect the separate tables will actually be a small bit faster for the Tally table because it'll have more rows per page when it's a stand-alone table.

    That would have been my reason for doing it (of course - I usually just kind of slip them in, and "forget" to tell them until it's being used a lot...:)). I just didn't think Sergiy knuckled under to office politics...:hehe::w00t:

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (5/6/2008) I just didn't think Sergiy knuckled under to office politics...:hehe::w00t:

    Well, I'm probably lucky, but I'm not THAT lucky.

    I'm not sure there is any scientific proof that such luck exists. :hehe:

    Regarding number of pages to be read, let's do some calculations.

    4 bytes(int) + 2 (nchar) + 4 (smalldatetime) + 1 + 1 + 1 + ...(tinyint for all those "days of week) = less than 20 bytes.

    65K * 20 = 1.3 M

    Plus size of indexes - you'll get 3 or 4 Meg in total.

    How much memory you've got on you oldest and smallest server?;)

    If you chasing for performance just do PINTABLE and you have already read all pages. Once and forever.

    _____________
    Code for TallyGenerator

Viewing 15 posts - 16 through 30 (of 58 total)

You must be logged in to reply to this topic. Login to reply