Calculating Work Days

  • Joe... I don't use formulas for the Holidays nor did I suggest that... that would be stupid.  Go back and look at what I said... I agree that a Holiday calendar would be necessary... you just don't need a full calendar table to do this.

    quoteIf this is the most complex SQL you have seen, then you do not get out much

    Heh... Apparently I get out more than you... I've read your books

    quoteAnd most importantly, it solves them all the SAME WAY. Unlike trying to get people to agree on procedural code.

    I'm thinking that a function accomplishes the same thing

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ETL the "code" to a spreadsheet?  Why would you do that?  You'd ETL the query results to a spreadsheet... not the code.

    As I did before earlier in this thread, I admitted that a Calendar table is a great thing to have and I also agreed that they don't take much room.  In fact, I try to convince clients that they really need a Calendar table (or, at least a Tally table) to handle all the nifty date calculations and addition "fiscal" information that they make so easy.  BUT, there's always those clients that, no matter how much you train them to think (I've been known to take on a bit of a "Celko tone" when it comes to stupid clients), they simply refuse to allow the Calendar table.  In those cases, the weekday algorithm I made can be used in either a function (bit of slothful RBAR there) or in a query directly (set based on columns, not variables).  So saying the approach is totally wrong is incorrect... the approach is right for stupid customers that don't want to learn 

    The other thing that you and other folks need to be made aware of, is that this article was written BEFORE (go look at the original date on the first post, I have no idea why they republished it) I had a real appreciation for the power of Tally and Calendar tables... you're actually preaching to the choir and the posts since then reflect that appreciation for those helper tables. 

    I just like arguing with you, ol' friend and favorite "adversary", because I get better data from you that way

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Oh yeah... almost forgot this... and speaking of "Thinking in Sets", Joe, you might want to include this in your new book... just because something is sans-loop, doesn't mean it's "Set Based".   The real reason why some of my clients (they're not so stupid after all) won't allow the Calendar table is because Calendar tables absolutely suck when it comes to batch code.  Calendar tables are only good when it comes to the RBAR associated with GUI code and a lot of DBA's worry that such things will be used by developers in batch code.

    As I stated before, "extraordinary claims require extraordinary proof"... and I'm not immune from that requirement...  so here it is... the famous "Jeff Moden Million Row Test"!   I'll test my algorithm ... you test the Calendar table method (actually, anyone can jump in... C'mon! It'll be fun )...

    Note that all dates with times will be treated as if the time component were midnight just to keep life simple...

    Here's the million row test table... feel free to modify it if you need to...

    --===== Create and populate a 1,000,000 row test table.
         -- Column RowNum has a range of 1 to 1,000,000 unique numbers
         -- Column "StartDate" has a range of  >=01/01/2000 and <01/01/2010 non-unique date/times
         -- Column "EndDate" is always greater than start date by some random number of days
         
         -- Takes about 19 seconds to execute.
     SELECT TOP 1000000
            RowNum     = IDENTITY(INT,1,1),
            StartDate  = CAST(RAND(CAST(NEWID() AS VARBINARY))*3653.0+36524.0 AS DATETIME),
            EndDate    = CAST(NULL AS DATETIME)
       INTO dbo.JBMTest
       FROM Master.dbo.SysColumns sc1,
            Master.dbo.SysColumns sc2 --Lack of join criteria makes this a CROSS-JOIN
    --===== A table is not properly formed unless a Primary Key has been assigned
      ALTER TABLE dbo.JBMTest
            ADD PRIMARY KEY CLUSTERED (RowNum)
    --===== Create the end date so that it's some random number of days greater than
         -- the start data
     UPDATE dbo.JBMTest
        SET EndDate = StartDate + CAST(RAND(CAST(NEWID() AS VARBINARY))*365 AS INT)

    ... and here's a couple of submittals using the algorithm in different ways... see which one you can beat with the use of a Calendar table...

    Method 1 - Direct Set Based (for use when you CAN'T modify the table)

    --===== Do the test
        SET STATISTICS TIME ON
        SET STATISTICS IO ON
     SELECT StartDate,
            EndDate,
            WeekDays = (DATEDIFF(dd, StartDate, EndDate) + 1)
                     - (DATEDIFF(wk, StartDate, EndDate) * 2)
                     - (CASE WHEN DATENAME(dw, StartDate) = 'Sunday' THEN 1 ELSE 0 END)
                     - (CASE WHEN DATENAME(dw, EndDate) = 'Saturday' THEN 1 ELSE 0 END)
       FROM dbo.JBMTest
    
    Table 'JBMTest'. Scan count 1, logical reads 3587, physical reads 0, read-ahead reads 0.
    SQL Server Execution Times:
       CPU time = 3453 ms,  elapsed time = 19144 ms.

    Method 2 - Embedded Set Based (for use when you CAN modify the table... obviously, my favorite for ease of use)

    Right... let's see you do this with a Calendar table

    --===== Add a calculated column to the table to solve for weekdays
      ALTER TABLE dbo.JBMTest
        ADD WEEKDAYS AS (DATEDIFF(dd, StartDate, EndDate) + 1)
                      - (DATEDIFF(wk, StartDate, EndDate) * 2)
                      - (CASE WHEN DATENAME(dw, StartDate) = 'Sunday' THEN 1 ELSE 0 END)
                      - (CASE WHEN DATENAME(dw, EndDate) = 'Saturday' THEN 1 ELSE 0 END)
    --===== Do the test
        SET STATISTICS TIME ON
        SET STATISTICS IO ON
     SELECT * FROM dbo.JBMTest
    Table 'JBMTest'. Scan count 1, logical reads 3587, physical reads 0, read-ahead reads 0.
    SQL Server Execution Times:
       CPU time = 3875 ms,  elapsed time = 21103 ms.

    So, Joe... when you say "This is the totally wrong approach", I'm thinking you're totally wrong

    Need any help "thinking in sets"?

    Serqiy ol' buddy... any idea how I can make the calculated column method deterministic on this one so we can slap an index on it if we need it?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You know, I use to have on my local test server Ukrainian collation and German language as default settings.

    Just to make all locale dependant things like DATENAME(dw, StartDate) = 'Sunday' to fail immediately.

    I don't like strings. To the bone.

    This returns me "1" at the end:

    CREATE FUNCTION dbo.TestDateWorkDiff

    (@StartDate datetime, @EndDate datetime)

    RETURNS int

    WITH SCHEMABINDING

    AS

    BEGIN

    DECLARE @ZeroDate datetime

    SET @ZeroDate = CONVERT(datetime, '1900-01-01', 120)

    RETURN (DATEDIFF(dd, @StartDate, @EndDate) + 1)

    - (DATEDIFF(wk, @StartDate, @EndDate) * 2)

    - (CASE WHEN (datediff(dd, @ZeroDate, @StartDate)+1)%7 = 0 THEN 1 ELSE 0 END)

    - (CASE WHEN (datediff(dd, @ZeroDate, @EndDate)+1)%7 = 6 THEN 1 ELSE 0 END)

    END

    GO

    SELECT objectproperty(object_ID('dbo.TestDateWorkDiff'), 'ISDETERMINISTIC')

    ---------------------------

    So, This should work for your table:

    DD WEEKDAYS AS (DATEDIFF(dd, StartDate, EndDate) + 1)

    - (DATEDIFF(wk, StartDate, EndDate) * 2)

    - (CASE WHEN (datediff(dd, CONVERT(datetime, '1900-01-01', 120), StartDate)+1)%7 = 0 THEN 1 ELSE 0 END)

    - (CASE WHEN (datediff(dd, CONVERT(datetime, '1900-01-01', 120), EndDate)+1)%7 = 6 THEN 1 ELSE 0 END)

    But, as for me, using of the function in that column definition is preferable.

    _____________
    Code for TallyGenerator

  • Hi Jeff,

    The database table I am interrogating is called Object which includes fields called StartDATE and EndDATE which are both of data type Float 8.  The table has only one record in it at present with the Start and End dates of Wed 1st May 2002 and Fri 31st May 2002 respectively.  I am using SQL Server 2000 and have added the Start and End dates to the output for you.

    Using the code:

    SELECT StartDATE, EndDATE,

       (DATEDIFF(dd, StartDATE, EndDATE) + 1)

      -(DATEDIFF(wk, StartDATE, EndDATE) * 2)

      -(CASE WHEN DATENAME(dw, StartDATE) = 'Sunday' THEN 1 ELSE 0 END)

      -(CASE WHEN DATENAME(dw, EndDATE) = 'Saturday' THEN 1 ELSE 0 END)

    FROM Object

    I get the output:

    37377.0     37407.0     21 (which I believe should be 23)

    With date range Thu 2nd May to Fri 31st May I get:

    37378.0     37407.0     20 (1 day less than above, as expected)

    With date range Fri 3rd May to Fri 31st May I get:

    37379.0     37407.0     20 (no difference, which is odd)

    With date range Fri 3rd May to Thu 30th May I get:

    37379.0     37406.0     20 (no difference, which is odd)

    With date range Fri 3rd May to Wed 29th May I get:

    37379.0     37405.0     20 (no difference, which is odd)

    With date range Fri 3rd May to Tue 28th May I get:

    37379.0     27404.0     19 (1 day less than above)

    Very strange!

  • Dizzy,

    Cant you include explicit conversion to datetime in this test?

    And display converted dates in line with other values?

    _____________
    Code for TallyGenerator

  • Hi Sergiy,

    Not sure how to do this ... also, not sure of the value of doing this - what would it tell me?  The examples in my post are clearly very odd and I'm not sure that datetime will help.

    Please explain. Thanks

  • SELECT StartDATE, EndDATE,

    CONVERT(datetime, StartDATE) dtStartDate,

    CONVERT(datetime, EndDATE) dtEndDATE,

    (DATEDIFF(dd, CONVERT(datetime, StartDATE), CONVERT(datetime, EndDATE)) + 1)

    -(DATEDIFF(wk, CONVERT(datetime, StartDATE), CONVERT(datetime, EndDATE)) * 2)

    -(CASE WHEN DATENAME(dw, CONVERT(datetime, StartDATE)) = 'Sunday' THEN 1 ELSE 0 END)

    -(CASE WHEN DATENAME(dw, CONVERT(datetime, EndDATE)) = 'Saturday' THEN 1 ELSE 0 END)

    FROM Object

    _____________
    Code for TallyGenerator

  • In point of fact, I actually wrote a sketchy, cumbersome, client-side loop to "calculate" the date that materials needed to be ordered based on the date they were needed for a project and the lead time (as reported in the or database in whole weekdays) for the materials being ordered.  Unlike many of the nay-sayers herein, holidays (jewish, british, Rhode Islandese, or otherwise) are inconsequential for my purposes.

    So how 'bout it?  Can you come up with a function that takes a "Needed By" date and a "Lead Time" in weekdays, and returns an "Order By" date?

  • I was interested by the title since I've spent a large amount of time trying to get a working days calculated correctly in the past year. My requirements meant that I needed a solution which would handle public holidays and weekends for multiple countries. Additionally I needed to know what working day in a month any particular date was in any particular country (e.g. today 10th Sept 2007 is Work Day 6 in the UK, but only Work Day 5 in the US due to Labor Day).

    I went with a calendar table generated from a function which uses a holiday table and has definitions for eight types of holiday. These are:

    1. Fixed date every year.

    2. Days relative to Easter.

    3. Fixed date but will slide to next Monday if on a weekend

    4. Fixed date but slides to Monday if Saturday or Tuesday if Sunday (UK Boxing Day is the only one AFAIK).

    5. Specific day of week after a given date (usually first/last Monday in a month but can be other days, e.g. 1st Thursday after 11/22 = Thanksgiving)

    6. Days relative to Greek Orthodox Easter (not always the same as Western Easter)

    7. Fixed date in Hijri (Muslim) calender - this turns out to only be approximate due to the way the calendar works.

    8. Days relative to previous Winter Solstice! (Chinese holiday of Qing Ming Jie)

    As you can see, some of these are getting a bit esoteric (and inaccurate?).

    I also had to make allowance for the possibility that a weekend was not Saturday/Sunday. For example, I'm told that in Iran the weekend is just Friday!

    I still think the article raises good points about the specifics of the way functions like DATEDIFF work, but I'd also like to point out that the other thing that I discovered while trying to speed up the calculation is that it's very important to get calls to UDFs (if you use them) in the right place.

    I have a test file which returns the same results for 4 queries using a UDF which is basically a single query. The times are significantly different.

    1. Select using a WHERE date > UDF - about 2 minutes.

    2. Select using a JOIN to a subquery selecting from the UDF - about 2 minutes.

    3. Selecting the previous subquery into a temporary table, then doing the select joined to the temporary table - 0.4 seconds!

    4. As 2, but inserting the UDF's query directly into the subquery instead of calling it - 0.03 seconds!

    My conclusions:

    1. Avoid UDFs except when absolutely necessary if you need fast code.

    2.SQL server won't pick up optimizations which may seem obvious since, despite the fact that UDFs must be deterministic, it doesn't seem to make use of the fact and simply calls UDFs with the same arguments multiple times.

    Just my 2 cents worth...

    Derek.

    Derek

  • Hi Sergiy,

    Thanks for the code.  That has now shed some light on the matter.  With the start and end dates reset to 1st May and 31st May respectively, when I use your code it gives me the following output:

    37377.0     37407.0     2002-05-03 00:00:00.000     2002-06-02 00:00:00.000     21

    So for some reason although the application shows the dates as 1st to 31st May, the SQL Server is interpreting those as 3rd May and 2nd June respectively, which then makes the weekday count correct at 21.

    Why on earth would it being this?  Is there some setting in this particular db that has the base date (value 0) offset by 2 days for some reason?

    Thanks

    -----------

    Further information/...

    Having spoken to the application developers the problem I was having is attributed a "known" issue where there is a 2-day difference between the date functionality in the programming environment and the SQL date functionality.  I should therefore use CONVERT(datetime, StartDATE-2)

    Problem solved!  Thank you for helping me solve it.

  • Over the years I've written several variations of work day calculator functions in a variety of different languages. The earliest was written in FORTRAN. The latest was a SQL Server 2000 script that I wrote about a year ago.

    My method would use an offset date to calculate Days-from-offset. Simple math would calculate number of days between dates as well as easily calculating the number of weekends (to be removed from the count of days between dates).

    The offset date had to be a Monday, Jan. 1, making sure it was following a leap year. By knowing this date and the offset you can calculate leap-year(s), day of week and even include a holiday calendar.

    My 2 cents worth.

    Kurt

    DBA

    RHWI

    Poughkeepsie, NY

    Kurt W. Zimmerman
    SR DBA
    Lefrak Organization
    New York, NY

    http://www.linkedin.com/in/kurtwzimmerman

  • Joe, read my post on this subject. I agree... Julian dates work much easier than Gregorian dates.

    Kurt

    DBA

    RHWI

    Poughkeepsie, NY

    Kurt W. Zimmerman
    SR DBA
    Lefrak Organization
    New York, NY

    http://www.linkedin.com/in/kurtwzimmerman

  • > So for some reason although the application shows the dates as 1st to 31st May,

    VB6 has 2 days offset when converting floats to dates.

    Zero date in VB6 is '1899-12-30', not '1900-01-01' as in SQL Server.

    _____________
    Code for TallyGenerator

  • You really need to look at a good calendar for the year 2002...

    2002-05-03 is a Friday... count 1 day

    There are 4 full weeks after that up to 2005-05-31...  count 20 days

    2002-06-01 is a Saturday... count no days

    2002-06-02 is a Sunday... count no days

    Total week days = 21... the algorithm is correct.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 61 through 75 (of 156 total)

You must be logged in to reply to this topic. Login to reply