Function to find whether last days of the month is weekend

  • Jeff Moden (3/12/2012)


    For the most part, I agree but there are some grand exceptions. There are many places where T-SQL code will beat SQLCLR simply because the T-SQL doesn't have to go through an API. An example of such a thing is some simple cases of RegEx. There are also places where Scalar Functions will handily beat Inline code and iTVFs. A good example of such a thing would be a function to replace the first letter of every word with a capitalized version. Oddly enough, not only does the Scalar function win but it also uses a While Loop.

    Rules of thumb are great guides but they should only inspire, not necessarily restrict because for nearly every rule, there's an high performance exception. I've seen lot's of people miss out on some very high performance solutions because their "Rules of thumb" prevented them from even trying alternate ideas.

    Using RegEx? Haven't I said that CLR should be implemented properly?

    "replace the first letter of every word with a capitalized version" - wouldn't a tally-based splitter + concatenating back with FOR XML PATH beat a scalar function with a loop? I have no time to try, but have some strange feelings 🙂

    Totally agree with you on "Rules of thumb" sentence ...

    One more thing about "a function to replace the first letter of every word with a capitalized version". Cannot see, how it can be done in In-line SQL code if I do understand the requirement correctly: replace the first letter of every word in a text with a capitalized version. Actually, I would do it in CLR, or not do it at all - formatting is a client/UI job 😉

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (3/12/2012)


    Jeff Moden (3/12/2012)


    For the most part, I agree but there are some grand exceptions. There are many places where T-SQL code will beat SQLCLR simply because the T-SQL doesn't have to go through an API. An example of such a thing is some simple cases of RegEx. There are also places where Scalar Functions will handily beat Inline code and iTVFs. A good example of such a thing would be a function to replace the first letter of every word with a capitalized version. Oddly enough, not only does the Scalar function win but it also uses a While Loop.

    Rules of thumb are great guides but they should only inspire, not necessarily restrict because for nearly every rule, there's an high performance exception. I've seen lot's of people miss out on some very high performance solutions because their "Rules of thumb" prevented them from even trying alternate ideas.

    Using RegEx? Haven't I said that CLR should be implemented properly?

    "replace the first letter of every word with a capitalized version" - wouldn't a tally-based splitter + concatenating back with FOR XML PATH beat a scalar function with a loop? I have no time to try, but have some strange feelings 🙂

    Totally agree with you on "Rules of thumb" sentence ...

    One more thing about "a function to replace the first letter of every word with a capitalized version". Cannot see, how it can be done in In-line SQL code if I do understand the requirement correctly: replace the first letter of every word in a text with a capitalized version. Actually, I would do it in CLR, or not do it at all - formatting is a client/UI job 😉

    I recently wrote an article (publication due on march 23) on formatting/parsing dates in SQL Server and I found very interesting results on this topic.

    Parsing a string to find separators, whitespaces and contiguos date part placeholders is somehow similar to the above problem.

    I coded two CLR functions, a scalar UDF (with a WHILE loop) and three different flavors of iTVFs: the clear winner was the scalar CLR (by far), but the scalar UDF outperformed all the iTVFs.

    What I found really surprising was that the scalar CLR function was a lot faster that the new FORMAT function of SQL2012.

    -- Gianluca Sartori

  • Gianluca Sartori (3/12/2012)

    I recently wrote an article (publication due on march 23) on formatting/parsing dates in SQL Server and I found very interesting results on this topic.

    Parsing a string to find separators, whitespaces and contiguos date part placeholders is somehow similar to the above problem.

    I coded two CLR functions, a scalar UDF (with a WHILE loop) and three different flavors of iTVFs: the clear winner was the scalar CLR (by far), but the scalar UDF outperformed all the iTVFs.

    What I found really surprising was that the scalar CLR function was a lot faster that the new FORMAT function of SQL2012.

    That's why I've placed CLR on top of all SQL UDF's (iSVF's and iTVF's).

    I guess in some cases proper-done CLR can outperform some "in-line-SQL" manipulations as well, based on your observations, looks like it will be the case for SQL2012 in-build FORMAT, which most likely is implemented in C# anyway. However, I wouldn't hold for granted that it will be for any formatting cases...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (3/12/2012)


    ... SQL2012 in-build FORMAT, which most likely is implemented in C# anyway.

    It is.

    In fact, the first time you call it it's slightly slower than subsequent calls, because SQL Server has to load the assembly.

    -- Gianluca Sartori

  • As promised...

    First, here's one version of a Calendar Table that I've used in the past. It' complete with some typical U.S.A. holidays and the two special columns I've talked about. The documentation in the code almost makes an article itself.

    --===== Do this in a nice, safe place that everyone has

    -- because we're going to donditionally drop everything

    -- before we make it and most of it ISN'T temporary stuff.

    USE tempdb

    ;

    GO

    --===== Conditionally drop the Calendar table to make reruns in SSMS easier.

    IF OBJECT_ID('dbo.Calendar') IS NOT

    NULL DROP TABLE dbo.Calendar;

    ;

    go

    WITH

    cteGenDates AS

    (

    --===== Calculates all dates from 2000-01-01 up to and not including 2100-01-01

    SELECT TOP (DATEDIFF(dd,'2000','2100'))

    DT = DATEADD(dd,ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1,'2000')

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ),

    cteGenDateParts AS

    (

    --===== Calculates the rest of the basic columns that we can calculate so far

    -- as well as passing the calculated date along.

    SELECT DT,

    YY = DATEPART(yy,DT),

    MM = DATEPART(mm,DT),

    DD = DATEPART(dd,DT),

    DW = DATEDIFF(dd,0,DT)%7+1

    FROM cteGenDates

    ),

    cteGenDWMonth AS

    (

    --===== Calculates a special column that will make finding things like

    -- the 3rd Wednesday of a month a whole lot easier as well as

    -- passing the other columns we've calculated along.

    SELECT DT, YY, MM, DD, DW,

    DWMonth = ROW_NUMBER() OVER (PARTITION BY YY,MM,DW ORDER BY DT)

    FROM cteGenDateParts

    ),

    cteGenHolidays AS

    (

    --===== Calculate all holidays that don't depend on other dates.

    -- Some of the holidays are "fixed" dates like New Year's day.

    -- Others, like George Washington's Birthday and Thanksgiving occur

    -- on the Nth occurrence of a given weekday within a given month (DWMonth).

    -- Calculate the weekends, as well.

    SELECT DT, YY, MM, DD, DW, DWMonth,

    IsWeekEnd = CASE WHEN DW >= 6 THEN 1 ELSE 0 END, --"6" is Saturday, "7" is Sunday

    HolidayID =

    CASE

    WHEN MM = 1 AND DD = 1 THEN 1 --New Year's Day

    WHEN MM = 1 AND DW = 1 AND DWMonth = 3 THEN 2 --Martin Luther King, Jr. Birthday

    WHEN MM = 2 AND DW = 1 AND DWMonth = 2 THEN 3 --George Washington's Birthday

    WHEN MM = 5 AND DW = 1 AND DATEPART(mm,DATEADD(dd,7,DT)) = 6 THEN 4 --Memorial Day

    WHEN MM = 7 AND DD = 4 THEN 5 --Independence Day

    WHEN MM = 9 AND DW = 1 AND DWMonth = 1 THEN 6 --Labor Day

    WHEN MM = 10 AND DW = 1 AND DWMonth = 2 THEN 7 --Columbus Day

    WHEN MM = 11 AND DD = 11 THEN 8 --Veterans Day

    WHEN MM = 11 AND DW = 1 AND DWMonth = 4 THEN 9 --ThanksGivingDay

    WHEN MM = 12 AND DD = 25 THEN 10 --Christmas Day

    WHEN DT = DATEADD(dd,DATEDIFF(dd,0,CAST(DATENAME(yy,DT)

    + CASE YY % 19

    WHEN 0 THEN '0415'

    WHEN 1 THEN '0404'

    WHEN 2 THEN '0324'

    WHEN 3 THEN '0412'

    WHEN 4 THEN '0401'

    WHEN 5 THEN '0419'

    WHEN 6 THEN '0409'

    WHEN 7 THEN '0329'

    WHEN 8 THEN '0417'

    WHEN 9 THEN '0406'

    WHEN 10 THEN '0326'

    WHEN 11 THEN '0414'

    WHEN 12 THEN '0403'

    WHEN 13 THEN '0323'

    WHEN 14 THEN '0411'

    WHEN 15 THEN '0331'

    WHEN 16 THEN '0418'

    WHEN 17 THEN '0408'

    WHEN 18 THEN '0328'

    ELSE NULL

    END

    AS DATETIME))/7*7,6)

    THEN 11 --Easter Sunday

    ELSE 0

    END

    FROM cteGenDWMonth

    ),

    cteGenFMHolidays AS

    ( --=== Calculate Monday/Friday holidays where the holiday fell on a weekend

    SELECT DT, YY, MM, DD, DW, DWMonth, IsWeekEnd,

    HolidayID = CASE

    WHEN MM = 12 AND DD = 31 AND DW = 5 THEN 12 --Day before NewYears is Friday

    WHEN MM = 1 AND DD = 2 AND DW = 1 THEN 12 --Day after NewYears is a Monday

    WHEN MM = 7 AND DD = 3 AND DW = 5 THEN 13 --Day before Independence Day is Friday

    WHEN MM = 7 AND DD = 5 AND DW = 1 THEN 13 --Day after Independence is a Monday

    WHEN MM = 11 AND DD = 10 AND DW = 5 THEN 14 --Day before Veterans Day is a Friday

    WHEN MM = 11 AND DD = 12 AND DW = 1 THEN 14 --Day after Veterans Day is a Monday

    WHEN MM = 12 AND DD = 24 AND DW = 5 THEN 15 --Day before Christmas is Friday

    WHEN MM = 12 AND DD = 26 AND DW = 1 THEN 15 --Day after Christmas is a Monday

    ELSE HolidayID

    END

    FROM cteGenHolidays

    )

    --===== Make all of the columns NOT NULL and cast each of them

    -- to the smallest datatype possible without adding possible

    -- implicit conversions to the mix.

    -- We also add in two columns that we're not yet ready to fill in.

    SELECT DT = ISNULL(DT,0), --Already a DATETIME

    YY = ISNULL(CAST(YY AS SMALLINT),0),

    MM = ISNULL(CAST(MM AS TINYINT),0),

    DD = ISNULL(CAST(DD AS TINYINT),0),

    DW = ISNULL(CAST(DW AS TINYINT),0),

    DWMonth = ISNULL(CAST(DWMonth AS TINYINT),0),

    IsWeekEnd = ISNULL(CAST(IsWeekEnd AS TINYINT),0),

    HolidayID = ISNULL(CAST(HolidayID AS TINYINT),0),

    IsWorkDay = ISNULL(CAST(0 AS TINYINT),0),

    WorkDayNum = ISNULL(CAST(0 AS INT),0) --Just to be sure

    INTO dbo.Calendar

    FROM cteGenHolidays

    ;

    --===== Update Good Friday and the Friday after Thanksgiving

    UPDATE tgt

    SET HolidayID = d.HolidayID

    FROM dbo.Calendar tgt

    INNER JOIN

    ( --=== Find dates and holiday IDs for Thanksgiving and Easter Sunday

    -- and translate them to dates for Day after Thanksgiving and Good Friday

    SELECT DT = CASE

    WHEN HolidayID = 9 THEN DATEADD(dd, 1,DT) --Day after Thanksgiving

    WHEN HolidayID = 11 THEN DATEADD(dd,-2,DT) --Good Friday

    END,

    HolidayID = CASE

    WHEN HolidayID = 9 THEN 16 --Day after Thanksgiving

    WHEN HolidayID = 11 THEN 17 --Good Friday

    END

    FROM dbo.Calendar

    WHERE HolidayID IN (9,11) --Thanksgiving, Easter

    ) d

    ON tgt.DT = d.DT

    ;

    --===== Update IsWorkday based on weekends and holidays

    UPDATE dbo.Calendar

    SET IsWorkday = CASE

    WHEN IsWeekend = 0 --Isn't a weekend day

    AND HolidayID IN (0,17) --Isn't a holiday or is Good Friday

    THEN 1

    ELSE 0

    END

    ;

    -------------------------------------------------------------

    --===== Add the quintessential clustered index as a PK

    ALTER TABLE dbo.Calendar

    ADD CONSTRAINT PK_Calendar

    PRIMARY KEY CLUSTERED (DT ASC)

    WITH FILLFACTOR = 100

    ;

    --===== Number the work days making the weekends and holidays

    -- have the same workday as the last previous workday.

    DECLARE @Counter INT,

    @Anchor DATETIME,

    @WorkDayNum INT

    ;

    SELECT @Counter = 1,

    @WorkDayNum = 0

    ;

    WITH

    cteQuirkyUpdate AS

    (

    SELECT RowNum = ROW_NUMBER() OVER (ORDER BY DT),

    DT,

    WorkDayNum,

    IsWorkDay

    FROM dbo.Calendar tgt WITH (TABLOCKX)

    )

    UPDATE tgt

    SET @WorkDayNum = WorkDayNum

    = CASE

    WHEN RowNum = @Counter

    THEN @WorkDayNum + IsWorkDay

    ELSE 1/0 --Forces error if rows get out of sync

    END,

    @Anchor = DT,

    @Counter = @Counter + 1

    FROM cteQuirkyUpdate tgt

    OPTION (MAXDOP 1)

    ;

    GO

    --===== Add another very important index for "workday" related searches.

    CREATE UNIQUE INDEX ix_Calendar_WorkDayNum

    ON dbo.Calendar (WorkDayNum ASC, DT DESC)

    WITH FILLFACTOR = 100

    ;

    GO

    SELECT * FROM dbo.Calendar

    ;

    Next, here's some of the fast (and correct) functions from this thread. The "Hybrid" function is new and not only demonstrates the WorkDayNum column, it also uses conditional logic to prevent looking up more than 1 row if that row is actually a work day.

    The first function uses only calculations. It's short coming is that it doesn't observe holidays where the other 3 do. The second function is the one Graham posted with a slight modification to make it work with this particular Calendar Table. The 3rd one is a modification of the second one. It uses TOP with a sort to find the "MIN" workday instead of actually using MIN. The 4th one is the "Hybrid" I spoke of. It's a bit slower than the pure caclulation function but comparatively not by much.

    Again, the code is heavily documented.

    --===== Do this in a nice, safe place that everyone has

    -- because we're going to donditionally drop everything

    -- before we make it and most of it ISN'T temporary stuff.

    USE tempdb

    ;

    GO

    --===== Conditionally drop all of the functions to make reruns in SSMS easier.

    IF OBJECT_ID('dbo.FirstWorkDayAfterPrevMonthCALC') IS NOT

    NULL DROP FUNCTION dbo.FirstWorkDayAfterPrevMonthCALC;

    IF OBJECT_ID('dbo.FirstWorkDayAfterPrevMonthMIN') IS NOT

    NULL DROP FUNCTION dbo.FirstWorkDayAfterPrevMonthMIN;

    IF OBJECT_ID('dbo.FirstWorkDayAfterPrevMonthTOP') IS NOT

    NULL DROP FUNCTION dbo.FirstWorkDayAfterPrevMonthTOP;

    IF OBJECT_ID('dbo.FirstWorkDayAfterPrevMonthHYBRID') IS NOT

    NULL DROP FUNCTION dbo.FirstWorkDayAfterPrevMonthHYBRID;

    GO

    --===== This function uses ONLY calculations.

    -- It does NOT refer to a calendar table.

    -- The problem with this function is that it is

    -- totally blind to holidays.

    CREATE FUNCTION dbo.FirstWorkDayAfterPrevMonthCALC

    (@pDateTime DATETIME)

    RETURNS TABLE AS

    RETURN

    WITH

    ctePrevMonthEnd AS

    (SELECT PrevMonthEnd = DATEADD(dd,-1,DATEADD(mm,DATEDIFF(mm,0,@pDateTime),0)))

    SELECT FirstWorkDayAfterPrevMonth =

    DATEADD(dd,

    CASE DATEDIFF(dd,0,PrevMonthEnd)%7

    WHEN 5 THEN 2

    WHEN 6 THEN 1

    ELSE 0

    END,

    PrevMonthEnd)

    FROM ctePrevMonthEnd

    ;

    GO

    --===== This function is similar to the one Graham made.

    -- The only functional modifications made were to make it

    -- compatible with the Calendar Table created by

    -- the attached code.

    CREATE FUNCTION dbo.FirstWorkDayAfterPrevMonthMIN

    (@pDateTime DATETIME)

    RETURNS TABLE AS

    RETURN

    SELECT FirstWorkDayAfterPrevMonth = MIN(DT)

    FROM dbo.Calendar

    WHERE (IsWorkDay = 1)

    AND (DT >= (DATEADD(mm, DATEDIFF(mm, 0, @pDateTime), 0)) -1)

    -- calendar date greater than or equal to last day of previous month

    GO

    --===== This function is nearly identical to Graham's code

    -- except it has been optimized by using TOP

    -- instead of MIN which removes an aggregate from

    -- the execution plan.

    CREATE FUNCTION dbo.FirstWorkDayAfterPrevMonthTOP

    (@pDateTime DATETIME)

    RETURNS TABLE AS

    RETURN

    SELECT TOP 1

    FirstWorkDayAfterPrevMonth = DT

    FROM dbo.Calendar

    WHERE IsWorkDay = 1

    AND DT >= DATEADD(dd,-1,DATEADD(mm, DATEDIFF(mm, 0, @pDateTime), 0))

    ORDER BY DT

    -- calendar date greater than or equal to last day of previous month

    GO

    --===== This function is a hybrid between the "calculation only"

    -- function and the lookup for a "MIN" first workday after

    -- the desired date except that it uses the lookup ONLY

    -- if the original date found was a non-workday.

    -- It also uses a special "WorkDayNum" column which is

    -- nothing more than a running count of workdays.

    -- Non-workdays will always have the same "WorkDayNum"

    -- as the last previous workday. It makes for some

    -- VERY fast lookups when workdays (business days) are

    -- part of the lookup.

    CREATE FUNCTION dbo.FirstWorkDayAfterPrevMonthHYBRID

    (@pDateTime DATETIME)

    RETURNS TABLE AS

    RETURN

    SELECT FirstWorkDayAfterPrevMonth =

    CASE

    WHEN c1.IsWorkDay = 1

    THEN c1.DT

    ELSE (SELECT TOP 1 DT

    FROM dbo.Calendar c2

    WHERE c2.WorkDayNum = c1.WorkDayNum + 1

    ORDER BY c2.DT DESC)

    END

    FROM dbo.Calendar c1

    WHERE c1.DT = DATEADD(dd,-1,DATEADD(mm,DATEDIFF(mm,0,@pDateTime),0))

    ;

    GO

    As most of the alumni on this forum will tell you, I believe that testing should usually be done on at least a million rows. With that idea in mind, this creates a narrow million row test table consisting of only whole (midnight time) dates just to keep it simple.

    --===== Conditionally drop the test table to make reruns easier in SSMS

    IF OBJECT_ID('tempdb..#ListOfDates','U') IS NOT NULL

    DROP TABLE #ListOfDates

    ;

    --===== Create the test table. The only thing it contains

    -- is a list of randomized "Whole" dates.

    SELECT TOP 1000000

    SomeDate = ABS(CHECKSUM(NEWID())) % DATEDIFF(dd,'2010','2020') + CAST('2010' AS DATETIME)

    INTO #ListOfDates

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    GO

    I ran the following test code several times to allow the Calendar table and the fuctions to cache like they would in the real world. Then I ran it one final time to get the timings using SQL Profiler (I no longer trust SET STATISTICS). Note that each function is executed 3 times in succession. Also note that I simply dump the results of each function into a variable so we're not getting tangled up with any display or I/O times that aren't a part of the test.

    --=============================================================================

    -- The following tests dump their output to a variable

    -- to take display and I/O time of the results out of

    -- the picture so we can measure just the performance

    -- of the actual functions.

    GO

    --===== Calculations only ====================================================='

    DECLARE @Bitbucket DATETIME;

    SELECT @Bitbucket = firstday.FirstWorkDayAfterPrevMonth

    FROM #ListOfDates

    CROSS APPLY dbo.FirstWorkDayAfterPrevMonthCALC(SomeDate) firstday

    ;

    GO 3

    --===== Calendar table with MIN ==============================================='

    DECLARE @Bitbucket DATETIME;

    SELECT @Bitbucket = firstday.FirstWorkDayAfterPrevMonth

    FROM #ListOfDates

    CROSS APPLY dbo.FirstWorkDayAfterPrevMonthMIN(SomeDate) firstday

    ;

    GO 3

    --===== Calendar table with TOP/ORDER BY ======================================'

    DECLARE @Bitbucket DATETIME;

    SELECT @Bitbucket = firstday.FirstWorkDayAfterPrevMonth

    FROM #ListOfDates

    CROSS APPLY dbo.FirstWorkDayAfterPrevMonthTOP(SomeDate) firstday

    ;

    GO 3

    --===== Calendar table with HYBRID lookup ====================================='

    DECLARE @Bitbucket DATETIME;

    SELECT @Bitbucket = firstday.FirstWorkDayAfterPrevMonth

    FROM #ListOfDates

    CROSS APPLY dbo.FirstWorkDayAfterPrevMonthHYBRID(SomeDate) firstday

    ;

    GO 3

    Here are the results from SQL Profiler. The results are pretty easy to read. I'll let you be the judge as to whether or not the WorkDayNum table is worth its salt or not.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 31 through 34 (of 34 total)

You must be logged in to reply to this topic. Login to reply