Is there a Coalesce LIke function that returns the lowest/highest value of a set of params passed to it

  • The Dixie Flatline (10/23/2012)


    p.s. I'm pretty sure that the presence of any NULL dates are going to throw a wrench into the gears.

    Agreed. One fix would be to add COALESCE or ISNULL expressions to work with the date parameters in order to force NULLS to high dates when calculating minimums, and low when calculating maximums.

    P.S. Thanks for running the time trials, Luis. Would you run it again on your machine using Jeff's function? I wouldn't expect any overhead from CROSS APPLYing an itvf, but it would still be nice to see the final results. 🙂

    OK, I've run it again and added NULLs to the test. As expected, my query that was later transformed into a function had to be fixed to show the correct results. With that fix, there was an important change in the time results. However, I wasn't sure on the best way to fix Dixie's V2 so I didn't include it in the test.

    I'm attaching the functions scripts.

    The results on miliseconds for the million row test is here.

    Dixie's V1 and Eugenes are basically the same but they differ on how they construct the table.

    --Average after 50 runs

    Dixie_V1 Luis Eugene

    ----------- ----------- -----------

    570 638 572

    --Detail of the 50 runs

    Dixie_V1 Luis Eugene

    ----------- ----------- -----------

    573 647 590

    571 647 581

    561 643 559

    559 643 599

    558 634 563

    563 638 559

    554 631 561

    562 631 567

    582 631 559

    556 630 566

    563 633 557

    558 629 559

    576 630 561

    560 634 575

    562 634 567

    559 636 588

    565 656 565

    554 637 562

    567 634 580

    566 630 561

    565 654 618

    574 658 588

    619 649 626

    603 651 569

    577 637 569

    605 646 588

    567 636 560

    580 642 559

    600 659 618

    637 642 582

    568 637 573

    571 640 570

    565 637 564

    568 660 566

    561 631 563

    565 632 562

    567 633 554

    565 634 617

    566 632 587

    562 631 565

    567 630 568

    564 637 571

    586 630 565

    563 642 561

    563 631 566

    571 636 561

    570 634 573

    571 633 564

    560 630 560

    564 634 567

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Another possible solution:

    USE ProofOfConcept;

    SET NOCOUNT ON;

    IF OBJECT_ID(N'tempdb..#T') IS NOT NULL

    DROP TABLE #T;

    CREATE TABLE #T

    (ID INT IDENTITY

    PRIMARY KEY,

    Col1 DATE,

    Col2 DATE,

    Col3 DATE,

    Col4 DATE);

    WITH Seeds(Seed)

    AS (SELECT *

    FROM ( VALUES ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1),

    ( 1), ( 1), ( 1) ) AS TVC (Seed)),

    Numbers(Number)

    AS (SELECT ROW_NUMBER() OVER (ORDER BY S1.Seed)

    FROM Seeds AS S1

    CROSS JOIN Seeds AS S2

    CROSS JOIN Seeds AS S3

    CROSS JOIN Seeds AS S4

    CROSS JOIN Seeds AS S5

    CROSS JOIN Seeds AS S6)

    INSERT INTO #T

    (Col1,

    Col2,

    Col3,

    Col4)

    SELECT DATEADD(DAY, CHECKSUM(NEWID()) % 10000, 0),

    DATEADD(DAY, CHECKSUM(NEWID()) % 10000, 0),

    DATEADD(DAY, CHECKSUM(NEWID()) % 10000, 0),

    DATEADD(DAY, CHECKSUM(NEWID()) % 10000, 0)

    FROM Numbers;

    WITH CTE

    AS (SELECT TOP 10000

    *

    FROM #T

    ORDER BY NEWID())

    UPDATE CTE

    SET Col1 = NULL;

    WITH CTE

    AS (SELECT TOP 10000

    *

    FROM #T

    ORDER BY NEWID())

    UPDATE CTE

    SET Col2 = NULL;

    WITH CTE

    AS (SELECT TOP 10000

    *

    FROM #T

    ORDER BY NEWID())

    UPDATE CTE

    SET Col3 = NULL;

    WITH CTE

    AS (SELECT TOP 10000

    *

    FROM #T

    ORDER BY NEWID())

    UPDATE CTE

    SET Col4 = NULL;

    GO

    IF OBJECT_ID(N'tempdb..#T2') IS NOT NULL

    DROP TABLE #T2;

    IF OBJECT_ID(N'tempdb..#T3') IS NOT NULL

    DROP TABLE #T3;

    DECLARE @Start DATETIME = GETDATE();

    SELECT ID,

    MAX([Date]) AS MaxDate

    INTO #T2

    FROM (SELECT *

    FROM #T UNPIVOT ( [Date] FOR Col IN (Col1, Col2, Col3, Col4) ) AS unpvt)

    AS Src

    GROUP BY ID;

    SELECT DATEDIFF(millisecond, @Start, GETDATE());

    SET @Start = GETDATE();

    SELECT ID,

    MaxDate

    INTO #T3

    FROM #T

    CROSS APPLY (SELECT MaxDate

    FROM dbo.FindMinMaxDate4(Col1, Col2, Col3, Col4)) AS MinMax;

    SELECT DATEDIFF(millisecond, @Start, GETDATE());

    GO 5

    (Includes full test harness and test data.)

    UDF modified as follows:

    USE [ProofOfConcept]

    GO

    /****** Object: UserDefinedFunction [dbo].[FindMinMaxDate4] Script Date: 10/24/2012 14:24:13 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE FUNCTION [dbo].[FindMinMaxDate4]

    (@pDateTime1 DATETIME,

    @pDateTime2 DATETIME,

    @pDateTime3 DATETIME,

    @pDateTime4 DATETIME)

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    RETURN

    SELECT MinDate = CASE WHEN @pDateTime1 <= COALESCE(@pDateTime2, @pDateTime3, @pDateTime4, @pDateTime1)

    AND @pDateTime1 <= COALESCE(@pDateTime3, @pDateTime4, @pDateTime1)

    AND @pDateTime1 <= COALESCE(@pDateTime4, @pDateTime1)

    THEN @pDateTime1

    WHEN @pDateTime2 <= COALESCE(@pDateTime3, @pDateTime4, @pDateTime2)

    AND @pDateTime2 <= COALESCE(@pDateTime4, @pDateTime2)

    THEN @pDateTime2

    WHEN @pDateTime3 <= COALESCE(@pDateTime4, @pDateTime3) THEN @pDateTime3

    ELSE @pDateTime4

    END,

    MaxDate = CASE WHEN @pDateTime1 >= COALESCE(@pDateTime2, @pDateTime3, @pDateTime4, @pDateTime1)

    AND @pDateTime1 >= COALESCE(@pDateTime3, @pDateTime4, @pDateTime1)

    AND @pDateTime1 >= COALESCE(@pDateTime4, @pDateTime1)

    THEN @pDateTime1

    WHEN @pDateTime2 >= COALESCE(@pDateTime3, @pDateTime4, @pDateTime2)

    AND @pDateTime2 >= COALESCE(@pDateTime4, @pDateTime2)

    THEN @pDateTime2

    WHEN @pDateTime3 >= COALESCE(@pDateTime4, @pDateTime3) THEN @pDateTime3

    ELSE @pDateTime4

    END;

    GO

    Results on my computer (high-end workstation):

    Unpivot vs UDF

    993 vs 1326

    1006 vs 1016

    946 vs 910

    1006 vs 1020

    1040 vs 1043

    The UDF was slower 3 out of 5, but marginally so.

    When I added (to the bottom of the test, before "GO 5"):

    CHECKPOINT;

    DBCC DROPCLEANBUFFERS;

    Unpivot vs UDF

    910 vs 990

    980 vs 1093

    983 vs 1100

    953 vs 1080

    986 vs 1120

    So, if you can't count on the data being in the buffer already, then the Unpivot option is faster 5 for 5. Again, that's a fraction of a second for 1-million rows of data, which is marginal at best.

    All my tests were run on a quad-core CPU with 16 Gig RAM, @@VERSION = Microsoft SQL Server 2008 R2 (SP1) - 10.50.2550.0 (X64) Jun 11 2012 16:41:53 Copyright (c) Microsoft Corporation Developer Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1).

    I used this to determine whether both were returning the same results:

    SELECT COUNT(*)

    FROM #T2 AS T2

    FULL OUTER JOIN #T3 AS T3

    ON T2.ID = T3.ID

    WHERE T2.ID IS NULL

    OR T3.ID IS NULL

    OR T2.MaxDate != T3.MaxDate;

    Count = 0

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks for all the replies. In case it make a doifference heres the tables DDL . Its a very simple table.

    The thing is a queue of reports called requests. There are 4 date/Time fields indicating the date/Time that the request first entered the queue, when it is scheduled to be processed when it actually started and when its done being processed. Sometimes the Scheduled column is equal to EnteredQueue while at other times its later then Enteredqueue and in some cases its before. I don’t know why it sometimes does this only that it does.

    When the Scheduled date/Time is earlier then the EnteredQueue date/time it messes up my calculations I run on the queue that determine how long on average a request waited in the queue and how long it took to process. I was looking for an easy way to return the lowest or highest value of 2 or more columns in the same row. I can Case it out but I try to avoid Case when possible just out of habit.

    Thanks again

    CREATE TABLE dbo.LE_QUEUE

    (

    hID NUMERIC(18,0) IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,

    RequestedBy VARCHAR(50) NOT NULL,

    Status INT NOT NULL,

    EnteredQueue DATETIME NOT NULL,

    Scheduled DATETIME NOT NULL,

    RequestStart DATETIME NULL,

    RequestEnd DATETIME NULL,

    hRequestDetail NUMERIC(18,0) NOT NULL

    CONSTRAINT PK_LE_QUEUE PRIMARY KEY CLUSTERED ( hID ASC )

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    )

    ON [PRIMARY]

    Kindest Regards,

    Just say No to Facebook!
  • GSquared (10/24/2012)


    Another possible solution:

    USE ProofOfConcept;

    SET NOCOUNT ON;.... OR T2.MaxDate != T3.MaxDate;

    Count = 0

    I just went to test out the ProofOfConcept UDf you provided and I realized that I don't see how I can use it so I'm hoping you can point out what it is that I'm not seeing.

    In my prior post I listed the DDL for the table I'm working with which has 4 date columns. What I'm trying to do is to return the Max date of 2 dateTime columns in a table. If I use the UDF I don't see how it can work.

    Its not that I'm not appreciative for the help but I just don't see hwo it can bd used. What is it I'm not seeing with the UDF you provded?

    Thanks

    Kindest Regards,

    Just say No to Facebook!
  • You originally were talking about four dates, so that's how we wrote the functions.

    Inline table valued functions can be implemented with a CROSS APPLY, like so.

    declare @datetimes table (ID int identity(1,1), date1 datetime, date2 datetime, date3 datetime, date4 datetime)

    insert into @datetimes

    select '11/03/2011','10/01/2012','01/01/2012','12/22/2011'

    union all

    select '11/03/2011','10/01/2012','01/01/2011','12/22/2012'

    select ID,ca.*

    from @datetimes

    cross apply dbo.HiLoDate2(date1,date2,date3,date4) ca

    Edited to add_ a second row of data with an ID column to the example.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Or did you mean to do something like this?

    declare @datetimes table (ID int identity(1,1), date1 datetime, date2 datetime)

    insert into @datetimes

    select '11/03/2011','12/01/2012'

    union all

    select '11/03/2011','10/01/2011'

    select MAX(case when Date1 >= Date2 then date1 else date2 end) as LatestDate

    from @datetimes

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • YSLGuru (10/26/2012)


    GSquared (10/24/2012)


    Another possible solution:

    USE ProofOfConcept;

    SET NOCOUNT ON;.... OR T2.MaxDate != T3.MaxDate;

    Count = 0

    I just went to test out the ProofOfConcept UDf you provided and I realized that I don't see how I can use it so I'm hoping you can point out what it is that I'm not seeing.

    In my prior post I listed the DDL for the table I'm working with which has 4 date columns. What I'm trying to do is to return the Max date of 2 dateTime columns in a table. If I use the UDF I don't see how it can work.

    Its not that I'm not appreciative for the help but I just don't see hwo it can bd used. What is it I'm not seeing with the UDF you provded?

    Thanks

    I didn't provide the UDF. I just modified it so it would work with NULL inputs.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • YSLGuru (10/24/2012)


    Thanks for all the replies. In case it make a doifference heres the tables DDL . Its a very simple table.

    The thing is a queue of reports called requests. There are 4 date/Time fields indicating the date/Time that the request first entered the queue, when it is scheduled to be processed when it actually started and when its done being processed. Sometimes the Scheduled column is equal to EnteredQueue while at other times its later then Enteredqueue and in some cases its before. I don’t know why it sometimes does this only that it does.

    When the Scheduled date/Time is earlier then the EnteredQueue date/time it messes up my calculations I run on the queue that determine how long on average a request waited in the queue and how long it took to process. I was looking for an easy way to return the lowest or highest value of 2 or more columns in the same row. I can Case it out but I try to avoid Case when possible just out of habit.

    Thanks again

    CREATE TABLE dbo.LE_QUEUE

    (

    hID NUMERIC(18,0) IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,

    RequestedBy VARCHAR(50) NOT NULL,

    Status INT NOT NULL,

    EnteredQueue DATETIME NOT NULL,

    Scheduled DATETIME NOT NULL,

    RequestStart DATETIME NULL,

    RequestEnd DATETIME NULL,

    hRequestDetail NUMERIC(18,0) NOT NULL

    CONSTRAINT PK_LE_QUEUE PRIMARY KEY CLUSTERED ( hID ASC )

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    )

    ON [PRIMARY]

    You're just comparing two columns to see which is earliest? Just use a Case statement and get it done. No reason to avoid them, Case isn't like cursors.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (10/29/2012)


    YSLGuru (10/24/2012)


    Thanks for all the replies. In case it make a doifference heres the tables DDL . Its a very simple table.

    The thing is a queue of reports called requests. There are 4 date/Time fields indicating the date/Time that the request first entered the queue, when it is scheduled to be processed when it actually started and when its done being processed. Sometimes the Scheduled column is equal to EnteredQueue while at other times its later then Enteredqueue and in some cases its before. I don’t know why it sometimes does this only that it does.

    When the Scheduled date/Time is earlier then the EnteredQueue date/time it messes up my calculations I run on the queue that determine how long on average a request waited in the queue and how long it took to process. I was looking for an easy way to return the lowest or highest value of 2 or more columns in the same row. I can Case it out but I try to avoid Case when possible just out of habit.

    Thanks again

    CREATE TABLE dbo.LE_QUEUE

    (

    hID NUMERIC(18,0) IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,

    RequestedBy VARCHAR(50) NOT NULL,

    Status INT NOT NULL,

    EnteredQueue DATETIME NOT NULL,

    Scheduled DATETIME NOT NULL,

    RequestStart DATETIME NULL,

    RequestEnd DATETIME NULL,

    hRequestDetail NUMERIC(18,0) NOT NULL

    CONSTRAINT PK_LE_QUEUE PRIMARY KEY CLUSTERED ( hID ASC )

    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    )

    ON [PRIMARY]

    You're just comparing two columns to see which is earliest? Just use a Case statement and get it done. No reason to avoid them, Case isn't like cursors.

    Thats exactly what I was going for. I originally mentioned 4 date columns as thats how many date/time columns the table has and I was orginally told I would need to look at all 4. As things proressed we eliminated the need to check all 4, narrowing it down to just 2 columns. I have a Case statemtn now but Iwas checking to see if there was something better then CASE. While it may be just 2 columns its millions of rows of data in a top heavy table.

    Thanks

    Kindest Regards,

    Just say No to Facebook!

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply