Very complex SQL Server logic to return multiple rows in a query even if there is only a single row of data, based on report date---will explain in post

  • matter2003 - Friday, March 22, 2019 12:13 PM

    drew.allen - Friday, March 22, 2019 12:04 PM

    matter2003 - Friday, March 22, 2019 11:03 AM

    drew.allen - Friday, March 22, 2019 9:06 AM

    Part of the problem is that your UDF is a scalar function (not a table-valued function) given the way that you are calling it.  I've substituted a Table Value Constructor for your function.
    PS: Scalar functions and table-valued functions are mutually exclusive.  You cannot have a scalar table-valued function.

    As Lynn mentioned, your sample data only covers one case.  I added a second case where the case had not yet been resolved.

    I believe that this gives you what you need.  It is very loosely based on work by Itzik Ben-Gan on packing intervals.

    WITH Case_Statuses AS
    (
        SELECT
            o.Case_ID
        ,    o.Contact_Date
        ,    o.Last_Updated_On AS Update_Status_Begin_Date
        ,    LEAD(o.Last_Updated_On, 1, COALESCE(r.Resolved_Date, '9999-12-31')) OVER(PARTITION BY o.Case_ID ORDER BY o.Last_Updated_On) AS Update_Status_End_Dt
        FROM #Open_Cases o
        LEFT OUTER JOIN #Resolved_Cases r
            ON o.Case_ID = r.Case_ID
    )

    SELECT *
    FROM Case_Statuses cs
    CROSS APPLY ( VALUES(1), (2), (3), (4), (5), (6) ) CalcAge(Age)
    CROSS APPLY ( VALUES(DATEADD(DAY, CalcAge.Age - 1, @Report_Start_Date) ) ) rd(Report_Date)
    WHERE cs.Update_Status_Begin_Date <= rd.Report_Date
        AND rd.Report_Date < cs.Update_Status_End_Dt
        AND rd.Report_Date BETWEEN @Report_Start_Date AND @Report_End_Date

    Drew

    Like the other post, the date selection range needs to be able to handle a dynamic date range...this date range will change, it is not going to remain the same and will vary by report and/or user entered values...is there a way to make the date range dynamic? (with the caveat it will skip over weekend and holidays and not include those---I can make a version of my calc age function that will return a date perhaps if it is not a weekend/holiday?)

    Drew, this solution works when I make the following change(minus the static date range):

      LEAD(o.Last_Updated_On, 1, COALESCE(r.Resolved_Date + 1, '9999-12-31')) OVER(PARTITION BY o.Case_ID ORDER BY o.Last_Updated_On) AS Update_Status_End_Dt

    However, I am not sure what #Open_Cases o and #Resolved_Cases r refer to(is this a temp table)?  If so, it was throwing an error...when I changed it to the Open_Cases and Resolved_Cases(the actual physical tables), it created duplicate rows in there(that somehow now can't be deleted??), which is not what I want...

    Any advice on what to do with that issue?

    Other than that it looks like the solution works properly...

    The problem here is that you didn't try to understand what the code is doing before deploying it to your own environment.  Yes, I used temp tables.  I don't want to create permanent tables on my database just to help you with a problem, so I converted all of the permanent tables to temp tables.  I didn't think it was necessary to explicitly tell you to change them back.  So, for future reference, you'll need to change these temp tables back to your own permanent tables.

    To make the dates dynamic you'll need to replace these two lines of code
    /*  First line to replace.  */
    CROSS APPLY ( VALUES(DATEADD(DAY, CalcAge.Age - 1, @Report_Start_Date) ) ) rd(Report_Date)

    /* Second line to replace. */
      AND rd.Report_Date BETWEEN @Report_Start_Date AND @Report_End_Date

    with the following code.
    /*  First replacement line.  */
    CROSS APPLY ( VALUES(DATEADD(DAY, CalcAge.Age - 1, @Report_Start_Date) ) ) rd(Report_Date)

    /* Second replacement line. */
      AND rd.Report_Date BETWEEN @Report_Start_Date AND @Report_End_Date

    Drew

    Drew,

    Apologies, I actually edited the initial reply to your first post that I had solved the temp table thing.  I also added in "Drop If Exists" logic to it as well so it will recreate the temp tables when it runs.

    I'm a little confused...it appears the original code is the same as the replacement code? 

    From what I am testing, it appears the following line is what is controlling the number of days that return:

    CROSS APPLY ( VALUES(1), (2), (3), (4), (5) ) CalcAge(Age)

    When I add in (6), (7), (8), etc it continues to return an additional day for each one added

    Exactly.  The solution I provided is only meant to guide you in the correct direction.  It is not meant to be the ULTIMATE answer that you can just cut and paste into your code.  You should never, ever simply paste code that you got from the Internet into your code.  You should always understand what it is doing first.

    I also saw your edit after I had posted.  You do not need to do a SELECT INTO to create the temp tables.  You should be using the permanent tables directly.  I only created the temp tables, because it makes it easier for me to clean up when I'm done coding the solution.  I certainly don't need permanent copies of every single set of sample data that I work with.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen - Friday, March 22, 2019 12:23 PM

    matter2003 - Friday, March 22, 2019 12:13 PM

    drew.allen - Friday, March 22, 2019 12:04 PM

    matter2003 - Friday, March 22, 2019 11:03 AM

    drew.allen - Friday, March 22, 2019 9:06 AM

    Part of the problem is that your UDF is a scalar function (not a table-valued function) given the way that you are calling it.  I've substituted a Table Value Constructor for your function.
    PS: Scalar functions and table-valued functions are mutually exclusive.  You cannot have a scalar table-valued function.

    As Lynn mentioned, your sample data only covers one case.  I added a second case where the case had not yet been resolved.

    I believe that this gives you what you need.  It is very loosely based on work by Itzik Ben-Gan on packing intervals.

    WITH Case_Statuses AS
    (
        SELECT
            o.Case_ID
        ,    o.Contact_Date
        ,    o.Last_Updated_On AS Update_Status_Begin_Date
        ,    LEAD(o.Last_Updated_On, 1, COALESCE(r.Resolved_Date, '9999-12-31')) OVER(PARTITION BY o.Case_ID ORDER BY o.Last_Updated_On) AS Update_Status_End_Dt
        FROM #Open_Cases o
        LEFT OUTER JOIN #Resolved_Cases r
            ON o.Case_ID = r.Case_ID
    )

    SELECT *
    FROM Case_Statuses cs
    CROSS APPLY ( VALUES(1), (2), (3), (4), (5), (6) ) CalcAge(Age)
    CROSS APPLY ( VALUES(DATEADD(DAY, CalcAge.Age - 1, @Report_Start_Date) ) ) rd(Report_Date)
    WHERE cs.Update_Status_Begin_Date <= rd.Report_Date
        AND rd.Report_Date < cs.Update_Status_End_Dt
        AND rd.Report_Date BETWEEN @Report_Start_Date AND @Report_End_Date

    Drew

    Like the other post, the date selection range needs to be able to handle a dynamic date range...this date range will change, it is not going to remain the same and will vary by report and/or user entered values...is there a way to make the date range dynamic? (with the caveat it will skip over weekend and holidays and not include those---I can make a version of my calc age function that will return a date perhaps if it is not a weekend/holiday?)

    Drew, this solution works when I make the following change(minus the static date range):

      LEAD(o.Last_Updated_On, 1, COALESCE(r.Resolved_Date + 1, '9999-12-31')) OVER(PARTITION BY o.Case_ID ORDER BY o.Last_Updated_On) AS Update_Status_End_Dt

    However, I am not sure what #Open_Cases o and #Resolved_Cases r refer to(is this a temp table)?  If so, it was throwing an error...when I changed it to the Open_Cases and Resolved_Cases(the actual physical tables), it created duplicate rows in there(that somehow now can't be deleted??), which is not what I want...

    Any advice on what to do with that issue?

    Other than that it looks like the solution works properly...

    The problem here is that you didn't try to understand what the code is doing before deploying it to your own environment.  Yes, I used temp tables.  I don't want to create permanent tables on my database just to help you with a problem, so I converted all of the permanent tables to temp tables.  I didn't think it was necessary to explicitly tell you to change them back.  So, for future reference, you'll need to change these temp tables back to your own permanent tables.

    To make the dates dynamic you'll need to replace these two lines of code
    /*  First line to replace.  */
    CROSS APPLY ( VALUES(DATEADD(DAY, CalcAge.Age - 1, @Report_Start_Date) ) ) rd(Report_Date)

    /* Second line to replace. */
      AND rd.Report_Date BETWEEN @Report_Start_Date AND @Report_End_Date

    with the following code.
    /*  First replacement line.  */
    CROSS APPLY ( VALUES(DATEADD(DAY, CalcAge.Age - 1, @Report_Start_Date) ) ) rd(Report_Date)

    /* Second replacement line. */
      AND rd.Report_Date BETWEEN @Report_Start_Date AND @Report_End_Date

    Drew

    Drew,

    Apologies, I actually edited the initial reply to your first post that I had solved the temp table thing.  I also added in "Drop If Exists" logic to it as well so it will recreate the temp tables when it runs.

    I'm a little confused...it appears the original code is the same as the replacement code? 

    From what I am testing, it appears the following line is what is controlling the number of days that return:

    CROSS APPLY ( VALUES(1), (2), (3), (4), (5) ) CalcAge(Age)

    When I add in (6), (7), (8), etc it continues to return an additional day for each one added

    Exactly.  The solution I provided is only meant to guide you in the correct direction.  It is not meant to be the ULTIMATE answer that you can just cut and paste into your code.  You should never, ever simply paste code that you got from the Internet into your code.  You should always understand what it is doing first.

    I also saw your edit after I had posted.  You do not need to do a SELECT INTO to create the temp tables.  You should be using the permanent tables directly.  I only created the temp tables, because it makes it easier for me to clean up when I'm done coding the solution.  I certainly don't need permanent copies of every single set of sample data that I work with.

    Drew

    Fair Enough...I have created a table function that will return every date in a date range outside of weekends and holidays, now just need to add it to the cross apply and hopefully it works...

    UPDATE: IT WORKS!

    --CROSS APPLY ( VALUES(1), (2), (3), (4), (5)) CalcAge(Age)

    CROSS APPLY ((SELECT RetVal FROM Get_Date_Range('3/18/2019', '3/28/2019','DD',1))) rd(Report_Date)

    Thanks for the help! You got me 90% of the way there!

  • drew.allen - Friday, March 22, 2019 12:23 PM

    matter2003 - Friday, March 22, 2019 12:13 PM

    drew.allen - Friday, March 22, 2019 12:04 PM

    matter2003 - Friday, March 22, 2019 11:03 AM

    drew.allen - Friday, March 22, 2019 9:06 AM

    Part of the problem is that your UDF is a scalar function (not a table-valued function) given the way that you are calling it.  I've substituted a Table Value Constructor for your function.
    PS: Scalar functions and table-valued functions are mutually exclusive.  You cannot have a scalar table-valued function.

    As Lynn mentioned, your sample data only covers one case.  I added a second case where the case had not yet been resolved.

    I believe that this gives you what you need.  It is very loosely based on work by Itzik Ben-Gan on packing intervals.

    WITH Case_Statuses AS
    (
        SELECT
            o.Case_ID
        ,    o.Contact_Date
        ,    o.Last_Updated_On AS Update_Status_Begin_Date
        ,    LEAD(o.Last_Updated_On, 1, COALESCE(r.Resolved_Date, '9999-12-31')) OVER(PARTITION BY o.Case_ID ORDER BY o.Last_Updated_On) AS Update_Status_End_Dt
        FROM #Open_Cases o
        LEFT OUTER JOIN #Resolved_Cases r
            ON o.Case_ID = r.Case_ID
    )

    SELECT *
    FROM Case_Statuses cs
    CROSS APPLY ( VALUES(1), (2), (3), (4), (5), (6) ) CalcAge(Age)
    CROSS APPLY ( VALUES(DATEADD(DAY, CalcAge.Age - 1, @Report_Start_Date) ) ) rd(Report_Date)
    WHERE cs.Update_Status_Begin_Date <= rd.Report_Date
        AND rd.Report_Date < cs.Update_Status_End_Dt
        AND rd.Report_Date BETWEEN @Report_Start_Date AND @Report_End_Date

    Drew

    Like the other post, the date selection range needs to be able to handle a dynamic date range...this date range will change, it is not going to remain the same and will vary by report and/or user entered values...is there a way to make the date range dynamic? (with the caveat it will skip over weekend and holidays and not include those---I can make a version of my calc age function that will return a date perhaps if it is not a weekend/holiday?)

    Drew, this solution works when I make the following change(minus the static date range):

      LEAD(o.Last_Updated_On, 1, COALESCE(r.Resolved_Date + 1, '9999-12-31')) OVER(PARTITION BY o.Case_ID ORDER BY o.Last_Updated_On) AS Update_Status_End_Dt

    However, I am not sure what #Open_Cases o and #Resolved_Cases r refer to(is this a temp table)?  If so, it was throwing an error...when I changed it to the Open_Cases and Resolved_Cases(the actual physical tables), it created duplicate rows in there(that somehow now can't be deleted??), which is not what I want...

    Any advice on what to do with that issue?

    Other than that it looks like the solution works properly...

    The problem here is that you didn't try to understand what the code is doing before deploying it to your own environment.  Yes, I used temp tables.  I don't want to create permanent tables on my database just to help you with a problem, so I converted all of the permanent tables to temp tables.  I didn't think it was necessary to explicitly tell you to change them back.  So, for future reference, you'll need to change these temp tables back to your own permanent tables.

    To make the dates dynamic you'll need to replace these two lines of code
    /*  First line to replace.  */
    CROSS APPLY ( VALUES(DATEADD(DAY, CalcAge.Age - 1, @Report_Start_Date) ) ) rd(Report_Date)

    /* Second line to replace. */
      AND rd.Report_Date BETWEEN @Report_Start_Date AND @Report_End_Date

    with the following code.
    /*  First replacement line.  */
    CROSS APPLY ( VALUES(DATEADD(DAY, CalcAge.Age - 1, @Report_Start_Date) ) ) rd(Report_Date)

    /* Second replacement line. */
      AND rd.Report_Date BETWEEN @Report_Start_Date AND @Report_End_Date

    Drew

    Drew,

    Apologies, I actually edited the initial reply to your first post that I had solved the temp table thing.  I also added in "Drop If Exists" logic to it as well so it will recreate the temp tables when it runs.

    I'm a little confused...it appears the original code is the same as the replacement code? 

    From what I am testing, it appears the following line is what is controlling the number of days that return:

    CROSS APPLY ( VALUES(1), (2), (3), (4), (5) ) CalcAge(Age)

    When I add in (6), (7), (8), etc it continues to return an additional day for each one added

    Exactly.  The solution I provided is only meant to guide you in the correct direction.  It is not meant to be the ULTIMATE answer that you can just cut and paste into your code.  You should never, ever simply paste code that you got from the Internet into your code.  You should always understand what it is doing first.

    I also saw your edit after I had posted.  You do not need to do a SELECT INTO to create the temp tables.  You should be using the permanent tables directly.  I only created the temp tables, because it makes it easier for me to clean up when I'm done coding the solution.  I certainly don't need permanent copies of every single set of sample data that I work with.

    Drew

    Well, I think I may have spoke too soon...

    For some reason it will not pick up cases that are created on the same day the report_date, when the report_date is the @enddate.

    So for instance:

    @EndDate = '3/26/2019'
    So items with a contact date of 3/26/2019 are not showing, but if I change the @EndDate to '3/27/2019' the item will show, but then it has a report date of '3/27/2019'
    For some reason it seems the BETWEEN code isn't including the last day in there...

    Any idea as to what might be wrong?

  • matter2003 - Wednesday, March 27, 2019 4:19 PM

    Well, I think I may have spoke too soon...

    For some reason it will not pick up cases that are created on the same day the report_date, when the report_date is the @enddate.

    So for instance:

    @EndDate = '3/26/2019'
    So items with a contact date of 3/26/2019 are not showing, but if I change the @EndDate to '3/27/2019' the item will show, but then it has a report date of '3/27/2019'
    For some reason it seems the BETWEEN code isn't including the last day in there...

    Any idea as to what might be wrong?

    Without digging into the code, I believe you may have fallen into the trap of believing that a date represents en entire day in code.
    A date, such as "2019-03-27" is in fact "2019-03-27 00:00:00.000".   It does not cater for all the hours, minutes, seconds, etc in the day.
    So, instead of using WHERE [SomeDate] BETWEEN '2019-03-01' AND '2019-03-31'
    You should be using the following construct, which takes care of all of the time components in the date WHERE [SomeDate] >= '2019-03-01' AND [SomeDate] < '2019-04-01'

  • Please look at the following article:
    https://www.red-gate.com/simple-talk/sql/t-sql.../state-transition-constraints/
      
    instead of doing the attribute splitting that tables for different kinds of cases, you should be using one table that represents a set of all cases and then check the status of a case as it goes through state transitions. Your single table should look something like this:

    CREATE TABLE Cases
    (case_id CHAR(11) NOT NULL,
    current_case_status VARCHAR(8) NOT NULL
    CHECK (case_status IN (‘open’, ‘resolved’, ..))
    prior_case_status VARCHAR(8) NOT NULL
    CHECK (case_status IN (‘open’, ‘resolved’, ..))
    FOREIGN KEY (current_case_status, prior_case_status)
    REFERENCES Status_Changes (current_case_status, prior_case_status)
    status_start_date DATE DEFAULT (CURRENT_TIMESTAMP) NOT NULL,
    status_end_date DATE,
    CHECK (status_start_date <= status_end_date),
    PRIMARY KEY (case_id, status_start_date),
    ..
    );

    you can then use views to find particular types of cases. In the simplest situation, you might have something like this

    CREATE VIEW Resolved_Cases
    AS:
    SELECT case_id, status_end_date
      FROM Cases
    WHERE case_status = ‘resolved’;

    >> This should be everything needed be able to help with this issue. Note CalcAgeCalcAge is a Scalar Table Function returning an INTEGER which gives us the number of days between the 2 dates subtracting Holidays and weekends. <<

    since SQL is declarative language, we hate procedural code such as your function calls. We also don’t like things like local variables, loops, if-then-else constructs, etc. That stuff belongs back in Fortran.

    Build a calendar table with one column for the calendar data and other columns to show whatever your business needs in the way of temporal information. Do not try to calculate holidays in SQL -- Easter alone requires too much math and there are two of them (Orthodox and Catholic).

    The Ordinal business day is a good trick. Number the days from whenever your calendar starts and repeat a number for a weekend or company holiday.

    CREATE TABLE Calendar
    (cal_date DATE NOT NULL PRIMARY KEY,
    ordinal_business_nbr INTEGER NOT NULL,
    ...);

    INSERT INTO Calendar
    VALUES ('2007-04-05', 42);
     ('2007-04-06', 43); -- Good Friday
     ('2007-04-07', 43);
     ('2007-04-08', 43); -- Easter Sunday
     ('2007-04-09', 44);
     ('2007-04-10', 45); -- Tuesday, back to work
    [/code language]
    To compute the business days from '2007-04-05', Thursday before Easter in 2007 to '2007-04-10', the next Tuesday:

    SELECT (C2.ordinal_business_nbr - C1.ordinal_business_nbr -1) AS business_day_cnt
    FROM Calendar AS C1, Calendar AS C2
    WHERE C1.cal_date = '2007-04-05'
     AND C2.cal_date = '2007-04-10';
    [/code language]

    This saves having to scan all of the rows within the range to get a sum. The -1 is to avoid counting the current day as one of the business days, but you might have a different rule in your company

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • It appears the CROSSAPPLY is giving me the wrong ReportDate to the SUB_QUERY_CREATED_ON dates. I am not sure what is going on as I checked the RetVal and its providing the proper date range to it.

    For instance on a report date of 3/28 I would expect a case with an Age of 0 to have a SUB_QUERY_CREATED_ON date of 3/28, but it has a SUB_QUERY_CREATED_ON date  of 3/27, this is the same for all cases in the date range...

    I have tried many things but I cannot get it to match up properly.

    EDIT: Just fixed by doing a Cast to Date with the comparisons

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply