Comparing 2 functions for any differences in the result set.

Question

Comparing 2 functions for any differences in the result set.

Feivel

SSCertifiable

Points: 7524
More actions
October 5, 2017 at 8:47 am

#325694

Recently, we've done some optimizations to the functions and now the BI Manager has tasked me with a request to compare the result of the 2 functions as below making sure the row counts match, along with the exact values of each field. In the same request he is expecting this to be a while loop that will loop all the YTD ranges dating back 20 years. (not sure if this is really practical in order to test the validity). However, the while loop is not a mandate for the same. Is there a way to compare the result of the 2 functions without hard coding the ranges and automating the same like for e.g declaring a variable and then based on the date range calculating the next date range etc without the user intervention.
Last but not the least a script that would validate there are no differences in the results of both the functions which means that the optimization did work. I am thinking of using except clause but I am not sure if it would work with functions. Also, if that won't work what else would be a possible way of doing it.
I would really appreciate for all the help. It is a big impact for the company if I could complete this task. However, I am not even sure if this is something that should be even handled by a DBA.
Functions
Original Function: report.fnReportDealCore_original(@StartDate, @EndDate, -1)
New Optimized Function: report.fnReportDealCore(@StartDate, @EndDate, -1)
Example of Run Dates dating back 20 years
1.       1/1/1997 – 1/31/2997
2. 2/1/1997 – 2/28/1997
3. 3/1/1997 – 3/31/1997
4.       ….
5.       1/1/1998 – 1/31/1998
6.       2/1/1988 – 2/28/1998
7.       …
8.       Through 9/30/2017

Thanks

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply

sgmunson SSC Guru Points: 110640 More actions · Answer 1

That's not entirely outside of a DBA's duties. I guess the question is why would you need to test each individual data range over the last 20 years? Why not just do a one time comparison across a 20 year span e.g. @StartDate = '1997-01-01', @EndDate = '2017-09-30' ? However you do it, there are a couple of ways. You can use EXCEPT or INTERSECT. Both ways mean you'll end up doing something twice. In the case of EXCEPT, you would have to see that the results of SELECT * from the original function with the date range EXCEPT the results of SELECT * from the new function produces zero rows, and doing the same thing just reversing the order of the queries between the EXCEPT keyword also produces zero rows. In the case of INTERSECT, you need to see that the results for both queries as used with EXCEPT but instead used with INTERSECT, return identical results to what comes out of both queries individually, so I suspect that using EXCEPT is more practical.

A calendar table would certainly help immensely if you were going to down the WHILE loop road, however.

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

Luis Cazares SSC Guru Points: 183706 More actions · Answer 2

ffarouqi - Thursday, October 5, 2017 8:47 AM
Recently, we've done some optimizations to the functions and now the BI Manager has tasked me with a request to compare the result of the 2 functions as below making sure the row counts match, along with the exact values of each field. In the same request he is expecting this to be a while loop that will loop all the YTD ranges dating back 20 years. (not sure if this is really practical in order to test the validity). However, the while loop is not a mandate for the same. Is there a way to compare the result of the 2 functions without hard coding the ranges and automating the same like for e.g declaring a variable and then based on the date range calculating the next date range etc without the user intervention.
Last but not the least a script that would validate there are no differences in the results of both the functions which means that the optimization did work. I am thinking of using except clause but I am not sure if it would work with functions. Also, if that won't work what else would be a possible way of doing it.
I would really appreciate for all the help. It is a big impact for the company if I could complete this task. However, I am not even sure if this is something that should be even handled by a DBA.
Functions

Original Function: report.fnReportDealCore_original(@StartDate, @EndDate, -1)
New Optimized Function: report.fnReportDealCore(@StartDate, @EndDate, -1)
Example of Run Dates dating back 20 years
1.       1/1/1997 â€“ 1/31/2997
2. 2/1/1997 â€“ 2/28/1997
3. 3/1/1997 â€“ 3/31/1997
4.       â€¦.
5.       1/1/1998 â€“ 1/31/1998
6.       2/1/1988 â€“ 2/28/1998
7.       â€¦
8.       Through 9/30/2017
Thanks

Are these scalar or table functions?

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Feivel SSCertifiable Points: 7524 More actions · Answer 3

Luis Cazares - Thursday, October 5, 2017 9:44 AM
ffarouqi - Thursday, October 5, 2017 8:47 AM
Recently, we've done some optimizations to the functions and now the BI Manager has tasked me with a request to compare the result of the 2 functions as below making sure the row counts match, along with the exact values of each field. In the same request he is expecting this to be a while loop that will loop all the YTD ranges dating back 20 years. (not sure if this is really practical in order to test the validity). However, the while loop is not a mandate for the same. Is there a way to compare the result of the 2 functions without hard coding the ranges and automating the same like for e.g declaring a variable and then based on the date range calculating the next date range etc without the user intervention.
Last but not the least a script that would validate there are no differences in the results of both the functions which means that the optimization did work. I am thinking of using except clause but I am not sure if it would work with functions. Also, if that won't work what else would be a possible way of doing it.
I would really appreciate for all the help. It is a big impact for the company if I could complete this task. However, I am not even sure if this is something that should be even handled by a DBA.
Functions

Original Function: report.fnReportDealCore_original(@StartDate, @EndDate, -1)
New Optimized Function: report.fnReportDealCore(@StartDate, @EndDate, -1)
Example of Run Dates dating back 20 years
1.       1/1/1997 â€“ 1/31/2997
2. 2/1/1997 â€“ 2/28/1997
3. 3/1/1997 â€“ 3/31/1997
4.       â€¦.
5.       1/1/1998 â€“ 1/31/1998
6.       2/1/1988 â€“ 2/28/1998
7.       â€¦
8.       Through 9/30/2017
Thanks

Are these scalar or table functions?

Table

Feivel SSCertifiable Points: 7524 More actions · Answer 4

sgmunson - Thursday, October 5, 2017 9:08 AM
That's not entirely outside of a DBA's duties. I guess the question is why would you need to test each individual data range over the last 20 years? Why not just do a one time comparison across a 20 year span e.g. @StartDate = '1997-01-01', @EndDate = '2017-09-30' ? However you do it, there are a couple of ways. You can use EXCEPT or INTERSECT. Both ways mean you'll end up doing something twice. In the case of EXCEPT, you would have to see that the results of SELECT * from the original function with the date range EXCEPT the results of SELECT * from the new function produces zero rows, and doing the same thing just reversing the order of the queries between the EXCEPT keyword also produces zero rows. In the case of INTERSECT, you need to see that the results for both queries as used with EXCEPT but instead used with INTERSECT, return identical results to what comes out of both queries individually, so I suspect that using EXCEPT is more practical.
A calendar table would certainly help immensely if you were going to down the WHILE loop road, however.

Thank you for your suggestion. However, do you have a script or anything similar that I can use as a while loop or any other automated way of comparing the range of dates dating 20 years etc. I really appreciate your help on this.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 5

This would generate a test script for all dates from 20 years ago till today. It uses a YTD range.
DECLARE @Start datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);

WITH E(n) AS( SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(n) ), E2(n) AS( SELECT a.n FROM E a, E b ), E4(n) AS( SELECT a.n FROM E2 a, E2 b ), cteCalendar(calDate) AS( SELECT TOP (DATEDIFF( DD, @Start, GETDATE()) + 1) DATEADD( DD, ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1, @Start) calDate FROM E4 ) SELECT REPLACE(REPLACE('IF EXISTS( (SELECT * FROM report.fnReportDealCore_original( @StartDate, @EndDate, -1) EXCEPT SELECT * FROM report.fnReportDealCore( @StartDate, @EndDate, -1)) UNION ALL (SELECT * FROM report.fnReportDealCore( @StartDate, @EndDate, -1) EXCEPT SELECT * FROM report.fnReportDealCore_original( @StartDate, @EndDate, -1))) SELECT ''Differences''' , '@StartDate', '''' + CAST( YEAR(calDate) AS char(4)) + '0101''') , '@EndDate', '''' + CONVERT( char(8), calDate, 112) + '''') FROM cteCalendar;

For month by month ranges, you could try this calculation and change the previous script.
DECLARE @Start datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);

WITH E(n) AS( SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(n) ), E2(n) AS( SELECT a.n FROM E a, E b ), E4(n) AS( SELECT a.n FROM E2 a, E2 b ), cteCalendar(calDate) AS( SELECT TOP (DATEDIFF( MM, @Start, GETDATE()) + 1) DATEADD( MM, ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1, @Start) calDate FROM E4 ) SELECT calDate AS MonthStart, DATEADD( MM, 1, calDate) - 1 AS MonthEnd FROM cteCalendar;

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 6

My take on it is that "It Depends". If it was an optimization made by the DBAs, then the DBAs are responsible for proving identical functionality and results. As a bit of a side bar, it probably took longer to write the original post on this thread than it would take to prove the two functions produce the identical results. Get crackin' 😉

If someone else made the changes, let them prove the QA on it. Just keep in mind that anything that goes to prod should still be reviewed by the DBA for possible security risks (not likely in this case) and performance issues.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Luis Cazares SSC Guru Points: 183706 More actions · Answer 7

This migh also work.
DECLARE @Start datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);

WITH E(n) AS( SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0))E(n) ), E2(n) AS( SELECT a.n FROM E a, E b ), E4(n) AS( SELECT a.n FROM E2 a, E2 b ), cteCalendar(calDate) AS( SELECT TOP (DATEDIFF( DD, @Start, GETDATE()) + 1) DATEADD( DD, ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1, @Start) calDate FROM E4 ) SELECT CAST( CAST( YEAR(calDate) AS char(4)) AS datetime), calDate FROM cteCalendar WHERE EXISTS((SELECT * FROM report.fnReportDealCore_original( CAST( YEAR(calDate) AS char(4)), calDate, -1) EXCEPT SELECT * FROM report.fnReportDealCore( CAST( YEAR(calDate) AS char(4)), calDate, -1)) UNION ALL (SELECT * FROM report.fnReportDealCore( CAST( YEAR(calDate) AS char(4)), calDate, -1) EXCEPT SELECT * FROM report.fnReportDealCore_original( CAST( YEAR(calDate) AS char(4)), calDate, -1))) ;

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 8

BTW... please post the 2 functions. I'm always interested in reporting functions and have some old "Black Arts" tricks that might provide additional performance benefits.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

TheCTEGuy Ten Centuries Points: 1032 More actions · Answer 9

Hey Luis - Please review the code below - will this perform better from your code -


declare @st datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);
;with cte as (
Select @st as dte
union all
Select dte+1 as dte2 from cte where dte<getdate()
)
select *,min(dte)over (partition by year(dte),month(dte)) as MinDate ,max(dte)over (partition by year(dte),month(dte)) as MAXDate 
from 
cte
order by dte
option(maxrecursion 0)

First solve the problem then write the code !

TheCTEGuy Ten Centuries Points: 1032 More actions · Answer 10

Here is something strange i found - Can anyone help me explain why is this that the scan counts are less in the below code


declare @st datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);
;with cte as (
Select @st as dte
union all
Select dte+1 as dte2 from cte where dte<getdate()
)
select *,min(dte)over (partition by year(dte),month(dte)) as MinDate ,max(dte)over (partition by year(dte),month(dte)) as MAXDate 
from 
cte
order by dte
option(maxrecursion 0)
go
/*
Table 'Worktable'. Scan count 8, logical reads 77860, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 SQL Server Execution Times:
 CPU time = 141 ms, elapsed time = 384 ms.
*/
declare @st datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);
;with cte as (
Select @st as dte,year(@st) as yr,month(@st) as mnth 
union all
Select dte+1 as dte2,year(dte),month(dte) from cte where dte<getdate()
)
select *,min(dte)over (partition by yr,mnth) as MinDate ,max(dte)over (partition by yr,mnth) as MAXDate 
from 
cte
order by dte
option(maxrecursion 0)
/*
Table 'Worktable'. Scan count 5, logical reads 61686, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 SQL Server Execution Times:
 CPU time = 156 ms, elapsed time = 224 ms.

*/

First solve the problem then write the code !

Phil Parkin SSC Guru Points: 246993 More actions · Answer 11

TheCTEGuy - Friday, October 6, 2017 4:53 AM
Hey Luis - Please review the code below - will this perform better from your code -
declare @st datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);
;with cte as (

; is a terminator, not an initiator.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

TheCTEGuy Ten Centuries Points: 1032 More actions · Answer 12

Phil Parkin - Friday, October 6, 2017 6:43 AM
TheCTEGuy - Friday, October 6, 2017 4:53 AM
Hey Luis - Please review the code below - will this perform better from your code -
declare @st datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);
;with cte as (
; is a terminator, not an initiator.

Are you trying to say that the code should like the below code ?

declare @st datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);

with cte as ( Select @st as dte,year(@st) as yr,month(@st) as mnth union all Select dte+1 as dte2,year(dte),month(dte) from cte where dte<getdate() )

select *,min(dte)over (partition by yr,mnth) as MinDate ,max(dte)over (partition by yr,mnth) as MAXDate from cte order by dte option(maxrecursion 0)

First solve the problem then write the code !

Phil Parkin SSC Guru Points: 246993 More actions · Answer 13

TheCTEGuy - Friday, October 6, 2017 6:50 AM
Phil Parkin - Friday, October 6, 2017 6:43 AM
TheCTEGuy - Friday, October 6, 2017 4:53 AM
Hey Luis - Please review the code below - will this perform better from your code -
declare @st datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);
;with cte as (
; is a terminator, not an initiator.
Are you trying to say that the code should like the below code ?
declare @st datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0);
with cte as ( Select @st as dte,year(@st) as yr,month(@st) as mnth union all Select dte+1 as dte2,year(dte),month(dte) from cte where dte<getdate() )
select *,min(dte)over (partition by yr,mnth) as MinDate ,max(dte)over (partition by yr,mnth) as MAXDate from cte order by dte option(maxrecursion 0)

Pretty much. Starting CTEs with a semicolon is unnecessary (and may mislead others who read your code into thinking that it is the correct way of doing things) if you properly terminate the rest of your code.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

Luis Cazares SSC Guru Points: 183706 More actions · Answer 14

TheCTEGuy - Friday, October 6, 2017 4:53 AM
Hey Luis - Please review the code below - will this perform better from your code -
declare @st datetime = DATEADD(YY, DATEDIFF(YY, 0, GETDATE()) - 20, 0); ;with cte as ( Select @st as dte union all Select dte+1 as dte2 from cte where dte<getdate() ) select *,min(dte)over (partition by year(dte),month(dte)) as MinDate ,max(dte)over (partition by year(dte),month(dte)) as MAXDate from cte order by dte option(maxrecursion 0)

Can you prove that? I would but won't have access to my laptop until Monday.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2