March 9, 2017 at 1:13 pm
Luis Cazares - Thursday, March 9, 2017 12:30 PMNo,
This will add the same overhead as using a view, which is practically none. ITVF integrate into the query and are resolved as a set, in this case, it's just as adding the calculation which would add a Compute Scalar operator. Here's an article on this subject: http://www.sqlservercentral.com/articles/T-SQL/91724/
Thanks Luis! Knowing that this approach has been blessed by the RBAR master makes me feel much better! 😛
LinkedIn: https://www.linkedin.com/in/sqlrv
Website: https://www.sqlrv.com
March 9, 2017 at 3:34 pm
Just curious if another approach might work. If these dates are confined to a relatively short overall range, say 20 years, ... this ITVF might work well:CREATE FUNCTION dbo.YEARDIFF (
@FIRST_DATE AS date,
@SECOND_DATE AS date
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--DECLARE @FIRST_DATE AS date = '2012-02-29', @SECOND_DATE AS date = '2014-02-28';
WITH NUMBERS AS (
SELECT 1 AS N UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 10
),
DATE_DIFF AS (
SELECT ABS(DATEDIFF(day, @FIRST_DATE, @SECOND_DATE)) AS DIFF_DAYS,
CASE WHEN @FIRST_DATE >= @SECOND_DATE THEN 'F' ELSE 'S' END AS WHICH_LARGER,
CASE WHEN DATEPART(month, @FIRST_DATE) = 2 AND DATEPART(day, @FIRST_DATE) = 29 THEN 1 ELSE 0 END AS IS_LEAP_DAY
),
ALL_YEARS AS (
SELECT DATEADD(day, DD1.IS_LEAP_DAY, DATEADD(year, 0 - N1.N, @FIRST_DATE)) AS THE_DATE,
CASE
WHEN DD1.WHICH_LARGER = 'F'
AND @SECOND_DATE <= DATEADD(day, DD1.IS_LEAP_DAY, DATEADD(year, 0 - N1.N, @FIRST_DATE))
THEN 1
ELSE 0
END AS IS_BETWEEN
FROM NUMBERS AS N1
CROSS APPLY DATE_DIFF AS DD1
UNION ALL
SELECT DATEADD(day, DD2.IS_LEAP_DAY, DATEADD(year, N2.N, @FIRST_DATE)),
CASE
WHEN DD2.WHICH_LARGER = 'S'
AND @SECOND_DATE >= DATEADD(day, DD2.IS_LEAP_DAY, DATEADD(year, N2.N, @FIRST_DATE))
THEN 1
ELSE 0
END AS IS_BETWEEN
FROM NUMBERS AS N2
CROSS APPLY DATE_DIFF AS DD2
)
SELECT COUNT(Y.THE_DATE) AS DIFF_YEARS
FROM ALL_YEARS AS Y
WHERE Y.IS_BETWEEN = 1;
GO
The method here is to take the 10 years before and after FirstDate, and count the number of them where SecondDate fits in between that date and FirstDate. This could easily be adjusted to accommodate a wider range. It operates as an ITVF, which should impose a relatively low cost.
If I take your sample data and add a query including the ITVF, the following is the SQL, and the results agree with expected:DECLARE @DateTest AS TABLE (
RecordID int IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED,
FirstDate datetime,
SecondDate datetime,
ExpectedResult smallint,
[Description] varchar(128)
);
INSERT INTO @DateTest (FirstDate, SecondDate, ExpectedResult, [Description])
VALUES ('2012-03-10', '2013-02-18', 0, 'Almost a year'),
('2012-03-10', '2013-03-10', 1, 'Exactly a year'),
('2012-03-10', '2013-03-20', 1, '1 year + 10 days'),
('2012-02-29', '2014-02-28', 1, '2 years from leap day-1'),
('2012-03-10', '2013-08-15', 1, '1 year + several months'),
('2012-02-29', '2014-03-01', 2, '2 years from leap day+1'),
('2013-02-18', '2012-03-10', 0, 'Rev: Almost a year'),
('2013-03-10', '2012-03-10', 1, 'Rev: Exactly a year'),
('2013-03-20', '2012-03-10', 1, 'Rev: 1 year + 10 days'),
('2014-02-28', '2012-02-29', 1, 'Rev: 2 years from leap day-1'),
('2013-08-15', '2012-03-10', 1, 'Rev: 1 year + several months'),
('2014-03-01', '2012-02-29', 2, 'Rev: 2 years from leap day+1')
SELECT DT.*, Y.DIFF_YEARS
FROM @DateTest AS DT
CROSS APPLY dbo.YEARDIFF(DT.FirstDate, DT.SecondDate) AS Y
ORDER BY DT.RecordID;
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
March 9, 2017 at 3:35 pm
First of all, there is no CASE statement in SQL; it is an expression. You are treating a UDF the way you would have written code in COBOL, FORTRAN, or BASIC -- you're doing computations with strings represent dates. SQL is a set oriented declarative language, so you don't programming the same way. In this case, I would use big look up table. A date is three bytes and 100 years in a calendar is only 36,525 rows. This is tiny by today's standards. It will probably fit in main storage. You can then join to it, share it among processes, etc.
The other tricks are to organize the indexes in descending order, on the assumption that current dates are more important than history. Keep your fill factor as small as possible, since it's going to be a constant look up table. In the one we always hate to do; go down and actually talk to the accounting department about setting up such calendars. Yes, yes, I know all of the jokes from the old Dilbert cartoons about visiting the trolls and accounting, but is a really good idea. They probably have some of this stuff in spreadsheets already and you can just copy it into tables in SQL. It's sort of nice if the accounting department spreadsheets in your database are actually in sync (obviously I have a horror story that I'm not going to tell you).
Please post DDL and follow ANSI/ISO standards when asking for help.
March 9, 2017 at 3:51 pm
Wow, that's a huge amount of code. Does it do something better than the code I posted earlier?:
SELECT DT.*,
DATEDIFF(YEAR, LowDate, HighDate) - CASE WHEN CONVERT(varchar(5), LowDate, 1) > CONVERT(varchar(5), HighDate, 1)
THEN 1 ELSE 0 END AS ScottMethod
FROM @DateTest DT
CROSS APPLY (
SELECT CASE WHEN SecondDate >= FirstDate THEN SecondDate ELSE FirstDate END AS HighDate,
CASE WHEN SecondDate >= FirstDate THEN FirstDate ELSE SecondDate END AS LowDate
) AS assign_alias_names1
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 10, 2017 at 6:31 am
ScottPletcher - Thursday, March 9, 2017 3:51 PMWow, that's a huge amount of code. Does it do something better than the code I posted earlier?:
SELECT DT.*,
DATEDIFF(YEAR, LowDate, HighDate) - CASE WHEN CONVERT(varchar(5), LowDate, 1) > CONVERT(varchar(5), HighDate, 1)
THEN 1 ELSE 0 END AS ScottMethod
FROM @DateTest DT
CROSS APPLY (
SELECT CASE WHEN SecondDate >= FirstDate THEN SecondDate ELSE FirstDate END AS HighDate,
CASE WHEN SecondDate >= FirstDate THEN FirstDate ELSE SecondDate END AS LowDate
) AS assign_alias_names1
Nope... and nice piece of code. I must have missed it earlier. I was pretty busy yesterday.... I'll have to add using the CONVERT to varchar(5) with Style 1 as a useful tool for date comparisons to my bag of tricks.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
March 10, 2017 at 7:44 am
jcelko212 32090 - Thursday, March 9, 2017 3:35 PM>> As the complex CASE statement [sic] is executed in multiple places, the natural thing to do is to place this into a user-defined function. However, we all know how such scalar UDFs affect performance by turning set-based logic into RBAR operations. <<First of all, there is no CASE statement in SQL; it is an expression. You are treating a UDF the way you would have written code in COBOL, FORTRAN, or BASIC -- you're doing computations with strings represent dates. SQL is a set oriented declarative language, so you don't programming the same way. In this case, I would use big look up table. A date is three bytes and 100 years in a calendar is only 36,525 rows. This is tiny by today's standards. It will probably fit in main storage. You can then join to it, share it among processes, etc.
The other tricks are to organize the indexes in descending order, on the assumption that current dates are more important than history. Keep your fill factor as small as possible, since it's going to be a constant look up table. In the one we always hate to do; go down and actually talk to the accounting department about setting up such calendars. Yes, yes, I know all of the jokes from the old Dilbert cartoons about visiting the trolls and accounting, but is a really good idea. They probably have some of this stuff in spreadsheets already and you can just copy it into tables in SQL. It's sort of nice if the accounting department spreadsheets in your database are actually in sync (obviously I have a horror story that I'm not going to tell you).
Once again, you missed the ball. You failed to understand the question which caused you to give a wrong answer.
The question is about how to get the age in full years. That has nothing to do with lookup tables or accountants.
March 10, 2017 at 9:11 am
The other tricks are to organize the indexes in descending order, on the assumption that current dates are more important than history.
"Descending order"? I think you're flashing back to COBOL. SQL can read an index backwards just as easily as it can read one forward, particularly for a small number of rows. I don't see any need for a descending index.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 10, 2017 at 9:22 am
jcelko212 32090 - Thursday, March 9, 2017 3:35 PM...In this case, I would use big look up table. A date is three bytes and 100 years in a calendar is only 36,525 rows. This is tiny by today's standards. It will probably fit in main storage. You can then join to it, share it among processes, etc.The other tricks are to organize the indexes in descending order, on the assumption that current dates are more important than history.
"Descending order"? I think you're flashing back to COBOL. SQL can read an index backwards just as easily as it can read one forward, particularly for a small number of rows. I don't see any need for a descending index.
not so fast buddy, the only set you need to be worried about is this set of punched cards i'm putting in the hopper!
March 10, 2017 at 12:14 pm
ScottPletcher - Friday, March 10, 2017 9:11 AMjcelko212 32090 - Thursday, March 9, 2017 3:35 PM...In this case, I would use big look up table. A date is three bytes and 100 years in a calendar is only 36,525 rows. This is tiny by today's standards. It will probably fit in main storage. You can then join to it, share it among processes, etc.The other tricks are to organize the indexes in descending order, on the assumption that current dates are more important than history.
"Descending order"? I think you're flashing back to COBOL. SQL can read an index backwards just as easily as it can read one forward, particularly for a small number of rows. I don't see any need for a descending index.
not so fast buddy, the only set you need to be worried about is this set of punched cards i'm putting in the hopper!
And if you need any MORE punched cards, I actually still have a couple of boxes of them...
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
March 10, 2017 at 12:40 pm
ScottPletcher - Friday, March 10, 2017 9:11 AMjcelko212 32090 - Thursday, March 9, 2017 3:35 PMnot so fast buddy, the only set you need to be worried about is this set of punched cards i'm putting in the hopper!
And if you need any MORE punched cards, I actually still have a couple of boxes of them...
Save the punch cards! They sell for $$ on Ebay as collectibles; REALLY!
My point is that a table sorted in the expected order has an advantage. This usually this means the most recent data is used before historic data.. Unfortunately, my mind is going, so I can't remember who did all the research on this specifically for SQL Server.
Please post DDL and follow ANSI/ISO standards when asking for help.
March 10, 2017 at 12:54 pm
sgmunson - Friday, March 10, 2017 12:14 PMAnd if you need any MORE punched cards, I actually still have a couple of boxes of them...
Oh, the horror! You've just brought back the nightmares of doing assembler programming on punch cards. One slight misstep and your program literally crashes all over the floor! Even after all these years it still hurts to talk about it....
LinkedIn: https://www.linkedin.com/in/sqlrv
Website: https://www.sqlrv.com
March 10, 2017 at 1:06 pm
patrickmcginnis59 10839 - Friday, March 10, 2017 9:22 AMScottPletcher - Friday, March 10, 2017 9:11 AMjcelko212 32090 - Thursday, March 9, 2017 3:35 PMnot so fast buddy, the only set you need to be worried about is this set of punched cards i'm putting in the hopper!
And if you need any MORE punched cards, I actually still have a couple of boxes of them...
Save the punch cards! They sell for $$ on Ebay as collectibles; REALLY!
My point is that a table sorted in the expected order has an advantage. This usually this means the most recent data is used before historic data.. Unfortunately, my mind is going, so I can't remember who did all the research on this specifically for SQL Server.
SQL Server's indexes are doubly linked, so I don't think reading backwards presents much of an issue for it. INSERTing backwards could be more of an issue though.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
March 11, 2017 at 8:42 pm
jcelko212 32090 - Thursday, March 9, 2017 3:35 PMYou are treating a UDF the way you would have written code in COBOL, FORTRAN, or BASIC -- you're doing computations with strings represent dates. SQL is a set oriented declarative language, so you don't programming the same way. In this case, I would use big look up table. A date is three bytes and 100 years in a calendar is only 36,525 rows. This is tiny by today's standards. It will probably fit in main storage. You can then join to it, share it among processes, etc.
Here's the 100 year calendar table with a single date column. I even took the liberty of adding a unique Clustered Index to the table in the form of a Primary Key. The domain of the table consists of all whole dates from 1950-01-01 up to and not including 2050-01-01.
--===== Create the suggested Calendar table
CREATE TABLE dbo.Calendar
(
CalendarDate DATE NOT NULL
,CONSTRAINT PK_Calendar PRIMARY KEY CLUSTERED (CalendarDate)
)
;
--===== Populate the Calendar table with sequential dates from 1950-01-01 inclusive up to 2050-01-01 exclusive.
INSERT INTO dbo.Calendar WITH (TABLOCK)
(CalendarDate)
SELECT TOP (DATEDIFF(dd,'1950','2050'))
CalendarDate = DATEADD(dd,ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1,CAST('1950' AS DATE))
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
... and here's the million row test data. Remember that the "FirstDate" may not be less than the "SecondDate" regardless of the naming convention the OP used for the column names.
--===== Create the test table (HEAP, really) and populate it on the fly.
-- Both DATE columns will contain some random date from 1950-01-01 inclusive up to 2050-01-01 exclusive.
-- Since a PK is not possible for these two columns, I'll leave indexing up to you, the user.
-- The use of ISNULL makes the resulting column NOT NULL. COALESCE won't do it.
SELECT TOP 1000000
FirstDate = ISNULL(CONVERT(DATE,DATEADD(dd,ABS(CHECKSUM(NEWID())%DATEDIFF(dd,'1950','2050')),'1950')),CAST('1950' AS DATE))
,SecondDate = ISNULL(CONVERT(DATE,DATEADD(dd,ABS(CHECKSUM(NEWID())%DATEDIFF(dd,'1950','2050')),'1950')),CAST('1950' AS DATE))
INTO dbo.TestTable
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
;
--===== The meta-data for the TestTable can be viewed by executing the following.
EXEC sp_help 'dbo.TestTable'
;
There you have it; the populated calendar table you prescribed and the table of test data the OP prescribed. Again, the goal is to correctly calculate the number of years between the two dates as a positive integer using code that will actually work in SQL Server even if you have to resort to the wiles of the proprietary language known as T-SQL.
We await your working code, good Sir!
p.s. The OP's original code, as wild or poor as it may seem to you, correctly solves the million row table in the following amount of CPU time and duration when the output is dumped to a "Unit 0" variable (you remember what "Unit 0" on a keypunch machine is, don't you. 😉 ). Your code should do at least as well.
SQL Server Execution Times:
CPU time = 1092 ms, elapsed time = 1093 ms.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 13, 2017 at 11:17 am
CREATE TABLE Calendar
(cal_date DATE NOT NULL PRIMARY KEY,
ordinal_business_day INTEGER NOT NULL,
...);
INSERT INTO Calendar VALUES ('2007-04-05', 42);
INSERT INTO Calendar VALUES ('2007-04-06', 43); -- Good Friday
INSERT INTO Calendar VALUES ('2007-04-07', 43);
INSERT INTO Calendar VALUES ('2007-04-08', 43); -- Easter Sunday
INSERT INTO Calendar VALUES ('2007-04-09', 44);
INSERT INTO Calendar VALUES ('2007-04-10', 45); -- Tuesday, back to work
To compute the business days from '2007-04-05', Thursday before Easter in 2007 to '2007-04-10', the next Tuesday:
SELECT (C2.ordinal_business_day - C1.ordinal_business_day -1) AS business_day_cnt
FROM Calendar AS C1, Calendar AS C2
WHERE C1.cal_date = '2007-04-05'
AND C2.cal_date = '2007-04-10';
This saves having to scan all of the rows within the range to get a count. The -1 is to avoid counting the current day as one of the business days, but you might have a different rule in your company.
When someone says they have a complicated rule for calculating durations, I think of various physical calendars. There’s over 150 of them in the GAAP, the last time I looked. My personal favorite, however, is crop calendars. Many decades ago I worked for a cigarette company.
They buy tobacco futures and their calendar is a 16 month “crop year†of 30 days per month. We try doing this with a regular calendar and a hell of a lot of really messy COBOL programming. Contracts require actions every month by both parties (i.e. In month five of the 2016 crop, we want to do some sampling and require the tobacco be of such and such a quality and quantity level). The crop calendar does not match up with the common era calendar very well. In fact it has to be re-adjusted if the weather changes one season. And you can have more than one crop overlapping in its lifecycle, from planting the delivery of the factory. The calendar is set up in the future, not the past.
The best way to handle complicated calendars that I found is to use this sort of look up table approach. It’s fast, it ports to any other implementation of SQL We seem to assume that everybody’s a SQL Server user in their shop; these days nobody works with just one database. If nothing else, they have to get stuff from somebody else outside their business was probably on Oracle, DB2, or post grass or who knows what.
Please post DDL and follow ANSI/ISO standards when asking for help.
March 13, 2017 at 11:45 am
jcelko212 32090 - Thursday, March 9, 2017 3:35 PMI would add an ordinal year to the calendar. This is a variant on the business days trick. Number the days from whenever your calendar starts and repeat a number for a weekend or company holiday.CREATE TABLE Calendar
(cal_date DATE NOT NULL PRIMARY KEY,
ordinal_business_day INTEGER NOT NULL,
...);INSERT INTO Calendar VALUES ('2007-04-05', 42);
INSERT INTO Calendar VALUES ('2007-04-06', 43); -- Good Friday
INSERT INTO Calendar VALUES ('2007-04-07', 43);
INSERT INTO Calendar VALUES ('2007-04-08', 43); -- Easter Sunday
INSERT INTO Calendar VALUES ('2007-04-09', 44);
INSERT INTO Calendar VALUES ('2007-04-10', 45); -- Tuesday, back to workTo compute the business days from '2007-04-05', Thursday before Easter in 2007 to '2007-04-10', the next Tuesday:
SELECT (C2.ordinal_business_day - C1.ordinal_business_day -1) AS business_day_cnt
FROM Calendar AS C1, Calendar AS C2
WHERE C1.cal_date = '2007-04-05'
AND C2.cal_date = '2007-04-10';This saves having to scan all of the rows within the range to get a count. The -1 is to avoid counting the current day as one of the business days, but you might have a different rule in your company.
When someone says they have a complicated rule for calculating durations, I think of various physical calendars. There’s over 150 of them in the GAAP, the last time I looked. My personal favorite, however, is crop calendars. Many decades ago I worked for a cigarette company.
They buy tobacco futures and their calendar is a 16 month “crop year†of 30 days per month. We try doing this with a regular calendar and a hell of a lot of really messy COBOL programming. Contracts require actions every month by both parties (i.e. In month five of the 2016 crop, we want to do some sampling and require the tobacco be of such and such a quality and quantity level). The crop calendar does not match up with the common era calendar very well. In fact it has to be re-adjusted if the weather changes one season. And you can have more than one crop overlapping in its lifecycle, from planting the delivery of the factory. The calendar is set up in the future, not the past.
The best way to handle complicated calendars that I found is to use this sort of look up table approach. It’s fast, it ports to any other implementation of SQL We seem to assume that everybody’s a SQL Server user in their shop; these days nobody works with just one database. If nothing else, they have to get stuff from somebody else outside their business was probably on Oracle, DB2, or post grass or who knows what.
A calendar table is needed for complex things (but not really for routine calendar calcs). But your approach of an ordinal business day seems very problematic and error-inducing to me. For example, if I designate a new holiday later, I have to renumber every row from that date forward. Or, less common, what about official half days off? That is perhaps rare, but some companies do have them.
As for having to scan rows if you don't use an ordinal day, that's true enough. But if you keep the row width to a reasonable size, all the rows should fit on a couple of pages max anyway, in whichever dbms it is. And I don't know of any way to read less than one page in SQL Server anyway.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 15 posts - 16 through 30 (of 58 total)
You must be logged in to reply to this topic. Login to reply