February 3, 2019 at 1:06 pm
Then why on this good Green Earth are you posting it as a solution!? :blink::Whistling:
--Jeff Moden
Change is inevitable... Change for the better is not.
February 4, 2019 at 11:56 pm
hi jeff
My reasons ..
1) For small amounts of data
2) something different
3) easy to understand .. how to code ...
There are a lot of different kinds
Of reasons in general
Performance
Less code
Logical reads
Formatting
Reusability
Etc etc etc
I am a very casual
sql
????
February 13, 2019 at 7:14 am
Okay folks. Recursion is not necessarily a problem, although it certainly can be. For smaller data sets, it can be quite helpful. I took a look at this and my first thought was, "why not recursion?". So here we go, and with a DATEDIFF so that date gaps in between rows are covered.
CREATE -- DROP -- TRUNCATE
TABLE #SampleData
(
Row_Num INTEGER NOT NULL,
Request_Date DATE NOT NULL,
Code VARCHAR(50) NOT NULL
);
INSERT INTO #SampleData (Row_Num, Request_Date, Code)
SELECT N,
DATEADD(DAY, N, '20190101'),
CASE WHEN RIGHT(CAST(RAND(N) AS VARCHAR(50)), 1) LIKE '[13579]' THEN 'Urgent Needs' ELSE 'Awaiting' END
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY name) AS N
FROM sys.objects
) AS TempTally
WHERE N <= 20;
SELECT *
FROM #SampleData
ORDER BY Row_Num;
WITH RECURSIVE_CTE AS (
SELECT
SD.Row_Num,
SD.Request_Date,
SD.Code,
0 AS Num_Days
FROM #SampleData AS SD
WHERE SD.Row_Num = 1
UNION ALL
SELECT
SD.Row_Num,
SD.Request_Date,
SD.Code,
CASE
WHEN RC.Code = SD.Code THEN RC.Num_Days + DATEDIFF(day, RC.Request_Date, SD.Request_Date)
ELSE 0
END AS Num_Days
FROM RECURSIVE_CTE AS RC
INNER JOIN #SampleData AS SD
ON RC.Row_Num + 1 = SD.Row_Num
)
SELECT *
FROM RECURSIVE_CTE
ORDER BY Row_Num;
DROP TABLE #SampleData;
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
February 13, 2019 at 7:36 am
The answer to that question is because, despite the smaller number of rows, it's still a performance problem for the smaller number of rows when an "incremental" rCTE that counts by "1" is used. It's all a part of the "Death by a thousand cuts" that occurs in every database and makes it nearly impossible to fix because management doesn't "get it" either. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
February 13, 2019 at 7:43 am
okfine08 - Monday, February 4, 2019 11:56 PMhi jeffMy reasons ..
1) For small amounts of data
2) something different
3) easy to understand .. how to code ...
There are a lot of different kinds
Of reasons in general
Performance
Less code
Logical reads
Formatting
Reusability
Etc etc etcI am a very casual
sql
😀😀
Please see my reply above. There is no justification for slow code no matter the number of rows. As for performance and logic reads, please see the following article and pay particular attention to where I talk about the {Red "sky-rocket"} which proves that the performance and logical reads of incremental rCTEs sucks so bad even for small row sets that it has its own center of gravity. Even a properly written WHILE loop will beat it in performance and logical reads.
Hidden RBAR: Counting with Recursive CTE's[/url]
Help stop the "Death by a Thousand Cuts". Stop promoting the use of "incremental rCTEs".
--Jeff Moden
Change is inevitable... Change for the better is not.
February 13, 2019 at 8:32 am
sgmunson - Wednesday, February 13, 2019 7:14 AMOkay folks. Recursion is not necessarily a problem, although it certainly can be. For smaller data sets, it can be quite helpful. I took a look at this and my first thought was, "why not recursion?".
The answer to that question is because, despite the smaller number of rows, it's still a performance problem for the smaller number of rows when an "incremental" rCTE that counts by "1" is used. It's all a part of the "Death by a thousand cuts" that occurs in every database and makes it nearly impossible to fix because management doesn't "get it" either. 😉
I get the concept, but I'm having a hard time seeing how the entire query; including the randomized insert; running in 0.02000120000 seconds is going to become a problem. Having an insert that could form the basis for a good test, I've now tested this on 3,000,000 rows. With returning the results to SSMS, the query ran in 2 minutes, 43 seconds. Of course, it actually had to deliver 6 million rows because there are two selects, so I commented out the first one and it ran in 2 minutes, 30 seconds. I then decided to run the data into a temp table to remove all the returning of the rows, and it ran in 2 minutes, 20 seconds. The initial INSERT accounts for 1 minute, 27 seconds of that time. At first I tried to test with 20,000,000 rows, but quickly found that there are only just over 3 million dates to work with between 1753-01-01 and 9999-12-31. FYI.
I'm pretty sure that applying this to considerably more data would likely be a bad idea, but for the small sample size, 0.02 seconds shouldn't be a problem.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
February 13, 2019 at 9:36 am
Jeff Moden - Wednesday, February 13, 2019 7:36 AMsgmunson - Wednesday, February 13, 2019 7:14 AMOkay folks. Recursion is not necessarily a problem, although it certainly can be. For smaller data sets, it can be quite helpful. I took a look at this and my first thought was, "why not recursion?".The answer to that question is because, despite the smaller number of rows, it's still a performance problem for the smaller number of rows when an "incremental" rCTE that counts by "1" is used. It's all a part of the "Death by a thousand cuts" that occurs in every database and makes it nearly impossible to fix because management doesn't "get it" either. 😉
I get the concept, but I'm having a hard time seeing how the entire query; including the randomized insert; running in 0.02000120000 seconds is going to become a problem. Having an insert that could form the basis for a good test, I've now tested this on 3,000,000 rows. With returning the results to SSMS, the query ran in 2 minutes, 43 seconds. Of course, it actually had to deliver 6 million rows because there are two selects, so I commented out the first one and it ran in 2 minutes, 30 seconds. I then decided to run the data into a temp table to remove all the returning of the rows, and it ran in 2 minutes, 20 seconds. The initial INSERT accounts for 1 minute, 27 seconds of that time. At first I tried to test with 20,000,000 rows, but quickly found that there are only just over 3 million dates to work with between 1753-01-01 and 9999-12-31. FYI.
I'm pretty sure that applying this to considerably more data would likely be a bad idea, but for the small sample size, 0.02 seconds shouldn't be a problem.
That's what I'm talking about. People are happy with .02 seconds. For such a small row set, it should be virtually immeasurable.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 16 through 21 (of 21 total)
You must be logged in to reply to this topic. Login to reply