August 15, 2013 at 2:06 am
GPO (8/15/2013)
Chris how would you change yours to work on SQL 2005 (no cross apply and table value constructors)?
SQL2k5 introduced APPLY so you're ok with it in your query. Here's a version which uses a 2k5 compliant row generator:
;WITH Tens (n) AS (
SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL
SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0
),
RowGenerator AS (
SELECT n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM Tens a CROSS JOIN Tens b CROSS JOIN Tens c CROSS JOIN Tens d CROSS JOIN Tens e
)
SELECT
location_id,
unoccupied_start_dt = CASE WHEN seq = 1 THEN NULL ELSE unoccupied_start_dt END,
unoccupied_end_dt = CASE WHEN seq = [Rows] THEN NULL ELSE unoccupied_end_dt END
FROM (
SELECT
location_id,
unoccupied_start_dt = MIN(Timespot),
unoccupied_end_dt = MAX(Timespot),
seq = ROW_NUMBER() OVER(PARTITION BY location_id ORDER BY TimeGroup),
[Rows] = COUNT(*) OVER(PARTITION BY location_id)
FROM (
SELECT
s.location_id, s.MIN_start_dt, s.MAX_end_dt,
x.Timespot,
TimeGroup = DATEADD(minute,1-ROW_NUMBER() OVER(PARTITION BY s.location_id ORDER BY x.Timespot), x.Timespot)
FROM (SELECT location_id, MIN_start_dt = MIN(start_dt), MAX_end_dt = MAX(end_dt) FROM #stays GROUP BY location_id) s
CROSS APPLY (
SELECT TOP (DATEDIFF(minute,s.MIN_start_dt,s.MAX_end_dt)+3)
TimeSpot = DATEADD(minute,n-2,s.MIN_start_dt)
FROM RowGenerator
) x
WHERE NOT EXISTS (SELECT 1 FROM #stays l WHERE l.location_id = s.location_id
AND x.Timespot BETWEEN l.start_dt AND l.end_dt)
) d
GROUP BY location_id, TimeGroup
) o
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 15, 2013 at 4:33 am
ChrisM@Work (8/15/2013)
GPO (8/15/2013)
Hi DwainI'm enormously grateful for the code you've put up. I'll test yours and Chris's and see what I can learn from them. I'll post back my observations after some time for reflection...(he said clinging for dear life to the learning curve)
Dwain's quite capable of providing an American English description of how his code works. Here's an English description of mine π
For each LocationID, find the earliest and latest date in the set. Subtract one interval from the earliest and add one interval to the latest. An interval for this exercise is defined as one minute.
Generate a row for each interval between these two dates - a set of dates incrementing by one minute from the start date (minus a minute) to the end date (plus a minute).
Remove rows from the list which are between the start date and end date of any visit for the locationID. This will leave a date range with gaps in it, where the gaps correspond to visits.
Divine the start and end date of each contiguous date range remaining.
Finally, process the start and end date to generate the NULLs shown in your example.
Nice easy query to finish off a busy day with π
Not sure there's much "divine" about mine to explain. I was hoping the links provided would pretty much cover it. I'm open to any questions though. π
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
August 15, 2013 at 6:06 am
GPO (8/15/2013)
Awesome to have the heavy hitters on the case! Jeff, where you say:You're not actually supposed to use any data from that. You're only supposed to use the presence of rows instead of writing a loop
I understand that, but if you run that it returns 5 columns of zeros. All called [n]. And I can't work out why that's necessary. Surely one column of zeros would be enough.
Edit: I think I get it now! The cross joins mean that 10^5 rows are generated, and you're saying it's irrelevant that 5 meaningless columns just happen to be generated. Doing an extra CROSS JOIN would presumably result in 6 meaningless columns and a million rows. Does that sound right?
Edit 2:
a readless Tally Table
So that means a tally table that doesn't have to be read from the disk, and is presumably therefore faster than the permanent tally table I usually use. How am I going so far?
Thank you for the kind words and I know you meant no slight but, just to be sure, Chris and Dwain are both "heavy hitters" in my book. If they weren't, I wouldn't have learned the things from them that I've learned. π
Your edit #1 is spot on and both Chris and Dwain have each provided good explanations.
Your edit #2, however, isn't quite right. Allow me to explain. Which is faster? Reading values cached in high speed memory or calculating values? If you take a look at the following chart, you'll find that reading from a cached table ever so slightly edges out calculating the numbers. The "problem" with reading from a cached Tally Table is that it will produce a huge number of reads. While not a problem, in this case, it's still bothersome for many because one of the indicators of other types of performance challenged code is to measure the reads for each query.
If you're interested, the chart above came from the following article...
http://www.sqlservercentral.com/articles/T-SQL/74118/
Also, I've been experimenting more with the performance of the cCTE (Cascading CTE) method that Itzik Ben-Gan originally came up with and I'll post my latest tonight after work (if I can remember that long, lots going on :-P).
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2013 at 6:28 am
ChrisM@Work (8/15/2013)
Dwain's quite capable of providing an American English description of how his code works. Here's an English description of mine π
BWAAAA-HAAAA!!!! I'm not sure that's so true anymore. He's developing an accent from all the traveling he's been doing. :-):-D:-P;-)
--Jeff Moden
Change is inevitable... Change for the better is not.
August 15, 2013 at 8:24 am
As a little aside, I had a quick look at the rCTE method vs other methods of generating rows. Here's the code:
-- Q1
DROP TABLE #Temp1;
WITH RowGenerator AS (
SELECT n = 1
UNION ALL
SELECT n+1 FROM RowGenerator WHERE n < 500000
)
SELECT n
INTO #Temp1
FROM RowGenerator OPTION (MAXRECURSION 0)
GO 10
-- Q2
DROP TABLE #Temp2;
SELECT TOP (500000)
n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
INTO #Temp2
FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a (n) -- 10
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b (n) -- 100
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c (n) -- 1000
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n) -- 10000
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n) -- 100000
CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n) -- 1000000
GO 10
-- Q3
DROP TABLE #Temp3;
;WITH
_2 (n)AS (SELECT 0 UNION ALL SELECT 0),
_4 (n)AS (SELECT a.n FROM _2 a, _2 b),
_16 (n)AS (SELECT a.n FROM _4 a, _4 b),
_256 (n)AS (SELECT a.n FROM _16 a, _16 b),
_65536 (n)AS (SELECT a.n FROM _256 a, _256 b)
SELECT TOP (500000)
n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
INTO #Temp3
FROM _65536 a, _16 b
GO 10
Using profiler to obtain CPU/duration, the rCTE is about 30x slower than the two other methods. Whilst rCTE's can be handy for calculations such as complex running totals and for hierarchy resolution, they are exceedingly expensive (read hopeless) for row generation.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 15, 2013 at 8:26 am
Jeff Moden (8/15/2013)
ChrisM@Work (8/15/2013)
Dwain's quite capable of providing an American English description of how his code works. Here's an English description of mine πBWAAAA-HAAAA!!!! I'm not sure that's so true anymore. He's developing an accent from all the traveling he's been doing. :-):-D:-P;-)
With the accent he had, he was afraid of being headhunted. Not so good in PNG :unsure:
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 15, 2013 at 7:23 pm
Jeff Moden (8/15/2013)
Thank you for the kind words and I know you meant no slight but, just to be sure, Chris and Dwain are both "heavy hitters" in my book. If they weren't, I wouldn't have learned the things from them that I've learned. π
Consider myself but a Padawan-learner I do. From a SQL Jedi Master such as yourself, coming such high praise humbles me it does.
ChrisM@Work (8/15/2013)
Jeff Moden (8/15/2013)
ChrisM@Work (8/15/2013)
Dwain's quite capable of providing an American English description of how his code works. Here's an English description of mine πBWAAAA-HAAAA!!!! I'm not sure that's so true anymore. He's developing an accent from all the traveling he's been doing. :-):-D:-P;-)
With the accent he had, he was afraid of being headhunted. Not so good in PNG :unsure:
Not only can I do Yoda-speak, I have picked up a fair amount of Aussie slang too. G'day to ya mates!
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 3, 2013 at 9:12 am
Some time ago two gap problems where published at
http://beyondrelational.com/puzzles/tsql/default.aspx
under puzzles 79 and 80.
I recently also read about 'Static Relational Interval Tree' at
http://www.solidq.com/sqj/Pages/Home.aspx
which at some point talks about how to deal with gaps in big data.
September 3, 2013 at 7:14 pm
I was going to look up this thread but Michael saved me the trouble. Thanks Michael!
What I wanted to say was, what happened to the OP? He promised to let us know what worked for him.
π
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
September 3, 2013 at 11:32 pm
Hi Dwain
Sorry for not getting back to you. I've been diverted to other projects for a while. I provided a rather simplified version of the problem as an example, but based on what you posted, I did get it working on my production data. I don't have any concrete metrics to report yet.
Cheers
...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell
Viewing 10 posts - 16 through 24 (of 24 total)
You must be logged in to reply to this topic. Login to reply