November 29, 2012 at 11:48 am
Phil Parkin (11/29/2012)
I wrote the above response before trying Craig's solution, which certainly does the business - thanks! Maybe this is one instance where a loop is the best solution. I'll leave it a while longer (no pun intended) before I implement it, to see whether anyone comes up with a set-based solution.
I poked at it a bit more yesterday and while I can make that process look cleaner (it's a bit disorganized) it's the best solution I have available without digging into indexes and possible shortcuts for some of the sub-data, like the BookingID list. The killer is in the self-referencing bookingIDs and the bi-directional hierarchy, where you're going both 'up and down' the chain simultaneously.
Glad to help, though. 🙂
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
November 29, 2012 at 6:58 pm
Hi
The following recursive option might work. It walks up and down from the queried invoice and joins the results.
DECLARE @invoiceID int = 3
;with rListUp as (
select a.invoiceid invoiceid, a.invoiceid currentid, a.bookingid, b.invoiceid nextinvoiceid
from #IDs a
CROSS APPLY (SELECT invoiceid, bookingid FROM #IDs m WHERE m.bookingid = a.bookingid ) b
where b.invoiceid > a.invoiceid
union all
select a.invoiceid, a.nextinvoiceid, a.bookingid, c.invoiceid
from rListUp a
CROSS APPLY (SELECT l.invoiceid, l.bookingid FROM #IDs l WHERE l.invoiceID = a.nextinvoiceid) b
CROSS APPLY (SELECT m.invoiceid, m.bookingid FROM #IDs m WHERE m.bookingid = b.bookingid and m.invoiceid > a.nextinvoiceid ) c
),
rListDown as (
select a.invoiceid invoiceid, a.invoiceid currentid, a.bookingid, b.invoiceid nextinvoiceid
from #IDs a
CROSS APPLY (SELECT invoiceid, bookingid FROM #IDs m WHERE m.bookingid = a.bookingid ) b
where b.invoiceid < a.invoiceid
union all
select a.invoiceid, a.nextinvoiceid, a.bookingid, c.invoiceid
from rListDown a
CROSS APPLY (SELECT l.invoiceid, l.bookingid FROM #IDs l WHERE l.invoiceID = a.nextinvoiceid) b
CROSS APPLY (SELECT m.invoiceid, m.bookingid FROM #IDs m WHERE m.bookingid = b.bookingid and m.invoiceid < a.nextinvoiceid ) c
),
rList AS (
SELECT invoiceid, nextinvoiceid FROM rListUp
UNION
SELECT invoiceid, nextinvoiceid FROM rListDown
)
select invoiceid, bookingid
from #IDs
where invoiceid in (
select nextinvoiceid from rlist where invoiceID = @InvoiceID
union
select @InvoiceID
)
It could probably be prettied up a bit more and there is still a chance that recursion limits will get hit if there is a long change of invoice/bookings
November 29, 2012 at 11:23 pm
mickyT (11/29/2012)
dwain.c (11/29/2012)
To all who participated in this thread:You may wish to review this new article:
http://www.sqlservercentral.com/articles/String+Manipulation/94365/
It provides a good utility function for solving this case and many other similar ones. Thanks to the OP for being the inspiration for the article!
Thanks Dwain
I've already had a look at it and I am going to have a really good read through it soon. Even though I haven't had a chance to dig into the functionality and digest it properly I was impressed with the methods you came up with. Great article:-)
Thanks for noticing and reading it, and the praise!
By all means leave some comments in the discussion thread and don't forget to rate the article!
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 29, 2012 at 11:25 pm
Hi Phil - allow me to propose something for you.
DECLARE @RowCount INT
DECLARE @IDs TABLE
(InvoiceId int not null
,BookingId int not null)
INSERT INTO @IDs
SELECT InvoiceID, BookingID
FROM #IDs
WHERE InvoiceID = 1
SET @RowCount = @@ROWCOUNT
WHILE @RowCount <> 0
BEGIN
INSERT INTO @IDs
SELECT a.InvoiceID, a.BookingID
FROM #IDs a
INNER JOIN @IDs b ON a.BookingID = b.BookingID
UNION ALL
SELECT a.InvoiceID, a.BookingID
FROM #IDs a
INNER JOIN @IDs b ON a.InvoiceID = b.InvoiceID
EXCEPT
SELECT a.InvoiceID, a.BookingID
FROM @IDs a
SET @RowCount = @@ROWCOUNT
END
SELECT * FROM @IDs
I think this resolves the original and the alternate data setup (from Kraig?) properly. As it turns out, I recently solved an problem quite similar to this and this solution has its basis in that.
Not to disappoint you Alan (as this uses a loop), it is at least a set-based loop and in the similar problem I solved any attempt at a recursive solution did not perform nearly as well. I have not done that testing here so I can't say for sure whether it will be faster or not.
The other advantage to this approach is that it eliminates feedback loops which will completely disintegrate any effort to use a rCTE.
Let me know of any cases it doesn't resolve. It is possible that another UNION ALL amidst the loop will take care of it.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 29, 2012 at 11:27 pm
Post deleted - SSC hung and posted this one to the wrong thread.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
November 30, 2012 at 3:32 pm
No loops, no cursors, no cte
IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL
DROP TABLE #TempTable
CREATE TABLE #TempTable
(
InvoiceId INT NOT NULL
,BookingId INT NOT NULL
,PRIMARY KEY (InvoiceId,BookingId)
)
INSERT #TempTable
(
InvoiceId
,BookingId
)
SELECT
*
FROM
( VALUES ( 1, 9), ( 1, 10), ( 1, 11), ( 2, 11), ( 3, 11), ( 3, 12), ( 3, 13), ( 4, 14), ( 5, 14) ) data (InvoiceId,BookingId)
SELECT DISTINCT
t2.InvoiceId
,t2.BookingId
FROM
(
SELECT
MAX(BookingId) AS BookingId
FROM
#TempTable
GROUP BY
BookingId
) AS Sub1
INNER JOIN
#TempTable AS t1
ON Sub1.BookingId = t1.BookingId
INNER JOIN
#TempTable AS t2
ON t1.BookingId = t2.BookingId + 1
November 30, 2012 at 3:42 pm
Steven Willis (11/30/2012)
No loops, no cursors, no cte
IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL
DROP TABLE #TempTable
CREATE TABLE #TempTable
(
InvoiceId INT NOT NULL
,BookingId INT NOT NULL
,PRIMARY KEY (InvoiceId,BookingId)
)
INSERT #TempTable
(
InvoiceId
,BookingId
)
SELECT
*
FROM
( VALUES ( 1, 9), ( 1, 10), ( 1, 11), ( 2, 11), ( 3, 11), ( 3, 12), ( 3, 13), ( 4, 14), ( 5, 14) ) data (InvoiceId,BookingId)
SELECT DISTINCT
t2.InvoiceId
,t2.BookingId
FROM
(
SELECT
MAX(BookingId) AS BookingId
FROM
#TempTable
GROUP BY
BookingId
) AS Sub1
INNER JOIN
#TempTable AS t1
ON Sub1.BookingId = t1.BookingId
INNER JOIN
#TempTable AS t2
ON t1.BookingId = t2.BookingId + 1
The problem with your code is that if the value (3,14) is present it does not get picked up and neither does (4,14) or (5,14)
IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL
DROP TABLE #TempTable
CREATE TABLE #TempTable
(
InvoiceId INT NOT NULL
,BookingId INT NOT NULL
,PRIMARY KEY (InvoiceId,BookingId)
)
INSERT #TempTable
(
InvoiceId
,BookingId
)
SELECT
*
FROM
( VALUES ( 1, 9), ( 1, 10), ( 1, 11), ( 2, 11), ( 3, 11), ( 3, 12), ( 3, 13), (3,14), ( 4, 14), ( 5, 14) ) data (InvoiceId,BookingId)
so far an extension of my adjacency list by searching up and down (not the fastest thanks Mickey T. by the way) and dwains loop solution are the only ones that seem to work fully on the set of data we have.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
November 30, 2012 at 8:25 pm
capnhector (11/30/2012)
so far an extension of my adjacency list by searching up and down (not the fastest thanks Mickey T. by the way) and dwains loop solution are the only ones that seem to work fully on the set of data we have.
Erm, that was mine, as well. Dwain's is basically a variation. I still need a chance to check performance between the two.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
November 30, 2012 at 8:28 pm
Steven Willis (11/30/2012)
No loops, no cursors, no cte
You've answered the data, not the problem, with this. You cannot recurse in both directions, you cannot deal with more than 2 layers of heirarchy. You need to be able to traverse bi-directionally across bookingIDs and InvoiceIDs as well as deal with multiple invoice-booking-invoice-booking-invoice chains.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
November 30, 2012 at 8:51 pm
OK, sorry. I only read the first few posts and made some faulty assumptions. After considering the last few comments I can see that it really is more complicated than it first appeared. Mea culpa.
December 1, 2012 at 9:43 am
Alan.B (11/28/2012)
Evil Kraig F (11/28/2012)
This is NOT pretty, but it IS functional.I am trying to understand why went with a loop vs a set-based approach. I posted a more set based method earlier that gets the same results notably faster. I am not trying to be confrontational, I'm just trying to understand your approach or what I did wrong.
If the problem requires that, no matter which InvoiceID you pick, all associated rows for all associated InvoiceIDs will be returned, then it actually can't be done in what most folks consider to be a "set based" manner. There needs to be some form of recursion either as a WHILE loop or as an rCTE (Recursive CTE).
I will say that the WHILE loop and the rCTE method can more closely come to "set based" in that sets of rows can be returned for each iteration but it can't be done in a single itermation (what most folks think of as "set based") without giving up on the trait of "unknown depth" as some of the set based solutions on this thread have.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 1, 2012 at 12:27 pm
Post withdrawn. I made a mistake in the code somewhere. I'll post it as a new post if I can figure out what the heck I did wrong and fix it.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 1, 2012 at 12:48 pm
*wipes away a tear from his eye* Damn, Jeff... that's... that's just beautiful. *sniff*
I couldn't for the life of me figure out how to get out of the IN (SELECT InvoiceID From rCTE) situation.
Only one problem.
It doesn't recurse past the first series of booking IDs. Check this out:
if object_id('tempdb..#IDs') is not null
drop table #IDs
create table #IDs (
InvoiceId int not null
,BookingId int not null
) on [PRIMARY]
go
alter table #IDs add constraint PK_IDs primary key clustered (
InvoiceId
,BookingId
)
with (
STATISTICS_NORECOMPUTE = off
,IGNORE_DUP_KEY = off
,ALLOW_ROW_LOCKS = on
,ALLOW_PAGE_LOCKS = on
) on [PRIMARY]
go
INSERT #IDs VALUES (1,9)
INSERT #IDs VALUES (1,10)
INSERT #IDs VALUES (1,11)
INSERT #IDs VALUES (2,11)
INSERT #IDs VALUES (2,12)
INSERT #IDs VALUES (3,12)
INSERT #IDs VALUES (3,13)
INSERT #IDs VALUES (4,14)
INSERT #IDs VALUES (5,14)
GO
--insert #IDs ( InvoiceId, BookingId)
--select * from (values (1,9), (1,10), (1,11), (2,11), (3,11), (3,12), (3,13), (4,14), (5,14)) data(InvoiceId,BookingId)
select * from #IDs
--=================================================================================
-- Solve the problem using some "standard" "Traveling Salesman" or "Airline
-- Flight Schedule" code. I believe you'll be absolutely amazed as the
-- performance of this code.
-- This code could easily be turned into an iTVF to be used in CROSS APPLY as
-- if it were a parameterized view.
--=================================================================================
--===== Start a timer because I don't trust SET STATISTICS any more (1).
-- Ref (1): http://www.sqlservercentral.com/articles/T-SQL/91724/
DECLARE @StartTime DATETIME;
SELECT @StartTime = GETDATE();
--===== Declare a variable for the original InvoiceID that we want to find and
-- stack the others against. This is like identifying the "Start City"
-- in a "Traveling Salesman" problem.
DECLARE @StartInvoiceID INT;
SELECT @StartInvoiceID = 1; --<<< Change this to do other tests.
--<<< Any number from 1 to 300,000 will do.
--===== Solve the problem as if it were a "Traveling Salesman" problem. The key
-- to this is to leave "breadcrumbs" as to where we've been by using a
-- "Materialized Hierarchical Path" (the "BeenThere" column).
WITH
cteBuildPath AS
( --=== Find the "top" ID as we would in any hierarchy or net.
SELECT a.InvoiceId, a.BookingID,
BeenThere = CAST(QUOTENAME(a.InvoiceID,'"') AS VARCHAR(8000))
FROM #IDs AS a --as in "anchor"
WHERE InvoiceID = @StartInvoiceID
UNION ALL
--==== "Loop" through all of the possibilities with some "stop code" to prevent
-- the inevitable cycles that will appear because it's not a "DAG".
SELECT b.InvoiceID, b.BookingID,
BeenThere = r.BeenThere + CAST(QUOTENAME(b.InvoiceID,'"') AS VARCHAR(8000))
FROM #IDs b --as in "base table"
JOIN cteBuildPath r --as in "recursive part"
ON r.BookingID = b.BookingId
WHERE r.BeenThere NOT LIKE '%'+QUOTENAME(b.InvoiceId,'"')+'%'
)
SELECT * FROM cteBuildPath
--
----=== Find the full data for the InvoiceID's that we've touched at least once.
-- -- Note that the WHERE IN acts as if it were a DISTINCT so no need to explicity
-- -- use the word DISTINCT.
-- SELECT o.*
-- FROM #IDs o --as in "original"
-- WHERE o.InvoiceID IN (SELECT InvoiceId FROM cteBuildPath)
-- ORDER BY o.InvoiceID, o.BookingID
--;
--===== Let's see how long that took.
PRINT 'Total Duration = ' + CONVERT(CHAR(12),GETDATE()-@StartTime,114)
GO
The result set stops at InvoiceID 2, and doesn't chain to 3 after picking up the BookingIDs for Invoice 2. Sorry for the horrible INSERT statement but I've only got 2k5 at the house.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
December 1, 2012 at 12:57 pm
Heh... dammit. I had it working and made a final tweek to the code. Now I have to go back and figure out what the heck I changed. Sorry for the mistake. I'll fix it (I think) and will repost it. I took down my original post because it was wrong.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 1, 2012 at 1:18 pm
Looking at the data that you've included, Craig, I'm thinking that maybe I didn't have it working at all when it comes to "disconnected" data like you did with the 2,12 entry. Thanks for the test. Back to the drawing board. :blush:
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 16 through 30 (of 35 total)
You must be logged in to reply to this topic. Login to reply