WHILE LOOP alternative

Question

WHILE LOOP alternative

gregs-1000765

Mr or Mrs. 500

Points: 550
More actions
October 22, 2013 at 12:25 pm

#299488

I'm trying to figure out an alternative to a WHILE LOOP statement.
I have a table with database key numbers (PartyKey) and a person's ID number (PersonId)
partykeypersonId
11
23
31
33
44
The problem is a Person can have more than one PartyKey number, and visa versa.
For instance, PartyKey 1 and 3 both have the same PersonId Number of 1.
PartyKey 3 and Partykey 2 also share a PersonId number of 3.
I need to get all combinations of Partykeys. In this case I would need PartyKeys 1, 2, and 3.
The WHILE LOOP works but this type of query is hard to understand and took quite some time to figure out.
Does anyone know of a better way to get the desired output? I have spent days working on this but other solutions took twice as long as the WHILE LOOP.
The desired output is this:
orderidpartykeynbrOfIterations
732141
882911
882932
882923
932331
932312
932322
Here is some test data and the desired resultset.
DECLARE @increment TINYINT
DECLARE @didinsert TINYINT
/* create test data for person id combinations */
CREATE TABLE #personIds(
partykey INT
,personId INT
)
INSERT INTO #personIds
SELECT 1, 1
UNION ALL
SELECT 2, 3
UNION ALL
SELECT 3, 1
UNION ALL
SELECT 3, 3
UNION ALL
SELECT 4, 4
;
/* create test data for orders */
CREATE TABLE #orders (
orderId INT
,partykey INT
)
INSERT INTO #orders
SELECT8829, 1
UNION ALL
SELECT 9323, 3
UNION ALL
SELECT 7321, 4
;
/* to store all partykey combinations */
CREATE TABLE #orderPartyIds (
orderid INT
,partykey INT
,incrementNbr TINYINT
)
;
SET @increment = 1
SET @didinsert = 1
/* insert order data orderspartyids table */
/* to be used in while loop below */
INSERT INTO #orderPartyIds (
orderid
,partykey
,incrementNbr
)
SELECTorderid
,partykey
,@increment /* to show this partykey is the original in the order */
FROM#orders
/* this routine loops thru the orderparties and personids tables to get all posible partykey combinations */
/* this actually works, just looks messy */
WHILE @increment < 10 AND @didinsert = 1
BEGIN
INSERT INTO #orderPartyIds (orderid, partykey, incrementNbr)
SELECT DISTINCT o.orderid, p1.partykey, @increment
FROM #orderPartyIds o
INNER JOIN #personIds p ON o.partykey = p.partykey
INNER JOIN #personIds p1 ON p.personId = p1.personId
WHEREincrementNbr = @increment - 1
AND NOT EXISTS (
SELECTorderid
FROM#orderPartyIds o1
WHEREo1.orderid = o.orderid
ANDo1.partykey = p1.partykey
)
SET @didinsert = 0
/* set the variable back to one if data was inserted */
SELECT@didinsert = 1
FROM#orderPartyIds
WHEREincrementNbr = @increment
/* increment the counter by one */
SET @increment = @increment + 1
END /* end of WHILE loop */
;
/* expected output */
SELECTorderid
,partykey
,incrementNbr AS nbrOfIterations
FROM#orderPartyIds
ORDER BY orderid
,incrementNbr
;
/* clean up */
DROP TABLE #personIds
DROP TABLE #orders
DROP TABLE #orderPartyIds

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Dhruvesh Shah SSCrazy Points: 2303 More actions · Answer 1

If you try grouping order id and partkey with count(*) that should give you number of unique combination of order id and part key.

If you want 1 2 3 next to it you can use Rownumber().

Nevyn SSChampion Points: 14210 More actions · Answer 2

Thats not what he wants if you look at his expected results. He wants to do a "six degrees of kevin bacon" on his data, finding all the parties associated to an order through mutual people.

It can be done with a recursive CTE, which is still not ideal, but likely better than the while loop. I don't think you can avoid some type of recursion here given that table, but I'd love to be proven wrong. There might be a way to do it with dynamic sql, too, but my head hurts just thinking about it.

EDIT: I THINK it can be done with a recursive CTE. No longer sure

Nevyn SSChampion Points: 14210 More actions · Answer 3

Here is the recursive CTE solution.

Not sure if it will be any quicker at all, and you still need to know the max iteration depth to put in an exit condition.

WITH

recursiveCTE AS (

SELECTorderid,

o.partykey,

p.personId,

1 AS incrementNbr

FROM #orders o

INNER JOIN #personIds p

ON p.partykey = o.partykey

UNION ALL

SELECT r.orderid,

p2.partykey,

p2.personId,

incrementNbr + 1 AS incrementNbr

FROM recursiveCTE r

INNER JOIN #personIds p1

ON p1.partykey = r.partykey

INNER JOIN #personIds p2

ON p2.personId = p1.personId

AND p2.partykey <> r.partykey

WHERE incrementNbr < 10

)

SELECT orderid,partykey,min(incrementNbr) AS nbrOfIterations

FROM recursiveCTE

GROUP by orderid,partykey

ORDER BY orderid, min(incrementNbr),partykey

gregs-1000765 Mr or Mrs. 500 Points: 550 More actions · Answer 4

Thanks Nevyn. That help me figure out what I did wrong.

Learned something new today. Didn't know what a "six degrees of kevin bacon" was. Interesting readying.

Anyway, the WHILE LOOP was still faster when executed against a table with 2,789,799 records.

I got the below script from some article on SSC some time ago and use it frequently on db's I have limited access on.

You can see the results of recursive CTE and the WHILE LOOP statement. SQL 2008 R2 on an i7 HP laptop.

The recursive CTE took almost one minute and the WHILE LOOP about 10 seconds or less. Did not clear cache between runs.

Anyway, I was hoping to clean up some code, since it being used more frequently, but the original developer may have had this right. Not giving up but it has to be shelved for right now.

Set NoCount On;

Declare @cpu_ int;

Declare @lreads_ int;

Declare @eMsec_ int;

Select

@cpu_ = cpu_time,

@lreads_ = logical_reads,

@eMsec_ = total_elapsed_time

From

sys.dm_exec_requests

Where

session_id = @@spid;

--====== Paste Code Below ============

--WHILE LOOP - three runs

----CpuMsLogRdsElapsed

----2596641465099

----2719641565540

----2841641465366

--Recursive CTE - two runs

----CpuMsLogRdsElapsed

----57360160042057833

----57687160041459077

--====== Paste Code Above =============

Select

cpu_time-@cpu_ as CpuMs,

logical_reads- @lreads_ as LogRds,

total_elapsed_time - @eMsec_ as Elapsed

From

sys.dm_exec_requests

Where

session_id = @@spid

GO

Nevyn SSChampion Points: 14210 More actions · Answer 5

Yeah, I'm not surprised.

If the deepest # of iterations isn't that high, the while loop performance would not be that bad.

The recursive cte can't tell which matches it has already found (I tried to integrate that but it did not go well), so it keeps linking down to max iterations, and then fixes it in the grouping.

I don't think the dynamic sql would be a performance boost either. Maybe if hundreds of iterations were possible one of these could become worth it.

You might be better off tweaking the loop for performance gains instead of replacing it.

Try out (loop only below, needs the rest of your code):

WHILE @increment < 10 AND (@didinsert > 0 OR @increment <= 2)

BEGIN

INSERT INTO #orderPartyIds (orderid, partykey, incrementNbr)

SELECT DISTINCT o.orderid, p1.partykey, @increment

FROM #orderPartyIds o

INNER JOIN #personIds p ON o.partykey = p.partykey

INNER JOIN #personIds p1 ON p.personId = p1.personId

WHEREincrementNbr = @increment - 1

AND NOT EXISTS (

SELECTorderid

FROM#orderPartyIds o1

WHEREo1.orderid = o.orderid

ANDo1.partykey = p1.partykey

)

SET @didinsert = @@ROWCOUNT

/* set the variable back to one if data was inserted */

--SELECT@didinsert = 1

--FROM#orderPartyIds

--WHEREincrementNbr = @increment

/* increment the counter by one */

SET @increment = @increment + 1

END /* end of WHILE loop */

In other words, you should not need to query your temp table to find out if you are at max depth. @@rowcount will tell you if anything got inserted on the current increment.

Nevyn SSChampion Points: 14210 More actions · Answer 6

As for the 'six degrees of kevin bacon' analogy, I thought it was a good fit to the problem you describe 🙂

Kind of a strange requirement ...

gregs-1000765 Mr or Mrs. 500 Points: 550 More actions · Answer 7

You know, sometimes the obvious is just too easy to see. Thanks.

I have been concentrating on getting rid of the WHILE LOOP.

When I first looked at the stored procedure it had over 960 lines of code. It also was not very readable (stretched out across the whole screen). I was able to take out repeatable processes and put them in either scalar functions or table valued functions, and made use of views where possible. This improved the performance of the proc tremendously and eliminated most, not all (yet), of the table scans.

The data we're working with is from integrations for other systems. Unfortunately in the source system a person can have more than one id and one id can have more than one person. Geez.

Just have to work with it. Not liking it but Oh Well.

Now that I've resigned myself to sticking with the WHILE LOOP, for awhile, I'll take your suggestion and look at ways to improve the logic.

Thanks again.

Dwain Camps SSC Guru Points: 86978 More actions · Answer 8

I guess its going to depend on how many levels you need but I believe this works on your test data:

WITH AllPersons AS

(

SELECT DISTINCT c.partykey

,personid=CASE WHEN a.personid = b.personid AND a.personid = c.personid THEN a.personid

WHEN a.personid = b.personid THEN c.personid

WHEN b.personid = c.personid THEN a.personid

ELSE b.personid END

FROM #personIDs a

JOIN #personIDs b ON a.partykey = b.partykey

JOIN #personIDs c ON b.personid = c.personid

)

SELECT DISTINCT orderid, c.partykey

FROM #orders a

JOIN AllPersons b ON a.partykey = b.partykey

JOIN #personIDs c ON b.personid = c.personid;

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 9

This WHILE LOOP might be a little cleaner and faster than what you've got and should resolve to any number of levels.

DECLARE @increment TINYINT

DECLARE @didinsert TINYINT

/* create test data for person id combinations */

CREATE TABLE #personIds(

partykey INT

,personId INT

)

INSERT INTO #personIds

SELECT 1, 1

UNION ALL

SELECT 2, 3

UNION ALL

SELECT 3, 1

UNION ALL

SELECT 3, 3

UNION ALL

SELECT 4, 4

;

/* create test data for orders */

CREATE TABLE #orders (

orderId INT

,partykey INT

)

INSERT INTO #orders

SELECT8829, 1

UNION ALL

SELECT 9323, 3

UNION ALL

SELECT 7321, 4

;

/* to store all partykey combinations */

CREATE TABLE #orderPartyIds (

orderid INT

,partykey INT

,personid INT

,incrementNbr TINYINT

)

;

DECLARE @RowsCount INT, @Iteration INT = 1;

INSERT INTO #orderPartyIDs (orderid, partykey, personid, incrementNbr)

SELECT orderid, a.partykey, b.personid, @Iteration

FROM #orders a

JOIN #personIDs b ON a.partykey = b.partykey;

SELECT @RowsCount = @@ROWCOUNT, @Iteration = @Iteration + 1;

WHILE @RowsCount > 0

BEGIN

INSERT INTO #orderPartyIDs (orderid, partykey, personid, incrementNbr)

SELECT orderid, partykey, personid, @Iteration

FROM

(

SELECT a.orderid, b.partykey, b.personid

FROM #orderPartyIDs a

JOIN #personIDs b ON a.personid = b.personid OR a.partykey = b.partykey

EXCEPT

SELECT orderid, partykey, personid

FROM #orderPartyIDs

) a;

SELECT @RowsCount = @@ROWCOUNT, @Iteration = @Iteration + 1;

END

SELECT orderid, partykey, IncrementNbr

FROM

(

SELECT orderid, partykey, IncrementNbr

,rn=ROW_NUMBER() OVER (PARTITION BY orderid, partykey ORDER BY IncrementNbr)

FROM #orderPartyIDs

) a

WHERE rn=1;

GO

/* clean up */

DROP TABLE #personIds

DROP TABLE #orders

DROP TABLE #orderPartyIds

Note that I added a column to your #orderPartyIDs temp table.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 10

Hmmm... I was really hoping to hear back from the OP whether my suggestion helped.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St