Auto update table with correlated subquery

Question

Post reply

Auto update table with correlated subquery

blom0344

SSCrazy

Points: 2076
More actions
August 20, 2008 at 5:58 am

#189509

I have the following query that is supposed to update a date-column in a table based on the lowest value of another date from the same table for every userid:
UPDATE dbo.S_TABLE
SET LOAD_END_DTS = GETDATE()
WHERE LOAD_END_DTS IS NULL AND
LOAD_DTS = (SELECT MIN(X.LOAD_DTS) FROM dbo.S_TABLE X
WHERE USERID = X.USERID)
AND USERID IN
(SELECT Y.USERID FROM dbo.S_TABLE Y
WHERE Y.LOAD_END_DTS IS NULL
GROUP BY Y.USERID
HAVING COUNT(Y.LOAD_DTS)>1)
The bold part is the correlated subquery that should make sure that the action is performed for EACH USERID.
The second part of the query is just there to avoid updating rows for a USERID that only have one row with LOAD_END_DTS as null.
However, the correlation does not seems to work as the MIN(X.LOAD_DTS) is established for the entire table and not for each USERID.
What am I missing, this used to work perfectly in - for instance - Oracle..

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

HansB SSC-Addicted Points: 411 More actions · Answer 1

If i understand it correctly,

you want to update to the load_end_dts date

where load_dts is the lowest value and where users have more then entry in the table?

i think you should write the query this way:

Update s

Set s.load_end_dts = GetDate()

From dbo.s_table as s

Where s.load_end_dts is NULL

AND Load_DTS = (SELECT MIN(x.Load_dts) from dbo.s_table x

where x.userid = s.userid)

and s.userid IN (SELECT Y.USERID FROM dbo.S_TABLE Y

WHERE Y.LOAD_END_DTS IS NULL

GROUP BY Y.USERID

HAVING COUNT(Y.LOAD_DTS)>1)

that should do it...

Richard Fryar Hall of Fame Points: 3457 More actions · Answer 2

How about using a derived table

Update s

Set s.load_end_dts = GetDate()

From dbo.s_table as s

Join (Select UserId, Min(Load_Dts) as MinDts From dbo.s_table Group By UserId) as z

On s.UserId = z.UserId And s.Load_Dts = z.MinDts

Where s.load_end_dts is NULL

and s.userid IN (SELECT Y.USERID FROM dbo.S_TABLE Y

WHERE Y.LOAD_END_DTS IS NULL

GROUP BY Y.USERID

HAVING COUNT(Y.LOAD_DTS)>1)

It might be more efficient than a correlated subquery.

Check Your SQL Servers Quickly and Easily
www.sqlcopilot.com

arjun.tewari Ten Centuries Points: 1013 More actions · Answer 3

Exactly, Go for the derived table it is much much more performant than a sub query.

HansB SSC-Addicted Points: 411 More actions · Answer 4

I dont think that i agree,

More readable? maybe, but more performant? I dont think so.

If you can show where the performance gains are then i wil be more then happy to agree with you.

Richard Fryar Hall of Fame Points: 3457 More actions · Answer 5

I said it might be more efficient.

You can never tell for sure without testing. Comparative performance depends on data volumes, statistics, indexes, etc.

Check Your SQL Servers Quickly and Easily
www.sqlcopilot.com

HansB SSC-Addicted Points: 411 More actions · Answer 6

Richard, my reply was not directed to you. It was directed towards arjun. Testing would be necessary to determine which method is faster and indeed it all depends.

my previous post was intended to say that testing would be the only way of being certain, but i now see that it did not come out that way.

In the end, we have both solved the problem, being it thru different routes, it is now up to the OP to decide which way to go.

maybe if i have some time when i am home tonight, i will recreate the problem, fill the table up to 200.000 rows and do some performance testing (because i am curious).

I will let you know my results.

blom0344 SSCrazy Points: 2076 More actions · Answer 7

I have only had 1 chance to test the first suggestion by Hans, but it did not have the proper result. Looks like it gave the proper result for the lowest key (that matches the requirement that 2 records must exist for a given key with load_end_dts = null)

Strange..

HansB SSC-Addicted Points: 411 More actions · Answer 8

Blom, can you give an example, i am not sure i am following you.

When you say key, what is the key? The subquery as i wrote it down gets the min value for Load_dts and not the key...

I am more then happy to see where it goes wromg, bu i need a small example. Thanks!

blom0344 SSCrazy Points: 2076 More actions · Answer 9

I have the following table:

GEBIDLOAD_DTS LOAD_END_DTS

10017-7-2008 0:00:00 NULL

10028-7-2008 0:00:00 14-7-2008 0:00:00

100215-7-2008 0:00:00 NULL

100220-8-2008 0:00:00 NULL

10031-8-2008 0:00:00 NULL

100320-8-2008 0:00:00 NULL

I use the script:

Update s

Set s.load_end_dts = dateadd(ss,-1,(GetDate()))

From dbo.s_gebruikers2 as s

Where s.load_end_dts is NULL

AND Load_DTS = (SELECT MIN(x.Load_dts) from dbo.s_gebruikers2 x

where x.gebid = s.gebid)

and s.gebid IN (SELECT Y.GEBID FROM dbo.s_gebruikers2 Y

WHERE Y.LOAD_END_DTS IS NULL

GROUP BY Y.GEBID

HAVING COUNT(Y.LOAD_DTS)>1)

Which results in:

GEBIDLOAD_DTS LOAD_END_DTS

10017-7-2008 0:00:00 NULL

10028-7-2008 0:00:00 14-7-2008 0:00:00

100215-7-2008 0:00:00 NULL

100220-8-2008 0:00:00 NULL

10031-8-2008 0:00:00 26-8-2008 11:17:05

100320-8-2008 0:00:00 NULL

Where I would expect the bold record to be updated to 26-8-2008 11:17:05 as well.

I looked very hard, but not found the cause yet..

HansB SSC-Addicted Points: 411 More actions · Answer 10

Blom,

This should do what you need:

Update s

Set s.load_end_dts = dateadd(ss,-1,(GetDate()))

From dbo.s_gebruikers2 as s

Where s.Load_DTS = (SELECT MIN(x.Load_dts) from dbo.s_gebruikers2 x

where x.gebid = s.gebid and x.load_end_dts is NULL)

and s.gebid IN (SELECT Y.GEBID FROM dbo.s_gebruikers2 Y

WHERE Y.LOAD_END_DTS IS NULL

GROUP BY Y.GEBID

HAVING COUNT(Y.LOAD_DTS)>1)

it was excluding the 1002 row before, because in the main query it was looking for rows with load_end_dts is null, 1002 had a value so it was excluded.

Try this and let me know!

Hans..

blom0344 SSCrazy Points: 2076 More actions · Answer 11

Yep,

Hans, This seems to work out right. Pretty obvious, but the more one stares, the more one misses such details.

Thanks again for your input!

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 12

Heh... one person says it might be more performant, another says, with some vigor I may add, probably not. Neither has done a test and posted it. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

blom0344 SSCrazy Points: 2076 More actions · Answer 13

From my point of view the performance issue is secondary , since we're talking in the range of 5000 - 100000 records for a table. With SQL server 2005 and a decent server I expect this to be a breeze for either solution.

However, your point is still valid.

For those interested in the outcome, let us create a 100000 record sample table and test performance 🙂

Jeff Moden SSC Guru Points: 1004339 More actions · Answer 14

blom0344 (9/4/2008)
From my point of view the performance issue is secondary , since we're talking in the range of 5000 - 100000 records for a table. With SQL server 2005 and a decent server I expect this to be a breeze for either solution.
However, your point is still valid.
For those interested in the outcome, let us create a 100000 record sample table and test performance 🙂

So... anybody gonna test this or what? 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)