Progress on long queries?

Question

Progress on long queries?

david@websource3d.com

Valued Member

Points: 68
More actions
January 23, 2003 at 4:51 pm

#151785

I'm currently running a query that is updating one field on each record of a table. The table has 110 million records. The update has been running for 46 hours and I am trying to determine its progress. I have used SQL Profiler to check the status of physical disk reads/sec and writes/sec but I don't really know how to relate that to how much time may be remaining.
The database and log files are on separate RAID arrays. I can see that the log file is growing so I assume the query will complete sometime. There are about 8 indexes on the table and I did not drop them before the update started. The machine has dual 2G Xeon processors and 4GB of Ram. I wouldn't have thought it would take this long but that is another issue i guess.
Is there any way of measuring progress of this query?
Thanks,
David

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Antares686 SSC Guru Points: 125444 More actions · Answer 1

46 hours seems long, could have stalled. One thing tha can give you a clue is if your Transaction log is set to grow and happens to do so. But I don't think it is running or it is blocked and just not timeing out.

david@websource3d.com Valued Member Points: 68 More actions · Answer 2

The query needs to complete. Assuming it is hosed up now how can I go about estimating an appropriate amount of time for this query? This is what is running now.

begin transaction

update

detail

set

nameid = (select

nameid

from

header

where

recordid = detail.recordid)

commit transaction

Is there a better way to write this? There is a relationship between the header and detail table on recordid. Any suggestions would be greatly appreciated.

Antares686 SSC Guru Points: 125444 More actions · Answer 3

The way you wrote it may not work right and in fact may be setting the value to every value in header or to itself. We have found quarks with using subqueries and not using the table.col reference in them where it assumed the outside table instead.

Try instead

BEGIN TRANSACTION

UPDATE detail

SET nameid = header.nameid

FROM

header

INNER JOIN

detail d2

ON

header.recordid = d2.recordid

COMMIT TRANSACTION

Note sure how long the other could take but I would only think a few hours at the most unless there is an index on recordid in the tables which will speed it up.

Edited by - antares686 on 01/24/2003 04:04:29 AM

david@websource3d.com Valued Member Points: 68 More actions · Answer 4

Thanks for the suggestion. I tried the new update on a small range (using a where clause) of data to get execution times so I can estimate how long this will take.

After 3 runs I am getting an average of 1 updated record every 0.00324 seconds. To complete the 107 million record update that suggest 96 hours. That seems long.

I feel like the join must be what is slowing things down. There is a foreign key defined on the detail table with reference to the header table. However when I created the relationship I did not choose the check existing data on creation option. Would that have made a difference?

Thanks for all the help.

Jeremy Kemp SSCertifiable Points: 6754 More actions · Answer 5

David,

I have found that with long running queries it is often better to break the query up into a number of smaller queries.

The main reason for this is that if, for whatever reason, your query stops before it completes the server will then rollback the updates which can in itself take a long time. The server will have to redo all the work that was rolled back when you rerun the query.

If you split up the query into blocks of 1 million rows, then if you have to stop the query, you would only lose the updates for the current set of 1 million rows. All the other updates would have been commited to the database. For example:

declare @count int

set @count = 0

while @count <= 110 begin

/* Display counter info */

print convert(varchar,datetime())

print convert(varchar,@count)

begin transaction

/* Run update sql with where clause */

UPDATE detail

SET nameid = header.nameid

FROM header INNER JOIN detail d2

ON header.recordid = d2.recordid

WHERE detail.recordid between (@count * 1000000) AND (@count + 999999)

/* NB (@count + 999999) should be (@count plus 999999) but the plus sign doesn't seem to display */

commit transaction

/* Increment Counter */

set @count = @count + 1

end

I haven't checked that this all works in Query Analyser but the principle is that it would update 1 million rows at a time. The first time round the loop it would update records 0 to 999,999; the second time round it would update records 1,000,000 to 1,999,999 and so on.

This has the advantage that it will print out a date and time and counter information each time around so you can easily see whether the query is still running. Also, if you need to restart the query then you can change the start value of @count.

You could also try issuing a table lock by replacing the UPDATE detail with UPDATE detail with (tablock). This might prevent other users from accessing the table but SQL Server might spend less time managing the row locks and more time processing the update.

Andy Warren SSC Guru Points: 119902 More actions · Answer 6

Another way to do that is to issue a set rowcount 1000000 or whatever value. Definitely makes sense to break up into chunks, you could even run multiple spids doing the update as long as you segmented the where clause to avoid overlap.

Andy

http://www.sqlservercentral.com/columnists/awarren/

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter

noeld SSC Guru Points: 96590 More actions · Answer 7

Just One more thing. You are using a One-Statement Update, so The BEGIN TRANSACTION and COMMIT TRANSACTION Statements are NOT neccessary, in fact they are NOT recomended for a single statement transactions

Noel

* Noel

AnzioBake SSCarpal Tunnel Points: 4134 More actions · Answer 8

I Frequent use a "Batch template" with the set rowcount statement to run updates like these. They are definately the way to go. With a bit of extra code you code easily print out how many rows have been processed and estimate the time till completion.

--========================================================

--Run Query in Batches of certain row size

--========================================================

Declare @BatchSize int

,@RowsAffected int

Select @Batchsize = 100000

,@RowsAffected = @Batchsize

Set Rowcount @BatchSize

While @rowsAffected = @BatchSize Begin

--====== Add Code

Select @RowsAffected = @@Rowcount

End

Set Rowcount 0