February 22, 2007 at 1:33 pm
I am currently trying to update one (30char) column in a table (11.5mil rows, 40col) with an inner join of another table (ie. update county based on person's zipcode by looking at the zipcode table). It has been running for 1.5 hours and I'm not sure if it should be taking this long. There is not a primary or foreign key involved.
UPDATE Standard2
SET Current_County = ZipMkt06.County2
FROM Standard2 INNER JOIN
zipmkt06 ON standard2.patzip = zip
Should I expect 11.5mil rows to take this long to update based on what I'm trying to do or is something wrong? Thanks for your help.
February 22, 2007 at 2:00 pm
I suppose it could take that long depending on your hardware, how busy your db is and if there are any indexes involved... Also, are both of your zipcode columns the same? char(5) or varchar(9)? If the engine is having to convert values it may take some additional time as well...
February 22, 2007 at 2:23 pm
since you are updating standard2, you should not have joined it: stop the query and do it this way;
UPDATE Standard2
SET Standard2.Current_County = ZipMkt06.County2
FROM zipmkt06 WHERE standard2.patzip = zipmkt06.zip
Lowell
February 22, 2007 at 2:41 pm
I'm trying that now. Any thoughts on how long it could potentially run?
February 26, 2007 at 6:46 am
to LauraS
"I suppose it could take that long depending on your hardware, how busy your db is and if there are any indexes involved"
If you can answer some of the above questions, you may get some progress on this. A start would be to post ALL the DDL of the tables involved.
to Lowell
Currently your code is equivalent to LauraS's, just written without an explicit join statement (the execution plan should be the same). However LauraS's would be a better style to practice as it also supports the other types of joins, without changing style...particularly the "left join" option. I also believe that 'in-the-long-term', LauraS's is the one that the ANSI standard is promoting and that M$ is not proposing to make obselete.
February 26, 2007 at 6:54 am
I appreciate everyone's assistance in this. I am fairly new to SQL so I am learning how the various queries operate.
I ended up creating a new table with all the columns from the Standard table and creating the Current_County column based on the zipmkt table. For some reason, creating the table from scratch completed in less than 15 minutes whereas update query ran for 16 hours until I stopped it. It really did not make sense to me how I could run a Select statement requesting all that information which would run in a few seconds, but doing the same query as an update just kept spinning.
I'm not sure how busy the db is, because the server is maintained by a different department.
February 26, 2007 at 7:23 pm
Oh no, no, no... not true... check Books Online... every single example where a join is involved has the target of the update in the FROM clause, as well. In fact, we've had several problems on our systems where they used the form of update you suggest... pegged a 4 CPU box for 2 hours on what should have been a 2 minute update. Once the form of the UPDATE was corrected, the update only took 2 minutes. And, it was repeatable in our case... when we changed the form back, it would again peg the CPU's.
Both forms will work most of the time... the form where the target table is included in the FROM clause in the presence of a join will work all the time.
The fact that there is no primary key on the table could certainly be the cause of the slowness. Another factor may be that if the log file is too small, you could be waiting for growth. Same with TempDB... most folks just make it way too small.
And, there's a "tipping point" on every box... for example, if I update a million rows on a particular table on my box at home, it has no problem and does the update in short order. If I do an update on 10 million rows of the same table, it works as expected by performing in about 10 times as much time as the million rows... but if I try to update 30 million rows, instead of it only taking 30 times as long, it took a whole weekend before I killed it.
To find out if you've reached that "tipping point", try a smaller update, then a larger, etc. If you really need to UPDATE the whole table and it's beyond what you figure as a tipping point, break the update into several smaller pieces of, say, a million rows per update (most systems have no problems with a million rows). The CPU will love you...
--Jeff Moden
Change is inevitable... Change for the better is not.
February 26, 2007 at 9:23 pm
It might also help to add a WHERE clause if the majority of the rows don't need to be updated because the Current_County is already correct.
UPDATE Standard2 SET Current_County = ZipMkt06.County2 FROM Standard2 INNER JOIN zipmkt06 ON standard2.patzip = zipmkt06.zip where Standard2.Current_County is null or Standard2.Current_County <> ZipMkt06.County2
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply