April 4, 2007 at 7:07 pm
Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/bFernando/2954.asp
May 20, 2007 at 11:16 pm
I think this could do the job...
set nocount on
SET ROWCOUNT 1000
declare @rowcount int, @batchcount int, @rowsupdated int
declare @LastKey nvarchar(341)
select @rowcount = 1,@batchcount = 0, @rowsupdated = 0, @LastKey = ''
while @rowcount > 0
begin
begin tran
update MyTable
set NewZip = ZipCode,
@LastKey = NameKey
where NewZip is NULL
and NameKey >= @LastKey
set @rowcount = @@rowcount
select @rowsupdated = @rowsupdated + @rowcount
if @rowcount > 0 select @batchcount = @batchcount + 1
commit
end
*If NameKey were unique, we could use ">" instead of ">=" .
May 21, 2007 at 1:28 am
hehe, very nice solution - simple is beautiful
May 21, 2007 at 1:37 am
> update MyTable
> set NewZip = ZipCode,
> @LastKey = NameKey
That may well work in this example, but it is not something I'd like to use for production use. You are relying on @LastKey being set to the last (alphabetically) NameKey value updated. Your updating a set, it could take any of the values from that set.
May 21, 2007 at 2:09 am
Ians, I believe you can use it for production.
The last value assigned to @LastKey is the largest NameKey in the set because the update query definitely uses the index [LargeTable__IE1].
May 21, 2007 at 2:27 am
Can you point me at the documentation that guarantees that behaviour?
May 21, 2007 at 2:33 am
You are actually both right.
But just to make sure of this behavior, you can use an index hint in the query. This pretty much guarantees the results.
May 21, 2007 at 2:59 am
We cant both be right.
From BOL:-
Setting Variables and Columns
Variable names can be used in UPDATE statements to show the old and new values affected. This should only be used when the UPDATE statement affects a single record; if the UPDATE statement affects multiple records, the variables only contain the values for one of the updated rows.
It just looks wrong to me, trying to set a single variable to many values from a table is a bad idea.
May 21, 2007 at 6:15 am
Regardless of the "perfect" SQL for this task, 35 minutes is sufficiently better than 20 hours that the solution in the article is certainly good enough. If it generalizes better than an ideal but single-purpose query, then (imo) it has better reusability.
What is valuable in this article is that a problem is broken down and explained so we can see WHY the solution works so much better than the original example.
I would be interested in the performance/timing of setting the default value to a UDF at creation of the column? (which could be removed after the initial load)
May 21, 2007 at 6:58 am
Haven't tried it, but couldn't you just use a ranking function over ordering by the clustered index key to get a pseudo identity column? I can't think of any drawback to this approach, except that you can't do it in SS2K
May 21, 2007 at 6:58 am
Yes you can both be right. The value will be set to the last row updated, however in a few rare cases the, update may be done in something that looks like a random order, using the index hint "should" aleviate that possibility.
May 21, 2007 at 7:05 am
One other caveat I thought of with either solution, is that if the column will end up being mandatory aka Not Null, there will need to be a check at the end to ensure that there are no null rows that were added after the fact. Also unless you can guarantee that the code will provide a value a default is almost always needed.
For the question on using a default when adding the table, I would first crank up the size of the log file by a few GB, and I expect it would possibly be shorter than the 35 minutes, but it only helps if you have the same value to insert or a derivable formula.
Enjoy, Brian
May 21, 2007 at 7:41 am
This is extra useful since SET ROWCOUNT affecting updates is going away in the next release of SQL. http://msdn2.microsoft.com/en-us/library/ms188774.aspx
May 21, 2007 at 7:59 am
Very interesting article, definitely worth seeing.
Btw, though, shouldn't the columns be NCHAR(2) instead of NVARCHAR(2)? It's a waste of space and processing overhead to make a 2-byte column variable.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
May 21, 2007 at 8:03 am
Ha! Very nice. I'm facing a similar situation here. This article demonstrates a nice way of dealing with this situation. Not to mention the issue of blocking / locking on the table while just such an update is occurring. Bravo.
Viewing 15 posts - 1 through 15 (of 28 total)
You must be logged in to reply to this topic. Login to reply