New Column Updates

Question

Post reply

New Column Updates

Bimal Fernando

SSCommitted

Points: 1945
More actions
April 4, 2007 at 7:07 pm

#177558

Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/bFernando/2954.asp

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply

JH Park SSC Enthusiast Points: 160 More actions · Answer 1

I think this could do the job...

set nocount on

SET ROWCOUNT 1000

declare @rowcount int, @batchcount int, @rowsupdated int

declare @LastKey nvarchar(341)

select @rowcount = 1,@batchcount = 0, @rowsupdated = 0, @LastKey = ''

while @rowcount > 0

begin

begin tran

update MyTable

set NewZip = ZipCode,

@LastKey = NameKey

where NewZip is NULL

and NameKey >= @LastKey

set @rowcount = @@rowcount

select @rowsupdated = @rowsupdated + @rowcount

if @rowcount > 0 select @batchcount = @batchcount + 1

commit

end

*If NameKey were unique, we could use ">" instead of ">=" .

Jure Right there with Babe Points: 751 More actions · Answer 2

Jure

Right there with Babe

Points: 751

May 21, 2007 at 1:28 am

#707740

hehe, very nice solution - simple is beautiful

ians SSC Eights! Points: 964 More actions · Answer 3

> update MyTable

> set NewZip = ZipCode,

> @LastKey = NameKey

That may well work in this example, but it is not something I'd like to use for production use. You are relying on @LastKey being set to the last (alphabetically) NameKey value updated. Your updating a set, it could take any of the values from that set.

JH Park SSC Enthusiast Points: 160 More actions · Answer 4

Ians, I believe you can use it for production.

The last value assigned to @LastKey is the largest NameKey in the set because the update query definitely uses the index [LargeTable__IE1].

ians SSC Eights! Points: 964 More actions · Answer 5

Can you point me at the documentation that guarantees that behaviour?

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 6

You are actually both right.

But just to make sure of this behavior, you can use an index hint in the query. This pretty much guarantees the results.

ians SSC Eights! Points: 964 More actions · Answer 7

We cant both be right.

From BOL:-

Setting Variables and Columns

Variable names can be used in UPDATE statements to show the old and new values affected. This should only be used when the UPDATE statement affects a single record; if the UPDATE statement affects multiple records, the variables only contain the values for one of the updated rows.

It just looks wrong to me, trying to set a single variable to many values from a table is a bad idea.

Mike Dougherty-384281 SSCrazy Points: 2794 More actions · Answer 8

Regardless of the "perfect" SQL for this task, 35 minutes is sufficiently better than 20 hours that the solution in the article is certainly good enough. If it generalizes better than an ideal but single-purpose query, then (imo) it has better reusability.

What is valuable in this article is that a problem is broken down and explained so we can see WHY the solution works so much better than the original example.

I would be interested in the performance/timing of setting the default value to a UDF at creation of the column? (which could be removed after the initial load)

Adrian Hains Ten Centuries Points: 1322 More actions · Answer 9

Haven't tried it, but couldn't you just use a ranking function over ordering by the clustered index key to get a pseudo identity column? I can't think of any drawback to this approach, except that you can't do it in SS2K

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 10

Yes you can both be right. The value will be set to the last row updated, however in a few rare cases the, update may be done in something that looks like a random order, using the index hint "should" aleviate that possibility.

Brian Munier SSCrazy Points: 2875 More actions · Answer 11

One other caveat I thought of with either solution, is that if the column will end up being mandatory aka Not Null, there will need to be a check at the end to ensure that there are no null rows that were added after the fact. Also unless you can guarantee that the code will provide a value a default is almost always needed.

For the question on using a default when adding the table, I would first crank up the size of the log file by a few GB, and I expect it would possibly be shorter than the 35 minutes, but it only helps if you have the same value to insert or a derivable formula.

Enjoy, Brian

Rob Scholl Old Hand Points: 300 More actions · Answer 12

This is extra useful since SET ROWCOUNT affecting updates is going away in the next release of SQL. http://msdn2.microsoft.com/en-us/library/ms188774.aspx

ScottPletcher SSC Guru Points: 101120 More actions · Answer 13

Very interesting article, definitely worth seeing.

Btw, though, shouldn't the columns be NCHAR(2) instead of NVARCHAR(2)? It's a waste of space and processing overhead to make a 2-byte column variable.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jeremy Brown SSCarpal Tunnel Points: 4223 More actions · Answer 14

Ha! Very nice. I'm facing a similar situation here. This article demonstrates a nice way of dealing with this situation. Not to mention the issue of blocking / locking on the table while just such an update is occurring. Bravo.