Updating 9 million records takes too much time...

Question

Updating 9 million records takes too much time...

reg-653649

Valued Member

Points: 61
More actions
February 26, 2008 at 4:07 am

#182560

After about 15 min the update times out...
Tabledef:
CREATE TABLE [dbo].[tblObjectUpdateInfo](
[ObjectID] [int] NOT NULL DEFAULT (0),
[PocketPCID] [int] NOT NULL DEFAULT (0),
[UpdateInfo] [tinyint] NOT NULL DEFAULT (0),
CONSTRAINT [PK_tblObjectUpdateInfo] PRIMARY KEY CLUSTERED
(
[PocketPCID] ASC,
[ObjectID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Update to be done:
UPDATE tblObjectUpdateInfo SET ObjectID = ObjectID + 99999;
9 million record is of course a lot of records, but it should take this long anyway I think... any suggestions?

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Andras Belokosztolszki SSC-Insane Points: 22031 More actions · Answer 1

Because the column you are updating is part of the clustered index, the update will be a delete followed by an insert. If you are in full recovery mode, this will be quite a load on your transaction log. Check if changing to bulk logged mode will help.

Also, do you have any other indexes? It will most likely be worth to disable them for the update.

Regards,

Andras

Andras Belokosztolszki, MCPD, PhD
GoldenGate Software

reg-653649 Valued Member Points: 61 More actions · Answer 2

I'm running Simple logging.

I have this Index too... and one more questions about that. Is this necessery when I have the clustered [ObjectID+PocketPCID] index? Will a search without the extra nonclustered [Object] index still be indexed using the first part of the clustered index (when serching on only ObjectID, or doing a join on TableA.ObjectID = tblObjectUpdateInfo.ObjectID)?

CREATE NONCLUSTERED INDEX [I_ObjectID] ON [dbo].[tblObjectUpdateInfo]

(

[ObjectID] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

Andras Belokosztolszki SSC-Insane Points: 22031 More actions · Answer 3

reg (2/26/2008)
I'm running Simple logging.
I have this Index too... and one more questions about that. Is this necessery when I have the clustered [ObjectID+PocketPCID] index? Will a search without the extra nonclustered [Object] index still be indexed using the first part of the clustered index?
CREATE NONCLUSTERED INDEX [I_ObjectID] ON [dbo].[tblObjectUpdateInfo]
(
[ObjectID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

Because of the order of your clustered index, the second index is useful. Disabling it for the table update should speed things up too.

(alter index .. on .. disable)

Regards,

Andras

Andras Belokosztolszki, MCPD, PhD
GoldenGate Software

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 4

Since you're doing this all in one transaction, I don't think it'll make any difference what your recovery mode is.

The first part of your clustered index key is PocketPCID and so if you are searching on ObjectID then your query won't use the clustered index for that purpose.

John

reg-653649 Valued Member Points: 61 More actions · Answer 5

Got error running this:

ALTER INDEX I_ObjectID ON tblObjectUpdateInfo DISABLE

ALTER INDEX PK_tblObjectUpdateInfo ON tblObjectUpdateInfo DISABLE

UPDATE tblObjectUpdateInfo SET ObjectID = ObjectID + 99999;

Msg 8655, Level 16, State 1, Line 3

The query processor is unable to produce a plan because the index 'PK_tblObjectUpdateInfo' on table or view 'tblObjectUpdateInfo' is disabled.

Andras Belokosztolszki SSC-Insane Points: 22031 More actions · Answer 6

Please only disable the nonclustered indexes. If you disable the clustered index, you will not be able to access the table.

Regards,

Andras

Andras Belokosztolszki, MCPD, PhD
GoldenGate Software

G.R.Prithiviraj Kulasingham SSCarpal Tunnel Points: 4105 More actions · Answer 7

You also need to consider, updating the records in small chucks at a time. Even though it will take more time than usual, the rest of the table will be available for others while you work with updates.

When you do one update over all rows, SQL Server places an exclusive lock at table level

Cheers,
Prithiviraj Kulasingham

http://preethiviraj.blogspot.com/

reg-653649 Valued Member Points: 61 More actions · Answer 8

Is there a smart way to do updates in small chucks at a time just specifying what amount of records I want to update? Or do I have to use a: WHERE ObjectID > X AND ObjcetID < Y?

Tommy Bollhofer SSChampion Points: 14940 More actions · Answer 9

Use a WHILE loop to incrementaly update the records using a ROWLOCK hint. A crude example, but you get the idea 🙂

create table ##UpdateRows(somePK int)

insert into ##UpdateRows

select somePK

from sometable

where something = something

create nonclustered index IX_UpdateRows on ##UpdateRows(somePK)

while (select count(*) from ##UpdateRows) > 0

begin

select getdate()

create table ##UpdateRowsWorking(case_id int)

insert into ##UpdateRowsWorking (somePK)

select top 1000 somePK from ##UpdateRows

create nonclustered index IX_UpdateRowsWorking on ##UpdateRowsWorking(somePK)

begin transaction

/* Begin Do Stuff */

update sometable with(rowlock)

set somevalue = B.somevalue

from sometable A

join someothertable B

on A.somePK = B.somePK

/* End Do Stuff */

commit transaction

delete from ##UpdateRows

where somePK in (select somePK from ##UpdateRowsWorking)

drop table ##UpdateRowsWorking

end

drop table ##UpdateRows

end

Tommy

Follow @sqlscribe

Jeff Moden SSC Guru Points: 1004434 More actions · Answer 10

If you really want this to fly, use a SELECT/INTO with the correct calculation... once you're sure the data is correct, rename the tables and add the correct indexes to the new table that has been renamed to the same as the old table.

The method also allows you to "play" without screwing up your original data and having to restore from a backup 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 11

Also - since you're touching something in the clustered index - if you plan on doing a "walking update", do it from the Back to the front. Meaning -walk the index in descending order. Since the clustered index controls the physical order, you don't necessarily want to cause it to have to reorder data multiple times for no reason (actually - you don't want the update to reorder the data at all, given the update you're running).

Jeff's right though - a new table would probably be substantially faster. You just have to remember to rebuild everything on the "new table": indexes, constraints, etc....

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?