October 14, 2008 at 11:44 pm
Hi,
I am having a large table (>2000000 rows)
I need to perform 135 update statements on them (only one field is being changed).
E.g.
update large_table
set field1 = 3.45
where identifier = 'A' and range <= 10
update large_table
set field1 = 2.75
where identifier = 'A' and range > 10 and range <= 200
update large_table
set field1 = 4.56
where identifier = 'A' and range > 200
update large_table
set field1 = 1.22
where identifier = 'B' and some_value = 'T'
update large_table
set field1 = 5.78
where identifier = 'B' and some_value = 'F'
and so on...
This whole operation takes > 2hrs to complete.
Is there any way to do this more effectively?
Please note the update statements are mutually exclusive.
The rows affected by one update statement are not affected by the other update statements.
I was thinking about having a separate table populated with the columns/values used in the where condition.
E.g. new_table
field1 | identifier | range_start | range_end | some_value
3.45 | A | NULL | 10 | NULL
2.75 | A | 10 | 200 | NULL
4.56 | A | 200 | NULL | NULL
1.22 | B | NULL | NULL | T
5.78 | B | NULL | NULL | F
and the update statement would be
update large_table
set field1 = n.field1
from large_table t inner join new_table n
on t.identifier = n.identifier and (n.range_start is null or t.range > n.range_start)
and (n.range_end is null or t.range <= n.range_end)
and (n.some_value is null or t.some_value = n.some_value)
I think the above statement works
But is there a better way to do this?
The new_table has too many null values esp. in the some_value column
In my original procedure, of the 135 update statements, only 6 use the some_value column
Please help?
Thanks,
Kapadia Shalin. P
October 14, 2008 at 11:58 pm
Some points....
i think the idea of population a table with necessary values and then doing the update is correct.
From what i have understood
You either update according to ranges or according to "somevalue" . If this is correct then better have 2 queries one based on the ranges and other based on somevalue.
This will increase the clarity.
Instead of null values its better to use 0. Makes things clear.
I think your conditions in the where clause may result in some updates not happening. (Just a thought).
"Keep Trying"
October 15, 2008 at 12:21 am
Thanks for your reply.
The problem with having separate update statements for range and some_value is that certain update statements use both the columns.
E.g. update large_table
set field1 = 4.00
where range > 200 and identifer = 'C' and some_value = 'T'
Also, range, some_value and identifier are not the only columns used in the where condition.
Some update statements use other columns as well.
Including all these columns in the new_table will lead to a large number of NULLs or 0, as you said.
Does this affect performance?
And about - 'I think your conditions in the where clause may result in some updates not happening.'
Could you give me an example?
I was thinking about using this solution
It would be disastrous if some rows where missed?
Thanks,
Shalin
October 15, 2008 at 7:15 am
This still seems like a semi-poor way to do it, but I believe a very long case statement would probably beat out your 135 update statements.
EG:
UPDATE large_table
SET Field1 = CASE WHEN identifier = 'A' and range < = 10 Then 2.75
CASE WHEN identifier = 'B' and range between 10 and 20 Then 3.5
...
END
Between a single update, and maybe an added index, I doubt this would take near to 2 hours to update.
October 15, 2008 at 1:30 pm
Quickest UPDATE would be drop the table and create a view with the same name to return calculated values based on look-up criteria.
_____________
Code for TallyGenerator
October 15, 2008 at 3:04 pm
Personally, I would do it the way that Garadin is doing it. Just be sure to include an ELSE clause to reassign any uncovered cases:
UPDATE large_table
SET Field1 = CASE
WHEN identifier = 'A' and range < = 10 Then 2.75
WHEN identifier = 'B' and range between 10 and 20 Then 3.5
...
ELSE Field1
END
Otherwise it will get set to NULL.
I would expect that this should take from 1.5 to 4.0 minutes to run, although it is hard to tell without more information.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
October 15, 2008 at 9:02 pm
Thanks all for your replies
They were extremely useful
October 15, 2008 at 9:38 pm
Be sure to let us know how it worked out.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
October 16, 2008 at 12:42 am
Sorry for the delay.
Since you are going to use both the range columns and also the value column, CASE is the best way to go about it.
So thats going to be 135 case statements. As pointed out pls be sure to keep the ELSE part in your update stmt.
BUT how many updates (amongst your 135) are going to use both range columns and some column in your update statement.
"Keep Trying"
October 16, 2008 at 1:36 am
I have a new problem now
It seems the conditions in the update statement are susceptible to change.
So, I was thinking about having all the case statements in a table
something like -
case_clause
WHEN identifier = 'A' and range < = 10 Then 2.75
WHEN identifier = 'B' and range between 10 and 20 Then 3.5
and adding all these case statements into a dynamic sql statement
DECLARE @cmd VARCHAR(8000)
SET @cmd = 'UPDATE large_table SET field1 = CASE'
SELECT @cmd = @cmd + ' ' + case_clause FROM conditions_table
SET @cmd = @cmd + ' ELSE field1 END'
EXEC (@cmd)
This works if the length of @cmd is less than 8000
But, in my case, it exceeds 8000
I get an error at EXEC (@cmd)
Please note I am using SQL Server 2000
So, I can't declare VARCHAR(MAX)
Any help would be appreciated????
Thanks,
Shalin
October 16, 2008 at 4:12 am
Put your Dynamic Query in two different variables and then
run it as
EXEC (@cmd + @cmd1)
October 16, 2008 at 7:43 am
Thats kind of odd, because with 135 CASE clauses, you should be able to use about 50-55 characters per case clause safely. Can you show us the actual command and case text please?
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
August 7, 2009 at 9:56 am
Split as batches "Go"
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply