March 22, 2005 at 8:03 am
Hi guys,
I'm running SQL 2000 sp3a on Windows 2000.
In one of our DB's here we have a table called D_DETAILS and in which there is a row called USERFIELD1. USERFIELD1 is an alphanumeric value made up of four letters and four numbers; the first four letters indicating that office that this piece of information is specific to. As fate would have it, the person responsible for updating this database has entered every single office code for one of our offices incorrectly (I think it's about 1000 entries) and is now asking for my help to automatically correct this.
Breaking the problem down I want to be able to select all entries in the USERFIELD1 row which have the first four characters in their value equal to "abcd" and then update these first three characters to "efgh".
At first I tried being lazy and using DTs to export the column to excel, do a find and replace and then import it. The export worked fine, the import didn't want to play.
Then I delved into T-SQL and tried to think of a way of doing this. I thought that assigning an alias to the first four characters might work and thus I attempted:
This bombed out on me - it seemed that the update part of the query didn't see "firstfour" as being valid.
If anyone could offer any help I'd be most grateful.
Cheers,
Iain
March 22, 2005 at 8:09 am
Update d_details
set UserField1 = 'efgh' + right(UserField1, len(UserField1) - 4)
where left(UserField1, 4) = 'abcd'
I'm using len(UserField1) - 4 in case the field can be more than 8 chars (can be especially if it's because of wrong user input)
March 22, 2005 at 8:11 am
UPDATE d_details
SET USERFIELD1 = 'efgh' + Substring(USERFIELD1, 5, 999)
WHERE USERFIELD1 Like 'abcd%'
[Edit] Remi wins
March 22, 2005 at 8:18 am
Cheers!
Job's a good 'un!
March 22, 2005 at 8:32 am
Actually I redid some test and it turns out that pw's solution is better than mine :
Substring(name, 3, 999) seems to be a tad faster than right(UserField1, len(UserField1) - 4) (no hard proof of that because of my sample data is too small but there's 1 less operation to do so I can assume at least as fast if not faster).
Also left(UserField1, 4) will not always run as fast as UserField1 like 'abcd%' because the left() operation will always force an index scan, while the like operator can use the index seek. This can give a great boost in performance if there's a lot of rows.
March 22, 2005 at 11:40 am
"...operation will always force an index scan, while the like operator can use the index seek..."
Remi,
Could you please explain the difference between an index scan and an index seek.
thanks, ron
March 22, 2005 at 12:20 pm
Index scan :
Sql server will analyse ALL the rows of the index and check if Left(.....
So 1 000 000 rows = 1 000 000 checks.
Index seek :
Sql server will do some sort of binary search (no idea how it's really done) and find a row that matches like 'abcd%'. Once a row is found and because the index is sorted, sql server can start scanning up and down from that point in the index for all matching rows. So for 1 000 000 rows it will take sql server a MAXIMUM of 20 checks to find the first match or none, and then 1 check for each line returned + 2 checks for the mismatches and index seek stop in both directions.
So let's say your select returns 100 rows, the index scan will do the left() operation 1 000 000 times, while the index seek will do it in
(
2 --end the seek
+ 99 --99 more lines to find
+ 14 -- 1 000 000 / (2 ^ 14) = 61 >> 100 hits vs 61 possible matches = impossible not to have found it by now
)
= 115 checks being almost 8700 times faster (assuming that the left and like checks are done exactly the same way).
March 22, 2005 at 11:08 pm
Wha'cha figure guys? Faster still?
UPDATE d_details
SET UserField1 = STUFF(UserField1, 1, 4, 'efgh')
WHERE UserField1 LIKE 'abcd%'
--Jeff Moden
Change is inevitable... Change for the better is not.
March 23, 2005 at 6:41 am
I can't prove it any faster or slower. Looks pretty much the same.
March 23, 2005 at 6:55 am
Thought it might be a little faster based on the fact that it's eliminated the final calculateion from the SET statement.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 23, 2005 at 7:09 am
hmm... I don't see a final calculation here :
'abcd' + Substring(name, 5, 999)
If there's a difference it must be so minute that I can't detect it with my test data (only 10k rows and very small varchar(128) field >> data is usually less than 20 chars.
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply