Updating First the First Few Characters in a Cell

  • Hi guys,

    I'm running SQL 2000 sp3a on Windows 2000.

    In one of our DB's here we have a table called D_DETAILS and in which there is a row called USERFIELD1.  USERFIELD1 is an alphanumeric value made up of four letters and four numbers; the first four letters indicating that office that this piece of information is specific to.  As fate would have it, the person responsible for updating this database has entered every single office code for one of our offices incorrectly (I think it's about 1000 entries) and is now asking for my help to automatically correct this.

    Breaking the problem down I want to be able to select all entries in the USERFIELD1 row which have the first four characters in their value equal to "abcd" and then update these first three characters to "efgh".

    At first I tried being lazy and using DTs to export the column to excel, do a find and replace and then import it.  The export worked fine, the import didn't want to play.

    Then I delved into T-SQL and tried to think of a way of doing this.  I thought that assigning an alias to the first four characters might work and thus I attempted:

    select left(userfield1,4)
    from d_details as firstfour
    update vendors
    set firstfour = 'efgh'
    where firstfour = 'abcd'

    This bombed out on me - it seemed that the update part of the query didn't see "firstfour" as being valid.

    If anyone could offer any help I'd be most grateful.

    Cheers,

    Iain

  • Update d_details

    set UserField1 = 'efgh' + right(UserField1, len(UserField1) - 4)

    where left(UserField1, 4) = 'abcd'

    I'm using len(UserField1) - 4 in case the field can be more than 8 chars (can be especially if it's because of wrong user input)

  • UPDATE d_details

    SET USERFIELD1 = 'efgh' + Substring(USERFIELD1, 5, 999)

    WHERE USERFIELD1 Like 'abcd%'

    [Edit] Remi wins

  • Cheers!

    Job's a good 'un!

  • Actually I redid some test and it turns out that pw's solution is better than mine :

    Substring(name, 3, 999) seems to be a tad faster than right(UserField1, len(UserField1) - 4) (no hard proof of that because of my sample data is too small but there's 1 less operation to do so I can assume at least as fast if not faster).

    Also left(UserField1, 4) will not always run as fast as UserField1 like 'abcd%' because the left() operation will always force an index scan, while the like operator can use the index seek. This can give a great boost in performance if there's a lot of rows.

  • "...operation will always force an index scan, while the like operator can use the index seek..."

    Remi,

    Could you please explain the difference between an index scan and an index seek.

    thanks, ron

  • Index scan :

    Sql server will analyse ALL the rows of the index and check if Left(.....

    So 1 000 000 rows = 1 000 000 checks.

    Index seek :

    Sql server will do some sort of binary search (no idea how it's really done) and find a row that matches like 'abcd%'. Once a row is found and because the index is sorted, sql server can start scanning up and down from that point in the index for all matching rows. So for 1 000 000 rows it will take sql server a MAXIMUM of 20 checks to find the first match or none, and then 1 check for each line returned + 2 checks for the mismatches and index seek stop in both directions.

    So let's say your select returns 100 rows, the index scan will do the left() operation 1 000 000 times, while the index seek will do it in

    (

    2 --end the seek

    + 99 --99 more lines to find

    + 14 -- 1 000 000 / (2 ^ 14) = 61 >> 100 hits vs 61 possible matches = impossible not to have found it by now

    )

    = 115 checks being almost 8700 times faster (assuming that the left and like checks are done exactly the same way).

  • Wha'cha figure guys?  Faster still?

     UPDATE d_details

        SET UserField1 = STUFF(UserField1, 1, 4, 'efgh')

      WHERE UserField1 LIKE 'abcd%'

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I can't prove it any faster or slower. Looks pretty much the same.

  • Thought it might be a little faster based on the fact that it's eliminated the final calculateion from the SET statement.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • hmm... I don't see a final calculation here :

    'abcd' + Substring(name, 5, 999)

    If there's a difference it must be so minute that I can't detect it with my test data (only 10k rows and very small varchar(128) field >> data is usually less than 20 chars.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply