Use of Function with update

  • Hi there,

    I have a table with 20 million rows of Employee information. This Table has Employee Address that needs to be reformatted. Index is only created on EmpID. I have written a function that Formats the address( Changes st. to Street; Rd to Road; Ste to Suite; and many more such small modifications) and returns formatted address. Address could be null.

    I first need a count of how many records could be affected by this function and also need to update the table.

    Create Table dbo.Employee(EmpID int not null, Empname varchar(100) not null, Address varchar(1000) null, Phone varchar(20) null,Fax varchar(20) null,EmpCountry varchar(100) not null)

    insert into dbo.Employee(1,'Amy1','2 main st, salem ,ma 02000','','','USA');

    insert into dbo.Employee(2,'Amy2','2 main rd, salem ,pa 01000','1234567890','','USA');

    insert into dbo.Employee(3,'Amy3','2 main st,ste 200, salem ,ga 03000','9999999999','','USA');

    insert into dbo.Employee(4,'Amy4','2 main ln, salem ,la 04500','','6666666666','RUS');

    insert into dbo.Employee(5,'Amy5','','1111111111','2222222222','USA');

    Output will look like this

    select EmpID,dbo.fnFormatAddress(Address)

    from dbo.Employee

    1,2 Main Street, Salem MA 02000

    2,2 Main Road, Salem ,PA 01000

    4,2 Main Lane, Salem ,LA 04500

    5, NULL

    This is the code I have

    Begin

    SET NOCOUNT ON;

    DECLARE @RowCount1 INTEGER;

    UPDATE dbo.Employee set dbo.fnFormatAddress(Address) = Address

    where EmpCountry='USA'

    SELECT @RowCount1 = @@ROWCOUNT

    SELECT @RowCount1

    End

    This is taking quite a while to run. Is there a better way of writing this?

    Thanks

    R

  • My recommendation is to not mess with this even for formatting the output of rows never mind doing an update to the table with homegrown code.

    Instead, by a CASS Certification product. They're not horribly expensive (much cheaper than the hours you're getting ready to spend possibly doing it wrong in the process) and will not only do the formatting correctly, but they'll validate the addresses, as well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I don't think we have are approved to use CASS in our company.

    Thanks

    R

  • rash3554 (1/22/2015)


    I don't think we have are approved to use CASS in our company.

    Thanks

    R

    The answer is always "No" until you ask. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • rash3554 (1/22/2015)


    This is the code I have

    Begin

    SET NOCOUNT ON;

    DECLARE @RowCount1 INTEGER;

    UPDATE dbo.Employee set dbo.fnFormatAddress(Address) = Address

    where EmpCountry='USA'

    SELECT @RowCount1 = @@ROWCOUNT

    SELECT @RowCount1

    End

    There are several problems with this code.

    1. The SET statement equation is backwards. The function should be on the right of the "=" sign.

    2. The function is a scalar function. Please post the code for the function so we can see if there's a better way.

    3. Every system has a "tipping point". On an unindexed fixed width datatype column, SQL Server will update a million rows in something like 1 to 4 seconds (barring any functions). Two million rows will only take twice as long. Three million rows might take only 3 times as long. If 4 million rows is the "tipping point", it could take several minutes to more than an hour. In such cases, it would be better to do segmented updates of, say, just a million or two rows at a time.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply