January 22, 2015 at 9:53 am
Hi there,
I have a table with 20 million rows of Employee information. This Table has Employee Address that needs to be reformatted. Index is only created on EmpID. I have written a function that Formats the address( Changes st. to Street; Rd to Road; Ste to Suite; and many more such small modifications) and returns formatted address. Address could be null.
I first need a count of how many records could be affected by this function and also need to update the table.
Create Table dbo.Employee(EmpID int not null, Empname varchar(100) not null, Address varchar(1000) null, Phone varchar(20) null,Fax varchar(20) null,EmpCountry varchar(100) not null)
insert into dbo.Employee(1,'Amy1','2 main st, salem ,ma 02000','','','USA');
insert into dbo.Employee(2,'Amy2','2 main rd, salem ,pa 01000','1234567890','','USA');
insert into dbo.Employee(3,'Amy3','2 main st,ste 200, salem ,ga 03000','9999999999','','USA');
insert into dbo.Employee(4,'Amy4','2 main ln, salem ,la 04500','','6666666666','RUS');
insert into dbo.Employee(5,'Amy5','','1111111111','2222222222','USA');
Output will look like this
select EmpID,dbo.fnFormatAddress(Address)
from dbo.Employee
1,2 Main Street, Salem MA 02000
2,2 Main Road, Salem ,PA 01000
4,2 Main Lane, Salem ,LA 04500
5, NULL
This is the code I have
Begin
SET NOCOUNT ON;
DECLARE @RowCount1 INTEGER;
UPDATE dbo.Employee set dbo.fnFormatAddress(Address) = Address
where EmpCountry='USA'
SELECT @RowCount1 = @@ROWCOUNT
SELECT @RowCount1
End
This is taking quite a while to run. Is there a better way of writing this?
Thanks
R
January 22, 2015 at 10:44 am
My recommendation is to not mess with this even for formatting the output of rows never mind doing an update to the table with homegrown code.
Instead, by a CASS Certification product. They're not horribly expensive (much cheaper than the hours you're getting ready to spend possibly doing it wrong in the process) and will not only do the formatting correctly, but they'll validate the addresses, as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 22, 2015 at 10:50 am
I don't think we have are approved to use CASS in our company.
Thanks
R
January 22, 2015 at 11:03 pm
rash3554 (1/22/2015)
I don't think we have are approved to use CASS in our company.Thanks
R
The answer is always "No" until you ask. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
January 22, 2015 at 11:10 pm
rash3554 (1/22/2015)
This is the code I have
Begin
SET NOCOUNT ON;
DECLARE @RowCount1 INTEGER;
UPDATE dbo.Employee set dbo.fnFormatAddress(Address) = Address
where EmpCountry='USA'
SELECT @RowCount1 = @@ROWCOUNT
SELECT @RowCount1
End
There are several problems with this code.
1. The SET statement equation is backwards. The function should be on the right of the "=" sign.
2. The function is a scalar function. Please post the code for the function so we can see if there's a better way.
3. Every system has a "tipping point". On an unindexed fixed width datatype column, SQL Server will update a million rows in something like 1 to 4 seconds (barring any functions). Two million rows will only take twice as long. Three million rows might take only 3 times as long. If 4 million rows is the "tipping point", it could take several minutes to more than an hour. In such cases, it would be better to do segmented updates of, say, just a million or two rows at a time.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply