July 20, 2009 at 8:15 am
I have a data issue with one column of data in a table with 400 million+ rows. I would like to standardize the format of the data to make searching it easier. Here is what I need to do:
The format I want is a letter followed by nine numbers so that it looks like this Z123456789.
I need a stored procedure that checks for several things and corrects them. Here are the things I need to check:
First. Check the field (znumber) to see if it is between 7 and 9 characters. Anything less or if it is a null value or it is 10 characters, the SP should skip to the next row.
Second, check to see if the first position has a letter character. If not, insert a "Z", and move on.
Third. Check to see how many numbers follow the letter Z. It should be nine numbers. If it is eight numbers, a zero should be inserted after the Z. If it is 7 numbers, two zeros should be inserted after the Z.
I need to do this on the entire data set and all new in coming data. Any direction would be much appreciated. Thanks
July 20, 2009 at 8:49 am
This can be done. Here are a couple of tips that should help. Check out the STUFF() function in BOL and lookup Tally table here on SSC. I think both of these will help you with the solution.
Could you post a table definition and some sample rows (some for each of the scenarios)? See the first link in my signature for how to post this. This makes it easier to provide an accurate and tested solution.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
July 20, 2009 at 8:53 am
The thing is that SQL doesn't process row by row. It works on the entire table or result set at a time.
So what you can do is write conditions to check things. I'd look at the SUBSTRING function to parse out sections of your string. Then you can use IN or LEN to check for some of your conditions and then filter out those rows. The thing you haven't provided is what should happen to those rows that don't match.
July 20, 2009 at 9:11 am
Is something like this what you are looking for? I just used a temp table but check out the update statement. I think it is close to what you are asking.
CREATE TABLE #temp (test VARCHAR(10))
INSERT INTO #temp (test) values ('Z123456')
INSERT INTO #temp (test) values ('123456789')
INSERT INTO #temp (test) values ('Z123456789')
INSERT INTO #temp (test) VALUES (NULL)
INSERT INTO #temp (test) VALUES ('123')
SELECT * FROM #temp
UPDATE #temp SET test = CASE WHEN ((LEN(test) BETWEEN 7 AND 9) OR test IS NULL) THEN
CASE WHEN LEFT(test,1) = 'Z' then
LEFT(test,1) + REPLICATE('0',10-LEN(test))+SUBSTRING(test,2,LEN(test))
ELSE
'Z' + REPLICATE('0',9-LEN(test))+test
END
ELSE
test
END
SELECT * FROM #temp
drop table #temp
As for the "future data", you could try a constraint on the table.
July 20, 2009 at 9:32 am
Matt, that was perfect. I hate how easy you guys all make it seem. I screwed around with Case statement for an entire day. Thanks again.
July 20, 2009 at 9:46 am
I'd make one change to Matt's script.
I'd put something like this in the WHERE clause to avoid touching rows that you are not going to update:
UPDATE #temp
SET test = CASE WHEN LEFT(test, 1) = 'Z'
then LEFT(test, 1) + REPLICATE('0',
10 - LEN(test))
+ SUBSTRING(test, 2, LEN(test))
ELSE 'Z' + REPLICATE('0', 9 - LEN(test)) + test
END
WHERE
LEN(test) BETWEEN 7 AND 9 OR test IS NOT NULL
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply