August 22, 2007 at 9:42 am
Hi,
I try to extract number from a string, is there a SQL Server function to do that? If not, how to get such function?
Thanks a lot
August 22, 2007 at 9:56 am
I'm sure you will find as many methods to do this as there are developers/DBAs, but this should work for you. Feel free to put this into a UDF for ease of use.
DECLARE @string varchar(100)
SET @string = 'sk123;fja4567afj;as890'
WHILE PATINDEX('%[^0-9]%',@string) <> 0
SET @string = STUFF(@string,PATINDEX('%[^0-9]%',@string),1,'')
SELECT @string
August 22, 2007 at 10:01 am
Actually in my case I need to select street number from a address string, which means if the string is '1234 dummy789 road', I only want to get '1234', not '1234789'.
Thanks
August 22, 2007 at 10:04 am
OK, so can you define in English exactly what you want? Are you looking for all numeric characters prior to the first blank character ' '? Or all numeric characters prior to the first non-numeric?
August 22, 2007 at 10:07 am
all numeric characters prior to the first blank character
August 22, 2007 at 10:19 am
DECLARE @string varchar(100)
SET @string = '12345 Test Address Dr.'
SELECT LEFT(@String,CHARINDEX(' ', @string))
August 22, 2007 at 10:24 am
Here it is with the PATINDEX used also:
DECLARE @string varchar(100)
SET @string = '12345 Test Address Dr.'
SELECT LEFT(@String,CHARINDEX(' ', @string)),
LEFT(@string,PATINDEX('%[^0-9]%',@string))
August 23, 2007 at 5:55 am
Not sure if this process could ever by 100% accurate, there will always be some clever dick who sticks random characters and numbers into a house name for example, there are also going to be addresses where the house number is prefixed with "Number" or "No." or a flat with the number 12A etc.
The following should find the first occurrence of a number in a string, although there is never going to be a guarantee that this is the house number.
DECLARE @string varchar(100),
@start int,
@end int,
@len int
SET @string = 'the hollies 12345 Test Address Dr.'
set @string = replace(@string, ' ' , '')
set @len = len(@string)
set @start = PATINDEX('%[0-9]%',@string)
set @end = PATINDEX('%[^0-9]%',substring(@string, @start, @len))-1
print substring(@string, @start, @end)
August 23, 2007 at 8:00 am
Fishbarnriots has a good point. (what is with that nickname?)
To get close to 100%, I'd actually go through a cleaning process. See if you can wipe out stuff before the house number. You may find some other cases and I'd actually recommend extracting this into a temp or working table (or new column) and running a few through a few passes to get it down to just the house number.
If you are using this for some other process regularly, I'd actually see if you could add a new column to store this.
November 15, 2007 at 1:22 pm
I was just reading thru this thread and I have a similar issue. I'm trying to extract the number string from a StarRating of Hotels column. For example the column currently has
StarRating:
0
1
1.5
NA
2
2 STAR
3.00
4-STAR
4
NULL
What I would like to see as the results in this column is:
0
1
1.5
0
2
2
0
3.00
4
4
0
And if possible replace the blank,NA and NULL with 0.
Could someone please assist.
November 15, 2007 at 3:01 pm
I've been beating the heck out of this drum lately - but an actual regular expression will help you find all of the various issues you're looking at here.
If you're using 2000, someone has been kind enough to create REGEX support in XP's which can then be added to your server. If you're on 2005 - build a CLR function (I've posted 4 versions in the last 2-3 days), which will definitively pick up these various issues patterns you're looking for.
Check out the scripts in here:
http://www.sqlservercentral.com/Forums/Topic419472-65-1.aspx
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
November 15, 2007 at 5:24 pm
Matt is correct... he's beat the snare right off that drum... and guess what? He's right! Matt and I have done a huge amout of testing and using a RegEx "xp" in SQL Server 2000 is definitely a great way to go...
On the off chance that you have a DBA that refuses to allow a "non-MS XP", there's a fairly easy way to do it... the basis of the method is covered in the same thread that Matt sited. But, for everyone's convenience, here it is...
First, you need a Tally table... it's nothing more than a table with a single column of well indexed sequential numbers... you can make it as a Temp Table or, better yet, add it to your database as a permanent table. It's got a lot of uses so I recommend the latter. Here's how to make one...
--===== Create and populate the Tally table on the fly
SELECT TOP 11000 --equates to more than 30 years of dates
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC
Next, you'll need a function... here's a pretty useful generic function that you can "program" to return your desired result...
CREATE FUNCTION dbo.fnLeaveOnly
/***********************************************************************************
Purpose:
This function accepts a string, the "LIKE" pattern that all characters must match
to be "kept", and a value to return if a NULL is the result of the function.
-- Jeff Moden
***********************************************************************************/
--===== Define the I/O Parameters
(
@String VARCHAR(8000), --String to be cleaned
@CharPattern VARCHAR(100), --Pattern a character must meet to keep
@NullValue VARCHAR(100) --Return this if a NULL is the result
)
RETURNS VARCHAR(8000) --The "cleaned" string
AS
BEGIN
--===== Declare the return variable
DECLARE @Return VARCHAR(8000)
--===== Clean the string leaving only what's in the character pattern
SELECT @Return = ISNULL(@Return,'')+SUBSTRING(@String,N,1)
FROM dbo.Tally
WHERE N <= LEN(@String)
AND SUBSTRING(@String,N,1) LIKE @CharPattern
--===== Return the "Cleaned" string substituting the null value if result is null
RETURN ISNULL(@Return,@NullValue)
END
Now, let's setup a test with all the values you stated and see what happens... do notice the "pattern" used...
DECLARE @yourtable TABLE (StarRating VARCHAR(100))
INSERT INTO @yourtable (StarRating)
SELECT '0' UNION ALL
SELECT '1' UNION ALL
SELECT '1.5' UNION ALL
SELECT 'NA' UNION ALL
SELECT '2' UNION ALL
SELECT '2 STAR' UNION ALL
SELECT ' ' UNION ALL
SELECT '3.00' UNION ALL
SELECT '4-STAR' UNION ALL
SELECT '4' UNION ALL
SELECT NULL UNION ALL
SELECT '~`!@#$%^&*()-_=+\|[{]};:''", /?a1_b2C3$'
SELECT StarRating AS Original,
dbo.fnLeaveOnly(StarRating,'[0-9.]',0)
FROM @yourtable
Result:
(12 row(s) affected)
Original Cleaned
----------------------------------------- ----------
0 0
1 1
1.5 1.5
NA 0
2 2
2 STAR 2
0
3.00 3.00
4-STAR 4
4 4
NULL 0
~`!@#$%^&*()-_=+\|[{]};:'", /?a1_b2C3$ .123
(12 row(s) affected)
Hope that helps...
--Jeff Moden
Change is inevitable... Change for the better is not.
November 15, 2007 at 8:23 pm
Thank you.
November 16, 2007 at 5:44 am
Thats beautiful work guys!
What if the record has the string '4-Stars out of 5'. Woud this return 45?
I was thinking about creating a function which loops through the string one character at a time and does some math with the results.
1. You would need a try catch block to error trap and then exit out.
2. You would have to store the previous result to use when an error is found so you could roll back to it.
3. You may have to check for sequences such as 10E which might get interpreted as Scientific Notation.
November 16, 2007 at 6:43 am
Heh... Yes... it would return "45" on "4 out of 5 stars"... it wasn't designed to make "descisions"... current function was only designed to return characters that meet a pattern.
I agree... you could add functionality to return numbers as separate numbers in a table variable if they are separated by any non-numeric characters. You could even make it find things like "10R3" and "3D5" and do other ISNUMERIC extractions. And, you can do it all without an explicit loop as this function does or, you can use a loop.
Just remember, the more functionality you add, the slower the function will be... better to write a more specific function to meet your expected needs.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 32 total)
You must be logged in to reply to this topic. Login to reply