March 20, 2009 at 1:46 am
I'm looking for an sql function that can brean following phone string and retun phone no and extension independently. not all r here but i have some similarity of these:
there are almost 500K phone contacts like this which i have to fix in their respective fields individually
i.e.
select Udf_GetPhone('913.135.9000 x:6390') will return 913.135.9000
select Udf_GetExt('913.135.9000 x:6390') will return 6390
following is some same:
913.135.9000 x:6390
(305)529-6911x2101
(206)965-2626
109.391.3261
201 933-3101 ext 6696
(919) 333-9395 x252
190-325-3121 Ext. 5096
(313) 196-1000, ex. 63936
(901) 123-3990 ext. 1209
305.263.0311
223 123 1922
301-191-3255
619 932 6660
(939)651-3631
913-931-9300
609-665-5312
512.322.1396
(101) 261-1911 x 6033
316-361-5900 ext 12021
223-123-1105
903-202-3339 x23031
631-261-5500
(950) 993-1151....9999
939-599-2100 x11200
915-939-5500, x2139
Shamshad Ali.
March 20, 2009 at 2:02 am
Is the phone number (without the extension) always the same number of figures, ignoring brackets, full stops, etc?
It looks like it from the sample data you hae provided.
If so I would remove all non-numeric characters, then use Left() to pick up the phone number - the remainder would be the extension
March 20, 2009 at 3:20 am
Hi Ali,
Hope this may help you.
DECLARE @VARIABLE AS VARCHAR(50)
SET @VARIABLE = UPPER('(901) 123-3990 ext. 1209')
--SET @VARIABLE = UPPER('(305)529-6911x2101')
--SET @VARIABLE = UPPER('(313) 196-1000, ex. 63936')
IF(CHARINDEX('EXT',@VARIABLE) > 0)
SELECT REPLACE(SUBSTRING(@VARIABLE, 0, CHARINDEX('EXT',@VARIABLE)), ',', '') AS PHONENUMBER
, REPLACE(SUBSTRING(@VARIABLE, CHARINDEX('EXT',@VARIABLE) + 3, LEN(@VARIABLE)),'.','') AS EXTENSION
ELSE IF(CHARINDEX('EX',@VARIABLE) > 0)
SELECT REPLACE(SUBSTRING(@VARIABLE, 0, CHARINDEX('EX',@VARIABLE)),',','') AS PHONENUMBER
, REPLACE(SUBSTRING(@VARIABLE, CHARINDEX('EX',@VARIABLE) + 2, LEN(@VARIABLE)),'.','') AS EXTENSION
ELSE IF(CHARINDEX('X',@VARIABLE) > 0)
SELECT REPLACE(SUBSTRING(@VARIABLE, 0, CHARINDEX('X',@VARIABLE)),',','') AS PHONENUMBER
, REPLACE(SUBSTRING(@VARIABLE, CHARINDEX('X',@VARIABLE) + 1, LEN(@VARIABLE)),'.','') AS EXTENSION
I have used substring, charindex and replace function to get your result.
Regards,
Ashok S
March 20, 2009 at 4:04 am
Thanks - Highly appreciated your help.
when i use this 512.322.1396, the logic don't return anything. also same with 713-326-XXXX as input it should return only numbers
could u plz see this and also i would like if it always return phone number in one standard format format like
7138754885
Shamshad Ali.
March 20, 2009 at 4:21 am
I'd use something like this in your UDF:
declare @val varchar(50)
declare @p varchar(10) -- phone number
declare @e varchar(10) -- extension
set @val = '(950) 993-1151....9999'
WHILE PATINDEX('%[^0-9]%', @val) > 0
SET @val = REPLACE(@val, SUBSTRING(@val,PATINDEX('%[^0-9]%', @val),1),'')
select @val
set @p = left(@val, 10)
select @p
set @e = substring(@val, 11, len(@val) - 10)
select @e
As I said before though, this depends on the phone number always being made up of the same number of digits
March 20, 2009 at 8:58 am
here's how i would do it.
I'm assuming that you are using US format for area codes..(800) xxx-yyyy ext.zzzz
some of your area code exampels started with 1, i.e. 109, which does not really exist.
i'm assumming everything can be calculated based on the string length..
it may not be obvious, but the end results, you could take any of the cases that assume 'ext' in them, and have that in your separate UDF for extensions.
hope this gets the ideas accross.
here's what i did.
two functions..one to strip non-numeric chars, and another to format to my pattern.
--assuming you already have a tally table:
CREATE FUNCTION StripNonNumeric(@OriginalText VARCHAR(8000))
RETURNS VARCHAR(8000)
BEGIN
DECLARE @CleanedText VARCHAR(8000)
SELECT @CleanedText = ISNULL(@CleanedText,'') +
CASE
--ascii numbers are 48(for '0') thru 57 (for '9')
WHEN ASCII(SUBSTRING(@OriginalText,Tally.N,1)) BETWEEN 48 AND 57 THEN SUBSTRING(@OriginalText,Tally.N,1) ELSE '' END
FROM dbo.Tally WHERE Tally.N <= LEN(@OriginalText)
RETURN @CleanedText
END
GO
CREATE FUNCTION [FormatPhone] (@PhoneString varchar(50))
RETURNS varchar(50)
AS
BEGIN
DECLARE @FinalString varchar(50)
SET @FinalString = ''
--strip out any existing formatting:
SELECT @PhoneString = dbo.StripNonNumeric(@PhoneString)
IF LEFT(@PhoneString,1) = '1' -- "1-(800)" format
BEGIN
SET @PhoneString = SUBSTRING(@PhoneString,2,50)
SET @FinalString = '1-'
END
SELECT @FinalString =
CASE
--less than 7 characters? just return the string
WHEN LEN(@PhoneString) < 7
THEN @FinalString + @PhoneString
--exactly 7 xxx-xxxx
WHEN LEN(@PhoneString) = 7
THEN @FinalString + STUFF(@PhoneString,4,0,'-')
--less than 10 xxx-xxxx ext yy
WHEN LEN(@PhoneString) < 10
THEN @FinalString + STUFF(STUFF(@PhoneString,4,0,'-') ,9,0,' ext ')
--exactly 10 (yyy) xxx-xxxx
WHEN LEN(@PhoneString) = 10
THEN @FinalString + STUFF(STUFF(STUFF(@PhoneString,4,0,') '),1,0,' ('),10,0,'-')
--more than 10 (yyy) xxx-xxxx ext y+
WHEN LEN(@PhoneString) > 10
THEN @FinalString + STUFF(STUFF(STUFF(STUFF(@PhoneString,4,0,') '),1,0,' ('),11,0,'-'),16,0,' ext. ')
END
RETURN @FinalString
END
GO
and here's a sample data and script to show you the results:
declare @PhoneNumbers table(Num varchar(50))
insert into @PhoneNumbers
SELECT '1-800 427-4567 extension 456780' UNION ALL
SELECT '470' UNION ALL
SELECT '555-4561 x 1' UNION ALL
SELECT '913.135.9000 x:6390' UNION ALL
SELECT '(305)529-6911x2101' UNION ALL
SELECT '(206)965-2626' UNION ALL
SELECT '109.391.3261' UNION ALL
SELECT '201 933-3101 ext 6696' UNION ALL
SELECT '(919) 333-9395 x252' UNION ALL
SELECT '190-325-3121 Ext. 5096' UNION ALL
SELECT '(313) 196-1000, ex. 63936' UNION ALL
SELECT '(901) 123-3990 ext. 1209' UNION ALL
SELECT '305.263.0311' UNION ALL
SELECT '223 123 1922' UNION ALL
SELECT '301-191-3255' UNION ALL
SELECT '619 932 6660' UNION ALL
SELECT '(939)651-3631' UNION ALL
SELECT '913-931-9300' UNION ALL
SELECT '609-665-5312' UNION ALL
SELECT '512.322.1396' UNION ALL
SELECT '(101) 261-1911 x 6033' UNION ALL
SELECT '316-361-5900 ext 12021' UNION ALL
SELECT '223-123-1105' UNION ALL
SELECT '903-202-3339 x23031' UNION ALL
SELECT '631-261-5500' UNION ALL
SELECT '(950) 993-1151....9999' UNION ALL
SELECT '939-599-2100 x11200' UNION ALL
SELECT '915-939-5500, x2139'
SELECT num,dbo.StripNonNumeric(Num),dbo.FormatPhone(Num) from @PhoneNumbers
and here are my results:
Orig Stripped Formatted
1-800 427-4567 exten 18004274567456780 1- (800) 427-4567 ext. 456780
470 470 470
555-4561 x 1 55545611 555-4561 ext 1
913.135.9000 x:6390 91313590006390 (913) 135-9000 ext. 6390
(305)529-6911x2101 30552969112101 (305) 529-6911 ext. 2101
(206)965-2626 2069652626 (206) 96-52626
109.391.3261 1093913261 1-093-9132 ext 61
201 933-3101 ext 669 20193331016696 (201) 933-3101 ext. 6696
(919) 333-9395 x252 9193339395252 (919) 333-9395 ext. 252
190-325-3121 Ext. 50 19032531215096 1- (903) 253-1215 ext. 096
(313) 196-1000, ex. 313196100063936 (313) 196-1000 ext. 63936
(901) 123-3990 ext. 90112339901209 (901) 123-3990 ext. 1209
305.263.0311 3052630311 (305) 26-30311
223 123 1922 2231231922 (223) 12-31922
301-191-3255 3011913255 (301) 19-13255
619 932 6660 6199326660 (619) 93-26660
(939)651-3631 9396513631 (939) 65-13631
913-931-9300 9139319300 (913) 93-19300
609-665-5312 6096655312 (609) 66-55312
512.322.1396 5123221396 (512) 32-21396
(101) 261-1911 x 603 10126119116033 1- (012) 611-9116 ext. 033
316-361-5900 ext 120 316361590012021 (316) 361-5900 ext. 12021
223-123-1105 2231231105 (223) 12-31105
903-202-3339 x23031 903202333923031 (903) 202-3339 ext. 23031
631-261-5500 6312615500 (631) 26-15500
(950) 993-1151....99 95099311519999 (950) 993-1151 ext. 9999
939-599-2100 x11200 939599210011200 (939) 599-2100 ext. 11200
915-939-5500, x2139 91593955002139 (915) 939-5500 ext. 2139
Lowell
March 20, 2009 at 9:05 am
ok i think i'm wrong about area codes between 100 and 199(starting with 1), so you might want to remove the logic i put in place to assume starting with 1 means the 1-(xxx) yyy-zzzz ext aaaaa format
Lowell
March 20, 2009 at 7:30 pm
Lowell (3/20/2009)
ok i think i'm wrong about area codes between 100 and 199(starting with 1), so you might want to remove the logic i put in place to assume starting with 1 means the 1-(xxx) yyy-zzzz ext aaaaa format
Nah... you were correct. The first area code for public use is 201. Still, I wouldn't get phone number validation mixed in with cleaning up phone numbers. In fact, if I were cleaning them for my company, I certainly wouldn't clean them just to have them as a 10 digit blob of numbers with separators... I'd split them into 3, 3, 4, + digits for NPA, NXX, Line, and Extension. It's actually easier to do that than to clean them up and reformat them. The formatting should probably be done on the GUI side of the house.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 20, 2009 at 7:47 pm
Ok Shamshad... you've got almost 400 points to your name. That means you've been around long enough to know how to post data for sample data to make it easier on us. If you don't know how, then please study the article at the link in my signature below and study the following code because that's the best way to post sample data like what you have. Why? Because we can use it to test our own code so you get a better answer, quicker. 😉
Here's the test data/table...
SELECT '913.135.9000 x:6390' AS FullPhoneNumber
INTO #OriginalTable UNION ALL
SELECT '(305)529-6911x2101' UNION ALL
SELECT '(206)965-2626' UNION ALL
SELECT '109.391.3261' UNION ALL
SELECT '201 933-3101 ext 6696' UNION ALL
SELECT '(919) 333-9395 x252' UNION ALL
SELECT '190-325-3121 Ext. 5096' UNION ALL
SELECT '(313) 196-1000, ex. 63936' UNION ALL
SELECT '(901) 123-3990 ext. 1209' UNION ALL
SELECT '305.263.0311' UNION ALL
SELECT '223 123 1922' UNION ALL
SELECT '301-191-3255' UNION ALL
SELECT 'My new phone # is Area Code 637, my number is 5551515, and I''m at extension 1234. Got that?' UNION ALL
SELECT 'Yep... and mine is (842) number 444-1212.' UNION ALL
SELECT '619 932 6660' UNION ALL
SELECT '(939)651-3631' UNION ALL
SELECT '913-931-9300' UNION ALL
SELECT '609-665-5312 Extension 2-8-4444 'UNION ALL
SELECT '512.322.1396' UNION ALL
SELECT '(101) 261-1911 x 6033' UNION ALL
SELECT '316-361-5900 ext 12021' UNION ALL
SELECT '223-123-1105' UNION ALL
SELECT '903-202-3339 x23031' UNION ALL
SELECT '631-261-5500' UNION ALL
SELECT '(950) 993-1151....9999' UNION ALL
SELECT '939-599-2100 x11200' UNION ALL
SELECT '915-939-5500, x2139'
Notice that I threw in a couple of huge curve balls on the data... if you're going to clean something pretty much standardized like a phone number (in the "NANPA" areas, always 10 digits followed by an extension or not), you might as well expect the unexpected when it comes to stuff you might have to remove. This does NOT, however, validate the phone numbers or strip leading 1's... that should be done by something else.
Now, I suppose you can use a UDF to make life easier in the future... but, I wanna show you how to clean the whole bloody table all at once... and, yes, you do need a Tally table for the following code...
SELECT FullPhoneNumber,
t.N AS DigitPosition,
SUBSTRING(FullPhoneNumber,t.N,1) AS Digit
INTO #CleanTable
FROM dbo.Tally t
CROSS JOIN #OriginalTable yt
WHERE t.N <= LEN(FullPhoneNumber)
AND SUBSTRING(FullPhoneNumber,t.N,1) LIKE '[0-9]'
SELECT c1.FullPhoneNumber,
STUFF(
STUFF(
LEFT(
(
SELECT ''+c2.Digit
FROM #CleanTable c2
WHERE c1.FullPhoneNumber = c2.FullPhoneNumber
ORDER BY c2.DigitPosition
FOR XML PATH('')
)
,10) --Left 10 of all NANPA phone numbers for NPA NXX LINE
,7,0,'.') --Add period between NXX and LINE
,4,0,'.') --Add period between NPA and NXX
AS CleanNumber,
SUBSTRING(
(
SELECT ''+c2.Digit
FROM #CleanTable c2
WHERE c1.FullPhoneNumber = c2.FullPhoneNumber
ORDER BY c2.DigitPosition
FOR XML PATH('')
)
,11,1000)
AS CleanExtension
FROM #CleanTable c1
GROUP BY c1.FullPhoneNumber
I use a Temp table as interim storage instead of a CTE because the CTE would have to be referenced at least 3 times... that also means that the CTE would be executed 3 times.
And, if you need to know how a Tally table works and how to build it correctly, take a look at the following article...
http://www.sqlservercentral.com/articles/TSQL/62867/
If you want to see other articles about splitting and rebuilding parameters and the like, just click on my name in the Tally table article header.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 20, 2009 at 7:55 pm
Nope... bad move on my part... it's going to have to be a function because there's no guarantee that you won't have duplicate phone numbers after the split and that makes for some pretty ugly and long (not to mention, wrong, extensions). I'll be back in a minute.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 20, 2009 at 8:33 pm
Ok... again... here's the test data...
drop table #OriginalTable
go
SELECT '913.135.9000 x:6390' AS FullPhoneNumber
INTO #OriginalTable UNION ALL
SELECT '(305)529-6911x2101' UNION ALL
SELECT '(206)965-2626' UNION ALL
SELECT '109.391.3261' UNION ALL
SELECT '201 933-3101 ext 6696' UNION ALL
SELECT '(919) 333-9395 x252' UNION ALL
SELECT '190-325-3121 Ext. 5096' UNION ALL
SELECT '(313) 196-1000, ex. 63936' UNION ALL
SELECT '(901) 123-3990 ext. 1209' UNION ALL
SELECT '305.263.0311' UNION ALL
SELECT '223 123 1922' UNION ALL
SELECT '301-191-3255' UNION ALL
SELECT 'My new phone # is Area Code 637, my number is 5551515, and I''m at extension 1234. Got that?' UNION ALL
SELECT 'Yep... and mine is (842) number 444-1212.' UNION ALL
SELECT '619 932 6660' UNION ALL
SELECT '(939)651-3631' UNION ALL
SELECT '913-931-9300' UNION ALL
SELECT '609-665-5312 Extension 2-8-4444 'UNION ALL
SELECT '512.322.1396' UNION ALL
SELECT '512.322.1396' UNION ALL
SELECT '(101) 261-1911 x 6033' UNION ALL
SELECT '316-361-5900 ext 12021' UNION ALL
SELECT '223-123-1105' UNION ALL
SELECT '903-202-3339 x23031' UNION ALL
SELECT '631-261-5500' UNION ALL
SELECT '(950) 993-1151....9999' UNION ALL
SELECT '939-599-2100 x11200' UNION ALL
SELECT '915-939-5500, x2139'
Here's the functions...
CREATE FUNCTION dbo.Udf_GetPhone
(@FullPhoneNumber VARCHAR(200))
RETURNS CHAR(12)
AS
BEGIN
DECLARE @CleanPhone VARCHAR(200)
SELECT @CleanPhone = ''
SELECT @CleanPhone = @CleanPhone + SUBSTRING(@FullPhoneNumber,t.N,1)
FROM dbo.Tally t
WHERE t.N <= LEN(@FullPhoneNumber)
AND SUBSTRING(@FullPhoneNumber,t.N,1) LIKE '[0-9]'
RETURN (SELECT STUFF(STUFF(LEFT(@CleanPhone,10),7,0,'.'),4,0,'.'))
END
GO
CREATE FUNCTION dbo.Udf_GetExtension
(@FullPhoneNumber VARCHAR(200))
RETURNS CHAR(12)
AS
BEGIN
DECLARE @CleanPhone VARCHAR(200)
SELECT @CleanPhone = ''
SELECT @CleanPhone = @CleanPhone + SUBSTRING(@FullPhoneNumber,t.N,1)
FROM dbo.Tally t
WHERE t.N <= LEN(@FullPhoneNumber)
AND SUBSTRING(@FullPhoneNumber,t.N,1) LIKE '[0-9]'
RETURN (SUBSTRING(@CleanPhone,11,20))
END
And here's how to use them...
SELECT FullPhoneNumber,
dbo.Udf_GetPhone(FullPhoneNumber) AS CleanPhoneNumber,
dbo.Udf_GetExtension(FullPhoneNumber) AS CleanExtension
FROM #OriginalTable
It may be even faster if you make a table valued function to return both the phone number and the extension on a single row and do a CROSS APPLY... I won't do that, though, because, like I said, I'd split it down to the NPA/NXX/LINE/Extension columns.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 23, 2009 at 7:19 am
Hello Friends,
Infront of u biggiesss i m too small to write antyhing but ,,, still i tried and i got some ansewer....... here is the code .....
I have considered in any case first ten digit will be the phone number.........
DECLARE @VARIABLE AS VARCHAR(50)
SET @VARIABLE = UPPER('913.135.9000 x:6390')
DECLARE @I INT
SET @I=LEN(@VARIABLE)
PRINT @I
WHILE @I<>0
BEGIN
IF SUBSTRING(@VARIABLE,@I,1) NOT BETWEEN '0' AND '9'
BEGIN
SET @VARIABLE=REPLACE(@VARIABLE,SUBSTRING(@VARIABLE,@I,1),' ')
END
SET @I=@I-1
END
SET @VARIABLE=REPLACE(@VARIABLE,' ','')
--style 1
SELECT SUBSTRING(@VARIABLE,1,10) AS PHONE_NUMBER,
SUBSTRING(@VARIABLE,11,LEN(@VARIABLE)) AS EXT_NUMBER
---style 2
SELECT '('+SUBSTRING(@VARIABLE,1,3)+')'+'('+SUBSTRING(@VARIABLE,4,3)+')'+'('+SUBSTRING(@VARIABLE,7,4)+')' AS PHONE_NUMBER,
SUBSTRING(@VARIABLE,11,LEN(@VARIABLE)) AS EXT_NUMBER
--style 3
SELECT SUBSTRING(@VARIABLE,1,3)+'.'+SUBSTRING(@VARIABLE,4,3)+'.'+SUBSTRING(@VARIABLE,7,4) AS PHONE_NUMBER,
SUBSTRING(@VARIABLE,11,LEN(@VARIABLE)) AS EXT_NUMBER
just need to copy and run the code , try changing the values to the @variable string.....
let me know the bugs my script have.....
thanks @ regards,
Mithun
March 24, 2009 at 7:35 pm
Hey Shamshad... any feedback on this?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply