December 7, 2014 at 5:46 pm
Hello Good Evening, i have a phone number table which has single column in it with different style of phone numbers, i am trying to change it as required format
Required format are
1) Always start with a 1
2) Only Left 14 charcaters after spaces elimination (if there is any)
3) No spaces
4) special characters like (,),-,/ need to replace with '.'
5) if there is no separation/space then format 1.800.###.#### need to apply
create table #myphoneno (phone varchar(20))
Insert into #myphoneno values ('(866) 987-3847' ) --- Rule 1,3,4
Insert into #myphoneno values ('1-800-222-SAVE(7283)') --- Rule 2,4
Insert into #myphoneno values ('(800) 111-4015') --Rule 3,4
Insert into #myphoneno values ('18002288775') ---Rule 5
Insert into #myphoneno values ('(800) 2223415')--- Rule 1,3,4,5
Insert into #myphoneno values ('(800) 2-MOHAN')--- Rule 1,,3,4
Insert into #myphoneno values ('1-800-228-4995') ---Rule 4
Insert into #myphoneno values ('877-233-0112')---- Rule 1,4
Insert into #myphoneno values ('800 228 3315')--- 1,3,4
Insert into #myphoneno values ('18002284225')---Rule 5
Insert into #myphoneno values ('800/219-1140') --- Rule 1,4
Insert into #myphoneno values ('1 (855) ECO-8107') -- Rule 3,4
Insert into #myphoneno values ('(888) BOM4BQC')-- Rule1,3,4
Insert into #myphoneno values ('877-421K-MOX')-- Rule 1,4
Select * from #myphoneno
create table #myphonenoExpected (phone varchar(20))
Insert into #myphonenoExpected values ('1.866.987.3847' )
Insert into #myphonenoExpected values ('1.800.222.SAVE')
Insert into #myphonenoExpected values ('1.800.111.4015')
Insert into #myphonenoExpected values ('1.800.228.8775')
Insert into #myphonenoExpected values ('1.800.222.3415')
Insert into #myphonenoExpected values ('1.800.2.MOHAN')
Insert into #myphonenoExpected values ('1.800.228.4995')
Insert into #myphonenoExpected values ('1.877.233.0112')
Insert into #myphonenoExpected values ('1.800.228.3315')
Insert into #myphonenoExpected values ('1.800.228.4225')
Insert into #myphonenoExpected values ('1.800.219.1140')
Insert into #myphonenoExpected values ('1.855.ECO.8107')
Insert into #myphonenoExpected values ('1.888.BOM.4BQC')
Insert into #myphonenoExpected values ('1.877.421K.MOX')
Select * from #myphonenoExpected
Thank you in advance
Asiti
December 7, 2014 at 7:13 pm
I'm getting closer...
try something like this:
SELECT phone
, FixPhone
, LEFT(FixPhone,1) + '-(' + SUBSTRING(FixPhone,2,3) + ') ' + SUBSTRING(FixPhone,5,3) + '-' + SUBSTRING(FixPhone,8,4)
FROM (
SELECT phone
, CASE WHEN LEFT(phone,1)='1' THEN REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(phone,'(','.'),')','.'),'.',''),'-',''),' ','')
ELSE '1' + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(phone,'(','.'),')','.'),'.',''),'-',''),' ','') END AS FixPhone
FROM myphoneno) x;
December 8, 2014 at 3:38 am
Your examples show phone numbers from the US only. If your data can validly contain phone numbers from other countries then your validation and formatting is not correct.
Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.
When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara
December 8, 2014 at 3:41 am
Quick solution, counts the spaces after "cleaning" to determin whether there is a format present
😎
USE tempdb;
GO
SET NOCOUNT ON;
;WITH BASE_CLEANING AS
(
SELECT
P.phone
,RTRIM(LTRIM(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(P.phone,'-'
,CHAR(32)),'(',CHAR(32)),')',CHAR(32)),'/',CHAR(32)),'-',CHAR(32)),' ',CHAR(32)),' ',CHAR(32)))
) AS CLS_STR
FROM #myphoneno P
)
,FORMAT_DETECT AS
(
SELECT
BC.phone
,BC.CLS_STR
,LEN('-' + BC.CLS_STR + '-') - LEN(REPLACE('-' + BC.CLS_STR + '-',' ','')) AS NUM_SPACES
,CASE WHEN SUBSTRING(BC.CLS_STR,1,1) = '1' THEN 1 ELSE 0 END AS HAS_LEADING_ONE
FROM BASE_CLEANING BC
)
SELECT
FD.phone AS PHONE_BEFORE
,CASE
WHEN FD.HAS_LEADING_ONE = 1 AND FD.NUM_SPACES > 2 THEN SUBSTRING(FD.CLS_STR,1,14)
WHEN FD.HAS_LEADING_ONE = 0 AND FD.NUM_SPACES > 1 THEN SUBSTRING('1 ' + FD.CLS_STR,1,14)
WHEN FD.HAS_LEADING_ONE = 1 AND FD.NUM_SPACES = 0 THEN '1 ' + SUBSTRING(FD.CLS_STR,2,3) + CHAR(32)
+ SUBSTRING(FD.CLS_STR,5,3) + CHAR(32)
+ SUBSTRING(FD.CLS_STR,8,4)
WHEN FD.HAS_LEADING_ONE = 0 AND FD.NUM_SPACES = 0 THEN '1 ' + SUBSTRING(FD.CLS_STR,1,3) + CHAR(32)
+ SUBSTRING(FD.CLS_STR,4,3) + CHAR(32)
+ SUBSTRING(FD.CLS_STR,7,4)
WHEN FD.HAS_LEADING_ONE = 0 AND FD.NUM_SPACES = 1 THEN '1 ' + SUBSTRING(REPLACE(FD.CLS_STR,CHAR(32),''),1,3) + CHAR(32)
+ SUBSTRING(REPLACE(FD.CLS_STR,CHAR(32),''),4,3) + CHAR(32)
+ SUBSTRING(REPLACE(FD.CLS_STR,CHAR(32),''),7,4)
WHEN FD.HAS_LEADING_ONE = 1 AND FD.NUM_SPACES < 3 THEN '1 ' + SUBSTRING(REPLACE(FD.CLS_STR,CHAR(32),''),2,3) + CHAR(32)
+ SUBSTRING(REPLACE(FD.CLS_STR,CHAR(32),''),5,3) + CHAR(32)
+ SUBSTRING(REPLACE(FD.CLS_STR,CHAR(32),''),8,4)
ELSE 'UNKNOWN (' + FD.CLS_STR + ')'
END AS PHONE_AFTER
FROM FORMAT_DETECT FD;
Results
PHONE_BEFORE PHONE_AFTER
-------------------- ---------------
(866) 987-3847 1 866 987 3847
1-800-222-SAVE(7283) 1 800 222 SAVE
(800) 111-4015 1 800 111 4015
18002288775 1 800 228 8775
(800) 2223415 1 800 222 3415
(800) 2-MOHAN 1 800 2 MOHAN
1-800-228-4995 1 800 228 4995
877-233-0112 1 877 233 0112
800 228 3315 1 800 228 3315
18002284225 1 800 228 4225
800/219-1140 1 800 219 1140
1 (855) ECO-8107 1 855 ECO 8107
(888) BOM4BQC 1 888 BOM 4BQC
877-421K-MOX 1 877 421K MOX
December 8, 2014 at 12:12 pm
Are you required to store the number with the formatting? It can be easier to apply the formatting when retrieving the phone number.
Don Simpson
December 8, 2014 at 1:12 pm
DonlSimpson (12/8/2014)
Are you required to store the number with the formatting? It can be easier to apply the formatting when retrieving the phone number.
Quick thought, if there are formatting rules applicable such as in this case, the input should validate rather than the storage. On the other hand, if the data is coming from external sources, it is generally better to unify the storage format, saves one from lots of problems down the line....
😎
December 8, 2014 at 1:28 pm
Eirikur Eiriksson (12/8/2014)
DonlSimpson (12/8/2014)
Are you required to store the number with the formatting? It can be easier to apply the formatting when retrieving the phone number.Quick thought, if there are formatting rules applicable such as in this case, the input should validate rather than the storage. On the other hand, if the data is coming from external sources, it is generally better to unify the storage format, saves one from lots of problems down the line....
😎
I agree completely. What I meant to suggest was stripping all formatting from the number and storing the un-formatted number, then formatting as appropriate when displaying.
Don Simpson
December 8, 2014 at 1:43 pm
My approach below. I added a phone_id to the phone#s in the tables so I could join and show the expected output in the same query as the original phone# and edited phone#. This code won't necessarily handle 100% of all variations as you'd like if letters are involved, but you can tweak it as needed.
set nocount on
if object_id('tempdb.dbo.#myphoneno') is not null drop table #myphoneno
create table #myphoneno (phone_id int, phone varchar(20))
Insert into #myphoneno values (01, '(866) 987-3847' ) --- Rule 1,3,4
Insert into #myphoneno values (02, '1-800-222-SAVE(7283)') --- Rule 2,4
Insert into #myphoneno values (03, '(800) 111-4015') --Rule 3,4
Insert into #myphoneno values (04, '18002288775') ---Rule 5
Insert into #myphoneno values (05, '(800) 2223415')--- Rule 1,3,4,5
Insert into #myphoneno values (06, '(800) 2-MOHAN')--- Rule 1,,3,4
Insert into #myphoneno values (07, '1-800-228-4995') ---Rule 4
Insert into #myphoneno values (08, '877-233-0112')---- Rule 1,4
Insert into #myphoneno values (09, '800 228 3315')--- 1,3,4
Insert into #myphoneno values (10, '18002284225')---Rule 5
Insert into #myphoneno values (11, '800/219-1140') --- Rule 1,4
Insert into #myphoneno values (12, '1 (855) ECO-8107') -- Rule 3,4
Insert into #myphoneno values (13, '(888) BOM4BQC')-- Rule1,3,4
Insert into #myphoneno values (14, '877-421K-MOX')-- Rule 1,4
--Select * from #myphoneno
if object_id('tempdb.dbo.#myphonenoexpected') is not null drop table #myphonenoexpected
create table #myphonenoExpected (phone_id int, phone varchar(20))
Insert into #myphonenoExpected values (01, '1.866.987.3847' )
Insert into #myphonenoExpected values (02, '1.800.222.SAVE')
Insert into #myphonenoExpected values (03, '1.800.111.4015')
Insert into #myphonenoExpected values (04, '1.800.228.8775')
Insert into #myphonenoExpected values (05, '1.800.222.3415')
Insert into #myphonenoExpected values (06, '1.800.2.MOHAN')
Insert into #myphonenoExpected values (07, '1.800.228.4995')
Insert into #myphonenoExpected values (08, '1.877.233.0112')
Insert into #myphonenoExpected values (09, '1.800.228.3315')
Insert into #myphonenoExpected values (10, '1.800.228.4225')
Insert into #myphonenoExpected values (11, '1.800.219.1140')
Insert into #myphonenoExpected values (12, '1.855.ECO.8107')
Insert into #myphonenoExpected values (13, '1.888.BOM.4BQC')
Insert into #myphonenoExpected values (14, '1.877.421K.MOX')
set nocount off
SELECT mp.phone AS phone_original, mpe.phone AS phone_expected, phone_edited5
FROM #myphoneno mp
left outer join #myphonenoExpected mpe on
mpe.phone_id = mp.phone_id
CROSS APPLY (
SELECT LEFT(CASE WHEN SUBSTRING(phone_edited4,10, 1) LIKE '[.A-Z]' THEN phone_edited4 ELSE STUFF(phone_edited4,10, 0, '.') END, 14)
AS phone_edited5
FROM (
SELECT CASE WHEN SUBSTRING(phone_edited3, 6, 1) LIKE '[.A-Z]' THEN phone_edited3 ELSE STUFF(phone_edited3, 6, 0, '.') END
AS phone_edited4
FROM (
SELECT CASE WHEN SUBSTRING(phone_edited2, 2, 1) LIKE '[.A-Z]' THEN phone_edited2 ELSE STUFF(phone_edited2, 2, 0, '.') END
AS phone_edited3
FROM (
SELECT CASE WHEN LEFT(phone_edited1, 1) = '1' THEN '' ELSE '1' END + phone_edited1
AS phone_edited2
FROM (
SELECT REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(mp.phone,
SPACE(1), ''), '(', '.'), ')', '.'), '-', '.'), '/', '.') AS phone_edited1
) AS derived1
) AS derived2
) AS derived3
) AS derived4
) AS assign_alias_names
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 9, 2014 at 8:19 am
Borrowing from Scott's table code, and adding in Dwain's sweet pattern splitter[/url], which I used for a long time to 'clean' phone numbers. This will get you most of the way to where you're going pretty quickly based on sample data.
-- PatternSplitCM will split a string based on a pattern of the form
-- supported by LIKE and PATINDEX
--
-- Created by: Chris Morris 12-Oct-2012
CREATE FUNCTION [dbo].[PatternSplitCM]
(
@List VARCHAR(8000) = NULL
,@Pattern VARCHAR(50)
) RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
WITH numbers AS (
SELECT TOP(ISNULL(DATALENGTH(@List), 0))
n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) f (n),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) g (n))
SELECT
ItemNumber = ROW_NUMBER() OVER(ORDER BY MIN(n)),
Item = SUBSTRING(@List,MIN(n),1+MAX(n)-MIN(n)),
[Matched]
FROM (
SELECT n, y.[Matched], Grouper = n - ROW_NUMBER() OVER(ORDER BY y.[Matched],n)
FROM numbers
CROSS APPLY (
SELECT [Matched] = CASE WHEN SUBSTRING(@List,n,1) LIKE @Pattern THEN 1 ELSE 0 END
) y
) d
GROUP BY [Matched], Grouper
SET nocount on
if object_id('tempdb.dbo.#myphoneno') is not null drop table #myphoneno
create table #myphoneno (phone_id int, phone varchar(20))
Insert into #myphoneno values (01, '(866) 987-3847' ) --- Rule 1,3,4
Insert into #myphoneno values (02, '1-800-222-SAVE(7283)') --- Rule 2,4
Insert into #myphoneno values (03, '(800) 111-4015') --Rule 3,4
Insert into #myphoneno values (04, '18002288775') ---Rule 5
Insert into #myphoneno values (05, '(800) 2223415')--- Rule 1,3,4,5
Insert into #myphoneno values (06, '(800) 2-MOHAN')--- Rule 1,,3,4
Insert into #myphoneno values (07, '1-800-228-4995') ---Rule 4
Insert into #myphoneno values (08, '877-233-0112')---- Rule 1,4
Insert into #myphoneno values (09, '800 228 3315')--- 1,3,4
Insert into #myphoneno values (10, '18002284225')---Rule 5
Insert into #myphoneno values (11, '800/219-1140') --- Rule 1,4
Insert into #myphoneno values (12, '1 (855) ECO-8107') -- Rule 3,4
Insert into #myphoneno values (13, '(888) BOM4BQC')-- Rule1,3,4
Insert into #myphoneno values (14, '877-421K-MOX')-- Rule 1,4
--Select * from #myphoneno
if object_id('tempdb.dbo.#myphonenoexpected') is not null drop table #myphonenoexpected
create table #myphonenoExpected (phone_id int, phone varchar(20))
Insert into #myphonenoExpected values (01, '1.866.987.3847' )
Insert into #myphonenoExpected values (02, '1.800.222.SAVE')
Insert into #myphonenoExpected values (03, '1.800.111.4015')
Insert into #myphonenoExpected values (04, '1.800.228.8775')
Insert into #myphonenoExpected values (05, '1.800.222.3415')
Insert into #myphonenoExpected values (06, '1.800.2.MOHAN')
Insert into #myphonenoExpected values (07, '1.800.228.4995')
Insert into #myphonenoExpected values (08, '1.877.233.0112')
Insert into #myphonenoExpected values (09, '1.800.228.3315')
Insert into #myphonenoExpected values (10, '1.800.228.4225')
Insert into #myphonenoExpected values (11, '1.800.219.1140')
Insert into #myphonenoExpected values (12, '1.855.ECO.8107')
Insert into #myphonenoExpected values (13, '1.888.BOM.4BQC')
Insert into #myphonenoExpected values (14, '1.877.421K.MOX')
set nocount OFF
SELECT * ,
( SELECT ' ' + Item
FROM dbo.PatternSplitCM([phone], '[0-9a-zA-Z]')
WHERE [Matched] = 1
FOR
XML PATH('')
)
FROM #myphonenoExpected
December 10, 2014 at 12:39 pm
Using this PatReplace8K function:
CREATE FUNCTION dbo.PatReplace8K
(
@String VARCHAR(8000),
@Pattern VARCHAR(50),
@Replace VARCHAR(1)
)
/*
Created by Alan Burstein Nov(ish)/2014
With help by Eirikur Eiriksson ;-)
*/
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
WITH E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS E1(N)),
Tally(N) AS (SELECT TOP (LEN(@String)) CHECKSUM(ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM E1 a,E1 b,E1 c,E1 d)
SELECT NewString =
CAST
(
(
SELECT CASE
WHEN PATINDEX(@Pattern,SUBSTRING(@String COLLATE Latin1_General_BIN,N,1)) = 0
THEN SUBSTRING(@String,N,1)+''
ELSE @replace+''
END
FROM Tally
FOR XML PATH(''), TYPE
)
AS varchar(8000)
);
GO
You could do this:
Insert into #myphoneno values ('800-222-SAVE(7283)') --- Dup, just adding to test the "1 in front" situation
WITH
prep AS
(SELECT *, px = patindex('%[a-z][a-z][a-z][a-z]([0-9][0-9][0-9][0-9])%',phone)
FROM #myphoneno
),
transform1 AS
(SELECT phone, newnum = newstring
FROM prep
CROSS APPLY PatReplace8K(phone, '[^0-9]','') xx
WHERE 0 <> px
UNION ALL
SELECT phone, xx9.newstring
FROM prep
CROSS APPLY dbo.PatReplace8K(phone, '[a-cA-C]','2') xx1
CROSS APPLY dbo.PatReplace8K(xx1.newstring, '[d-fD-F]','3') xx2
CROSS APPLY dbo.PatReplace8K(xx2.newstring, '[g-iG-I]','4') xx3
CROSS APPLY dbo.PatReplace8K(xx3.newstring, '[j-lJ-L]','5') xx4
CROSS APPLY dbo.PatReplace8K(xx4.newstring, '[m-oM-O]','6') xx5
CROSS APPLY dbo.PatReplace8K(xx5.newstring, '[p-sP-S]','7') xx6
CROSS APPLY dbo.PatReplace8K(xx6.newstring, '[t-vT-V]','8') xx7
CROSS APPLY dbo.PatReplace8K(xx7.newstring, '[w-zW-Z]','9') xx8
CROSS APPLY dbo.PatReplace8K(xx8.newstring, '[^0-9]','') xx9
WHERE 0 = px
),
transform2 AS
(SELECT old = phone, newnum = IIF(substring(newnum,1,1)=1, newnum, concat(1,newnum))
FROM transform1
)
SELECT old, newnum = IIF(len(newnum)<>11,'BAD!',stuff(stuff(stuff(newnum, 2,0,'.'),6,0,'.'),10,0,'.'))
FROM transform2;
and get this:
oldnewnum
--------------------------------------------------------------------------------------------------------------------------
1-800-222-SAVE(7283)1.800.222.7283
800-222-SAVE(7283) 1.800.222.7283
(866) 987-3847 1.866.987.3847
(800) 111-40151.800.111.4015
180022887751.800.228.8775
(800) 22234151.800.222.3415
(800) 2-MOHANBAD!
1-800-228-49951.800.228.4995
877-233-01121.877.233.0112
800 228 33151.800.228.3315
180022842251.800.228.4225
800/219-11401.800.219.1140
1 (855) ECO-81071.855.326.8107
(888) BOM4BQC1.888.266.4272
877-421K-MOX1.877.421.5669
-- Itzik Ben-Gan 2001
December 10, 2014 at 6:18 pm
Now that there are multiple methods of cleaning up the data, have you considered cleaning up the source of the data so that this type of stuff doesn't happen to begin with?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2014 at 3:18 pm
Hi Jeff, I fully agree with cleaning the source. Only problem is it depends if you are responsible for it, if you are an application or sql developer, you can but maybe the company has a data guy that's responsible for Inserts/Updates/Deletes (db maintenance), if that's the case you can ask the data guy to do it (hopefully he gets the time for it, else it might never get done or only after a few months). So what do you do in the meanwhile when customer want's job done like yesterday ? Guess the same as most of us, work-around it. If you do BI you can work around it with fancy expressions in SSIS, but that's again a work-around. If only all developers had enough time to do cleaning / optimization then development would be much easier, SQL queries execute much faster. Unfortunately, with the worldwide economy not looking that bright (companies retrenching staff or closing shop), in many cases a developer might end up doing 2 to 3 developers jobs, meaning cleaning / optimization might never happen.
December 21, 2014 at 2:15 pm
Oh, I definitely agree, Kevin. Sometimes you do have to play with the cards you're dealt. I do try to get the dealer to deal better cards, though.
So, are you all set now?
--Jeff Moden
Change is inevitable... Change for the better is not.
December 22, 2014 at 2:03 pm
CELKO (12/22/2014)
Pull all the punctuation out and pad with digits as needed until you get the Standard International phone number. Let the presentation layers handle presentations.http://en.wikipedia.org/wiki/E.123
You also might want to look at a data scrubbing package to clean up this and other data.
BWAA-HAAAA... with 1 to 3 digits for a country code, 0 to 4 digits for a city code, the dialer zones of Mexico, and the NPA/NXX format in the North American Numbering plan, I'm thinking that there isn't actually a consistent standard other than using just spaces for semantical groupings. Even the Wiki article used a "+" sign to indicate a dialer prefix.
--Jeff Moden
Change is inevitable... Change for the better is not.
December 22, 2014 at 2:57 pm
Jeff Moden (12/22/2014)
CELKO (12/22/2014)
Pull all the punctuation out and pad with digits as needed until you get the Standard International phone number. Let the presentation layers handle presentations.http://en.wikipedia.org/wiki/E.123
You also might want to look at a data scrubbing package to clean up this and other data.
BWAA-HAAAA... with 1 to 3 digits for a country code, 0 to 4 digits for a city code, the dialer zones of Mexico, and the NPA/NXX format in the North American Numbering plan, I'm thinking that there isn't actually a consistent standard other than using just spaces for semantical groupings. Even the Wiki article used a "+" sign to indicate a dialer prefix.
Store a formatting code for how the output needs to be. I can't image you'd need more than a tinyint for that. You still never need to store the formatting in the number itself, and you shouldn't.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply