November 22, 2007 at 12:08 pm
I'm still battling with PatIndex.I have two issues TYpe_1 is partly correct except
that if
1.
For the Column Type_1 : using the first record , I'm trying to extract the '80' as the TYPE_1
its seperated by a space. Sometimes in one space sometimes its a double space.
2.
For Column Code1 : i.e '80 98439 9991 997 -50.000' I want to extract the 98439.Sometimes this
is 4 characters long or 3 or five.
If I do it in two steps it works. How can I do it in one step and split those two field.
Output should be:
'80 98439 9991 997 -50.000' should be Type = 80 Code = 98439
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'01 759 9991 997 -0.560' should be Type = 01 Code = 759
'01 9990 9991 997 -127.910' should be Type = 01 Code = 9990
CREATE TABLE #Test (String Varchar(100),ROWID INT IDENTITY(1,1))
INSERT #test VALUES ('80 98439 9991 997 -50.000')
INSERT #test VALUES ('0000058916 00000074000039708')
INSERT #test VALUES ('80 98408 9991 997 -10.000')
INSERT #test VALUES ('80 98402 9991 997 -1.400')
INSERT #test VALUES ('01 9990 9991 997 -127.910')
INSERT #test VALUES ('01 759 9991 997 -0.560')
INSERT #test VALUES ('80 98440 9991 997 -84.330')
INSERT #test VALUES ('80 98474 9991 997 -2.000')
INSERT #test VALUES ('80 98402 9991 997 -0.280')
INSERT #test VALUES ('0000058916 00000074000039708')
INSERT #test VALUES ('80 98408 9991 997 -10.000')
INSERT #test VALUES ('80 98402 9991 997 -1.400')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('01 6060 3 997 -1000.000 CB')
INSERT #test VALUES ('METLIFE_MALL_2 LA TREE D')
INSERT #test VALUES ('80 98418 3 997 -5.000')
INSERT #test VALUES ('80 98421 3 997 -9.500')
INSERT #test VALUES ('80 98402 3 997 -2.030')
INSERT #test VALUES ('01 6060 3 997 -500.000 CB')
INSERT #test VALUES ('M.A.N_SUPERMKT KI NG WILLIAM D')
INSERT #test VALUES ('80 98418 3 997 -5.000')
INSERT #test VALUES ('80 98421 3 997 -4.750')
INSERT #test VALUES ('80 98402 3 997 -1.370')
INSERT #test VALUES ('01 6060 3 997 -100.000 CB')
INSERT #test VALUES ('M.A.N_SUPERMKT KI NG WILLIAM D')
INSERT #test VALUES ('80 98418 3 997 -5.000')
INSERT #test VALUES ('80 98421 3 997 -0.950')
INSERT #test VALUES ('80 98402 3 997 -0.830')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('01 6060 3 997 -20.000 CB')
INSERT #test VALUES ('METLIFE_MALL_2 LA TREE D')
INSERT #test VALUES ('80 98418 3 997 -5.000')
INSERT #test VALUES ('80 98421 3 997 -0.190')
INSERT #test VALUES ('80 98402 3 997 -0.730')
INSERT #test VALUES ('80 98484 9991 997 -50.000')
INSERT #test VALUES ('80 98402 9991 997 -7.000')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
INSERT #test VALUES ('0000074164 00000074000039708')
INSERT #test VALUES ('80 98408 9991 997 -10.000')
INSERT #test VALUES ('80 98402 9991 997 -1.400')
INSERT #test VALUES ('80 98416 3 997 -1.750')
INSERT #test VALUES ('80 98402 3 997 -0.250')
----Step 1
drop table ##DDD
SELECT
SUBSTRING(String, 1, CHARINDEX(' ', String) - 1) AS TYPE_1,
LTRIM(RTRIM(SUBSTRING(String,CHARINDEX(' ', String) + 1, LEN(String)))) AS Code
,MonetoryValue =
CASE
WHEN CHARINDEX('.',REVERSE(String)) = 0 THEN NULL ELSE
REVERSE(SUBSTRING(REVERSE(String), CHARINDEX('.',REVERSE(String))+1,
CHARINDEX(' ',REVERSE(String),CHARINDEX('.',REVERSE(String))) - CHARINDEX('.',REVERSE(String))))
END +
CASE
WHEN CHARINDEX('.',String) =0 THEN NULL ELSE
REPLACE(REPLACE(SUBSTRING(String,CHARINDEX('.',String)
,LEN(String) - CHARINDEX('.',String)+1),'CB',''),'FC','')
END
,String INTO ##DDD
FROM
#Test WHERE String != 'M.A.N_SUPERMKT KI NG WILLIAM D'
--Step 2
SELECT Type_1,MonetoryValue,LEFT(Code, NULLIF(CHARINDEX(' ', Code) - 1, -1)) AS Code1,Code,string FROM ##DDD
November 22, 2007 at 9:15 pm
Output should be:
'80 98439 9991 997 -50.000' should be Type = 80 Code = 98439
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'01 759 9991 997 -0.560' should be Type = 01 Code = 759
'01 9990 9991 997 -127.910' should be Type = 01 Code = 9990
Hey! Matt Miller! This looks like a great place for one of those awesome Regex things you do! 🙂
RayM,
In the absense of Regex, you first need to build a Tally table to make your life easier for a lot of things. A Tally table is nothing more than a table with a single column of very well indexed sequential numbers... it's used to replace loops and the like... here's how to build one...
--===== Create and populate the Tally table on the fly
SELECT TOP 11000 --equates to more than 30 years of dates
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC
Once that's done, your problem becomes a whole lot easier and doesn't require performance robbers like the REVERSE function...
SELECT s.RowID,
Type = MAX(CASE WHEN s.Posit =1 AND LEN(s.SplitString) =2 THEN s.SplitString ELSE NULL END),
Code = MAX(CASE WHEN s.Posit =2 AND LEN(s.SplitString)<=5 THEN s.SplitString ELSE NULL END)
FROM (--==== Derived table "s" splits all of the strings and marks the position (ordinal index) of each
SELECT RowID,
SplitString = SUBSTRING(' '+h.String+' ', t.N+1, CHARINDEX(' ', ' '+h.String+' ', t.N+1)-t.N-1),
Posit = t.N-LEN(REPLACE(LEFT(' '+h.String+' ',t.N), ' ', ''))
FROM dbo.Tally t
RIGHT OUTER JOIN --Necessary in case String is NULL
#Test h
ON SUBSTRING(' '+h.String+' ', t.N, 1) = ' '
AND t.N < LEN(' '+h.String+' ')
AND h.String NOT LIKE '%[_]%'
)s
WHERE s.Posit IN (1,2)
GROUP BY s.RowID
--Jeff Moden
Change is inevitable... Change for the better is not.
November 22, 2007 at 9:23 pm
Sorry... missed the "MonetaryValue" column that you included in your code but not in your problem definition... this will do it...
SELECT s.RowID,
Type = MAX(CASE WHEN s.Posit =1 AND LEN(s.SplitString) =2 THEN s.SplitString ELSE NULL END),
Code = MAX(CASE WHEN s.Posit =2 AND LEN(s.SplitString)<=6 THEN s.SplitString ELSE NULL END),
MonetaryValue = MAX(CASE WHEN s.Posit =5 THEN s.SplitString ELSE NULL END)
FROM (--==== Derived table "s" splits all of the strings and marks the position (ordinal index) of each
SELECT RowID,
SplitString = SUBSTRING(' '+h.String+' ', t.N+1, CHARINDEX(' ', ' '+h.String+' ', t.N+1)-t.N-1),
Posit = t.N-LEN(REPLACE(LEFT(' '+h.String+' ',t.N), ' ', ''))
FROM dbo.Tally t
RIGHT OUTER JOIN --Necessary in case String is NULL
#Test h
ON SUBSTRING(' '+h.String+' ', t.N, 1) = ' '
AND t.N < LEN(' '+h.String+' ')
AND h.String NOT LIKE '%[_]%'
)s
WHERE s.Posit IN (1,2,5)
GROUP BY s.RowID
--Jeff Moden
Change is inevitable... Change for the better is not.
November 23, 2007 at 5:53 am
SELECT * FROM tbPROCESSED
This is the full picture of what I'm trying to do.The problematic field is called Actual_Text which is the string that requires manipulation.
I'm deriving three fields from field Actual_Text
fields Type,Code & MonetoryValue
Field Type is the digits starting from the left and end when it encounters a space
only where there is a Valid monetoryValue
ie. '80 98437 3 997 -2.500'
in this case Type will be = 80
ie. 'PARK AVENUE MANOR Pr RAZOO 0'
in this case Type will be = null
ie. '530719000254'
in this case Type will be = null
-----------------------------Code
Field Code is the digits after the first space starting from the left and ends when it encounters the
second space ,only where there is a Valid monetoryValue
ie. '80 98437 3 997 -2.500'
Code will be = 98437
ie. 'PARK AVENUE MANOR Pr RAZOO 0'
Code will be = null
ie. '530719000254'
Code will be = null
ie. '01 6060 3 997 -20.000 CO'
Code will be = 6060
ie. '0997 2247'
Code will be = null because there is no MonetoryValue
---MonetoryValue
I can extract the monetory value only problem is ho do I remove
the alpha characters without using replace
i.e '01 6060 3 997 -20.000 CO'
'-20.000 CO' to exclude CO etc...
'01 6060 3 997 -50.000 CB'
November 23, 2007 at 11:12 am
The code I posted extracts all three fields even for the "01" records...
You didn't even try the code, did you?
--Jeff Moden
Change is inevitable... Change for the better is not.
November 26, 2007 at 12:33 pm
Jeff Moden (11/22/2007)
Output should be:
'80 98439 9991 997 -50.000' should be Type = 80 Code = 98439
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'0000058916 00000074000039708' should be Type = NULL Code = NULL
'01 759 9991 997 -0.560' should be Type = 01 Code = 759
'01 9990 9991 997 -127.910' should be Type = 01 Code = 9990
Hey! Matt Miller! This looks like a great place for one of those awesome Regex things you do! 🙂
Sorry - just realized I got "paged" to the SSC forum during the long vacation weekend....:)
Yes - you could use a CLR Regex Function to do just that if you want to. It's actually rather cute, since it allows you to "throw out" the result if you don't have BOTH (if you so desire).
I just put this together, if you should want to go down that path.
Here's the CLR (.NET code):
Imports System
Imports System.Data
Imports System.Data.SqlClient
Imports System.Data.SqlTypes
Imports Microsoft.SqlServer.Server
Imports System.Text.RegularExpressions
Partial Public Class UserDefinedFunctions
Private Const optionS As RegexOptions = RegexOptions.CultureInvariant
<Microsoft.SqlServer.Server.SqlFunction(IsDeterministic:=True, IsPrecise:=True)< _
Public Shared Function RegexMatch(ByVal input As SqlChars, ByVal pattern As SqlString) As SqlString
' Add your code here
Dim rex As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern.Value, optionS)
Return New SqlString(CType(rex.IsMatch(New String(input.Value)), String))
End Function
<Microsoft.SqlServer.Server.SqlFunction(IsDeterministic:=True, IsPrecise:=True)< _
Public Shared Function RegexMatchGroup(ByVal input As SqlChars, ByVal pattern As SqlString, byval groupnum as SqlInt32) As SqlString
Dim rex As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern.Value, optionS)
Return New SqlString(CType(rex.Matches(New String(input.Value))(0).Groups(groupnum.Value).ToString, String))
End Function
End Class
At that point, you could use something like this to pick out the pieces:
select dbo.regexmatchgroup('80 12345 p9a996458aaa','^(?<1>[0-9]{2}) (?<2>[0-9]{3,5})',2)
The last parameter could be 0, 1, or 2 (0=whole string, 1/2 = the named backreferences to 80 and 12345 respectively in my example).
It does require (at least for me) knowing how to correctly structure the grouping constructs for .NET, but this should give you a running start.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
November 26, 2007 at 3:55 pm
There he is! And I knew he'd have a pretty good answer for this one, too!
How was the weekend, Matt?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply