April 5, 2021 at 10:16 am
Hi,
I need some guidelines to convert Teradata REGEXP_INSTR logic into SQL server code. Please find the attached file in which I had provided a sample script.
Appreciate your reply in advance.
Thanks
April 5, 2021 at 10:18 am
Not allow me to attahed .sql file so copy paste code here.
SELECT case when REGEXP_INSTR(string, '/ OB/') > 0
OR REGEXP_INSTR(string, '/OB/') > 0
OR REGEXP_INSTR(string, 'OB//') > 0
OR REGEXP_INSTR(string, 'T/O') > 0
OR REGEXP_INSTR(string, '/ OB') > 0
OR REGEXP_INSTR(string, 'ON OB') > 0
THEN 'OB'
when REGEXP_INSTR(string, 'TP[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'RR[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'RRO/[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TP [0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TP NUM[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'RRA[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, '/R[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TP NO[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TP NBR[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'RR NBR[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'RRO[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, '/RRO[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, '/RR [0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'RR[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TN-[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TP NO.[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TRIP PASS[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TPAUTH[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TPNO[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TAO[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, '/RPO[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'RRP[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'TP/[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, '/RFA[0-9]{5,7}') > 0
OR REGEXP_INSTR(string, 'RRA[0-9]{5,7}') > 0
THEN 'TP'
ELSE 'PP' END XYZ
FROM TABLE A
April 5, 2021 at 8:13 pm
Think the easiest way will be to break out what each REGEX expression you indicated means and build up case statements that reflect this. I am just going to help with the first one and let you go from there. The "teach a man to fish" approach. So you have this:
SELECT case when REGEXP_INSTR(string, '/ OB/') > 0
OR REGEXP_INSTR(string, '/OB/') > 0
OR REGEXP_INSTR(string, 'OB//') > 0
OR REGEXP_INSTR(string, 'T/O') > 0
OR REGEXP_INSTR(string, '/ OB') > 0
OR REGEXP_INSTR(string, 'ON OB') > 0
THEN 'OB'
So, lets grab each of these in turn and parse them. /OB/ is the first one. Now, without knowing 100% how teradata handles this or what your data looks like, I am going to assume that those are all literal things you are looking up. That is you are looking up the literal string /OB/. The reason I think this is because /OB/ is not valid REGEX (as far as I know and based on the online regex tool I was testing this with). So, what I am thinking is that this is really doing "INSTR" (in string) and not actual REGEX based on how you wrote it. So to convert the above to TSQL, something like this would work:
SELECT CASE WHEN string LIKE '%/OB/%'
OR string LIKE '%OB//%'
OR string LIKE '%T/O%'
OR string LIKE '%/ OB%'
OR string LIKE '%ON OB%'
THEN 'OB'
Now, if you are using actual REGEX (or some modified form of Regex used by teradata), you will need to tweak the above.
The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.
April 6, 2021 at 5:18 am
Thanks for your reply, Mr. Brian. I got your point. I will first check with the client to run this script on their Teradata environment and based on the result will apply to convert logic.
April 6, 2021 at 4:43 pm
You can create your own INSTR function leveraging ngrams8k.
CREATE OR ALTER FUNCTION dbo.Instr8K
(
@string VARCHAR(8000),
@search VARCHAR(100),
@instance INT
)
/*
Created by Alan Burstein, 20210406
Similar to Teradata and Oracles INSTR and REGEX_INSTR functions
*/
RETURNS TABLE AS RETURN
SELECT Position = ISNULL(MAX(f.Position),0)
FROM
(
SELECT ng.Position, ng.Token, Instance = ROW_NUMBER() OVER (ORDER BY (ng.Position))
FROM samd.ngrams8k(@string,DATALENGTH(@search)) AS ng
WHERE ng.Token = @search
) AS f
WHERE f.instance = ISNULL(@instance,1);
GO
Here's how you would find the position of the 1st, 2nd, 3rd OR 4th instance of the text "AB" inside the string "ABCXYZ123ABCAB".
DECLARE @string VARCHAR(1000) = 'ABCXYZ123ABCAB',
@search VARCHAR(100) = 'AB';
SELECT instr.Position FROM dbo.Instr8K(@string,@search,1) AS instr;
SELECT instr.Position FROM dbo.Instr8K(@string,@search,2) AS instr;
SELECT instr.Position FROM dbo.Instr8K(@string,@search,3) AS instr;
SELECT instr.Position FROM dbo.Instr8K(@string,@search,4) AS instr;
This shows that the first instance of AB is at position 1, the second at position 10, 13 for the third and 0 for the 4th instance as there are only three. Now for a simplified version of your problem which should help you get started.
DECLARE @things TABLE (String VARCHAR(1000) UNIQUE);
INSERT @things
VALUES('>>>/OB///'),(' /OBXYZ'),('OB///GYN'),(''),('(^*^)'),('555'),('{5,7}'),('{5,5}'),
('MOT/O'),('88.{5,7},{5,9}')
SELECT
t.String, p.Pattern, i.Position
FROM @things AS t
CROSS JOIN (VALUES('/OB/'),('OB//'),('/OB'),('T/O')) AS p(Pattern)
CROSS APPLY dbo.Instr8K(t.String,p.Pattern,1) AS i
WHERE i.Position > 0;
This returns:
String Pattern Position
------------ ------- ----------
/OBXYZ /OB 2
>>>/OB/// /OB/ 4
>>>/OB/// OB// 5
>>>/OB/// /OB 4
MOT/O T/O 3
OB///GYN OB// 1
For a distinct list of matches I can modify my query like so:
SELECT TOP(1) WITH TIES
t.String, p.Pattern, i.Position
FROM @things AS t
CROSS JOIN (VALUES('/OB/'),('OB//'),('/OB'),('T/O')) AS p(Pattern)
CROSS APPLY dbo.Instr8K(t.String,p.Pattern,1) AS i
WHERE i.Position > 0
ORDER BY ROW_NUMBER() OVER (PARTITION BY t.String ORDER BY Position);
-- Itzik Ben-Gan 2001
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply