Parse SQL data

Question

Post reply

Parse SQL data

rocky_498

SSCertifiable

Points: 6532
More actions
July 18, 2022 at 9:46 pm
Go to Answer
#4065014
Below is the test data. What I am looking for is...
I want to PARSE any value starting with PD- from the description column. Below is sample data. However, it could be possible the "Description" column could have n number PD id in the column, not restricted to any numbers.
Any advice would be highly appreciated.
Here is the result that I am looking.
k_id, pd_id,description,value_of_week,
1,,No PS update required. This issue, PS328498,Discussed with the respective Division Risk Manager and product/ client impact was identified.,2
2,PD-0051,There were two large events in the month of July 2021 related to PD-0051 One event was a near miss of $250,000- A Transaction Banking client received three ACH transactions that were not authorized.,6
3,PD-01484,IS116748 - User Access Recertification - Global Trust Services (GTS) - This is a High Operations and Technology PD-01484 Project Finance Services and PD-01485 Services which are currently under review and any relevant RCSAs.PD-00148: Global Custody International Custody Services PD-00155: ,8
3,PD-01485,IS116748 - User Access Recertification - Global Trust Services (GTS) - This is a High Operations and Technology PD-01484 Project Finance Services and PD-01485 Services which are currently under review and any relevant RCSAs.PD-00148: Global Custody International Custody Services PD-00155: ,8
3,PD-00148,IS116748 - User Access Recertification - Global Trust Services (GTS) - This is a High Operations and Technology PD-01484 Project Finance Services and PD-01485 Services which are currently under review and any relevant RCSAs.PD-00148: Global Custody International Custody Services PD-00155: ,8
3,PD-00155,IS116748 - User Access Recertification - Global Trust Services (GTS) - This is a High Operations and Technology PD-01484 Project Finance Services and PD-01485 Services which are currently under review and any relevant RCSAs.PD-00148: Global Custody International Custody Services PD-00155: ,8
4,PD-00713,PD-00713 (Preferred Securities & Partnership), PD-00714 (Financial Solutions Loans): OCC MRA 2021-32 noted that ,12
4,PD-00714,PD-00713 (Preferred Securities & Partnership), PD-00714 (Financial Solutions Loans): OCC MRA 2021-32 noted that ,12
```
DECLARE @Parse_Test TABLE(k_id INT,description VARCHAR(500) , value_of_week int)


INSERT INTO @Parse_Test

VALUES

(1,'No PS update required. This issue, PS328498,Discussed with the respective Division Risk Manager and product/ client  impact was identified.',2)  ,

(2,'There were two large events in the month of July 2021 related to PD-0051 One event was a near miss of $250,000- A Transaction Banking client received three ACH transactions that were not authorized.',6) ,

(3,'IS116748 - User Access Recertification - Global Trust Services (GTS) - This is a High Operations and Technology   PD-01484 Project Finance Services and PD-01485 Services which are currently under review and any relevant RCSAs.PD-00148: Global Custody International Custody Services PD-00155: ',8) ,

(4,'PD-00713 (Preferred Securities & Partnership), PD-00714 (Financial Solutions Loans): OCC MRA 2021-32 noted that ',12)

 

SELECT * FROM @Parse_Test
```

Steve Collins

SSCrazy

Points: 2176

July 18, 2022 at 10:34 pm

Answer

#4065029

To split out the pd_id's you could try STRING_SPLIT to separate based on space delimiter. Since some 'PD-[0-9]' values are encompassed by other extra characters a conditional substring is done based on PATINDEX location. Also, I'm using Azure SQL level 150 (which is roughly equivalent to 2019) which has the extra 'enable_ordinal' parameter set to 1 which then outputs the 'ordinal' column

select pt.k_id, 
       iif(ss.[value] like 'PD-[0-9]%', ss.[value], 
           substring(ss.[value], ptndx.loc,
           len(ss.[value])-ptndx.loc)) pd_id,
       pt.[description], pt.value_of_week
from @Parse_Test pt
     outer apply string_split(pt.description, ' ', 1) ss
     outer apply (values (patindex('%PD-[0-9]%', ss.[value]))) ptndx(loc)
where ss.[value] like '%PD-[0-9]%'
order by pt.k_id, ss.ordinal;

If the order of the rows doesn't matter the ordinal column could be removed

select pt.k_id, 
       iif(ss.[value] like 'PD-[0-9]%', ss.[value], 
           substring(ss.[value], ptndx.loc,
           len(ss.[value])-ptndx.loc)) pd_id,
       pt.[description], pt.value_of_week
from @Parse_Test pt
     outer apply string_split(pt.description, ' ') ss
     outer apply (values (patindex('%PD-[0-9]%', ss.[value]))) ptndx(loc)
where ss.[value] like '%PD-[0-9]%'
order by pt.k_id;

This reply was modified 2 years, 5 months ago by Steve Collins.

Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004432 More actions · Answer 1

Steve Collins wrote:

To split out the pd_id's you could try STRING_SPLIT to separate based on space delimiter. Since some 'PD-[0-9]' values are encompassed by other extra characters a conditional substring is done based on PATINDEX location. Also, I'm using Azure SQL level 150 (which is roughly equivalent to 2019) which has the extra 'enable_ordinal' parameter set to 1 which then outputs the 'ordinal' column
If the order of the rows doesn't matter the ordinal column could be removed

The OP has posted in the 2019 forum, which doesn't have the "Ordinal Position" operand. Like it says in the MS documentation for STRING_SPLIT()...

Note
The enable_ordinal argument and ordinal output column are currently supported in Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics (serverless SQL pool only). Beginning with SQL Server 2022 (16.x) Preview, the argument and output column are available in SQL Server.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

rocky_498 SSCertifiable Points: 6532 More actions · Answer 2

Hi Steve,

It works for the sample data. I am sure it will work on the original data as well. I marked this question as answered. I appreciate your help.