How To Extract Values from the string that is key value pairs with different formats
I have a table called DataForce. with column DataVal (Nvarchar 2000).
it has below 3 records
How to get CID Value, ID Value, Email Value, Zip from below strings please advise?
{'ani':'80052242','dnis':'90425935','cid':'4601923812','Id':'046067ce-7fec-4af7-bb51-e4d31b4720','No':'3048','FN':'Bod Conner','Ent':'2012-08-27','Email':'sounage@nefcom.net','Add':'P.o. Box 146','Zip':'31040','Not':'English','Off':'FL','ACode':'','OffId':'0002 Home Office FL','Except':''},
{"ani":"30926539","dnis":"33300180","cid":"4594296356","Id":"a2a5bac2-b6ad-4239-95b2-2da730c2b6","No":"93872","FN":"Martinael u0026 Kar Schmale","Ent":"2023-07-08","Email":"martinhael.d.schmale@gmail.com","Add":"552 WPlainview cir","Zip":"13498","Not":"","Off":"FL","ACode":"1|a2a5bac2-b6ad-4239-95b2-2da780c2b6|0061FL|V|000000000000000000|00000000","OffId":"0061 Rand FL","Except":""},
{ani:63179364,dnis:88825454,cid:4563900520,Id:c1144de0-1945-4862-b057-fe25eabe73,No:97158,FN:Bombay and Marsha Osrow,Ent:2023-11-03,Email:bombayrow7634@aol.com,Add:10014 SW Vitalia Court,Zip:14987,Not:English,Off:FL,ACode:,OffId:0021 Zanfini FL,Except:},
Thank you in advance
asita
January 25, 2024 at 12:24 am
Tried this below code but no luck 🙁
Can somebody please help. Thank you in advance
SELECT
Dataval,
CASE
WHEN LEN(SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","No"', SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval))) > 0 THEN CHARINDEX('","No"', SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)) > 70
THEN SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","No"', SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval))) > 0 THEN CHARINDEX('","No"', SUBSTRING(Dataval, CHARINDEX('"Id":"', Dataval) + 6, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
ELSE SUBSTRING(SUBSTRING(Dataval, CHARINDEX(',Id:', Dataval) + 5, LEN(Dataval)), 1, CASE WHEN CHARINDEX(',No', SUBSTRING(Dataval, CHARINDEX(',Id:', Dataval) + 5, LEN(Dataval))) > 0 THEN CHARINDEX(',No:', SUBSTRING(Dataval, CHARINDEX(',Id:', Dataval) + 5, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
END AS COSTID,
CASE
WHEN LEN(SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Except"', SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval))) > 0 THEN CHARINDEX('","Except"', SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)) > 70
THEN SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Except"', SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval))) > 0 THEN CHARINDEX('","Except"', SUBSTRING(Dataval, CHARINDEX('"OffId":"', Dataval) + 9, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
ELSE SUBSTRING(SUBSTRING(Dataval, CHARINDEX(',OffId:', Dataval) + 7, LEN(Dataval)), 1, CASE WHEN CHARINDEX(',Except', SUBSTRING(Dataval, CHARINDEX(',OffId:', Dataval) + 7, LEN(Dataval))) > 0 THEN CHARINDEX(',Except', SUBSTRING(Dataval, CHARINDEX(',OffId:', Dataval) + 7, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
END AS Branch,
CASE
WHEN LEN(SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Add"', SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval))) > 0 THEN CHARINDEX('","Add"', SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)) > 70
THEN SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Add"', SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval))) > 0 THEN CHARINDEX('","Add"', SUBSTRING(Dataval, CHARINDEX('"Email":"', Dataval) + 8, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
ELSE SUBSTRING(SUBSTRING(Dataval, CHARINDEX(',Email:', Dataval) + 8, LEN(Dataval)), 1, CASE WHEN CHARINDEX(',Add:', SUBSTRING(Dataval, CHARINDEX(',Email:', Dataval) + 8, LEN(Dataval))) > 0 THEN CHARINDEX(',Add:', SUBSTRING(Dataval, CHARINDEX(',Email:', Dataval) + 8, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
END AS ContactEmail,
CASE
WHEN LEN(SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Not"', SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval))) > 0 THEN CHARINDEX('","Not"', SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)) > 70
THEN SUBSTRING(SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval)), 1, CASE WHEN CHARINDEX('","Not"', SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval))) > 0 THEN CHARINDEX('","Not"', SUBSTRING(Dataval, CHARINDEX('"Zip":"', Dataval) + 6, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
ELSE SUBSTRING(SUBSTRING(Dataval, CHARINDEX(',Zip:', Dataval) + 5, LEN(Dataval)), 1, CASE WHEN CHARINDEX(',Not', SUBSTRING(Dataval, CHARINDEX(',Zip:', Dataval) + 5, LEN(Dataval))) > 0 THEN CHARINDEX(',Not', SUBSTRING(Dataval, CHARINDEX(',Zip:', Dataval) + 5, LEN(Dataval))) - 1 ELSE LEN(Dataval) END)
END AS PostalCode
FROM DataForce c
;
January 25, 2024 at 9:17 am
I've done the first few for you. You should be able to add the others using similar logic.
DROP TABLE IF EXISTS #JSONish;
CREATE TABLE #JSONish
(
Id INT IDENTITY(1, 1)
,Dataval NVARCHAR(MAX) NOT NULL
);
INSERT #JSONish
(
Dataval
)
VALUES
(N'{''ani'':''80052242'',''dnis'':''90425935'',''cid'':''4601923812'',''Id'':''046067ce-7fec-4af7-bb51-e4d31b4720'',''No'':''3048'',''FN'':''Bod Conner'',''Ent'':''2012-08-27'',''Email'':''sounage@nefcom.net'',''Add'':''P.o. Box 146'',''Zip'':''31040'',''Not'':''English'',''Off'':''FL'',''ACode'':'''',''OffId'':''0002 Home Office FL'',''Except'':''''}, ' -- Dataval - nvarchar(max)
)
,(N'{"ani":"30926539","dnis":"33300180","cid":"4594296356","Id":"a2a5bac2-b6ad-4239-95b2-2da730c2b6","No":"93872","FN":"Martinael u0026 Kar Schmale","Ent":"2023-07-08","Email":"martinhael.d.schmale@gmail.com","Add":"552 WPlainview cir","Zip":"13498","Not":"","Off":"FL","ACode":"1|a2a5bac2-b6ad-4239-95b2-2da780c2b6|0061FL|V|000000000000000000|00000000","OffId":"0061 Rand FL","Except":""},')
,(N'{ani:63179364,dnis:88825454,cid:4563900520,Id:c1144de0-1945-4862-b057-fe25eabe73,No:97158,FN:Bombay and Marsha Osrow,Ent:2023-11-03,Email:bombayrow7634@aol.com,Add:10014 SW Vitalia Court,Zip:14987,Not:English,Off:FL,ACode:,OffId:0021 Zanfini FL,Except:},');
SELECT jn.Id
,cid = MAX (REPLACE (IIF(c2.ItemNumber = 3, c2.Item, ''), 'cid:', ''))
,Id = MAX (REPLACE (IIF(c2.ItemNumber = 4, c2.Item, ''), 'Id:', ''))
,Email = MAX (REPLACE (IIF(c2.ItemNumber = 8, c2.Item, ''), 'Email:', ''))
FROM #JSONish jn
CROSS APPLY
(SELECT Cleaned = REPLACE (REPLACE (jn.Dataval, '''', ''), '"', '')) c1
CROSS APPLY
(SELECT ss.* FROM dbo.DelimitedSplit8K (c1.Cleaned, ',') ss ) c2
GROUP BY jn.Id
ORDER BY jn.Id;
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 25, 2024 at 9:20 am
Obviously, this relies on Jeff Moden's splitter, but as you're on 2022, you can also use the native string_split() function.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
Here is the STRING_SPLIT version
SELECT
jn.Id
, cid = MAX (REPLACE (IIF(c2.ordinal = 3, c2.value, ''), 'cid:', ''))
, Id = MAX (REPLACE (IIF(c2.ordinal = 4, c2.value, ''), 'Id:', ''))
, Email = MAX (REPLACE (IIF(c2.ordinal = 8, c2.value, ''), 'Email:', ''))
FROM #JSONish jn
CROSS APPLY
(SELECT Cleaned = REPLACE (REPLACE (jn.Dataval, '''', ''), '"', '')) c1
CROSS APPLY
(SELECT ss.* FROM STRING_SPLIT(c1.Cleaned, ',', 1) ss) c2
GROUP BY jn.Id
ORDER BY jn.Id;
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 25, 2024 at 4:29 pm
Thank you very much Phil. I am testing some more code for additions to my tables. will keep you post.
Thank you a Ton Phil.
January 26, 2024 at 7:49 pm
Good Morning Phil,
I have few issues I am encountering that is just my wrong data your solution is great.
Can I ask you a quick question, is there anyway can we just get all Values as column values and keys as headings of column names?
Please
I grateful to you, Phil
Thank you in advance
asita
January 26, 2024 at 9:05 pm
If you are hoping for a dynamic solution which does not require you to type in the names of each key, that takes quite a chunk of coding and requires the use of dynamic SQL.
Alternatively, most reporting tools will do it for you. I was able to put the output of the inner query below (ie, the 'dta' CTE), and use PowerQuery to do all of the pivoting.
But if you are happy to type in all of the key names, you can use a variation of my first query
WITH dta
AS (SELECT jn.Id
,Ky = LEFT(c2.Item, CHARINDEX (':', c2.Item) - 1)
,Val = TRIM (RIGHT(c2.Item, LEN (c2.Item) - CHARINDEX (':', c2.Item)))
FROM #JSONish jn
CROSS APPLY
(
SELECT Cleaned = REPLACE (REPLACE (REPLACE (REPLACE (jn.Dataval, '''', ''), '"', ''), '{', ''), '}', '')
) c1
CROSS APPLY
(SELECT ss.* FROM dbo.DelimitedSplit8K (c1.Cleaned, ',') ss ) c2
WHERE LEN (c2.Item) > 0)
SELECT d.Id
,cid = MAX (IIF(d.Ky = 'cid', d.Val, ''))
,id = MAX (IIF(d.Ky = 'id', d.Val, ''))
,Email = MAX (IIF(d.Ky = 'Email', d.Val, ''))
FROM dta d
GROUP BY d.Id
ORDER BY d.Id;
And here is the PQ I came up with:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Id", Int64.Type}, {"Ky", type text}, {"Val", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Id", "Id1"}}),
#"Pivoted Column" = Table.Pivot(#"Renamed Columns", List.Distinct(#"Renamed Columns"[Ky]), "Ky", "Val")
in
#"Pivoted Column"
Which gives this
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 29, 2024 at 11:03 am
This was removed by the editor as SPAM
January 29, 2024 at 2:58 pm
Thank you much Phil, I tweaked little hear and there your first code is working great. may I know what is this 8k 4k apart from String Split function? please
I owe you drink 🙂
God Bless you my friend.
Thanks Again
Best regards
January 29, 2024 at 3:38 pm
DelimitedSplit8k is a highly optimised string splitter that was written in the days when there was nothing comparable built in to T-SQL. It has since become the de facto standard splitter for many people. For SQL Server 2022 and above, STRING_SPLIT() may be used instead.
You can find the code and supporting article here.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 29, 2024 at 5:36 pm
Got it Great! Thank you phil, one last question for this one
so here the value (example below record) "WCo:Underwriters at Lloyds, London" it is splitting the Underwriters at Lloyds , and london is going to another column value.
is there any fix that we try to avoid this in between spring splitting befroe next tag (:) just checking if there is any
{cd:430009532,PolId:c2e336-6448-4c9a-ac56-4f909cb38,PolNo:TFU-U,PolEff:2002-08-01,polExp:2020-08-01,stat:Active,LOB:Installation/Builder Risk (P),Prod:caiJei bigwood,PCo:comp Underwriters Inc,WCo:Underwriters at Lloyds, London, TOB:P,PGroup:Florida,ClaimPNum:(800) 334-1234,BMethod:Direct Bill},
SELECT
jn.Id
, cid = MAX (REPLACE (IIF(c2.ordinal = 3, c2.value, ''), 'cid:', ''))
, Id = MAX (REPLACE (IIF(c2.ordinal = 4, c2.value, ''), 'Id:', ''))
, Email = MAX (REPLACE (IIF(c2.ordinal = 8, c2.value, ''), 'Email:', ''))
FROM #JSONish jn
CROSS APPLY
(SELECT Cleaned = REPLACE (REPLACE (jn.Dataval, '''', ''), '"', '')) c1
CROSS APPLY
(SELECT ss.* FROM STRING_SPLIT(c1.Cleaned, ',', 1) ss) c2
GROUP BY jn.Id
ORDER BY jn.Id;
this is the code I am using just fyi. Thank you again.
January 29, 2024 at 5:48 pm
Dear phil,
I am thinking first replace comma with space or empty in the WCO string, then apply the logic
Thank you
Z Sita
January 29, 2024 at 5:49 pm
Do you have any control over the format of the input file? If you could get them to use a delimiter which will not appear elsewhere in the text (| is a common one, I've also used ~ in the past), this problem would go away.
Otherwise, you have a problem, because (as per your initial post) some, but not all, of your data is 'text qualified'.
I can see the following solutions:
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 29, 2024 at 6:57 pm
Thank you Phil.
There is no control in the history data which is what I am working on 🙁
So here what I am thinking is Replace commas in the WCO: value if any before we string split. then your code works and proved already
you think can we do any way?
data_val2 =
STUFF(
data_val,
CHARINDEX('WCo:', data_val),
CHARINDEX(',', data_val, CHARINDEX('WCo:', data_val)) - CHARINDEX('WCo:', data_val) + 1,
REPLACE(
SUBSTRING(
data_value,
CHARINDEX('WCo:', data_val),
CHARINDEX(',', data_value, CHARINDEX('WCo:', data_val) + LEN('WCo:')) - CHARINDEX('WCo:', data_val)
),
',', ' '
)
)
from #my_temp
WHERE CHARINDEX('WCo:', data_val) > 0
AND CHARINDEX(',', data_val, CHARINDEX('WCO:', data_val)) > 0;
but no luck yet.
Thanks Again
Zond
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply