September 5, 2017 at 3:46 am
Hi All,
I have been asked to clean and move data from a column called [worktitle] to another column called [supplementaryInformation] when the data within the brackets matches a value stored matrix\lookup table.
e.g [worktitle] column = 'some random (acoustic) texts coconuts'
now move the matched lookup text with int the brackets (acoustic) to column [supplementaryInformation], then finally remove the brackets from [worktitle] and [supplementaryinformation]
The catch is the data within the [worktitle] column does not have consistent opening and closing brackets its in quite a bad state, e.g 'some text ) here (version) but ignore the other' or 'some other (( random (acoustic)stuff in this column'.
Lookup Codes Sample (file attached, lookup codes)
ID Text
GW, ACOUSTIC
GW, VERSION
etc.....
Current Sample Rows (file attached, sample data 3)
WorkIdentifier WorkTitle
GW37327314 NAUGHTY BITS (2008 VERSION)
GW37327317 UNTRODDEN PATHS (WOLVES PT. II)
GW37327602 ) RAW RAP FEAT. ICON THE MIC KING, (VERSION) COLDFRONT, DA RISIN SUN
GW37327614 THROWN (AZ HEAT) FEAT. JAMES CIPHURPHACE, COPYWRITE, BIG MER
GW37327643 SET A PRISONER FREE (ACOUSTIC)
GW37327669 BRAIN DEATH (AH CAMA SOTZ REMIX)
GW37327670 GERM CODE” (DISHARMONY REMIX)
Required Rows
WorkIdentifier WorkTitle SupplementaryInformation
GW37327314 NAUGHTY BITS VERSION
GW37327317 UNTRODDEN PATHS (WOLVES PT. II)
GW37327602 RAW RAP FEAT. ICON THE MIC KING, COLDFRONT, DA RISIN SUN VERSION
GW37327643 SET A PRISONER FREE ACOUSTIC
In my SQL I have created a temp table to store the [workidentifier], [worktitle] and [supplementary information] then a variable to hold the lookup value.
My code works for 90% of rows but when the [worktitle] column has extra brackets these rows become a problem. I use a wildcard in the join against the lookup VAL (value) to match it against the string in [worktitle]
IF OBJECT_ID('tempdb..#tempWorkTitle3') IS NOT NULL
DROP TABLE #tempWorkTitle3
DECLARE @Information3 VARCHAR(MAX)
SELECT WorkIdentifier,WorkTitle,SupplementaryInformation = @Information3
INTO #tempWorkTitle3
FROM Work_Test
WHERE WorkTitle like '%(%' and (WorkIdentifier Like 'GW%' )
CREATE CLUSTERED INDEX IDX_Title3_Indt ON #tempWorkTitle3(WorkIdentifier)
DECLARE @textsvalueGW TABLE(catgeory VARCHAR(2),val VARCHAR(100),SupplementaryInformation VARCHAR(100))
Insert into @textsvalueGW SELECT category,texts, '' from .[BracketedPhraseBGGW_Matrix] WHERE Category in ('GW') order by texts
--Remove leading single bracket
UPDATE #tempWorkTitle3
SET WorkTitle = ( CASE WHEN CHARINDEX(')',t.WorkTitle)-CHARINDEX('(',t.WorkTitle) < 0
THEN STUFF(t.WorkTitle,PATINDEX('%)%',t.worktitle),LEN(t.worktitle),SUBSTRING(t.worktitle,PATINDEX('%)%',t.worktitle)+1,LEN(t.worktitle))) END)
FROM Work_Test t
JOIN #tempWorkTitle3 w on
t.WorkIdentifier = w.WorkIdentifier
WHERE (CHARINDEX(')',t.WorkTitle)-CHARINDEX('(',t.WorkTitle) < 0 and t.WorkIdentifier like 'Gw%' and t.WorkTitle like '%(%' AND t.WorkTitle like '%)%')
--UPDATE work title for double brackets
UPDATE #tempWorkTitle3
SET SupplementaryInformation = SUBSTRING(t.WorkTitle,CHARINDEX('(',t.WorkTitle),CHARINDEX(')',t.WorkTitle)-CHARINDEX('(',t.WorkTitle))
FROM #tempWorkTitle3 t
JOIN @textsvalueGW
ON RTRIM(LTRIM(SUBSTRING(WorkTitle,CHARINDEX('(',WorkTitle)+1,CHARINDEX(')',t.WorkTitle)))) LIKE '%' + RTRIM(LTRIM(val)) + '%'
WHERE t.workTitle like '%(%' and t.workTitle like '%)%' UPDATE Work_Test
SET SupplementaryInformation = t.SupplementaryInformation
FROM #tempWorkTitle3 t
JOIN Work_Test w
ON t.WorkIdentifier = w.WorkIdentifier
WHERE w.workTitle like '%(%' and w.workTitle like '%)%'
UPDATE #tempWorkTitle3
SET WorkTitle = STUFF(t.WorkTitle,CHARINDEX('(',t.WorkTitle),CHARINDEX(')',t.WorkTitle),'')
FROM #tempWorkTitle3 t
WHERE t.workTitle like '%(%' and t.workTitle like '%)%'
UPDATE Work_Test
SET WorkTitle = t.WorkTitle
FROM #tempWorkTitle3 t
JOIN CnP.Work_Test w
ON t.WorkIdentifier = w.WorkIdentifier
WHERE w.workTitle like '%(%' and w.workTitle like '%)%'
--UPDATE work title for single brackets
UPDATE #tempWorkTitle3
SET WorkTitle = STUFF(WorkTitle,CHARINDEX('(',WorkTitle),LEN(WorkTitle),'')
FROM #tempWorkTitle3
JOIN @textsvalueGW
ON RTRIM(LTRIM(SUBSTRING(WorkTitle,CHARINDEX('(',WorkTitle)+1,LEN(WorkTitle)))) LIKE RTRIM(LTRIM(val)) + '%'
WHERE workTitle like '%(%'
UPDATE Work_Test
SET WorkTitle = t.WorkTitle
FROM #tempWorkTitle3 t
JOIN Work_Test w
ON t.WorkIdentifier = w.WorkIdentifier
WHERE w.workTitle like '%(%'
UPDATE #tempWorkTitle3
SET WorkTitle = STUFF(t.WorkTitle,CHARINDEX('(',t.WorkTitle),LEN(t.WorkTitle),'')
FROM #tempWorkTitle3 t
WHERE workTitle like '%(%'
UPDATE Work_Test
SET WorkTitle = t.WorkTitle
FROM #tempWorkTitle3 t
JOIN Work_Test w
ON t.WorkIdentifier = w.WorkIdentifier
WHERE w.workTitle like '%(%'
I have been looking at this for while so any assistance would be much appreciated.
Thanks
September 6, 2017 at 6:48 am
The first thing I have done is find the rows where there are multiple brackets, so there is a odd number.
SELECt WorkIdentifier,WorkTitle
FROM cnp.work_test
WHERE WorkIdentifier like 'GW%'and (workTitle like '%(%' or workTitle like '%)%')
and (len(WorkTitle) - len(replace(WorkTitle, '(', '')))!=(len(WorkTitle) - len(replace(WorkTitle, ')', '')))
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply