July 3, 2013 at 11:55 am
I need update the AppID in table1 by comparing Table2. the data looks like below. Here the problem is AppNames in the both tables are not exactly same but technically they are same with partial match.
Table1
AppID||||Appname
Null ||||CSO Services Operations
Null |||| CSP
Null ||||AAC Claims Systems
Table 2
AppID|||||Appname
1 |||||Corporate Services Operations(CSO)
2 |||||Credit Servicing Portal(CSP)
3 |||||American Assuarance Company (AAC) Claims Systems
Please let me know how to get rid of this task.
thanks
July 3, 2013 at 12:12 pm
you are going to have to cleanup the data, it looks to me;
in your example, using charindex, you only get a single match.
unless you can guarantee the three letter abbreviations won't return false matches ("COR" would find "Google Corporation", for example), i don't see anything to reliably join the data.
/*
AppIDAppnameAppIDAppname
NullCSP2Credit Servicing Portal(CSP)
*/
;With Table1([AppID],[Appname])
AS
(
SELECT 'Null','CSO Services Operations' UNION ALL
SELECT 'Null','CSP' UNION ALL
SELECT 'Null','AAC Claims Systems'
),Table2([AppID],[Appname])
AS
(
SELECT '1','Corporate Services Operations(CSO)' UNION ALL
SELECT '2','Credit Servicing Portal(CSP)' UNION ALL
SELECT '3','American Assuarance Company (AAC) Claims Systems'
)
SELECT *
FROM Table1
INNER JOIN Table2
ON CHARINDEX(Table1.Appname,Table2.Appname) > 0
Lowell
July 3, 2013 at 12:16 pm
Hi Lowell,
Thanks for your pormpt reply. But i do have 1000 records to macth in both tables. So, we have to put all the thousand values in the way you mentioned?? or do we have any other way to do it
Please advise me
Thanks
July 3, 2013 at 1:24 pm
DBA12345 (7/3/2013)
Hi Lowell,Thanks for your pormpt reply. But i do have 1000 records to macth in both tables. So, we have to put all the thousand values in the way you mentioned?? or do we have any other way to do it
Please advise me
Thanks
well the obvious fix is hard work. you've inherited an app that didn't normalize these fields, and now you need to sanitize and normalize it after the fact...it's not fun.
1. you have to create a master table for the possible values of AppName, with an identity primary key/unique constraint on Appname., making sure to add an acronym for versatility matching.
my wild guess:
CREATE TABLE [dbo].[APPNAMELOOKUP] (
[APPNAMEID] INT IDENTITY(1,1) NOT NULL,
APPACRONYM VARCHAR(3),
[APPNAME] VARCHAR(100) NOT NULL,
CONSTRAINT [PK__AppNameLookup__11C2C212] PRIMARY KEY CLUSTERED (AppNameID),
CONSTRAINT [UQ__AppNameLookup__12B6E64B] UNIQUE NONCLUSTERED (AppName))
2. you have to go through all the descriptions and select only ONE per "real" appname.
3.Whatever application that allows users to freetext an appName must now be modified to select the value from a drop down menu instead.
4. Your existing tables now need new column , which now point to the new table as a foreign key, instead of having descriptions
5. that means a migration by analysis to update the new column to the "correct" value for the Designated Appname. This would probably be a series of scripts, That you custom write yourself, that updates based on comparing charindex like above, then trying for partial matches, and then by 3 letter acronyms, and then finally, for any that did not get caught in the above scripts, a stack of manually assigning records based on eyeball analysis.
Lowell
July 3, 2013 at 2:17 pm
I'm with Lowell on cleaning up the data.
However, this might help you, but you still have to review the matches given.
;With Table1([AppID],[Appname])
AS
(
SELECT 'Null','CSO Services Operations' UNION ALL
SELECT 'Null','CSP' UNION ALL
SELECT 'Null','AAC Claims Systems'
),Table2([AppID],[Appname])
AS
(
SELECT '1','Corporate Services Operations(CSO)' UNION ALL
SELECT '2','Credit Servicing Portal(CSP)' UNION ALL
SELECT '3','American Assuarance Company (AAC) Claims Systems'
)
SELECT *
FROM Table1
JOIN Table2 ON Table1.Appname LIKE SUBSTRING( Table2.Appname, CHARINDEX( '(',Table2.Appname) + 1, CHARINDEX( ')',Table2.Appname) - CHARINDEX( '(',Table2.Appname) - 1) + '%'
July 3, 2013 at 4:20 pm
DBA12345 (7/3/2013)
I need update the AppID in table1 by comparing Table2. the data looks like below. Here the problem is AppNames in the both tables are not exactly same but technically they are same with partial match.Table1
AppID||||Appname
Null ||||CSO Services Operations
Null |||| CSP
Null ||||AAC Claims Systems
Table 2
AppID|||||Appname
1 |||||Corporate Services Operations(CSO)
2 |||||Credit Servicing Portal(CSP)
3 |||||American Assuarance Company (AAC) Claims Systems
Please let me know how to get rid of this task.
thanks
Are all of the AppNames in Table1 absolutely guaranteed to contain the abbreviation contained in the parenthesis of the AppNames in Table2 and are all the AppNames in Table2 absolutely guaranteed to contain the abbreviation in parenthesis?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 3, 2013 at 7:17 pm
Lot's of assumptions here including all of Jeff's, Lowell's that you don't have an abbreviation like COR that matches to Corporation and that also there are not extra parens in Table2, but here's a shot at it:
WITH Table1([AppID],[Appname]) AS (
SELECT 'Null','CSO Services Operations' UNION ALL
SELECT 'Null',' CSP' UNION ALL
SELECT 'Null','AAC Claims Systems'
)
,Table2([AppID],[Appname]) AS (
SELECT '1','Corporate Services Operations(CSO)' UNION ALL
SELECT '2','Credit Servicing Portal(CSP)' UNION ALL
SELECT '3','American Assuarance Company (AAC) Claims Systems'
)
SELECT a.*, e.*
FROM Table2 a
CROSS APPLY dbo.PatternSplitCM(a.Appname, '[()A-Za-z]') b
CROSS APPLY (SELECT RIGHT(Item, LEN(Item)-CHARINDEX('(', Item))) c(Item2)
CROSS APPLY (SELECT '%' + STUFF(Item2, LEN(Item2), 1, '') + '%') d(Item3)
INNER JOIN Table1 e ON e.AppName LIKE Item3
WHERE [Matched]=1 AND CHARINDEX('(', Item) > 0 AND CHARINDEX(')', Item) > 0
PatternSplitCM can be found in the 4th article in my signature links.
I wouldn't want to be responsible for this running in Prod though without a whole lot of real thorough testing through all the possible cases. Still, it might be possible to make it work if you could apply additional assumptions like that the abbreviation always starts in the first non-blank character of Table1.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply