February 9, 2014 at 4:29 pm
I need my column called 'nameInternal' to derive from another column called 'nameFormal'. If the latter is "Oracle Corporation", the 'nameInternal' column would compute to 'Oracle'. In pseudo code, this might look like:
CREATE TABLE Company
nameInternal AS
CASE WHEN nameFormal LIKE ('%[corporation]'
THEN nameInternal ('%[^corporation])'--need to match chars except for 'corporation'
END
I've searched pattern matching and CASE but can't deduce this particular task. I'm running 2008R2 on Windows7.
February 9, 2014 at 10:18 pm
It sounds like you are looking at a Jaro-Winkler type of requirement.
Here is an article on that
http://www.sqlservercentral.com/articles/Fuzzy+Match/65702/
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 9, 2014 at 11:25 pm
Hi Jason: Thanks for your reply. I don't think the Jaro-Winkler approach is quite what I'm looking for. Let me explain further: Both the nameFormal and the nameInternal attributes will be used in different ways; they're not fuzzy duplicates or mistakes.
I'm using LIKE to detect a "formal" name like 'Oracle Corporation'. And I'd like to use pattern matching to strip off certain "fluff" words such as corporation, corp, inc, ltd, and so on. In the example below, I'd like to match all the chars in 'Oracle Corporation' except for 'corporation'. Pattern matching is obviously easy with LIKE but I don't know how to use it to strip off certain chars.
CREATE TABLE Company
nameInternal AS
CASE WHEN nameFormal LIKE ('%[corporation]'
THEN nameInternal ('%[^corporation])'--need to match all chars except for 'corporation'
END
February 10, 2014 at 12:21 am
DECLARE @tbl TABLE
(
ID INT IDENTITY(1,1),
Name VARCHAR(100),
Name1 AS CASE WHEN Name LIKE '%corporation' THEN SUBSTRING(Name,1,LEN(Name)-11) ELSE Name END
)
INSERT INTO @tbl(Name)
SELECT 'Oracle Corporation'
UNION
SELECT 'Union Corporation'
UNION
SELECT 'Union Corporate'
select * from @tbl
Regards,
Mitesh Oswal
+918698619998
Regards,
Mitesh OSwal
+918698619998
February 10, 2014 at 9:34 am
timSF (2/9/2014)
Hi Jason: Thanks for your reply. I don't think the Jaro-Winkler approach is quite what I'm looking for. Let me explain further: Both the nameFormal and the nameInternal attributes will be used in different ways; they're not fuzzy duplicates or mistakes.I'm using LIKE to detect a "formal" name like 'Oracle Corporation'. And I'd like to use pattern matching to strip off certain "fluff" words such as corporation, corp, inc, ltd, and so on. In the example below, I'd like to match all the chars in 'Oracle Corporation' except for 'corporation'. Pattern matching is obviously easy with LIKE but I don't know how to use it to strip off certain chars.
CREATE TABLE Company
nameInternal AS
CASE WHEN nameFormal LIKE ('%[corporation]'
THEN nameInternal ('%[^corporation])'--need to match all chars except for 'corporation'
END
On a previous project, we did this very thing and used the Jaro-Winkler method to get to what you are seeking. We also built a "terms" table where we could strip off those things like corporation, llc, etc etc etc.
You could also try to build something using patindex to strip off those terms that should not be returned.
Using patindex you could search for the starting point of these terms and return everything outside of that pattern (left or right).
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
February 10, 2014 at 9:35 am
Thanks Mitesh. Using Substring makes perfect sense.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply