July 31, 2012 at 4:05 pm
This is an issue we had to solve for a customer a few weeks back. Imagine you had list
of 10,000 US business names stored in a free form text field in a database. There are no
descriptive attributes with the US business names indicating the type of business. A sample
set of the US business names would appear as follows:
CITGROUP
WALMART
IBM
SECU
NATIONS BANK
NBA
We have a need in our application to distinguish banks from non-banks. Thus, the issue was
that we had to determine which US business names belonged to US banks. Please describe how
would you go about approaching/solving this problem?
Note: All you have is the US business name. There is NO CHANCE of having an additional
attribute of some sort that indicates that it is a bank within the database where this information
is stored. Also we are not looking for an exact 100% solution nor are we looking for written
code, we just want to understand your thought process as to how you would go about solving
this problem.
July 31, 2012 at 4:28 pm
Well, since you can't alter the database, I don't think there is a good way to solve this issue.
Personally, I would opt for an attribute or for an additional table that lists all of the banks. Then compare the text field (in the case of the separate table listing the banks) to the banks table.
Other methods might involve a call out to fdic.gov on some sort of interface. That would also require a call out to NCUA for credit unions, imho. I don't think these are viable options in most cases because of the delay that could be caused.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
July 31, 2012 at 4:33 pm
Obtain from somewhere a list of all banks (another database, government info, etc), then use SSIS fuzzy lookup.
Without that list of all banks, that's impossible to solve automatically. How would you tell if "Mutual & Federal" is a bank? Or "ABSA"?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 1, 2012 at 2:11 pm
I would explain to the customer that if I can't create a db field to store this info, that hard-coding is a terrible solution. Not knowing how/whether the list of businesses is modified, it's not going to be maintainable if "Cash Now" becomes a bank business on the list.
If there is no ability for pushback, then, fine, I would case out matches to 'bank', 'credit union', maybe 'savings institution' and be done with it, explaining to the customer what the limitations are.
August 1, 2012 at 3:23 pm
a quick google search of "FDIC insured banks" gives me a starting point like Gail suggested:
7200+ institutions on the search page , all part of a downloadable csv file.
you'd have to get a similar list for credit unions and savings and loans, and you are back to Gail's fuzzy lookup for matching the names.
Lowell
August 1, 2012 at 3:33 pm
And if you can't put another attribute in this database, you put a table into another with a lookup table once you've done the fuzzy lookup.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 2, 2012 at 2:18 am
I used to do this kind of thing all the time in a company I worked at a few years ago.
I would query a list of all the companies in the UK (we had lots of company data - but you can easily obtain it), that were listed as a bank.
Then simply match on name - we didn't have SSIS back then and had to use third-party applications but Fuzzy Lookup would do the job I'm sure.
August 2, 2012 at 3:18 pm
GilaMonster (7/31/2012)
How would you tell if "Mutual & Federal" is a bank?
He, Mutual and Federal isn't a bank. Doesn't everyone know that? 😀
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply