May 14, 2015 at 3:07 am
Hi
I have a table that has for example data
I am looking to write a script that will change the first table into the second table.......all help really appreciated.......
Table 1
Account No Name other field
1 Mr T and Mrs M Lambert xxx
I need to rewrite this as
Table 2
Account No split Name other field
1 a Mr T Lambert xxx
1 b Mrs M Lambert xxx
May 14, 2015 at 3:40 am
this seems to be oversimplified......are your initial rows always, always formatted like you present ...or are there others?
for example
2 T & M Lambert xxx
3 Mr T, Mrs M and Miss L Lambert xxx
________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day
May 14, 2015 at 3:47 am
Sorry your correct its over simplified
The name field is a free text field and could contain things like
McDonalds
Mr and Mrs Jones
Mr Smith
Mrs Peabody
Mr L Thomas & Mrs P Smith
A company Name
Jones and Smith
I need to split out where there are 2 people into one record each
but not for a company name like Jones and Smith which would be a company.
it may be a nightmare to do this but looking for any help I can get..........
May 14, 2015 at 3:48 am
Too many unknown parameters. First, what are criteria to extract family name? Last word exactly? Lookup table of all known family names? second word after [last] 'and' plus all the following words ?
You'd better provide more verbose task definition.
Generally it should look like
SELECT [Account No], splitter.Name, other field
FROM [Table 1]
CROSS APPLY (
-- my fancy splitter text
) splitter
May 14, 2015 at 3:52 am
It is a nightmare and it's going to be near-impossible to do correctly in all cases.
What you can do is list all the valid titles that you have in the table and only split in cases where you find "And <title>" in the text and take the before and after the and, but even that's going to be wrong in some cases, for example "Mr and Dr T van Zyl" will give incorrect results if you do that.
This will probably end up being 3/4 a manual process as no matter what the script does you will have to check for and fix places the result was wrong.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 14, 2015 at 3:53 am
Sorry
probably best to split on the and or & symbol except where there is no Mr or Mrs in the field
eg if name Jones and Smith leave alone as this is company
but if Mr Jones and Mrs Smith then it would need to be split
May 14, 2015 at 4:04 am
My 2 cents worth
Firstly, as has already been pointed out this is going to be a nightmare to get any sort of accuracy - can you go back to the source and get the data in a proper format
Seccond, assuming that you can't then I would suggest that SSIS may be a better tool for this than SQL directly. Load the data into a source table and then use that as the SSIS data source and pipe data through text filters. This will be infinitely more flexible than SQL.
E.g.
If the string contains Ltd, Ltd., PLC, PLC., T/A, T/as then route the data down a pipe for company names
If the data contains MR, Mr or Mr. and it contains MRS, Mrs or Mrs. then it is two people
If the data contains ' and ' then it is two people
You can split each pipe as many time as you like and recombine them where necessary and output each pipe to a separate file - your user can then review each file for errors and where they find one present you with a new rule to filter and redirect the data.
I know its another skillset to learn, but I think you will find it to be the most flexible solution
May 14, 2015 at 4:12 am
aaron.reese (5/14/2015)
If the data contains MR, Mr or Mr. and it contains MRS, Mrs or Mrs. then it is two people
Careful, that makes large assumptions. What about
Prof and Mrs?
Mr and Dr?
or (like a couple people I know who have all sorts of problems with assumptions around titles)
Prof and Dr
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 14, 2015 at 6:20 am
@Gila,
My point exactly - As your end user finds new rules that need to be applied, it is (IMHO) easier to do that with another inline filter and merge/join tasks in SSIS than to build parser rules into SQL code.
SSIS is pretty good at this stuff because it is a row based process rather than SQL which is a set based process.
FWIW I have done this in the past with Crystal Reports but only to sanitise the data, not split it in to separate rows
May 14, 2015 at 9:00 am
I would like to thanks you and Gail for all the help. I will try SSIS to resolve my issue using your filter ideas
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply