August 20, 2012 at 7:19 am
Good afternoon,
With a bit of effort this morning, I've managed to get my function to split my name field. There are three tables required by the function:
Performance is not too bad but not too great. Can I have some second opinions on this please?
create table dbo.NameSplit_Title_Table (ID int not null identity(1, 1), Title varchar(50))
insert into NameSplit_Title_Table
select 'Abbot' union all
select 'Arq' union all
select 'Baroness' union all
select 'Brig' union all
select 'Brigadier' union all
select 'Canon' union all
select 'Capitaine' union all
select 'Capt' union all
select 'Captai' union all
select 'Captain' union all
select 'Chief' union all
select 'Cmdr' union all
select 'CMDT' union all
select 'Col' union all
select 'Colonel' union all
select 'Commander' union all
select 'Commodore' union all
select 'Councillor' union all
select 'Count' union all
select 'Countess' union all
select 'Cpt' union all
select 'Dame' union all
select 'De Heer' union all
select 'Deacon' union all
select 'Deaconess' union all
select 'Detect' union all
select 'Dhr' union all
select 'District Judge' union all
select 'Doctor' union all
select 'Dokter' union all
select 'Dr' union all
select 'DRS' union all
select 'F Lt' union all
select 'Father' union all
select 'Flight Lieut' union all
select 'Flt Lieutenant' union all
select 'Flt Lt' union all
select 'Frau' union all
select 'Frauline' union all
select 'Fthr' union all
select 'Gen' union all
select 'General' union all
select 'Group Capt' union all
select 'Group Captain' union all
select 'Heer' union all
select 'Herr' union all
select 'Hon' union all
select 'Hr' union all
select 'Ing' union all
select 'Insp' union all
select 'Inspec' union all
select 'Inspector' union all
select 'Ir' union all
select 'Lady' union all
select 'Lcol' union all
select 'Lieut Col' union all
select 'Lieut Colonel' union all
select 'Lieutenant' union all
select 'Lord' union all
select 'Lt' union all
select 'Lt C' union all
select 'Lt Cmdr' union all
select 'Lt Cmr' union all
select 'Lt Col' union all
select 'Ltcdr' union all
select 'Madam' union all
select 'Madame' union all
select 'Maj' union all
select 'Maj Gen' union all
select 'Major' union all
select 'Major General' union all
select 'Marquis' union all
select 'Master' union all
select 'Meister' union all
select 'Mej' union all
select 'Mevrouw' union all
select 'Miss' union all
select 'Mjr' union all
select 'Mlle' union all
select 'Mme' union all
select 'Mon' union all
select 'Monsieur' union all
select 'Mother' union all
select 'Mr' union all
select 'Mrs' union all
select 'Ms' union all
select 'Mstr' union all
select 'Mthr' union all
select 'Pastoor' union all
select 'Pastor' union all
select 'Prof' union all
select 'Professor' union all
select 'Rev' union all
select 'Rev Canon' union all
select 'Rev''d' union all
select 'Revd' union all
select 'Reverend' union all
select 'Señorita' union all
select 'Sgt' union all
select 'Sig' union all
select 'Sir' union all
select 'Sis' union all
select 'Sister' union all
select 'Snr' union all
select 'Sqn' union all
select 'Sqn Ldr' union all
select 'Squadron Leader' union all
select 'Sr' union all
select 'Sra' union all
select 'Str' union all
select 'The Honourable' union all
select 'The Very Reverand' union all
select 'Very Rev' union all
select 'Wg Cdr' union all
select 'Wing Cdr' union all
select 'Wing Comdr' union all
select 'Wing Commander'
create table dbo.NameSplit_Prefix_Table (ID int not null identity(1, 1), Prefix varchar(50))
insert into dbo.NameSplit_Prefix_Table
select 'Du' union all
select 'De' union all
select 'Von' union all
select 'Van' union all
select 'Da' union all
select 'Le' union all
select 'La' union all
select 'St' union all
select 'El' union all
select 'Dos'
create table dbo.NameSplit_suffix_table (ID int not null identity(1, 1), Suffix varchar(50))
insert into NameSplit_suffix_table
select 'Jr' union all
select 'Phd' union all
select 'Msc' union all
select 'Sr' union all
select 'Hons' union all
select 'Jnr' union all
select 'Snr' union all
select 'Obe' union all
select 'Cbe' union all
select 'Mbe' union all
select 'Ba' union all
select 'Bsc' union all
select 'II' union all
select 'III' union all
select 'MCIOB' union all
select 'FCIOB' union all
select 'FRICS' union all
select 'MRICS' union all
select 'Esq'
The function:
CREATE FUNCTION [dbo].[fn_NameSplit_Step1] (@name varchar(100))
RETURNS varchar(200) AS
BEGIN
declare @salutation varchar(100)
declare @forename varchar(100)
declare @initials varchar(100)
declare @surname varchar(100)
declare @suffix varchar(100)
declare @myword varchar(100)
declare @icount integer
declare @icountprevious integer
declare @namelen integer
declare @returnval varchar(100)
if @name is not null
begin
set @icount = 1
set @icountprevious = 1
set @myword = null
set @salutation = ''
set @forename = ''
set @initials = ''
set @surname = ''
set @Suffix = ''
-- Search suffix and remove from @name
select @myword = Suffix from NameSplit_suffix_table where patindex('% ' + Suffix ,@name)<> 0 or @name = Suffix
while @@rowcount <> 0
begin
set @suffix = ltrim(substring(@name, len(@name) - len(@myword), len(@myword)+1) + ' ' + @suffix)
set @name = rtrim(substring(@name, 1, len(@name) - len(@myword)))
select @myword = Suffix from NameSplit_suffix_table where patindex('% ' + Suffix ,@name)<> 0 or @name = Suffix
end
-- Search Salutation and remove from @name
select @myword = Title from NameSplit_Title_Table where patindex(Title + ' %',@name)<> 0 or @name = Title
while @@rowcount <> 0
begin
set @salutation = ltrim(@salutation + ' ' + substring(@name, 1, len(@myword)))
set @name = ltrim(substring(@name, len(@myword)+1 ,len(@name) - len(@myword)))
select @myword = Title from NameSplit_Title_Table where patindex(Title + ' %',@name)<> 0 or @name = Title
end
-- Split @name into words
set @namelen = len(@name)
while @icount < @namelen
begin
if substring(@name, @icount,1) = ' '
begin
set @myword = rtrim(ltrim(substring(@name, @icountprevious, @icount - @icountprevious)))
-- if word is not surname prefix, then
-- add to @forename with first letter added to @initials
if not exists (select * from NameSplit_Prefix_Table where Prefix = @myword)
begin
set @forename = ltrim(@forename + ' ' + @myword)
set @initials = ltrim(@initials + ' ' + left(@myword,1))
set @icountprevious = @icount
end
-- else exit loop
else break
end
set @icount = @icount + 1
end
-- Add remaining to Surname
if @surname ='' set @surname = ltrim(substring(@name, @icountprevious ,@namelen - @icountprevious +1))
end
set @returnval = '|1'+ @salutation + '|2'+ @forename + '|3'+ @initials + '|4'+ @surname + '|5'+ @suffix
return @returnval
END
Test case:
select dbo.[fn_NameSplit_Step1] ('Professor Ludvic Van Fritz')
--|1Professor|2Ludvic|3L|4Van Fritz|5
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
August 20, 2012 at 7:40 am
I have to ask why you have "Señorita" but not "Señor", and a few others like that. Or did I just miss them in the list?
As far as the code goes, you might be better off moving this kind of thing into a CLR object. Performance will probably be better there, as T-SQL is poor at both string functions and loops, while CLR is good at both of those.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 20, 2012 at 7:45 am
Thanks for the feedback GSquared.
I started off with a small list for my titles but then I added some more from a list of titles I found online 😀
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
August 20, 2012 at 8:27 am
As a proof-of-concept exercise, I built a "set-based" version of your UDF. It doesn't use any explicit loops, and so on. I tested the two side-by-side, and the one you have is about 4X faster than the set-based version. Because of the complexity of the rules here, it's massively over-complex, including multiple CTEs, some Outer Apply functions, and so on. Just a mess.
So CLR is almost certainly your best bet, as expected.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
August 20, 2012 at 8:32 am
Thanks CSGuard. I will try and create a CLR implementation if time permits and will report back.
---------------------------------------------------------
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply