May 2, 2014 at 2:32 am
Thanks for the great info.
I was previously using a modified version of
Jeff Moden's Tally OH! An Improved SQL 8K βCSV Splitterβ Function
to parse text files that where pipe delimited.
I really needed a way to parse proper csv files that are comma delimited with quote text qualifiers
as the data may have embedded commas.
i.e. "900 N. May ST., #5"
using the previous splitter it would get split into "900 N. May ST. and #5"
After searching the forums I found this and it works perfect but..
The production server for this project is 2008 R2.
Is there any way to replicate the Lag() and Lead() functions with 2008 equivalents?
Any help would be greatly appreciated.
October 7, 2014 at 10:22 pm
Even li'l ol' MySQL has GROUP_CONCAT().
You mean only MySQL has GROUP_CONCAT(). BOL has a SQLCLR version http://msdn.microsoft.com/en-us/library/ms131056(v=sql.105).aspx and there is an improved versoin in codeplex http://groupconcat.codeplex.com/%5B/url%5D.
September 1, 2015 at 2:59 pm
Hope it's not too late to say, "Great article Eirikur!". I just finished reading this for a second time (there is a problem that I was struggling with that your article helped me solve). I have referred many people to this article as a good example of "How to reap the benefits of Window functions".
I had one small question... I noticed you used the Latin1_General_BIN collation trick (WHERE SUBSTRING(@pString,t.N,1) COLLATE Latin1_General_BIN = @pDelimiter) in you CSV example but did not use it in dbo_DelimitedSplit8K_LEAD. Is there a specific reason that you did not use it there?
-- Itzik Ben-Gan 2001
April 22, 2016 at 7:27 am
Alan.B (9/1/2015)
Hope it's not too late to say, "Great article Eirikur!". I just finished reading this for a second time (there is a problem that I was struggling with that your article helped me solve). I have referred many people to this article as a good example of "How to reap the benefits of Window functions".I had one small question... I noticed you used the Latin1_General_BIN collation trick (WHERE SUBSTRING(@pString,t.N,1) COLLATE Latin1_General_BIN = @pDelimiter) in you CSV example but did not use it in dbo_DelimitedSplit8K_LEAD. Is there a specific reason that you did not use it there?
Not as late as my answer Alan:-D
The reason for not doing a binary collation on the first part was not changing anything within the original DelimitedSplit8K code apart from introducing the lead function in order to reflect only the changes from charindex to lead.
π
April 22, 2016 at 7:51 am
Good article, well, lengthy. There are better areas to demonstrate the benefit of window functions than parsing strings. For example, when you need to compare rows based on some sequence.
Using xml functions in SQL server is much easier to parse strings.:-)
April 22, 2016 at 8:12 am
lucyliu0301 (4/22/2016)
Good article, well, lengthy. There are better areas to demonstrate the benefit of window functions than parsing strings. For example, when you need to compare rows based on some sequence.Using xml functions in SQL server is much easier to parse strings.:-)
Using XML functions to split CSV is also much slower.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 22, 2016 at 11:31 am
Nice to see this article again. Still awesome!:-P
-- Itzik Ben-Gan 2001
April 22, 2016 at 12:16 pm
Jeff Moden (4/22/2016)
lucyliu0301 (4/22/2016)
Good article, well, lengthy. There are better areas to demonstrate the benefit of window functions than parsing strings. For example, when you need to compare rows based on some sequence.Using xml functions in SQL server is much easier to parse strings.:-)
Using XML functions to split CSV is also much slower.
Second Jeff's input here, for less than 8000/4000 characters, XML is much slower. Normally I see this assumption where no proper testing has been done, few functions have gone through as rigorous testing as the DelimitedSplit8K/4K functions, thanks to Jeff (cudos Jeff and SSC) which means that on the areas where to demonstrate the benefits of the Window functions, there are hardly any better challenges.
π
Further if one needs more than the number of elements one can fit within 8000/4000 characters string then as Jeff recently posted, "you are doing something wrong";-)
April 22, 2016 at 12:17 pm
Alan.B (4/22/2016)
Nice to see this article again. Still awesome!:-P
Thanks Alan! When are you going to do a piece about your interesting work?
π
April 23, 2016 at 3:17 pm
Eirikur Eiriksson (4/22/2016)
Alan.B (4/22/2016)
Nice to see this article again. Still awesome!:-PThanks Alan! When are you going to do a piece about your interesting work?
π
Very soon sir, I have a few things I just need to clean up a little before submitting.:-D
-- Itzik Ben-Gan 2001
April 25, 2016 at 6:43 am
Thanks for the article.
April 25, 2016 at 6:45 am
robert_verell (3/24/2014)
I like this article if anything for the West Point, MS reference.
For a moment I thought of the Academy.
March 20, 2018 at 4:12 am
Having just moved from SQL 2008, I'm finally able to use this improved version of the splitter function.
I have just one comment, the ISNULL(NULLIF(LEAD(s.N1,1,1) over (order by s.N1)-1,0)-s.N1,8000) seems to be doing a little more work than it needs to - the default value for the LEAD function could be changed to 8000, so you don't need the isnull or nullif. It must make a few microseconds of a difference in performance.
LEAD(s.N1,1,8000) over (order by s.N1)-1-s.N1
From the little testing I've done, this does seem to work.
November 2, 2018 at 7:52 am
Maybe I missed it in the article, I have read through it twice, on the CSV code why are you using the Latin1_General_BIN collation?
November 2, 2018 at 8:08 am
Lynn Pettis - Friday, November 2, 2018 7:52 AMMaybe I missed it in the article, I have read through it twice, on the CSV code why are you using the Latin1_General_BIN collation?
The reason is that it using a binary collations is more efficient than using a language specific collations.
π
Viewing 15 posts - 16 through 30 (of 57 total)
You must be logged in to reply to this topic. Login to reply