May 16, 2016 at 1:53 pm
Comments posted to this topic are about the item New T-SQL Fixed Width "Splitter"
-- Itzik Ben-Gan 2001
June 20, 2016 at 11:17 am
Hey there. Thanks for the shout-out on SQL#[/url] :-). I like the T-SQL approach and have a few notes (just notes, please don't take any of the following as criticism):
Take care,
Solomon...
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
June 21, 2016 at 1:50 pm
Solomon Rutzky (6/20/2016)
Hey there. Thanks for the shout-out on SQL#[/url] :-). I like the T-SQL approach and have a few notes (just notes, please don't take any of the following as criticism):
- For the version that accepts VARCHAR(MAX), why not name it NSplitMax?
- Regarding VARCHAR, why not do NVARCHAR so that it can work in more situations? Or I suppose it would be easy enough to create NVARCHAR variations of these two functions, using DATALENGTH / 2
- Why are the @TokenLen and @DelimLen parameters BIGINT? The upper-limit of bytes for all 3 MAX types is the same as the upper-limit of the INT datatype (i.e. 2,147,483,647). See MSDN page for int, bigint, smallint, and tinyint.
Take care,
Solomon...
Thanks for chiming in!
NSplitMax would actually be better; I wish I thought of that. I used to name the [n]varchar(max) version of these types of functions <function>VCMax and thought 2B was simpler. I think <function>MAX is actually a better and may just change it.
NVarchar is a little slower - I leave that up to people that need an NVarchar version.
I go with bigint because getnumsAB (the tally table iTVF that the function uses) takes bigint and I wanted to avoid an implicit conversion. To be honest, I have not tested the performance difference between int+implicit conversion vs bigint+no implicit conversion.
-- Itzik Ben-Gan 2001
June 21, 2016 at 2:01 pm
Alan.B (6/21/2016)
NVarchar is a little slower - I leave that up to people that need an NVarchar version.
Makes sense. I was just curious.
I go with bigint because getnumsAB (the tally table iTVF that the function uses) takes bigint and I wanted to avoid an implicit conversion. To be honest, I have not tested the performance difference between int+implicit conversion vs bigint+no implicit conversion.
This might be a 6 of one / half a baker's dozen of the other π type situation. If you are speaking of the input parameters to the getnumsAB iTVF, then I'm not sure if anything is gained regardless since there should still be an implicit conversion happening when calling your split function, assuming that numeric literals are passed into the function call instead of variables. Numeric literals are of type INT, though you could use a variable declared as BIGINT and pass that in. Then all levels would be using the same type. But I doubt there is a noticeable impact in either case. I think it's fine to keep it as BIGINT.
SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR
June 22, 2016 at 2:46 am
Excellent writeup Alan, thanks for posting this. I had no idea about n-grams, despite having used them for fuzzy-matching for years.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 22, 2016 at 12:42 pm
ChrisM@Work (6/22/2016)
Excellent writeup Alan, thanks for posting this. I had no idea about n-grams, despite having used them for fuzzy-matching for years.
Thanks Chris!
Regarding N-Grams - My second SSC artcicle, which will be the first of a 6-part series on N-Grams will be published tomorrow. π I'd love your thoughts on the subject.
-- Itzik Ben-Gan 2001
July 5, 2016 at 6:55 pm
I don't know if it's just my machine, but the final code box is empty. I tried Chrome, IE, and IE with compatibility settings. I assume it contains the code for GetNumsAB.
Any suggestions?
July 7, 2016 at 7:23 am
Marcia J (7/5/2016)
I don't know if it's just my machine, but the final code box is empty. I tried Chrome, IE, and IE with compatibility settings. I assume it contains the code for GetNumsAB.Any suggestions?
Hi Marcia. Sorry I missed this post.
It has nothing to do with your browser (s), that last empty box is nothing. I don't know why it is there. The interface for submitting scripts and articles is very good but can be a little quirky at first.
I have been meaning to fix this and re-submit the script/post.
Cheers!
-- Itzik Ben-Gan 2001
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply