May 22, 2012 at 4:43 am
peter-757102 (5/22/2012)
Too bad (for me) this new round cames at a bad time as I will be too bussy this week to particpate in this round of challenges.As for bad performance with some implementations, please try to bench with option( maxdop 1) too as there exists a bug in SQL Server 2008 and later that can (unpredictably) cause extreme slowdowns when parallel plans are used. We would not want to see results tainted by this of course.
You are right. AFAIK, this bug is present in SQL Server 2005 as well for some queries (which currently the requirement is for) 😉 But so far, there is not much difference by using the Maxdop(1) option, but for RCTE.
As I said earlier, MAXDOP(1) would slow down the splitter process for RCTE varchar(max) version. It includes Index spool and sorting of large result set which are good candidates for parallel plans. This is one of the reasons, I may prefer XML version instead of RCTE version in future case. Another reason is when the thread requirement of the parallel plan for a specific degree of parallelism cannot be satisfied, the Database Engine decreases the degree of parallelism automatically or completely abandons the parallel plan in the specified workload context, so the actual performance may decrease at anytime. But if it is a seldom requirement or to be done at a time when there is less workload, then that could certainly be used. As always, It will always depend upon the environment 🙂
I agree that MAXDOP(1) OR MAXDOP(N) performance should also be bench marked, as some of the environments do have this option as a policy for most queries. Also, this sort of performance testing should be the done in any case before putting into the production.
May 22, 2012 at 5:28 am
peter-757102 (3/15/2012)
Jeff Moden (3/15/2012)
peter-757102 (3/15/2012)
Jeff, did you receive any news regarding the performance characteristics of the code (or variations) on SQL Server 2012?And on the extra bright side, in a few months I will be working in a .NET driven development environment, then it is time for me to do some experiments with use of that in SQL Server :).
Haven't tried it there yet but I don't see why any of this would change.
So far as the .Net driven development environment goes, take a look at the CLR splitter that Paul White wrote for the comparision testing in this article.
It certainly could be quite different if one of the following internals have been modified in SQL Server 2012 versus 2008:
* string handling
* memory management
* query optimizer changes (CTE/operator stuff)
Also the difference in performance for varchar(max) and varchar types might be gone as there is no good reason for large difference there from a pure technical perspective. As for .Net coding, I will certainly look at Paul White's splitter. To begin with as a good source to learn from. I haven't done anything .Net much, nor recent.
By the way, did I miss your long string splitter version somehow, or was that work never finished?
Sadly, as expected, it seems that things have not changed much. I have tried it on SQL 2012, and even after replacing the CHARINDEX function with new LEAD function, the performance was still dismal. Here is what I tried (only a template)
SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
GO
--Create Split Tally Function VARCHAR(MAX) Version
CREATE FUNCTION dbo.Split8KTallyM_2012 (
@pString VARCHAR(MAX), @pDelimiter VARCHAR(1)
)
RETURNS @Results TABLE (ItemNumber BIGINT, ItemValue VARCHAR(MAX))
AS
BEGIN
SET @pString = @pDelimiter --To avoid the OR condition in the WHERE clause for the first item
+ @pString
+ @pDelimiter --To avoid the ISNULL/NULLIF
INSERT INTO @Results
(
[ItemNumber]
,[ItemValue]
)
SELECT ItemNumber = ROW_NUMBER() OVER ( ORDER BY N )
, Item = SUBSTRING(@pString, (N+1),
LEAD(N,1, CONVERT(INT,LEN(@pString))) OVER ( ORDER BY N )
- (N+1))
FROM [dbo].[Tally] WITH ( NOLOCK ) --NOLOCK HINT IS NOT NECESSARY
WHERE N BETWEEN 1 AND
CONVERT(INT,LEN(@pString)) -- CAST TO THE COLUMN DATATYPE
- 1 -- TO EXCLUDE LAST DELIMITER
AND SUBSTRING(@pString, N,1) = @pDelimiter
RETURN;
END
GO
For RCTE, this is how I did it
SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
GO
--Create Split_RCTE function VARCHAR(MAX) version
CREATE FUNCTION dbo.Split_RCTE_2012
(
@pString VARCHAR(MAX)
,@pDelimiter VARCHAR(1)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
WITH cteSplit
AS ( SELECT EndPosition = CONVERT(INT, CHARINDEX(@pDelimiter,
@pString COLLATE Latin1_General_BIN)) + 1
UNION ALL
SELECT EndPosition = CONVERT(INT, CHARINDEX(@pDelimiter,
@pString COLLATE Latin1_General_BIN,
EndPosition)) + 1
FROM cteSplit
WHERE EndPosition > 1
)
SELECT [ItemNumber] = ROW_NUMBER() OVER ( ORDER BY EndPosition )
, ItemValue = SUBSTRING(@pString, EndPosition,
LEAD(EndPosition, 1, LEN(@pString) + (EndPosition+1)) OVER ( ORDER BY EndPosition )
- (EndPosition + 1))
FROM cteSplit
May 22, 2012 at 5:59 am
Gatekeeper (8/8/2011)
mark hutchinson (5/25/2011)
@JeffHere's an idea to kick around for varchar(max) data...use a tally table (or cte equivalent) to split the very large string into strings less than 8000 characters and then rejoin those parsed strings.
Mark, we used Jeff's function and expanded it out to varchar(max) with a fair amount of linearity, though I'm looking forward to seeing Jeff's with max handled.
String Length : Elements : CPU Time in ms
74242
148482
296962
5921923
118438432
23687687
4736153613
9472307256
18944614449
378881228896
7577624576193
15155249152385
303104983045006
60620819660810085
121241639321619989
UDF
/*
See sometime for improvements to this: http://www.sqlservercentral.com/articles/Tally+Table/72993/
Jeff Moden 2011-05-02
*/
CREATE FUNCTION udf_StrList2Table (
@List NVARCHAR(MAX) = N'',
@Delimiter NCHAR(1) = N','
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
/*
Following inline delimited spliter written by Jeff Moden.
http://www.sqlservercentral.com/articles/Tally+Table/72993/
*/
WITH E1(N) AS
( --=== Create Ten 1's
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
E8(N) AS
(
--==== Limits the number of rows right up front for both a performance gain and prevention of accidental "overruns"
SELECT TOP (DATALENGTH(ISNULL(@List,1))) 1 FROM E4 a, E4 b -- 100,000,000
),
cteTally(N) AS
( --==== This provides the "zero base"
SELECT 0 UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E8
),
cteStart(N1) AS
( --==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1 FROM cteTally t WHERE (SUBSTRING(@List,t.N,1) = @Delimiter OR t.N = 0)
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT
ROW_NUMBER() OVER(ORDER BY s.N1) as ListPos,
SUBSTRING(@List,s.N1,ISNULL(NULLIF(CHARINDEX(@Delimiter,@List,s.N1),0)-s.N1,DATALENGTH(ISNULL(@List,1)))) as Value
FROM cteStart s
GO
Testing
DECLARE
@List NVARCHAR(MAX),
@Delimiter NCHAR(1) = N',',
@ListCnt int,
@StartTime datetime2,
@EndTime datetime2,
@ReplicationSeed int = 2,
@ReplicationMultiplier int = 2
DECLARE @Results TABLE (ListLength int, ElementCount int, MSTime int)
WHILE @ReplicationSeed <= 50000
BEGIN
SELECT @List = REPLICATE(CAST(N'ab,a,aeae,3,3,a3,23,4,asa,,434,q4,345' as nvarchar(max)), @ReplicationSeed)
SELECT @StartTime = SYSDATETIME()
SELECT @ListCnt = COUNT(*) FROM udf_StrList2Table(@List, @Delimiter)
SELECT @EndTime = SYSDATETIME()
INSERT INTO @Results (ListLength, ElementCount, MSTime)
SELECT LEN(@List), LEN(@List) - LEN(REPLACE(@List, ',', '')), DATEDIFF(MS, @StartTime, @EndTime)
SELECT
@ReplicationSeed = @ReplicationSeed * @ReplicationMultiplier
END
SELECT * FROM @Results
The optimizer seems to be smart enough to return the count without doing the actual Split. For instance, the above returns this output on my machine
ListLengthElementCountMSTime
74240
148483
296960
5921920
11843843
23687686
4736153613
9472307223
18944614450
378881228896
7577624576193
15155249152390
30310498304780
6062081966088316
121241639321616510
But if I change the Count(*) statement to
SELECT @ItemNumber = [ListPos], @ItemValue = [Value] FROM udf_StrList2Table(@List, @Delimiter)
where the declaration of the above were
DECLARE @ItemNumber BIGINT, @ItemValue VARCHAR(MAX)
Following is the output
ListLengthElementCountMSTime
74240
148480
296963
5921923
11843843
236876810
4736153623
9472307243
18944614493
3788812288180
7577624576356
15155249152700
303104983041400
60620819660814033
121241639321628070
May 22, 2012 at 6:17 am
Usman Butt (5/22/2012)
Since, in the future, we may need the VARCHAR(MAX) version as well, I did some testing on the other solutions as tally table solution is a sore loser on wider strings. Two of them performed really well1.Inline Recursive CTE solution (Split_RCTE, Tweaked a little bit for better performance)
2.TVF XML solution by Oleg (Split_XML_Solution_By_Oleg)
The simple SQLCLR splitter 'B' handles Unicode and ANSI strings of MAX length, can use parallelism if appropriate, and works on all versions from SQL Server 2005 onward. I ran the tests you provided, with the following results:
1. Recursive CTE: 2 minutes 30 seconds
2. XML: 4 minutes 16 seconds
3. SQLCLR: 15 seconds
The set up code is below:
-- ======================================
-- Splitter B : NET2 explicit enumeration
-- ======================================
-- ======================================
-- Drop the CLR function if it exists
-- ======================================
IF EXISTS
(
SELECT 1
FROM sys.objects
WHERE
[object_id] = OBJECT_ID(N'dbo.SplitterB')
AND type_desc = N'CLR_TABLE_VALUED_FUNCTION'
)
BEGIN
DROP FUNCTION dbo.SplitterB;
END;
GO
-- ======================================
-- Drop the assembly if it exists
-- ======================================
IF EXISTS
(
SELECT 1
FROM sys.assemblies
WHERE
name = N'SplitterB'
)
BEGIN
DROP ASSEMBLY SplitterB;
END;
GO
-- ======================================
-- Create assembly
-- ======================================
CREATE ASSEMBLY SplitterB
AUTHORIZATION dbo
FROM 0x4D5A90000300000004000000FFFF0000B800000000000000400000000000000000000000000000000000000000000000000000000000000000000000800000000E1FBA0E00B409CD21B8014CCD21546869732070726F6772616D2063616E6E6F742062652072756E20696E20444F53206D6F64652E0D0D0A2400000000000000504500004C0103003B0DC14D0000000000000000E00002210B010800000E00000006000000000000AE2D0000002000000040000000004000002000000002000004000000000000000400000000000000008000000002000000000000030040850000100000100000000010000010000000000000100000000000000000000000582D000053000000004000008003000000000000000000000000000000000000006000000C00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000080000000000000000000000082000004800000000000000000000002E74657874000000B40D000000200000000E000000020000000000000000000000000000200000602E7273726300000080030000004000000004000000100000000000000000000000000000400000402E72656C6F6300000C0000000060000000020000001400000000000000000000000000004000004200000000000000000000000000000000902D0000000000004800000002000500E0210000780B00000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000A2026F1300000A2D0E026F1400000A0373040000062B0C168D140000011673040000068C030000022A000000133002001A0000000100001102A5040000020A031200280800000654041200280A000006512A1E02281500000A2AC202037D0100000402047D0300000402027B010000048E697D02000004027C05000004FE150400000202167D040000042A0013300500C200000002000011027B04000004027B020000043302162A027B040000040A2B56027B010000040693027B030000043342027C0500000425280800000617582809000006027C05000004027B01000004027B0400000406027B0400000459731700000A280B000006020617587D04000004172A0617580A06027B0200000432A1027C0500000425280800000617582809000006027C05000004027B01000004027B04000004027B02000004027B0400000459731700000A280B00000602027B020000047D04000004172A32027B050000048C040000022A1A731800000A7A1E027B060000042A2202037D060000042A1E027B070000042A2202037D070000042A42534A4201000100000000000C00000076322E302E35303732370000000005006C000000CC030000237E0000380400008C04000023537472696E677300000000C40800000800000023555300CC080000100000002347554944000000DC0800009C02000023426C6F6200000000000000020000015717A2030902000000FA253300160000010000001900000004000000070000000B00000009000000010000001900000012000000020000000200000003000000050000000300000001000000020000000200000000000A00010000000000060056004F0006005D004F0006007A0067000A00A70092000A0031021602060079025A020600A30291020600BA0291020600D70291020600F602910206000F03910206002803910206004303910206005E039102060077035A0206008B0391020600C403A4030600E403A4030A000204160206002C044F00060031045A02060047045A02060052044F00060059044F0006007104A403000000000100000000000100010001001000180000000500010001000B0110002D0000000900010004000B0110003D00000009000600080021005B012E002100610132002100680135000100720132000100780138000100D20132000100EC0152005020000000009600B0000A0001007C20000000009600BA0012000300A220000000008618C2001B000600AA20000000008318C2001F000600DC2000000000E101C80026000800AA2100000000E109F9002A000800B72100000000E10130011B000800BE21000000008308A60140000800C621000000008308B30144000800CF21000000008308C00149000900D721000000008308C9014D000900000001001002000002004302000001004D02020002005102020003008602000001001002000002004302000001008B02000001008B0203000D001900F0002600190024012A00190055011B002900C2001B003100C2001B003900C2004D004100C2004D004900C2004D005100C2004D005900C2004D006100C2004D006900C2004D007100C2004D007900C20070008100C2004D008900C20044009100C2001B009900C2001B00210017042600210022040D020900C2001B00A900C2001702B900C2001D02C100C2001B00C900C2001B00200093007500240023005D002E0033002E022E0043003D022E008B007C022E004B0043022E0053002E022E0073003D022E003B003D022E00830073022E005B0052022E0063003D02C100CB002902E100CB0029020001CB0029022001CB0029024001CB0029026001CB00290212022502030001000400020000007F013C0000000202550000000B02590002000600030001000900050002000800050002000A00070001000B00070003000A00030003000C00050003000E00070004800000010000002D108596000000000000B000000002000000000000000000000001004600000000000200000000000000000000000100860000000000030002000400020000000000003C4D6F64756C653E0053706C6974746572422E646C6C0055736572446566696E656446756E6374696F6E730053706C6974456E756D657261746F720053706C6974526F77006D73636F726C69620053797374656D004F626A6563740056616C7565547970650053797374656D2E436F6C6C656374696F6E730049456E756D657261746F720053797374656D2E446174610053797374656D2E446174612E53716C54797065730053716C43686172730053706C6974746572420046696C6C526F77002E63746F720053797374656D2E436F6C6C656374696F6E732E49456E756D657261746F722E4D6F76654E657874004D6F76654E6578740053797374656D2E436F6C6C656374696F6E732E49456E756D657261746F722E6765745F43757272656E74006765745F43757272656E740053797374656D2E436F6C6C656374696F6E732E49456E756D657261746F722E526573657400526573657400696E707574006C656E6774680064656C696D69746572007374617274007265636F72640053797374656D2E436F6C6C656374696F6E732E49456E756D657261746F722E43757272656E74006765745F53657175656E6365007365745F53657175656E6365006765745F4974656D007365745F4974656D003C53657175656E63653E6B5F5F4261636B696E674669656C64003C4974656D3E6B5F5F4261636B696E674669656C640053657175656E6365004974656D00496E707574004D6963726F736F66742E53716C5365727665722E5365727665720053716C46616365744174747269627574650044656C696D69746572006F626A0073657175656E63650053797374656D2E52756E74696D652E496E7465726F705365727669636573004F7574417474726962757465006974656D0076616C75650053797374656D2E5265666C656374696F6E00417373656D626C795469746C6541747472696275746500417373656D626C794465736372697074696F6E41747472696275746500417373656D626C79436F6E66696775726174696F6E41747472696275746500417373656D626C79436F6D70616E7941747472696275746500417373656D626C7950726F6475637441747472696275746500417373656D626C79436F7079726967687441747472696275746500417373656D626C7954726164656D61726B41747472696275746500417373656D626C7943756C7475726541747472696275746500436F6D56697369626C6541747472696275746500417373656D626C7956657273696F6E4174747269627574650053797374656D2E52756E74696D652E436F6D70696C6572536572766963657300436F6D70696C6174696F6E52656C61786174696F6E734174747269627574650052756E74696D65436F6D7061746962696C6974794174747269627574650053716C46756E6374696F6E417474726962757465006765745F49734E756C6C006765745F56616C75650043686172005374727563744C61796F7574417474726962757465004C61796F75744B696E6400537472696E67004E6F74496D706C656D656E746564457863657074696F6E00436F6D70696C657247656E657261746564417474726962757465000003200000000000E6C339A53C7D1F4CA3A2AE4AC04EF7A60008B77A5C561934E089070002120D121103080003011C1008100E03200001062002011D0303032000020320001C03061D03020608020603030611100328001C0320000804200101080320000E042001010E02060E032800080328000E12010001005408074D617853697A65FFFFFFFF04200101028196010006005455794D6963726F736F66742E53716C5365727665722E5365727665722E446174614163636573734B696E642C2053797374656D2E446174612C2056657273696F6E3D322E302E302E302C2043756C747572653D6E65757472616C2C205075626C69634B6579546F6B656E3D623737613563353631393334653038390A446174614163636573730000000054557F4D6963726F736F66742E53716C5365727665722E5365727665722E53797374656D446174614163636573734B696E642C2053797374656D2E446174612C2056657273696F6E3D322E302E302E302C2043756C747572653D6E65757472616C2C205075626C69634B6579546F6B656E3D623737613563353631393334653038391053797374656D446174614163636573730000000054020F497344657465726D696E69737469630154020949735072656369736501540E1146696C6C526F774D6574686F644E616D650746696C6C526F77540E0F5461626C65446566696E6974696F6E2173657175656E636520494E542C206974656D204E564152434841522834303030290420001D030407011110052001011159072003011D0308080307010804010000000E01000953706C69747465724200000501000000000E0100094D6963726F736F667400002001001B436F7079726967687420C2A9204D6963726F736F6674203230313100000801000800000000001E01000100540216577261704E6F6E457863657074696F6E5468726F77730100802D000000000000000000009E2D0000002000000000000000000000000000000000000000000000902D000000000000000000000000000000005F436F72446C6C4D61696E006D73636F7265652E646C6C0000000000FF25002040000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100100000001800008000000000000000000000000000000100010000003000008000000000000000000000000000000100000000004800000058400000280300000000000000000000280334000000560053005F00560045005200530049004F004E005F0049004E0046004F0000000000BD04EFFE000001000000010085962D100000010085962D103F000000000000000400000002000000000000000000000000000000440000000100560061007200460069006C00650049006E0066006F00000000002400040000005400720061006E0073006C006100740069006F006E00000000000000B00488020000010053007400720069006E006700460069006C00650049006E0066006F00000064020000010030003000300030003000340062003000000034000A00010043006F006D00700061006E0079004E0061006D006500000000004D006900630072006F0073006F006600740000003C000A000100460069006C0065004400650073006300720069007000740069006F006E0000000000530070006C00690074007400650072004200000040000F000100460069006C006500560065007200730069006F006E000000000031002E0030002E0034003100340031002E0033003800350033003300000000003C000E00010049006E007400650072006E0061006C004E0061006D0065000000530070006C006900740074006500720042002E0064006C006C0000005C001B0001004C006500670061006C0043006F007000790072006900670068007400000043006F0070007900720069006700680074002000A90020004D006900630072006F0073006F0066007400200032003000310031000000000044000E0001004F0072006900670069006E0061006C00460069006C0065006E0061006D0065000000530070006C006900740074006500720042002E0064006C006C00000034000A000100500072006F0064007500630074004E0061006D00650000000000530070006C00690074007400650072004200000044000F000100500072006F006400750063007400560065007200730069006F006E00000031002E0030002E0034003100340031002E00330038003500330033000000000048000F00010041007300730065006D0062006C0079002000560065007200730069006F006E00000031002E0030002E0034003100340031002E0033003800350033003300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002000000C000000B03D00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
WITH PERMISSION_SET = SAFE;
GO
-- ======================================
-- Create the function
-- ======================================
CREATE FUNCTION dbo.SplitterB
(
@Input NVARCHAR(MAX),
@Delimiter NCHAR(1)
)
RETURNS TABLE
(
sequence INTEGER NULL,
item NVARCHAR(4000) NULL
)
AS EXTERNAL NAME SplitterB.UserDefinedFunctions.SplitterB;
GO
-- ======================================
-- Enable CLR if necessary
-- ======================================
IF NOT EXISTS
(
SELECT 1
FROM sys.configurations
WHERE
name = N'clr enabled'
AND value_in_use = 1
)
BEGIN
EXECUTE sys.sp_configure
@configname = N'clr enabled',
@configvalue = 1;
RECONFIGURE;
END;
GO
-- ======================================
-- Test the function
-- ======================================
SELECT
sequence,
item
FROM dbo.SplitterB(N'A,B,C', N',');
GO
Source:
using System.Collections;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
public partial class UserDefinedFunctions
{
/**
* How SQL Server SQLCLR table-valued functions work:
*
* 1. SQL Server passes the input parameters in to the function and receives an enumeration object in return
* 2. SQL Server calls the MoveNext() method on the enumeration object
* 3. If the MoveNext() call returns true, SQL Server calls the Current() method to get an object for the current row
* 4. SQL Server calls the FillRow method to obtain column values for the current row
* 5. Repeat from step 2, until MoveNext() returns false
*
* */
[SqlFunction
(
DataAccess = DataAccessKind.None, // No user data access by this function
SystemDataAccess = SystemDataAccessKind.None, // No system data access by this function
IsDeterministic = true, // This function is deterministic
IsPrecise = true, // This function is precise
FillRowMethodName = "FillRow", // The method called by SQL Server to obtain the next row
TableDefinition =
"sequence INT, item NVARCHAR(4000)" // Returned table definition
)
]
// 1. SQL Server passes input parameters and receives an enumration object
public static IEnumerator SplitterB
(
[SqlFacet(MaxSize = -1)] SqlChars Input,
char Delimiter
)
{
return Input.IsNull ?
new SplitEnumerator(new char[0], char.MinValue) :
new SplitEnumerator(Input.Value, Delimiter);
}
// The enumeration object
struct SplitEnumerator : IEnumerator
{
// Constructor (called once when the object is created)
internal SplitEnumerator(char[] Input, char Delimiter)
{
// Save references
input = Input;
delimiter = Delimiter;
// Remember the length of the character array
length = input.Length;
// Structure holding split rows
record = new SplitRow();
// Starting at the first character
start = 0;
}
// Enumerator implementation
#region IEnumerator Methods
// 2. SQL Server calls the MoveNext() method on the enumeration object
bool IEnumerator.MoveNext()
{
// No more rows?
if (start == length) { return false; }
// Find the next delimiter
for (int i = start; i < length; i++)
{
if (input == delimiter)
{
// Increment the sequence number
record.Sequence++;
// Save the split element
record.Item = new string(input, start, i - start);
// Set the next element search start point
start = i + 1;
return true;
}
}
// Last item
record.Sequence++;
record.Item = new string(input, start, length - start);
start = length;
return true;
}
// 3. SQL Server calls the Current() method to get an object for the current row
// (We pack the current row data in an OutputRecord structure)
object IEnumerator.Current
{
get { return record; }
}
// Required by the IEnumerator interface, but not needed for this implementation
void IEnumerator.Reset()
{
throw new System.NotImplementedException();
}
#endregion
readonly char[] input; // Reference to the string to be split
readonly int length; // Length of the input string
readonly char delimiter; // The delimiter character
int start; // Current search start position
SplitRow record; // Each row to be returned
}
// 4. SQL Server calls the FillRow method to obtain column values for the current row
public static void FillRow(object obj, out int sequence, out string item)
{
// The passed-in object is an OutputRecord
var r = (SplitRow)obj;
// Set the output parameter values
sequence = r.Sequence;
item = r.Item;
}
// Structure used to hold each row
struct SplitRow
{
internal int Sequence { get; set; } // Sequence of the element
internal string Item { get; set; } // The element
}
};
May 22, 2012 at 6:18 am
peter-757102 (5/22/2012)
As for bad performance with some implementations, please try to bench with option( maxdop 1) too as there exists a bug in SQL Server 2008 and later that can (unpredictably) cause extreme slowdowns when parallel plans are used. We would not want to see results tainted by this of course.
To which 'bug' do you refer?
May 22, 2012 at 6:39 am
SQL Kiwi (5/22/2012)
peter-757102 (5/22/2012)
As for bad performance with some implementations, please try to bench with option( maxdop 1) too as there exists a bug in SQL Server 2008 and later that can (unpredictably) cause extreme slowdowns when parallel plans are used. We would not want to see results tainted by this of course.To which 'bug' do you refer?
http://sqlblog.com/blogs/paul_white/archive/2012/05/03/parallel-row-goals-gone-rogue.aspx
You are familiar with your own blog i presume 😉
In my view, what you shown us there is a bug (in the sense that the processing times become unpredicatble and unreasonable).
After testing for myself and trying to evade the issue using a few contructs, I concluded it to be quite serious.
At least with the option ( maxdop 1 ) we can exclude this unfavorable scenario for code that is very sensitive to this issue (like recursive CTE generating large sequences of numbers used in splitters and the like).
May 22, 2012 at 6:50 am
Usman Butt (5/22/2012)
Since, in the future, we may need the VARCHAR(MAX) version as well, I did some testing on the other solutions as tally table solution is a sore loser on wider strings. Two of them performed really well
I'm pretty sure that I stated that the code in the article is NOT the correct solution for VARCHAR(MAX) because just changing the input datatype to VARCHAR(MAX) causes the code to run twice as slow.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 22, 2012 at 6:59 am
SQL Kiwi (5/22/2012)
Usman Butt (5/22/2012)
Since, in the future, we may need the VARCHAR(MAX) version as well, I did some testing on the other solutions as tally table solution is a sore loser on wider strings. Two of them performed really well1.Inline Recursive CTE solution (Split_RCTE, Tweaked a little bit for better performance)
2.TVF XML solution by Oleg (Split_XML_Solution_By_Oleg)
The simple SQLCLR splitter 'B' handles Unicode and ANSI strings of MAX length, can use parallelism if appropriate, and works on all versions from SQL Server 2005 onward. I ran the tests you provided, with the following results:
1. Recursive CTE: 2 minutes 30 seconds
2. XML: 4 minutes 16 seconds
3. SQLCLR: 15 seconds
The set up code is below:
-- ======================================
-- Splitter B : NET2 explicit enumeration
-- ======================================
-- ======================================
-- Drop the CLR function if it exists
-- ======================================
IF EXISTS
(
SELECT 1
FROM sys.objects
WHERE
[object_id] = OBJECT_ID(N'dbo.SplitterB')
AND type_desc = N'CLR_TABLE_VALUED_FUNCTION'
)
BEGIN
DROP FUNCTION dbo.SplitterB;
END;
GO
-- ======================================
-- Drop the assembly if it exists
-- ======================================
IF EXISTS
(
SELECT 1
FROM sys.assemblies
WHERE
name = N'SplitterB'
)
BEGIN
DROP ASSEMBLY SplitterB;
END;
GO
-- ======================================
-- Create assembly
-- ======================================
CREATE ASSEMBLY SplitterB
AUTHORIZATION dbo
FROM 0x4D5A90000300000004000000FFFF0000B800000000000000400000000000000000000000000000000000000000000000000000000000000000000000800000000E1FBA0E00B409CD21B8014CCD21546869732070726F6772616D2063616E6E6F742062652072756E20696E20444F53206D6F64652E0D0D0A2400000000000000504500004C0103003B0DC14D0000000000000000E00002210B010800000E00000006000000000000AE2D0000002000000040000000004000002000000002000004000000000000000400000000000000008000000002000000000000030040850000100000100000000010000010000000000000100000000000000000000000582D000053000000004000008003000000000000000000000000000000000000006000000C00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000080000000000000000000000082000004800000000000000000000002E74657874000000B40D000000200000000E000000020000000000000000000000000000200000602E7273726300000080030000004000000004000000100000000000000000000000000000400000402E72656C6F6300000C0000000060000000020000001400000000000000000000000000004000004200000000000000000000000000000000902D0000000000004800000002000500E0210000780B00000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000A2026F1300000A2D0E026F1400000A0373040000062B0C168D140000011673040000068C030000022A000000133002001A0000000100001102A5040000020A031200280800000654041200280A000006512A1E02281500000A2AC202037D0100000402047D0300000402027B010000048E697D02000004027C05000004FE150400000202167D040000042A0013300500C200000002000011027B04000004027B020000043302162A027B040000040A2B56027B010000040693027B030000043342027C0500000425280800000617582809000006027C05000004027B01000004027B0400000406027B0400000459731700000A280B000006020617587D04000004172A0617580A06027B0200000432A1027C0500000425280800000617582809000006027C05000004027B01000004027B04000004027B02000004027B0400000459731700000A280B00000602027B020000047D04000004172A32027B050000048C040000022A1A731800000A7A1E027B060000042A2202037D060000042A1E027B070000042A2202037D070000042A42534A4201000100000000000C00000076322E302E35303732370000000005006C000000CC030000237E0000380400008C04000023537472696E677300000000C40800000800000023555300CC080000100000002347554944000000DC0800009C02000023426C6F6200000000000000020000015717A2030902000000FA253300160000010000001900000004000000070000000B00000009000000010000001900000012000000020000000200000003000000050000000300000001000000020000000200000000000A00010000000000060056004F0006005D004F0006007A0067000A00A70092000A0031021602060079025A020600A30291020600BA0291020600D70291020600F602910206000F03910206002803910206004303910206005E039102060077035A0206008B0391020600C403A4030600E403A4030A000204160206002C044F00060031045A02060047045A02060052044F00060059044F0006007104A403000000000100000000000100010001001000180000000500010001000B0110002D0000000900010004000B0110003D00000009000600080021005B012E002100610132002100680135000100720132000100780138000100D20132000100EC0152005020000000009600B0000A0001007C20000000009600BA0012000300A220000000008618C2001B000600AA20000000008318C2001F000600DC2000000000E101C80026000800AA2100000000E109F9002A000800B72100000000E10130011B000800BE21000000008308A60140000800C621000000008308B30144000800CF21000000008308C00149000900D721000000008308C9014D000900000001001002000002004302000001004D02020002005102020003008602000001001002000002004302000001008B02000001008B0203000D001900F0002600190024012A00190055011B002900C2001B003100C2001B003900C2004D004100C2004D004900C2004D005100C2004D005900C2004D006100C2004D006900C2004D007100C2004D007900C20070008100C2004D008900C20044009100C2001B009900C2001B00210017042600210022040D020900C2001B00A900C2001702B900C2001D02C100C2001B00C900C2001B00200093007500240023005D002E0033002E022E0043003D022E008B007C022E004B0043022E0053002E022E0073003D022E003B003D022E00830073022E005B0052022E0063003D02C100CB002902E100CB0029020001CB0029022001CB0029024001CB0029026001CB00290212022502030001000400020000007F013C0000000202550000000B02590002000600030001000900050002000800050002000A00070001000B00070003000A00030003000C00050003000E00070004800000010000002D108596000000000000B000000002000000000000000000000001004600000000000200000000000000000000000100860000000000030002000400020000000000003C4D6F64756C653E0053706C6974746572422E646C6C0055736572446566696E656446756E6374696F6E730053706C6974456E756D657261746F720053706C6974526F77006D73636F726C69620053797374656D004F626A6563740056616C7565547970650053797374656D2E436F6C6C656374696F6E730049456E756D657261746F720053797374656D2E446174610053797374656D2E446174612E53716C54797065730053716C43686172730053706C6974746572420046696C6C526F77002E63746F720053797374656D2E436F6C6C656374696F6E732E49456E756D657261746F722E4D6F76654E657874004D6F76654E6578740053797374656D2E436F6C6C656374696F6E732E49456E756D657261746F722E6765745F43757272656E74006765745F43757272656E740053797374656D2E436F6C6C656374696F6E732E49456E756D657261746F722E526573657400526573657400696E707574006C656E6774680064656C696D69746572007374617274007265636F72640053797374656D2E436F6C6C656374696F6E732E49456E756D657261746F722E43757272656E74006765745F53657175656E6365007365745F53657175656E6365006765745F4974656D007365745F4974656D003C53657175656E63653E6B5F5F4261636B696E674669656C64003C4974656D3E6B5F5F4261636B696E674669656C640053657175656E6365004974656D00496E707574004D6963726F736F66742E53716C5365727665722E5365727665720053716C46616365744174747269627574650044656C696D69746572006F626A0073657175656E63650053797374656D2E52756E74696D652E496E7465726F705365727669636573004F7574417474726962757465006974656D0076616C75650053797374656D2E5265666C656374696F6E00417373656D626C795469746C6541747472696275746500417373656D626C794465736372697074696F6E41747472696275746500417373656D626C79436F6E66696775726174696F6E41747472696275746500417373656D626C79436F6D70616E7941747472696275746500417373656D626C7950726F6475637441747472696275746500417373656D626C79436F7079726967687441747472696275746500417373656D626C7954726164656D61726B41747472696275746500417373656D626C7943756C7475726541747472696275746500436F6D56697369626C6541747472696275746500417373656D626C7956657273696F6E4174747269627574650053797374656D2E52756E74696D652E436F6D70696C6572536572766963657300436F6D70696C6174696F6E52656C61786174696F6E734174747269627574650052756E74696D65436F6D7061746962696C6974794174747269627574650053716C46756E6374696F6E417474726962757465006765745F49734E756C6C006765745F56616C75650043686172005374727563744C61796F7574417474726962757465004C61796F75744B696E6400537472696E67004E6F74496D706C656D656E746564457863657074696F6E00436F6D70696C657247656E657261746564417474726962757465000003200000000000E6C339A53C7D1F4CA3A2AE4AC04EF7A60008B77A5C561934E089070002120D121103080003011C1008100E03200001062002011D0303032000020320001C03061D03020608020603030611100328001C0320000804200101080320000E042001010E02060E032800080328000E12010001005408074D617853697A65FFFFFFFF04200101028196010006005455794D6963726F736F66742E53716C5365727665722E5365727665722E446174614163636573734B696E642C2053797374656D2E446174612C2056657273696F6E3D322E302E302E302C2043756C747572653D6E65757472616C2C205075626C69634B6579546F6B656E3D623737613563353631393334653038390A446174614163636573730000000054557F4D6963726F736F66742E53716C5365727665722E5365727665722E53797374656D446174614163636573734B696E642C2053797374656D2E446174612C2056657273696F6E3D322E302E302E302C2043756C747572653D6E65757472616C2C205075626C69634B6579546F6B656E3D623737613563353631393334653038391053797374656D446174614163636573730000000054020F497344657465726D696E69737469630154020949735072656369736501540E1146696C6C526F774D6574686F644E616D650746696C6C526F77540E0F5461626C65446566696E6974696F6E2173657175656E636520494E542C206974656D204E564152434841522834303030290420001D030407011110052001011159072003011D0308080307010804010000000E01000953706C69747465724200000501000000000E0100094D6963726F736F667400002001001B436F7079726967687420C2A9204D6963726F736F6674203230313100000801000800000000001E01000100540216577261704E6F6E457863657074696F6E5468726F77730100802D000000000000000000009E2D0000002000000000000000000000000000000000000000000000902D000000000000000000000000000000005F436F72446C6C4D61696E006D73636F7265652E646C6C0000000000FF25002040000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100100000001800008000000000000000000000000000000100010000003000008000000000000000000000000000000100000000004800000058400000280300000000000000000000280334000000560053005F00560045005200530049004F004E005F0049004E0046004F0000000000BD04EFFE000001000000010085962D100000010085962D103F000000000000000400000002000000000000000000000000000000440000000100560061007200460069006C00650049006E0066006F00000000002400040000005400720061006E0073006C006100740069006F006E00000000000000B00488020000010053007400720069006E006700460069006C00650049006E0066006F00000064020000010030003000300030003000340062003000000034000A00010043006F006D00700061006E0079004E0061006D006500000000004D006900630072006F0073006F006600740000003C000A000100460069006C0065004400650073006300720069007000740069006F006E0000000000530070006C00690074007400650072004200000040000F000100460069006C006500560065007200730069006F006E000000000031002E0030002E0034003100340031002E0033003800350033003300000000003C000E00010049006E007400650072006E0061006C004E0061006D0065000000530070006C006900740074006500720042002E0064006C006C0000005C001B0001004C006500670061006C0043006F007000790072006900670068007400000043006F0070007900720069006700680074002000A90020004D006900630072006F0073006F0066007400200032003000310031000000000044000E0001004F0072006900670069006E0061006C00460069006C0065006E0061006D0065000000530070006C006900740074006500720042002E0064006C006C00000034000A000100500072006F0064007500630074004E0061006D00650000000000530070006C00690074007400650072004200000044000F000100500072006F006400750063007400560065007200730069006F006E00000031002E0030002E0034003100340031002E00330038003500330033000000000048000F00010041007300730065006D0062006C0079002000560065007200730069006F006E00000031002E0030002E0034003100340031002E0033003800350033003300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002000000C000000B03D00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
WITH PERMISSION_SET = SAFE;
GO
-- ======================================
-- Create the function
-- ======================================
CREATE FUNCTION dbo.SplitterB
(
@Input NVARCHAR(MAX),
@Delimiter NCHAR(1)
)
RETURNS TABLE
(
sequence INTEGER NULL,
item NVARCHAR(4000) NULL
)
AS EXTERNAL NAME SplitterB.UserDefinedFunctions.SplitterB;
GO
-- ======================================
-- Enable CLR if necessary
-- ======================================
IF NOT EXISTS
(
SELECT 1
FROM sys.configurations
WHERE
name = N'clr enabled'
AND value_in_use = 1
)
BEGIN
EXECUTE sys.sp_configure
@configname = N'clr enabled',
@configvalue = 1;
RECONFIGURE;
END;
GO
-- ======================================
-- Test the function
-- ======================================
SELECT
sequence,
item
FROM dbo.SplitterB(N'A,B,C', N',');
GO
Source:
using System.Collections;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
public partial class UserDefinedFunctions
{
/**
* How SQL Server SQLCLR table-valued functions work:
*
* 1. SQL Server passes the input parameters in to the function and receives an enumeration object in return
* 2. SQL Server calls the MoveNext() method on the enumeration object
* 3. If the MoveNext() call returns true, SQL Server calls the Current() method to get an object for the current row
* 4. SQL Server calls the FillRow method to obtain column values for the current row
* 5. Repeat from step 2, until MoveNext() returns false
*
* */
[SqlFunction
(
DataAccess = DataAccessKind.None, // No user data access by this function
SystemDataAccess = SystemDataAccessKind.None, // No system data access by this function
IsDeterministic = true, // This function is deterministic
IsPrecise = true, // This function is precise
FillRowMethodName = "FillRow", // The method called by SQL Server to obtain the next row
TableDefinition =
"sequence INT, item NVARCHAR(4000)" // Returned table definition
)
]
// 1. SQL Server passes input parameters and receives an enumration object
public static IEnumerator SplitterB
(
[SqlFacet(MaxSize = -1)] SqlChars Input,
char Delimiter
)
{
return Input.IsNull ?
new SplitEnumerator(new char[0], char.MinValue) :
new SplitEnumerator(Input.Value, Delimiter);
}
// The enumeration object
struct SplitEnumerator : IEnumerator
{
// Constructor (called once when the object is created)
internal SplitEnumerator(char[] Input, char Delimiter)
{
// Save references
input = Input;
delimiter = Delimiter;
// Remember the length of the character array
length = input.Length;
// Structure holding split rows
record = new SplitRow();
// Starting at the first character
start = 0;
}
// Enumerator implementation
#region IEnumerator Methods
// 2. SQL Server calls the MoveNext() method on the enumeration object
bool IEnumerator.MoveNext()
{
// No more rows?
if (start == length) { return false; }
// Find the next delimiter
for (int i = start; i < length; i++)
{
if (input == delimiter)
{
// Increment the sequence number
record.Sequence++;
// Save the split element
record.Item = new string(input, start, i - start);
// Set the next element search start point
start = i + 1;
return true;
}
}
// Last item
record.Sequence++;
record.Item = new string(input, start, length - start);
start = length;
return true;
}
// 3. SQL Server calls the Current() method to get an object for the current row
// (We pack the current row data in an OutputRecord structure)
object IEnumerator.Current
{
get { return record; }
}
// Required by the IEnumerator interface, but not needed for this implementation
void IEnumerator.Reset()
{
throw new System.NotImplementedException();
}
#endregion
readonly char[] input; // Reference to the string to be split
readonly int length; // Length of the input string
readonly char delimiter; // The delimiter character
int start; // Current search start position
SplitRow record; // Each row to be returned
}
// 4. SQL Server calls the FillRow method to obtain column values for the current row
public static void FillRow(object obj, out int sequence, out string item)
{
// The passed-in object is an OutputRecord
var r = (SplitRow)obj;
// Set the output parameter values
sequence = r.Sequence;
item = r.Item;
}
// Structure used to hold each row
struct SplitRow
{
internal int Sequence { get; set; } // Sequence of the element
internal string Item { get; set; } // The element
}
};
Sadly, it is one of the shops where CLR is not permissible 🙁 Otherwise, for sure, I would have chosen your CLR without any reluctance 🙂
May 22, 2012 at 6:59 am
Usman Butt (5/22/2012)
Sadly, as expected, it seems that things have not changed much. I have tried it on SQL 2012, and even after replacing the CHARINDEX function with new LEAD function, the performance was still dismal. Here is what I tried (only a template)
Considering that I said that this method wouldn't work well at all with VARCHAR(MAX), are you surprised? 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
May 22, 2012 at 7:08 am
Jeff Moden (5/22/2012)
Usman Butt (5/22/2012)
Sadly, as expected, it seems that things have not changed much. I have tried it on SQL 2012, and even after replacing the CHARINDEX function with new LEAD function, the performance was still dismal. Here is what I tried (only a template)Considering that I said that this method wouldn't work well at all with VARCHAR(MAX), are you surprised? 😉
Not at all, but had to test it to confirm 🙂 As I said, I was expecting the same to happen. And you stated that you could be wrong, so had to prove that you were right 😉
May 22, 2012 at 7:30 am
Usman Butt (5/22/2012)
Sadly, it is one of the shops where CLR is not permissible 🙁 Otherwise, for sure, I would have chosen your CLR without any reluctance 🙂
That is sad. Not least because CLR was new *seven years ago*. 🙂
May 22, 2012 at 7:32 am
peter-757102 (5/22/2012)
SQL Kiwi (5/22/2012)
To which 'bug' do you refer?http://sqlblog.com/blogs/paul_white/archive/2012/05/03/parallel-row-goals-gone-rogue.aspx
You are familiar with your own blog i presume 😉
In my view, what you shown us there is a bug (in the sense that the processing times become unpredicatble and unreasonable).
After testing for myself and trying to evade the issue using a few contructs, I concluded it to be quite serious.
At least with the option ( maxdop 1 ) we can exclude this unfavorable scenario for code that is very sensitive to this issue (like recursive CTE generating large sequences of numbers used in splitters and the like).
Vaguely familiar, yes 🙂 Actually, that's why I asked; I wanted to be sure you meant that issue.
May 22, 2012 at 7:34 am
SQL Kiwi (5/22/2012)
Usman Butt (5/22/2012)
Sadly, it is one of the shops where CLR is not permissible 🙁 Otherwise, for sure, I would have chosen your CLR without any reluctance 🙂That is sad. Not least because CLR was new *seven years ago*. 🙂
*seven years ago* SQL 2k5 was also new, and they are still on SQL 2k5. Things have not changed much for them 😉
May 22, 2012 at 7:43 am
The new licensing model for SQL Server 2012 model isn't very encouraging to the company I work for to make a switch to this version. But we did made it to SQL Server 2008 at least and in doing so, got to use some nice features along the way.
The step up from 2005 did pay off in terms of usable features!
May 22, 2012 at 7:53 am
Usman Butt (5/22/2012)
SQL Kiwi (5/22/2012)
Usman Butt (5/22/2012)
Sadly, it is one of the shops where CLR is not permissible 🙁 Otherwise, for sure, I would have chosen your CLR without any reluctance 🙂That is sad. Not least because CLR was new *seven years ago*. 🙂
*seven years ago* SQL 2k5 was also new, and they are still on SQL 2k5. Things have not changed much for them 😉
:laugh: I see. But still, they've had *seven years* to get over their fears...
Viewing 15 posts - 301 through 315 (of 990 total)
You must be logged in to reply to this topic. Login to reply