September 22, 2010 at 7:26 pm
Paul White NZ (9/22/2010)
Jeff Moden (9/22/2010)
The problem with the Tally Table related splits is that I forgot :blush: to add 1 to the length....WHERE N BETWEEN 1 AND LEN(@Parameter)+1
Indeed. Even with that correction, however, the LEN prevents correct results when using a space as the delimiter:
ALTER FUNCTION dbo.Split8KTally
(@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))
RETURNS TABLE
WITH SCHEMABINDING AS
RETURN
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND LEN(@Parameter)+1
AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma
GO
SELECT ItemNumber, ItemValue FROM dbo.Split8KTally('A,B,', ','); -- 3 rows (correct)
SELECT ItemNumber, ItemValue FROM dbo.Split8KTally('A B ', ' '); -- 2 rows (oops!)
Sorry. 🙁
Nah... don't be sorry. This is all good. 🙂
--Jeff Moden
Change is inevitable... Change for the better is not.
September 22, 2010 at 7:28 pm
steve-893342 (9/22/2010)
Jeff Moden (9/22/2010)
The problem with the Tally Table related splits is that I forgot :blush: to add 1 to the length....WHERE N BETWEEN 1 AND LEN(@Parameter)+1
Shouldn't that be
WHERE N < DATALENGTH(@Parameter) + 2
I don't believe we need the +2 but we do need DATALENGTH if we need to use a space as a delimiter as Paul suggests.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 22, 2010 at 9:17 pm
Deleted. Made an error on datalength.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 23, 2010 at 2:15 am
Good thread , lots of useful material coming from it ,but there is an issue,
SET STATISTICS TIME ON
Carries a performance overhead when used with Scalar functions , specifically with 2005 , but still noticeable with 2008.
Heres a simple example....
http://sqlblogcasts.com/blogs/sqlandthelike/archive/2009/11/24/the-observer-effect-in-action.aspx
So by doubling up the work (as shown below) and timing both with datediff , one with STATISTICS TIME ON , the other off , the run without stats is ALWAYS faster, ranging from 100 to 400 ms
So , right now , i think :unsure: i am saying, that the MultiStatement functions will be showing an artificial slowness in all the testing done so far.
BEGIN TRAN
--===== Do this in a nice safe place that everyone has
USE tempdb;
GO
--===================================================================
-- Create a Tally table from 1 to 11000
--===================================================================
--===== If the Tally table already exists here, drop it to make reruns easier.
IF OBJECT_ID('tempdb.dbo.Tally','U') IS NOT NULL
DROP TABLE dbo.Tally;
--===== Create and populate the Tally table on the fly.
-- This ISNULL function makes the column NOT NULL
-- so we can put a Primary Key on it
SELECT TOP 11000
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
;
--===== Add a CLUSTERED Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
;
--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC
;
GO
--=====================================================================================================================
-- Create the various functions to test
--=====================================================================================================================
--===== Do this in a nice safe place that everyone has
USE tempdb;
--===== Tally Table (Split8KTally iTVF) ===============================================================================
IF OBJECT_ID('tempdb.dbo.Split8KTally') IS NOT NULL
DROP FUNCTION dbo.Split8KTally
GO
CREATE FUNCTION dbo.Split8KTally
(@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))
RETURNS TABLE
WITH SCHEMABINDING AS
RETURN
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND LEN(@Parameter)+1
AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma
GO
--===== XML-Brad (Split8KXMLBrad iTVF) ================================================================================
IF OBJECT_ID('dbo.Split8KXMLBrad') IS NOT NULL
DROP FUNCTION dbo.Split8KXMLBrad
GO
CREATE FUNCTION dbo.Split8KXMLBrad
(@Parameter VARCHAR(MAX), @Delimiter VARCHAR(1))
RETURNS TABLE
WITH SCHEMABINDING AS
RETURN
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber,
x.i.value('(./text())[1]','varchar(8000)') AS ItemValue
from (select XMLList=cast('<i>'+replace(@Parameter,@Delimiter,'</i><i>')+'</i>' as xml).query('.')) a
cross apply XMLList.nodes('i') x(i)
;
GO
--===== Loop Method 1 (Split8KL1 mlTVF) ===============================================================================
IF OBJECT_ID('tempdb.dbo.Split8KL1') IS NOT NULL
DROP FUNCTION dbo.Split8KL1
GO
CREATE FUNCTION dbo.Split8KL1
(@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))
RETURNS @Result TABLE (ItemNumber INT IDENTITY(1,1), ItemValue VARCHAR(8000))
WITH SCHEMABINDING AS
BEGIN
--===== Declare a variable to remember the position of the current comma
DECLARE @N INT;
--===== Add start and end commas to the Parameter so we can handle
-- single elements
SELECT @Parameter = @Delimiter + @Parameter + @Delimiter,
--===== Preassign the current comma as the first character
@N = 1;
--===== Loop through and find each comma, then insert the string value
-- found between the current comma and the next comma. @N is
-- the position of the current comma.
WHILE @N < LEN(@Parameter) --Don't include the last comma
BEGIN
--==== Do the insert using the value between the commas
INSERT INTO @Result (ItemValue)
SELECT SUBSTRING(@Parameter, @N+1, CHARINDEX(@Delimiter, @Parameter, @N+1)-@N-1);
--==== Find the next comma
SELECT @N = CHARINDEX(@Delimiter, @Parameter, @N+1);
END; --END While
RETURN;
END; --END Function
GO
--===== Tally Table (Split8KTallyMulti iTVF) ==========================================================================
IF OBJECT_ID('tempdb.dbo.Split8KTallyMulti') IS NOT NULL
DROP FUNCTION dbo.Split8KTallyMulti
;
GO
CREATE FUNCTION dbo.Split8KTallyMulti
(@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))
RETURNS @Split TABLE (ItemNumber INT, ItemValue VARCHAR(8000))
WITH SCHEMABINDING AS
BEGIN
INSERT INTO @Split
(ItemNumber,ItemValue)
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND LEN(@Parameter)+1
AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma
RETURN;
END;
GO
--=====================================================================================================================
-- Build some test data from the table that is actually using in the tests
--=====================================================================================================================
--===== Build a thousand item CSV row without it being dependent on other test tables
DECLARE @CSVList VARCHAR(8000);
SELECT @CSVList = STUFF(
(
SELECT TOP 1000 ','+ CAST(SalesOrderDetailID AS VARCHAR(10))
FROM AdventureWorks.Sales.SalesOrderDetail
ORDER BY NEWID()
FOR XML PATH('')
)
,1,1,'');
--===== Show what we have for a parameter now...
PRINT @CSVList;
--=====================================================================================================================
-- Run the tests using Brad's code
--=====================================================================================================================
--===== Commence testing ...
declare @RunTimeStart datetime
select @RunTimeStart = getdate()
SET STATISTICS TIME,IO ON;
--And use it in a query:
PRINT '--===== Split8KTally (iTVF) ==========================================================================================='
declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KTally(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
--declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);
PRINT '--===== Split8KXMLBrad (XML)==========================================================================================='
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KXMLBrad(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
--declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);
PRINT '--===== Split8KL1 (Loop mlTVF)========================================================================================='
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KL1(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
--declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);
PRINT '
--===== Split8KTallyMulti (mlTVF )====================================================================================='
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KTallyMulti(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
select datediff(ms,@RunTimeStart,getdate()) as 'RunTimeWithStats'
SET STATISTICS TIME,IO OFF;
select @RunTimeStart = getdate()
--And use it in a query:
PRINT '--===== Split8KTally (iTVF) ==========================================================================================='
--declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KTally(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
--declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);
PRINT '--===== Split8KXMLBrad (XML)==========================================================================================='
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KXMLBrad(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
--declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);
PRINT '--===== Split8KL1 (Loop mlTVF)========================================================================================='
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KL1(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
--declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);
PRINT '
--===== Split8KTallyMulti (mlTVF )====================================================================================='
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KTallyMulti(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
select datediff(ms,@RunTimeStart,getdate()) as 'RunTimeNoStats'
go
--=====================================================================================================================
-- Houskeeping
--=====================================================================================================================
--===== Do this in a nice safe place that everyone has
USE tempdb;
GO
--===== Drop all the object that we just created
DROP FUNCTION dbo.Split8KTally, dbo.Split8KL1, dbo.Split8KXMLBrad, dbo.Split8KTallyMulti;
DROP TABLE dbo.Tally;
GO
ROLLBACK
September 23, 2010 at 3:05 am
Dave Ballantyne (9/23/2010)
Good thread , lots of useful material coming from it ,but there is an issue, SET STATISTICS TIME ON carries a performance overhead when used with Scalar functions , specifically with 2005 , but still noticeable with 2008.
:ermm:
There aren't any scalar T-SQL functions in the test suite, Dave - just in-line TVFs and multi-statement TVFs. Stats on does add considerable overhead to scalar functions, particularly if they are invoked (per row, naturally) on a large number of input rows. The functions used in the test you quote, aside from not being scalar, are only called once. Am I missing something vital here?
September 23, 2010 at 3:27 am
Im lumping the multi-statement TVF in with scalar functions, my bad terminology. My 'mental' division is "Multi Statement Function" = Scalar = Bad , "Inline Function " = One optimize-able statement = good 😉
But by running the latest test code with SET STATISTICS TIME ON and then OFF , there is a consistent difference. (Microsoft SQL Server 2005 - 9.00.4207.00 (Intel X86) Dec 17 2008 14:34:38 Copyright (c) 1988-2005 Microsoft Corporation Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 2) , hmm need to update with some CU's 🙂
Im assuming the "Set Statistics Time on" monitors and accumulates the time of each statement executed within the function and that is causing the overhead ,rather than a strict number of executions of the function, which as you say is once only.
Heres a more cut down version using only the Split8KL1 function.
BEGIN TRAN
--===== Do this in a nice safe place that everyone has
USE tempdb;
GO
--===================================================================
-- Create a Tally table from 1 to 11000
--===================================================================
--===== If the Tally table already exists here, drop it to make reruns easier.
IF OBJECT_ID('tempdb.dbo.Tally','U') IS NOT NULL
DROP TABLE dbo.Tally;
--===== Create and populate the Tally table on the fly.
-- This ISNULL function makes the column NOT NULL
-- so we can put a Primary Key on it
SELECT TOP 11000
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
;
--===== Add a CLUSTERED Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
;
--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC
;
GO
--===== Loop Method 1 (Split8KL1 mlTVF) ===============================================================================
IF OBJECT_ID('tempdb.dbo.Split8KL1') IS NOT NULL
DROP FUNCTION dbo.Split8KL1
GO
CREATE FUNCTION dbo.Split8KL1
(@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))
RETURNS @Result TABLE (ItemNumber INT IDENTITY(1,1), ItemValue VARCHAR(8000))
WITH SCHEMABINDING AS
BEGIN
--===== Declare a variable to remember the position of the current comma
DECLARE @N INT;
--===== Add start and end commas to the Parameter so we can handle
-- single elements
SELECT @Parameter = @Delimiter + @Parameter + @Delimiter,
--===== Preassign the current comma as the first character
@N = 1;
--===== Loop through and find each comma, then insert the string value
-- found between the current comma and the next comma. @N is
-- the position of the current comma.
WHILE @N < LEN(@Parameter) --Don't include the last comma
BEGIN
--==== Do the insert using the value between the commas
INSERT INTO @Result (ItemValue)
SELECT SUBSTRING(@Parameter, @N+1, CHARINDEX(@Delimiter, @Parameter, @N+1)-@N-1);
--==== Find the next comma
SELECT @N = CHARINDEX(@Delimiter, @Parameter, @N+1);
END; --END While
RETURN;
END; --END Function
GO
--=====================================================================================================================
-- Build some test data from the table that is actually using in the tests
--=====================================================================================================================
--===== Build a thousand item CSV row without it being dependent on other test tables
DECLARE @CSVList VARCHAR(8000);
SELECT @CSVList = STUFF(
(
SELECT TOP 1000 ','+ CAST(SalesOrderDetailID AS VARCHAR(10))
FROM AdventureWorks.Sales.SalesOrderDetail
ORDER BY NEWID()
FOR XML PATH('')
)
,1,1,'');
--===== Show what we have for a parameter now...
PRINT @CSVList;
--=====================================================================================================================
-- Run the tests using Brad's code
--=====================================================================================================================
--===== Commence testing ...
declare @RunTimeStart datetime
select @RunTimeStart = getdate()
SET STATISTICS TIME,IO ON;
declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);
select @RunTimeStart = getdate()
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KL1(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
select datediff(ms,@RunTimeStart,getdate()) as 'RunTimeWithStats'
SET STATISTICS TIME,IO OFF;
select @RunTimeStart = getdate()
select @ProductID=d.ProductID
,@ProductName=p.Name
,@SalesOrderID=h.SalesOrderID
,@OrderDate=h.OrderDate
,@TerritoryName=t.Name
from dbo.Split8KL1(@CSVList,',') a
join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID
join AdventureWorks.Production.Product p on d.ProductID=p.ProductID
join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID
join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID
order by d.ProductID,h.SalesOrderID ;
select datediff(ms,@RunTimeStart,getdate()) as 'RunTimeNoStats'
go
--=====================================================================================================================
-- Houskeeping
--=====================================================================================================================
--===== Do this in a nice safe place that everyone has
USE tempdb;
GO
--===== Drop all the object that we just created
DROP FUNCTION dbo.Split8KL1;
DROP TABLE dbo.Tally;
GO
ROLLBACK
September 23, 2010 at 3:57 am
Dave Ballantyne (9/23/2010)
Im lumping the multi-statement TVF in with scalar functions, my bad terminology. My 'mental' division is "Multi Statement Function" = Scalar = Bad , "Inline Function " = One optimize-able statement = good
Right, I see what you're getting at now.
Yes, a loop inside a multi-statement TVF will produce much the same effect as calling a scalar function many times. It's the same effect you get by running a trace that captures the SP:StmtCompleted event in a trace - the WHILE loop executes many times and each statement is traced separately.
I'm a bit surprised you chose the single-row input test to illustrate the point with - I would have thought the effect would be much more pronounced on the 10,000 row original script?
Also interesting: If you run the scripts while tracing SP:StmtCompleted for the session id in question, the runtime with stats is considerable less than running with stats off.
September 23, 2010 at 5:11 am
Jeff Moden (9/22/2010)
steve-893342 (9/22/2010)
Jeff Moden (9/22/2010)
The problem with the Tally Table related splits is that I forgot :blush: to add 1 to the length....WHERE N BETWEEN 1 AND LEN(@Parameter)+1
Shouldn't that be
WHERE N < DATALENGTH(@Parameter) + 2
I don't believe we need the +2 but we do need DATALENGTH if we need to use a space as a delimiter as Paul suggests.
You will need to use the +2 to ensure the last element is returned in all cases
VARCHAR version
WHERE N < DATALENGTH(@Parameter) + 2
NVARCHAR version
WHERE N < DATALENGTH(@Parameter)/2 + 2
September 23, 2010 at 8:43 pm
steve-893342 (9/23/2010)
Jeff Moden (9/22/2010)
steve-893342 (9/22/2010)
Jeff Moden (9/22/2010)
The problem with the Tally Table related splits is that I forgot :blush: to add 1 to the length....WHERE N BETWEEN 1 AND LEN(@Parameter)+1
Shouldn't that be
WHERE N < DATALENGTH(@Parameter) + 2
I don't believe we need the +2 but we do need DATALENGTH if we need to use a space as a delimiter as Paul suggests.
You will need to use the +2 to ensure the last element is returned in all cases
VARCHAR version
WHERE N < DATALENGTH(@Parameter) + 2
NVARCHAR version
WHERE N < DATALENGTH(@Parameter)/2 + 2
Ah... thanks for watching out for me but not true. Only DATALENGTH is doing its thing at the Byte level. Substring and "N" are working at the character level. See for yourself in the following 3 examples...
--===== Simulate a passed parameter
DECLARE @Parameter NVARCHAR(4000);
SET @Parameter = N'Element01,Element02,Element03,Element04,Element05,';
--===== Optimized ===================================================================
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(N',',@Parameter+N',',N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND DATALENGTH(@Parameter)/2+1
AND SUBSTRING(N','+@Parameter,N,1) = N','; --Notice how we find the comma
GO
--===== Simulate a passed parameter
DECLARE @Parameter NVARCHAR(4000);
SET @Parameter = N'Element01,Element02,Element03,Element04,Element05,a';
--===== Optimized ===================================================================
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(N',',@Parameter+N',',N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND DATALENGTH(@Parameter)/2+1
AND SUBSTRING(N','+@Parameter,N,1) = N','; --Notice how we find the comma
GO
--===== Simulate a passed parameter
DECLARE @Parameter NVARCHAR(4000);
SET @Parameter = N'Element01,Element02,Element03,Element04,Element05,ab';
--===== Optimized ===================================================================
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(N',',@Parameter+N',',N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND DATALENGTH(@Parameter)/2+1
AND SUBSTRING(N','+@Parameter,N,1) = N','; --Notice how we find the comma
GO
--Jeff Moden
Change is inevitable... Change for the better is not.
September 23, 2010 at 9:51 pm
Here are the results for the two test runs that I think you were looking for. This is from SQL Server 2008 R2 CU3, on my personal play server at home. (If I had it to do over again I would have built it better for SQL Server.)
Sorry for coming in so late, but since I didn't see a lot of R2 results I thought you might be interested. (I'm curious how my performance compares to the rest, but I haven't had a chance to look yet.)
Personally I have been fighting the trace flag 4199 issues. (Both with SQL Server 2005 SP3 CU9 and SQL Server 2008 R2 CU3.) Microsoft recommends it to fix things and turn on some 'optimizations' but I am finding some queries that go from just over 2 minutes to almost 5 hours when the flag is turned on.
September 24, 2010 at 12:38 am
UMG Developer (9/23/2010)
Personally I have been fighting the trace flag 4199 issues. (Both with SQL Server 2005 SP3 CU9 and SQL Server 2008 R2 CU3.) Microsoft recommends it to fix things and turn on some 'optimizations' but I am finding some queries that go from just over 2 minutes to almost 5 hours when the flag is turned on.
Fighting the trace flag issues?! :unsure:
TF 4199 enables all QO updates that might cause a different plan to be produced. There is a separate trace flag 41xx for all the individual QO fixes.
Microsoft deliberately make plan-affecting CU optimizer updates an opt-in issue (by requiring you to set a trace flag). Generally, you should only apply flags for QO fixes you have experienced and want to avoid, after testing. Applying 4199 without a full regression test seems extremely unwise to me.
It seems much more sensible to only apply trace flags for specific fixes that solve a problem you have directly experienced. Otherwise, wait for the next Service Pack or SQL11 - both of which will have been more fully regression tested.
More info: http://support.microsoft.com/kb/974006/
Paul
edit: corrected link
September 24, 2010 at 12:53 am
Jeff Moden (9/23/2010)
steve-893342 (9/23/2010)
Jeff Moden (9/22/2010)
steve-893342 (9/22/2010)
Jeff Moden (9/22/2010)
The problem with the Tally Table related splits is that I forgot :blush: to add 1 to the length....WHERE N BETWEEN 1 AND LEN(@Parameter)+1
Shouldn't that be
WHERE N < DATALENGTH(@Parameter) + 2
I don't believe we need the +2 but we do need DATALENGTH if we need to use a space as a delimiter as Paul suggests.
You will need to use the +2 to ensure the last element is returned in all cases
VARCHAR version
WHERE N < DATALENGTH(@Parameter) + 2
NVARCHAR version
WHERE N < DATALENGTH(@Parameter)/2 + 2
Ah... thanks for watching out for me but not true. Only DATALENGTH is doing its thing at the Byte level. Substring and "N" are working at the character level. See for yourself in the following 3 examples...
--===== Simulate a passed parameter
DECLARE @Parameter NVARCHAR(4000);
SET @Parameter = N'Element01,Element02,Element03,Element04,Element05,';
--===== Optimized ===================================================================
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(N',',@Parameter+N',',N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND DATALENGTH(@Parameter)/2+1
AND SUBSTRING(N','+@Parameter,N,1) = N','; --Notice how we find the comma
GO
--===== Simulate a passed parameter
DECLARE @Parameter NVARCHAR(4000);
SET @Parameter = N'Element01,Element02,Element03,Element04,Element05,a';
--===== Optimized ===================================================================
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(N',',@Parameter+N',',N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND DATALENGTH(@Parameter)/2+1
AND SUBSTRING(N','+@Parameter,N,1) = N','; --Notice how we find the comma
GO
--===== Simulate a passed parameter
DECLARE @Parameter NVARCHAR(4000);
SET @Parameter = N'Element01,Element02,Element03,Element04,Element05,ab';
--===== Optimized ===================================================================
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(N',',@Parameter+N',',N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND DATALENGTH(@Parameter)/2+1
AND SUBSTRING(N','+@Parameter,N,1) = N','; --Notice how we find the comma
GO
Don't know what you mean. Unless BETWEEN is more efficient here, you just need the LESS THAN
--===== Simulate a passed parameter
DECLARE @Parameter NVARCHAR(4000);
SET @Parameter = N'Element01,Element02,Element03,Element04,Element05,';
--===== Optimized ===================================================================
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(N',',@Parameter+N',',N)-N) AS ItemValue
FROM dbo.Tally
-- WHERE N BETWEEN 1 AND DATALENGTH(@Parameter)/2+1
WHERE N < DATALENGTH(@Parameter)/2 + 2
AND SUBSTRING(N','+@Parameter,N,1) = N','; --Notice how we find the comma
GO
--===== Simulate a passed parameter
DECLARE @Parameter NVARCHAR(4000);
SET @Parameter = N'Element01,Element02,Element03,Element04,Element05,a';
--===== Optimized ===================================================================
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(N',',@Parameter+N',',N)-N) AS ItemValue
FROM dbo.Tally
-- WHERE N BETWEEN 1 AND DATALENGTH(@Parameter)/2+1
WHERE N < DATALENGTH(@Parameter)/2 + 2
AND SUBSTRING(N','+@Parameter,N,1) = N','; --Notice how we find the comma
GO
--===== Simulate a passed parameter
DECLARE @Parameter NVARCHAR(4000);
SET @Parameter = N'Element01,Element02,Element03,Element04,Element05,ab';
--===== Optimized ===================================================================
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(N',',@Parameter+N',',N)-N) AS ItemValue
FROM dbo.Tally
-- WHERE N BETWEEN 1 AND DATALENGTH(@Parameter)/2+1
WHERE N < DATALENGTH(@Parameter)/2 + 2
AND SUBSTRING(N','+@Parameter,N,1) = N','; --Notice how we find the comma
GO
September 24, 2010 at 10:31 am
Gentlemen, a thank you for doing this entire thread publically. I'm relatively sure my brain is attempting an escape as I try to keep up with your experience and knowledge, but I'm learning a lot of little intricate things that rarely come up otherwise, just following the conversation.
I'm sorry I don't have more to contribute other than another box for a test shell, but I just wanted to thank you.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
September 24, 2010 at 12:48 pm
Paul White NZ (9/24/2010)
TF 4199 enables all QO updates that might cause a different plan to be produced. There is a separate trace flag 41xx for all the individual QO fixes.Microsoft deliberately make plan-affecting CU optimizer updates an opt-in issue (by requiring you to set a trace flag). Generally, you should only apply flags for QO fixes you have experienced and want to avoid, after testing. Applying 4199 without a full regression test seems extremely unwise to me.
It seems much more sensible to only apply trace flags for specific fixes that solve a problem you have directly experienced. Otherwise, wait for the next Service Pack or SQL11 - both of which will have been more fully regression tested.
The whole point of 4199 is to get a different plan for queries to fix bugs and performance regressions. I do have issues that I am trying to resolve where Microsoft says I have to turn 4199 on, and I am testing it before putting it into production.
I can't find any documentation about individual trace flags for current fixes:
In this link for a particular fix Microsoft says you have to turn 4199 on: http://blogs.msdn.com/b/psssql/archive/2010/09/01/slow-query-performance-because-inaccurate-cardinality-estimate-when-using-anti-semi-join-following-upgrade-from-sql-server-2000.aspx
In this blog posting Microsoft is talking about changes they made to the BPA that tells people to turn 4199 on to resolve issues: http://blogs.msdn.com/b/psssql/archive/2010/08/24/why-use-sql-server-2008-r2-bpa-case-1-missing-updates.aspx
You have to use 4199 to fix the MIN/MAX performance problem: http://support.microsoft.com/kb/973255/EN-US
And for CTE problems: http://support.microsoft.com/kb/976935
And for partitioning problems: http://support.microsoft.com/kb/982441/EN-US
And for free-text issues: http://support.microsoft.com/kb/2264562/en-US
And for CHANGETABLE issues: http://support.microsoft.com/kb/2276330
And finally the one that spells out the issue: http://support.microsoft.com/kb/974006/ "This one trace flag can be used to enable all the fixes that were previously made for the query processor under many trace flags. In addition, all future query processor fixes will be controlled by using this trace flag."
So as far as I can tell 4199 is the only trace flag to use going forward. (Assuming you need any fix released after trace flag 4133. If you have other information that lists the individual trace flags for the newer fixes that would be great as it would help me narrow down the problem for my report to Microsoft, but so far Microsoft has told me to send them a bunch of information, mainly execution plans with/with-out 4199, so that they can research what fix is causing the issue.
September 24, 2010 at 10:33 pm
UMG Developer (9/24/2010)
So as far as I can tell 4199 is the only trace flag to use going forward. (Assuming you need any fix released after trace flag 4133)
Hmm, for some reason the link in my previous post ended up as a link to the whole support.microsoft.com site - it was intended to be a link to KB 974006, which lists the individual trace flags. I have corrected the link now. I hadn't noticed that the practice had been discontinued in favour of 4199 after 4133. Sorry about that. Not having separate trace flags seems a backward step to me.
Viewing 15 posts - 181 through 195 (of 214 total)
You must be logged in to reply to this topic. Login to reply