May 1, 2009 at 9:40 am
Flo, are you getting your reads from SET STATISTICS IO ON? IIRC UDFs don't report IO correctly in this manner. Try getting them with Profiler.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
May 1, 2009 at 9:45 am
TheSQLGuru (5/1/2009)
Flo, are you getting your reads from SET STATISTICS IO ON? IIRC UDFs don't report IO correctly in this manner. Try getting them with Profiler.
Depending on how they are used, aren't In-Line TVF expanded before execution? This particular In-Line TVF doesn't even access any tables, and everything should be small enough that it could (should) easily be done in memory. Please note, I'm not an expert on the internals of SQL Server, in fact I'm still learning a lot.
May 1, 2009 at 9:48 am
Lynn Pettis (5/1/2009)
Ah, padawan, set-based solution to populate your table I created, hmm.Look for the good in the bad, you may see it.
Amusing!
Yes I did get that - but I was hoping to avoid drawing attention to the fact that the CPU and IO involved in setting up my 'custom tally' was separated out from the core of the routine I posted. Including the CPU, even through cunning use of a constant scan, would make it look slightly less impressive :hehe:
I went on to think how I might expand it to a set-based solution.
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 1, 2009 at 9:49 am
TheSQLGuru (5/1/2009)
Flo, are you getting your reads from SET STATISTICS IO ON? IIRC UDFs don't report IO correctly in this manner. Try getting them with Profiler.
Good point. Thanks!
Just tried with Profiler:
Description CPU Reads Duration
----------- --- ----- --------
Paul 0 3 0
Jeff 0 4 0
Lynn 0 0 0
... not kidding
Greets
Flo
May 1, 2009 at 9:51 am
Lynn Pettis (5/1/2009)
TheSQLGuru (5/1/2009)
Flo, are you getting your reads from SET STATISTICS IO ON? IIRC UDFs don't report IO correctly in this manner. Try getting them with Profiler.Depending on how they are used, aren't In-Line TVF expanded before execution? This particular In-Line TVF doesn't even access any tables, and everything should be small enough that it could (should) easily be done in memory.
Exactly correct, though I can't think of a time when an in-line TVF wasn't inserted directly into the query plan.
There may be examples, but I got (0 row(s) affected) from my internal query.
The biggest cost with UNION ALL constants is compilation time - scale the constant scan up and compilation time can run into the minutes.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 1, 2009 at 9:53 am
Dear oh dear.
I have *just* realized the point of Lynn's solution!
It avoids the I/O on the custom tally table completely.
Not my finest moment.
Luckily it won't be available on the internet for all eternity...oh wait
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 1, 2009 at 10:27 am
I've just been running some tests on the original problem and hit a curiosity of the Replace function. (I was unsuccessfully trying to get a winning UPDATE solution)
It is used a lot for getting the number of instances of a substring within a string. God knows, I use it myself for doing line counts. It doesn't work in some instances. I feel sure I'm doing something stupid, but what? (Surely, The result should always have no rows? I get five)
[font="Courier New"]SELECT [name] FROM
(
SELECT TOP(200) o.name, m.definition
FROM MASTER.sys.all_objects o
JOIN MASTER.sys.all_sql_modules m ON o.OBJECT_ID = m.OBJECT_ID
WHERE TYPE = 'P'
) f
WHERE --get the count using the Replace trick
(LEN(definition)- LEN(REPLACE(definition,CHAR(13)+CHAR(10),'')))/2
<> --use a tally table
(SELECT COUNT(*) FROM (SELECT SUBSTRING(definition,number,2)AS 'ch' FROM numbers WHERE number<=LEN(definition)) f
WHERE ch=CHAR(13)+CHAR(10))
[/font]
Best wishes,
Phil Factor
May 1, 2009 at 11:07 am
I'll take a look, Paul. Thanks.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 1, 2009 at 11:14 am
Florian Reischl (5/1/2009)
TheSQLGuru (5/1/2009)
Flo, are you getting your reads from SET STATISTICS IO ON? IIRC UDFs don't report IO correctly in this manner. Try getting them with Profiler.Good point. Thanks!
Just tried with Profiler:
Description CPU Reads Duration
----------- --- ----- --------
Paul 0 3 0
Jeff 0 4 0
Lynn 0 0 0
... not kidding
Greets
Flo
Ok... I've lost it... what are you measuring here? The code to gen 256 characters?
--Jeff Moden
Change is inevitable... Change for the better is not.
May 1, 2009 at 11:17 am
Jeff Moden (5/1/2009)
Florian Reischl (5/1/2009)
TheSQLGuru (5/1/2009)
Flo, are you getting your reads from SET STATISTICS IO ON? IIRC UDFs don't report IO correctly in this manner. Try getting them with Profiler.Good point. Thanks!
Just tried with Profiler:
Description CPU Reads Duration
----------- --- ----- --------
Paul 0 3 0
Jeff 0 4 0
Lynn 0 0 0
... not kidding
Greets
Flo
Ok... I've lost it... what are you measuring here? The code to gen 256 characters?
Who knows, Jeff. To me it just looks like we went off on some kind of tangent. But then that seems to happen all the time here on SSC, but it generates a lot of interesting conversations and tidbits of knowledge (useful or not is still to be determined).
May 1, 2009 at 11:35 am
Jeff Moden (5/1/2009)
Ok... I've lost it... what are you measuring here? The code to gen 256 characters?
The 'A & B Cleaners' thing you posted.
It was a hyperbole-ic tangent.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 1, 2009 at 11:41 am
Jeff Moden (5/1/2009)
Florian Reischl (5/1/2009)
TheSQLGuru (5/1/2009)
Flo, are you getting your reads from SET STATISTICS IO ON? IIRC UDFs don't report IO correctly in this manner. Try getting them with Profiler.Good point. Thanks!
Just tried with Profiler:
Description CPU Reads Duration
----------- --- ----- --------
Paul 0 3 0
Jeff 0 4 0
Lynn 0 0 0
... not kidding
Greets
Flo
Ok... I've lost it... what are you measuring here? The code to gen 256 characters?
Oups, sorry. :Whistling:
Just tried the three versions (I removed mine) to clean-up a 256 char string.
[font="Courier New"]--== Helper table for Pauls version
----DROP TABLE dbo.UnwantedChars
IF (OBJECT_ID('dbo.UnwantedChars') IS NULL)
BEGIN
CREATE TABLE dbo.UnwantedChars (BadChar VARCHAR(1) NOT NULL);
-- Populate a table with the chars we want to remove
DECLARE @char_code INT,
@char VARCHAR(1);
SET @char_code = -1
WHILE @char_code <= 255
BEGIN
SELECT @char_code = @char_code + 1,
@char = CHAR(@char_code);
INSERT dbo.UnwantedChars(BadChar)
SELECT @char
WHERE PATINDEX('[^A-Z]', @char) = 1;
END;
END
GO
--==== Paul ======================
PRINT '-- Paul''s Version'
DECLARE @CompanyName VARCHAR(256)
SET @CompanyName = 'A & B Cleaners';
UPDATE dbo.UnwantedChars
SET @CompanyName = CONVERT(VARCHAR(256), REPLACE(@CompanyName, BadChar, ''))
WHERE CHARINDEX(BadChar, @CompanyName) != 0;
PRINT @CompanyName
GO
--==== Jeff ======================
PRINT '-- Jeff''s Version'
DECLARE @CompanyName VARCHAR(256)
SELECT @CompanyName = 'A & B Cleaners'
-- Do the clean-up
SELECT @CompanyName = STUFF(@CompanyName,PATINDEX('%[^A-Z]%',@CompanyName),1,'')
FROM dbo.Tally t
WHERE t.N <= LEN(@CompanyName)
AND SUBSTRING(@CompanyName,t.N,1) LIKE '[^A-Z]'
PRINT @CompanyName
GO
--==== Lynn ======================
PRINT '-- Lynn''s Version'
DECLARE @CompanyName VARCHAR(256)
SET @CompanyName = 'A & B Cleaners';
; WITH
Tally2 (N) AS ( SELECT 1 UNION ALL SELECT 2 ),
Tally4 (N) AS ( SELECT t1.N FROM Tally2 t1, Tally2 t2 ),
Tally16 (N) AS ( SELECT t1.N FROM Tally4 t1, Tally4 t2 ),
Tally256 (N) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) FROM Tally16 t1, Tally16 t2 )
SELECT @CompanyName = STUFF(@CompanyName,PATINDEX('%[^A-Z]%',@CompanyName),1,'')
FROM Tally256 t
WHERE t.N <= LEN(@CompanyName)
AND SUBSTRING(@CompanyName,t.N,1) LIKE '[^A-Z]'
PRINT @CompanyName
[/font]
Greets
Flo
May 1, 2009 at 12:03 pm
Phil Factor (5/1/2009)
Jeff,...but their developers seem unable to fix things so they work like they used to.
Ouch.
They're doing all they can to put in a better solution ASAP. The problem was that it became impossible to improve the browser-side 'prettifier' without a lot of effort (Javascript regex solutions can only do so much) and they are planning on using a vastly better server-side solution. They're working on it urgently at this very moment.
You can always use my Prettifier in the meantime!
Heh... understood and sorry for the pork chop. But I sure wouldn't have released it without testing it for all the mistakes the code windows currently enjoy. Why not just put things back the way they were until you can get it hammered out because, right now, doing my own formating or using the prettifier sure does slow down the works.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 1, 2009 at 7:02 pm
Paul White (5/1/2009)
Dear oh dear.I have *just* realized the point of Lynn's solution!
It avoids the I/O on the custom tally table completely.
Not my finest moment.
Luckily it won't be available on the internet for all eternity...oh wait
Flo actually used a permutation of Itzik's method. It will quite handily build a billion row table without driving the log file nuts like most every other method will.
A word of caution though... test it on more than just one string... I've found that it's relatively slow (still fast as hell) compared to some of the other methods until you get up into some of the larger numbers.
[font="Courier New"]
DECLARE @BitBucket INT
DECLARE @TestSize INT
SET @TestSize = 1000000
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',100) ---------------------------------------------------------
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
PRINT '===== Matt Miller''s Method ====='
SET STATISTICS IO ON
SET STATISTICS TIME ON
;with cte1 (N) as (select 1 union ALL select 2), --2
cte2 as (select c1.N from cte1 c1, cte1 c2, cte1 c3, cte1 c4), --16
cte3 as (select c1.N from cte2 c1, cte2 c2, cte2 c3, cte2 c4), --65536
cte4(n) as (select null from cte3 c1,cte2 c2,cte2 c3)
select top (@TestSize)
@Bitbucket = row_number() over (order by N)
from CTE4
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',100) ---------------------------------------------------------
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
PRINT '===== Itzek''s Method ====='
SET STATISTICS IO ON
SET STATISTICS TIME ON
; WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows
Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY C) AS n FROM L5)
SELECT top (@TestSize) @BitBucket = N FROM Nums
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',100) ---------------------------------------------------------
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
PRINT '===== Jeff Moden''s Method' --Fastest overall...
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @SomeConstant INT
SET @SomeConstant = 1
; WITH cTally AS
(-----------------------------------------------------------------------------
SELECT TOP (@TestSize)
ROW_NUMBER() OVER (ORDER BY @SomeConstant) AS N
FROM Master.sys.SysColumns t1
CROSS JOIN Master.sys.SysColumns t2
)-----------------------------------------------------------------------------
SELECT @BitBucket = N FROM cTally
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',100) ---------------------------------------------------------
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
declare @root int
select @root = sqrt(@testsize)+1
PRINT '===== RBarryYoung''s Method'
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH cTally AS
(-----------------------------------------------------------------------------
SELECT TOP (@TestSize)
ROW_NUMBER() OVER (ORDER BY t1.ID) AS N
FROM (Select TOP (@root) ID from Master.sys.SysColumns) t1
CROSS JOIN (Select TOP (@root) ID from Master.sys.SysColumns) t2
)-----------------------------------------------------------------------------
SELECT @BitBucket = N FROM cTally
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',100) ---------------------------------------------------------
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
PRINT '===== Combined Method'
SET STATISTICS IO ON
SET STATISTICS TIME ON
--DECLARE @SomeConstant INT
-- SET @SomeConstant = 1
; WITH cTally AS
(-----------------------------------------------------------------------------
SELECT TOP (@TestSize)
ROW_NUMBER() OVER (ORDER BY @SomeConstant) AS N
FROM (Select TOP (@root) ID from Master.sys.SysColumns) t1
CROSS JOIN (Select TOP (@root) ID from Master.sys.SysColumns) t2
)-----------------------------------------------------------------------------
SELECT @BitBucket = N FROM cTally
SET STATISTICS TIME OFF
SET STATISTICS IO OFF[/font]
Here's the cleaned up results...
[font="Courier New"]====================================================================================================
===== Matt Miller's Method =====
SQL Server Execution Times:
CPU time = 907 ms, elapsed time = 935 ms.
====================================================================================================
===== Itzek's Method =====
SQL Server Execution Times:
CPU time = 906 ms, elapsed time = 1014 ms.
====================================================================================================
===== Jeff Moden's Method
SQL Server Execution Times:
CPU time = 734 ms, elapsed time = 941 ms.
====================================================================================================
===== RBarryYoung's Method
SQL Server Execution Times:
CPU time = 750 ms, elapsed time = 887 ms.
====================================================================================================
===== Combined Method
SQL Server Execution Times:
CPU time = 766 ms, elapsed time = 861 ms.
[/font]
--Jeff Moden
Change is inevitable... Change for the better is not.
May 1, 2009 at 7:58 pm
Phil Factor (5/1/2009)
I've just been running some tests on the original problem and hit a curiosity of the Replace function. (I was unsuccessfully trying to get a winning UPDATE solution)It is used a lot for getting the number of instances of a substring within a string. God knows, I use it myself for doing line counts. It doesn't work in some instances. I feel sure I'm doing something stupid, but what? (Surely, The result should always have no rows? I get five)
[font="Courier New"]SELECT [name] FROM
(
SELECT TOP(200) o.name, m.definition
FROM MASTER.sys.all_objects o
JOIN MASTER.sys.all_sql_modules m ON o.OBJECT_ID = m.OBJECT_ID
WHERE TYPE = 'P'
) f
WHERE --get the count using the Replace trick
(LEN(definition)- LEN(REPLACE(definition,CHAR(13)+CHAR(10),'')))/2
<> --use a tally table
(SELECT COUNT(*) FROM (SELECT SUBSTRING(definition,number,2)AS 'ch' FROM numbers WHERE number<=LEN(definition)) f
WHERE ch=CHAR(13)+CHAR(10))
[/font]
NVARCHAR environment apparently throws LEN off when search for those 2 characters... trying to figure out why...
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 361 through 375 (of 522 total)
You must be logged in to reply to this topic. Login to reply