April 6, 2006 at 6:34 pm
They do.
But LEN ignores trailing spaces.
_____________
Code for TallyGenerator
April 6, 2006 at 6:48 pm
Sorry but this was the post:
LEN(RTRIM(LTRIM(ERR))) > 0
is equal to
LEN(ERR) > 0
and that is not correct. Len ignores trailing spaces but not the begining ones.
select Len ( ' Space in the begining '), Len (rtrim(ltrim(' Space in the begining ')))
returns : 23, 21
datalength is a completly different function
* Noel
April 6, 2006 at 8:43 pm
noeld,
when LEN(Err) = 0 there are no beginning spaces.
All spaces are trailing.
Otherwise LEN gonna be > 0
In this task we don't care about not empty strings, string is not empty no matter is it 21 or 23 charachters long.
So,
LEN(RTRIM(LTRIM(ERR))) > 0
is equal to
LEN(ERR) > 0
and that is correct.
_____________
Code for TallyGenerator
April 6, 2006 at 8:56 pm
declare @ERR char(20)
set @ERR = ' '
select 1
where @ERR > ''
select 1
where len(@ERR) > 0
It does not return anything. No matter that @ERR has actually 20 spaces inside.
_____________
Code for TallyGenerator
April 7, 2006 at 3:39 am
Am I correct in saying that you're looking for records where err is not full full of spaces?
Put an index on the err column and try this
SELECT <fields> FROM <table> WHERE Err != SPACE(255)
Any form of function will prevent index use. the column can't be null and since it's a char it can't be an empty string, but will be always padded out to 255 char. The only way is can be empty, if I understand properly, is if all 255 characters are spaces, hence the comparison with SPACE(255).
Does this do what you need?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 7, 2006 at 6:13 am
First off, thanks for all the replies.
The row count is somewhere near 5,000,000-7,000,000. About 100,000 total will be updated daily. The LEN(RTRIM(LTRIM(ERR))) > 0 will be evaluated probably about 5,000 times a day, maybe more. Allowing nulls to speed up the evaluation, but slowing down the updates is not a fair trade.
All I really need to check for is that if the field has any text in it. If it has any text in it besides spaces, then I assume that it contains a error message for the row. So I don't care if the spaces are before or after the text, just if there is any text in the message.
So do we have anything that's faster than RTRIM(ERR)>''?
April 7, 2006 at 6:18 am
Whether the column allows NULLs or not, the following will produce returns only if the column is NOT NULL and the column has something other than all blanks regardless of the number of blanks... In other words, the following is equivelent to saying "Find everything that is not null and is not blank"...
SELECT * FROM yourtable WHERE yourcolumnname > ''
Because there is no calculation on "yourcolumnname", it is possible for an index to be used if one is available and the optimizer chooses to do so.
In other words, go back and look at Serqiy's post because he's right
--Jeff Moden
Change is inevitable... Change for the better is not.
April 7, 2006 at 6:25 am
So do we have anything that's faster than RTRIM(ERR)>''?
Did you read my last post&
WHERE Err>''
is all you need.
_____________
Code for TallyGenerator
April 7, 2006 at 6:58 am
Is '5 spaces'>'' ?
Can you store '5 spaces' in a varchar, or will it just reduce it to ''?
I am contenplating converting the column to varchar to save space, but I want to make sure it won't mess up this comparison.
April 7, 2006 at 8:41 am
if we are talking about 5 - 7 millions I still think that the calculated column (with an index) will be a lot faster for two reasons.
1.The value is an integer (could be a tinyint) therefore fast searches can be performed
2. index width is very narrow storage requirement for the index is a lot less than what char or varchar equivqlent will offer.
Just make sure that on the query plan that index is used.
Can you store '5 spaces' in a varchar, or will it just reduce it to ''?
it will treat them as '' in a comparison with the standard connection defaults.
* Noel
April 7, 2006 at 11:03 pm
They say that one test is worth a thousand calculations and a million guesses... Noeld is correct. But don't take my word for it... I expanded my normally well behaved 1-million row test into a nasty 10-million row test and here's the results (see bold areas)...
Creating a 10-million row test table
(10000000 row(s) affected)
Adding the calculated column Adding the primary key Adding an index to the calculated column Wounding the "Kicks" column with spaces
(1000000 row(s) affected)
Add the index on the "Kicks" column Heres the stats on how many different groups of spacing there are String NumberOfSpaces Length NumberPresent -------------------------------- -------------- ----------- ------------- || 0 0 99887 | | 1 0 100226 | | 2 0 99861 | | 3 0 99682 | | 4 0 99838 | | 5 0 99906 | | 6 0 100278 | | 7 0 100075 | | 8 0 100235 | | 9 0 100012
(10 row(s) affected)
============================================================================== DBCC execution completed. If DBCC printed error messages, contact your system administrator. DBCC execution completed. If DBCC printed error messages, contact your system administrator. ----------- 9000000
(1 row(s) affected)
9.826 Seconds for CalcCol > 0 ============================================================================== DBCC execution completed. If DBCC printed error messages, contact your system administrator. DBCC execution completed. If DBCC printed error messages, contact your system administrator. ----------- 9000000
(1 row(s) affected)
21.486 Seconds for LEN(Kicks) > 0 ============================================================================== DBCC execution completed. If DBCC printed error messages, contact your system administrator. DBCC execution completed. If DBCC printed error messages, contact your system administrator. ----------- 9000000
(1 row(s) affected)
20.500 Seconds for Kicks > '' ============================================================================== ============================================================================== DBCC execution completed. If DBCC printed error messages, contact your system administrator. DBCC execution completed. If DBCC printed error messages, contact your system administrator. ----------- 9000000
(1 row(s) affected)
22.486 Seconds for LEN(LTRIM(RTRIM(Kicks))) > 0
Here's the code to make the test table (note that the "Kicks" column is the column under test and it's setup as a VARCHAR column), but before you decide to run it on a production box, be advised....
DO NOT RUN THE FOLLOWING CODE ON A PRODUCTION SERVER!!!
THE FOLLOWING CODE USES A HIGHLY AGRESSIVE TRIPLE CROSS-JOIN TO CREATE THE ROWS IN THE TEST TABLE THAT WILL CAUSE 1 OR MORE CPU'S TO MAX OUT AND TAKES NEARLY 100% OF THE HARD-DISK BUSS BANDWIDTH FOR APPROXIMATELY 6.5 MINUTES AND IT DOES SO IN AN EXCLUSIVE MANNER... NO ONE ELSE CAN GET IN OR DO ANYTHING FOR THAT TIME PERIOD. FURTHER, THE TEST TABLE WILL CONSUME 1.7 GIGA-BYTES OF HARD-DISK SPACE AND THE INDEXES WILL CONSUME 0.7 GIGA-BYTES OF HARD-DISK SPACE.
WARNING! WARNING! WARNING! WARNING! WARNING!
DO NOT RUN THE FOLLOWING CODE ON A PRODUCTION SERVER!!!
Here's the code to create the test table, indexes, calculated column, etc... do be careful...
PRINT 'Creating a 10-million row test table' GO
--===== Create a populate a 10-million row test table -- DO NOT RUN THIS CODE ON A PRODUCTION SERVER!!!! SELECT TOP 10000000 IDENTITY(INT,1,1) AS RowNum, CAST(RAND(CAST(NEWID() AS VARBINARY))*50000+1 AS INT) AS UserID, --1 thru 50000 randomly CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)) + CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)) AS SomeValue, --'AA' thru 'ZZ' randomly CAST(' A column for kicks ' AS VARCHAR(30)) AS Kicks, 'Still another column just for proofing' AS StillAnother, CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS MONEY) AS SomeNumber, --0.0000 thru 99.9999 randomly CAST(RAND(CAST(NEWID() AS VARBINARY))*3653.0+36524.0 AS DATETIME) AS ADate --(>=01/01/2000 <01/01/2010) INTO dbo.BigTest FROM Master.dbo.SysColumns sc1, Master.dbo.SysColumns sc2, Master.dbo.SysColumns sc3
PRINT 'Adding the calculated column' GO --===== Add a calculated column ALTER TABLE dbo.BigTest ADD CalcCol AS LEN(Kicks)
PRINT 'Adding the primary key' GO --===== A table is not properly formed unless a Primary Key has been assigned ALTER TABLE dbo.BigTest ADD PRIMARY KEY NONCLUSTERED (RowNum)
PRINT 'Adding an index to the calculated column' GO --===== Add an index to the calculated column CREATE INDEX IX_BigTest_CalcCol ON dbo.BigTest (CalcCol)
PRINT 'Wounding the "Kicks" column with spaces' GO --===== Change every 10th row to have a random number of spaces from 0 to 9 UPDATE dbo.BigTest SET Kicks = SPACE(CAST(RAND(CAST(NEWID() AS VARBINARY))*10 AS INT)) WHERE RowNum%10 = 0
PRINT 'Add the index on the "Kicks" column' --===== Add an index to the "Kicks" column (just for "Kicks" and to demo indexing) CREATE INDEX IX_BigTest_Kicks ON dbo.BigTest (Kicks)
PRINT 'Heres the stats on how many different groups of spacing there are' GO --===== Prove the spaces are there including "0" spaces which is an empty string SELECT '|'+Kicks+'|' AS String, LEN('|'+Kicks+'|')-2 AS NumberOfSpaces, LEN(Kicks) AS Length, COUNT(*) AS NumberPresent FROM dbo.BIGTEST WHERE RowNum%10 = 0 GROUP BY '|'+Kicks+'|',LEN('|'+Kicks+'|')-2,LEN(Kicks) ORDER BY '|'+Kicks+'|' DESC GO
... and here's the code that does the actual testing...
/*************************************************************************************************/ --===== Create some performance measuring variables DECLARE @StartTime DATETIME --Start of server duration time for run DECLARE @EndTime DATETIME --Start of server duration time for run
--================================================================================================= --===== Clear the cache, do a measured run for CalcCol > 0 PRINT REPLICATE('=',78) DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE SET @StartTime = GETDATE() ------------------------------- SELECT COUNT(*) FROM dbo.BigTest WHERE CalcCol > 0 ------------------------------- SET @EndTime = GETDATE() PRINT STR(DATEDIFF(ms,@StartTime,@EndTime)/1000.0,10,3) + ' Seconds for CalcCol > 0'
--================================================================================================= --===== Clear the cache, do a measured run for LEN(Kicks) > 0 PRINT REPLICATE('=',78) DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE SET @StartTime = GETDATE() ------------------------------- SELECT COUNT(*) FROM dbo.BigTest WHERE LEN(Kicks) > 0 ------------------------------- SET @EndTime = GETDATE() PRINT STR(DATEDIFF(ms,@StartTime,@EndTime)/1000.0,10,3) + ' Seconds for LEN(Kicks) > 0'
--================================================================================================= --===== Clear the cache, do a measured run for Kicks > '' PRINT REPLICATE('=',78) DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE SET @StartTime = GETDATE() ------------------------------- SELECT COUNT(*) FROM dbo.BigTest WHERE Kicks > '' ------------------------------- SET @EndTime = GETDATE() PRINT STR(DATEDIFF(ms,@StartTime,@EndTime)/1000.0,10,3) + ' Seconds for Kicks > ''''' PRINT REPLICATE('=',78)
--================================================================================================= --===== Clear the cache, do a measured run for LEN(LTRIM(RTRIM(Kicks))) > 0 PRINT REPLICATE('=',78) DBCC DROPCLEANBUFFERS DBCC FREEPROCCACHE SET @StartTime = GETDATE() ------------------------------- SELECT COUNT(*) FROM dbo.BigTest WHERE LEN(LTRIM(RTRIM(Kicks))) > 0 ------------------------------- SET @EndTime = GETDATE() PRINT STR(DATEDIFF(ms,@StartTime,@EndTime)/1000.0,10,3) + ' Seconds for LEN(LTRIM(RTRIM(Kicks))) > 0'
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 12 posts - 16 through 26 (of 26 total)
You must be logged in to reply to this topic. Login to reply