February 10, 2010 at 4:05 am
peter-757102 (2/10/2010)
CraigIW (2/10/2010)
nicholasw (2/10/2010)
Instead of:IF EXISTS(SELECT 1 FROM ......WHERE....)
Would the following be quicker still?
IF EXISTS(SELECT TOP 1 1 FROM ......WHERE....)
No, as the "top 1" is only done after the "select 1 from".
Funny enaugh, it is not always that simple. I seen cases where a:
select top 1 max(id)
outperformed
select max(id)
many times over, and I suspect top sometimes acts as a undocumented hint. I got some testing to do on it still, and if it pans out it might also be a trick to control table variables bestter by using top to provide an estimate other then the constant 1 and thus get better query plans.
Crazy! Good to know that though.
February 10, 2010 at 4:26 am
Well done-- very useful!
February 10, 2010 at 4:41 am
Catch all Lookup tables are the bane of my life.
I've lost count of how many times I've had to write queries which join to the same lookup table 10 or more times to retrieve some basic data items.
As your article points out, it makes enforcing business logic very difficult, sometimes these data values can be mandatory but there is no straight forward way of enforcing that.
It also hides the true data structure so when someone new to the database is trying to mine the information schema tables to try and find certain data items they find themselves unable to locate key columns which have been instead hidden into a lookup!
Finally, the performance hit. All the extra processing which has to go on to flatten the data from the lookup tables. Storage is cheap, CPU cycles wastes energy!
February 10, 2010 at 4:52 am
If you look throught he source of the built in system stored procs you will see quite a few cases of IF(SELECT COUNT(*)...)>0
Try sp_helptext 'sp_helptext' for an example!
February 10, 2010 at 6:02 am
"Catch all Lookup tables are the bane of my life.
I've lost count of how many times I've had to write queries which join to the same lookup table 10 or more times to retrieve some basic data items."
Iv come across the same, This becomes worse if proper documentation is not present.
But having lookup tables has its own advantages, There are scenarios when u r dealing with a huge DB and if its not normalized atleast to the 2nd level u face loads of performance related problems especially with data retrieval.
February 10, 2010 at 6:21 am
Ryan C. Price (2/10/2010)
Not that I want to encourage bad habits, but I seem to recall a discussion (I thought it was here @ SQL Server Central, but I can't find it) regarding COUNT(*) vs EXISTS, and tests were carried out (in SQL 2005 or 2008) where the compiler was smart enough to recognize that a programmer really intended to use EXISTS not COUNT(*), and so actually generated the same execution plan for both - the SQL Server engineering team compensating for the huge amount of abuse of COUNT(*) that goes on.
Depends.
IF (SELECT Count(*) FROM SomeTable) > 0
and
IF EXISTS (SELECT 1 FROM SomeTable)
are treated the same, because SQL can tell that all you want it to check for presence/absence of a row.
DECLARE @i int
SET @i = (SELECT count(*) FROM SomeTable);
IF @i >0
AND
IF EXISTS (SELECT 1 FROM SomeTable)
are not treated the same way.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 10, 2010 at 6:35 am
Thank you! I wish I would have read this 5 years ago. This article validates lessons I have learned the hard way. I am now in the process of overcoming problems I created with the original db design and this article hit on MOST of them!
So, what's a common best practice for dealing with older data that needs to get purged? I still need to get to it for comparative and historical purposes, but need to purge certain large tables (easily>2M Rows).
February 10, 2010 at 6:56 am
Loved the article. Wish I could be one of those nodding my head and saying, "yes, been there, done that, won't do it again." However, sadly I don't understand what the problem and the correct solutions in these cases...I need more help.
I have worked with two major systems written by other developers. Usually more in a support/configuration/customization role. I think both of them have the OTLT that you describe.
For the OTLT, one sytem had a code table with the primary key on the codeId, and codeType. Then there was one additional column - longName. There was a view created on each one of the codeTypes (I think this might have been done automatically) and any columns in the tables with names ending in Cd and matched a codeType. I think there were delete triggers on this table to search for any instances of a code before deleting them, but really the general rule was just to never delete a code. You would think very carefully before adding one and these were used for things that weren't likely to change overtime.
Is this what you are describing as a time bomb? What is the big problem with this? What is the correct way to do handle these look ups? Hundreds of small individual lookup tables?
Thank you very much for sharing your knowledge and experience!
February 10, 2010 at 7:00 am
Where's the 10 stars button for the vote??
February 10, 2010 at 7:21 am
February 10, 2010 at 7:49 am
Samuel Vella (2/10/2010)
Storage is cheap, CPU cycles wastes energy!
I take it you've never compared the power usage of even a small SAN to that of a midsized database server... Storage may be cheap to purchase, but it's expensive to run. I agree with your overall position, but generalization like that is dangerous.
February 10, 2010 at 8:05 am
"TOP" versus "SELECT 1" performance varies depending on many factors . One of them is then "where" clause is in query and it is the most frequent usage of TOP and , as prev comment mentioned make sense with “Order By” clause.
If TOP 1 clause used in following query :
select top 1 * from table_with_10_mln_rows where Record_Date > ‘01/01/2008’
The CPU time reported like this:
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
The other option with select 1 from table_with_10_mln_rows where Record_Date > ‘01/01/2008’
The CPU time reported like this:
SQL SERVER EXECUTION TIMES:
CPU TIME = 10 MS, ELAPSED TIME = 26 MS.
However, the CPU execution changes all the way around when there is no “WHERE” clause.
In my case it is because there is clustered index on Record_Date column in my table that I tested.
My point for you is that when you doubt about what is the best query for you use the query opimiser options to check what is better in your case.
SET STATISTICS IO ON
SET STATISTICS TIME ON
The full output after execution is:
CPU TIME = 0 MS, ELAPSED TIME = 166 MS.
SQL SERVER EXECUTION TIMES:
CPU TIME = 0 MS, ELAPSED TIME = 0 MS.
TABLE ‘MY_TABLE’. SCAN COUNT 1, LOGICAL READS 4, PHYSICAL READS 1, READ-AHEAD READS 0.
SQL SERVER EXECUTION TIMES:
CPU TIME = 10 MS, ELAPSED TIME = 26 MS.
This line tells you about memory. Ideal execution then PHYSICAL READS is 0 and logical is minimum,
TABLE ‘MY_TABLE’. SCAN COUNT 1, LOGICAL READS 4, PHYSICAL READS 1, READ-AHEAD READS 0.
This line tells you the actual CPU time execution on the server, ideally is 0 when SQL server choose to neglect reporting the time because it is very small fraction of a second
SQL SERVER EXECUTION TIMES:
CPU TIME = 0 MS, ELAPSED TIME = 26 MS.
Whichever query gives less CPU time and PHYSICAL READS wins the dispute of perfromance.
February 10, 2010 at 8:19 am
If you run a query twice in a row, the second will (very likely) have 0 physical reads because the data is cached.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
February 10, 2010 at 8:28 am
In regards to "The dreaded BIT field problem", what are your throughts on using an indexed filtered view instead of a table? IE
CREATE VIEW DocumentList WITH SchemaBinding
AS
SELECT DocumentID
FROM ProcessingQueue
WHERE IsProcessed = 0 AND IgnoreForProcessing = 0
CREATE CLUSTERED INDEX [DocumentID] ON [dbo].[DocumentList] (DocumentID ASC)
February 10, 2010 at 8:40 am
You got to clear the procedure cach every time if you do repeatable executions. Run DBCC FREEPROCCACHE when you want to clear all execution plans from the cach. There are more about on SQLMAG.COM by Kalen Delaney well known perfromance expert.
But, for start it is a good for developers that start learning the query coding for performance.:-P However, if you just execute the querires in separte query window the first time will show you clean stats.
Viewing 15 posts - 16 through 30 (of 73 total)
You must be logged in to reply to this topic. Login to reply