August 10, 2009 at 7:21 am
I saw Gail's answers and I think she hit the ones I was going to hit, but I'm feeling the need to pile on a bit.
#8: KEEPFIXED PLAN can prevent some recompiles, but not all, but, remember, recompiles aren't necessarily bad. sp_executesql is not a mechanism for preventing recompiles. It's a mechanism for promoting plan reuse, a completely different topic. Your average EXEC 'somestring' doesn't suffer from recompiles. It suffers from multiple compiles, and no reuse. That's just two different options. But even the use of the hint needs to be very carefully weighed because, again, recompiles are not necessarily bad.
#11 WHOA! Bad choice. A function on a column like that will prevent an index on that column from ever getting used. Better to reverse it and set the variable to max or min or whatever, or just let it run (if the case insensitive coalation is in use, it's unecessary).
#13 I agree with the idea of limiting the use of DISTINCT. However, ORDER BY can be a very inexpensive operation that can, in some ways benefit performance. Do a comparison between getting the MAX of a value and getting the TOP(1) of a value where an index is available for the TOP(1) operation.
#16 Huh?
#18 Wrong. Temp tables and table variables make use of tempdb and memory in the same way. The difference is statistics, which is why the table variable doesn't cause recompiles. But that same difference makes table variables a very weak choice when dealing with large amounts of data.
#19 Index scans are not necessarily faster than table scans. It honestly depends. This doesn't change the tip, use proper indexes, which is correct, but as an explanation, it's weak.
#20 Can we rephrase this to use traces? Profiler itself causes problems when connected to a production system directly.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 10, 2009 at 7:22 am
GilaMonster (8/10/2009)
ta.bu.shi.da.yu (8/10/2009)
Strange as it may seem, if you run the Oracle Automatic Workload Repository ADDM report then you'll often get Oracle warning not to use the operator. It seems that they say this because it does a full scan against the tables, whereas if you do something like number > 0 and number 0 OR SomeColumn < 0) and seek on an appropriate index. It's still two partial scans of the table but, if there are lots of cases where SomeColumn=0, it's likely better than a full scan of the index or table (likely, because this can be a bit of an edge case as to whether seeks which do partial scans are better than full scans)
I'm curious... would the selectivity of an index influence the execution plan when an inequality predicate is used? I ask because, tucked in the recesses of my mind, I remember some time ago seeing an obscure SQL Server 2000 performance bug that caused it to understimate the cardinality of certain expressions that used an inequality or range operator.
August 10, 2009 at 7:28 am
ta.bu.shi.da.yu (8/10/2009)
I'm curious... would the selectivity of an index influence the execution plan when an inequality predicate is used?
Possibly. Selectivity's difficult with inequalities, especially when there's parameters or variables in the mix.
Probably doesn't answer your question, but ... http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 10, 2009 at 7:34 am
Yep, Gail wins the timezone war! She is spot on too, as usual.
I am also surprised this article got the green light.
I hate it when I'm late to the game.
The only little bits I would add are:
Order of operation can help speed a little bit, since you stop checking once of the conditions is met, kinda like the EXISTS issue, but you would have to be using A LOT of them to see any real difference, you may run into the optimizer picking a simple plan before this issue even comes up I'd have to do some testing to prove that out though.
Table variables return 1 row on estimated plans because they don't have any stats on them. A # table gets stats generated just like any other table.
DDL to the top of the proc, not dispersed through the procedure helps, the partial recompile work better, so even if you can't avoid a recompile you may not have to recompile the more complicated parts of your proc.
August 10, 2009 at 7:36 am
Hi,
Some good suggestions raised here. Another common performance tip is the use of UNION ALL in preference to UNION statement.
Provided you can guarantee that all the rows will be unique from each select rowset of the union query then you do not need expense of SQL to perform an unnecessary elimination of duplicated rows performed by UNION statement.
Cheers
Terry
August 10, 2009 at 7:40 am
ta.bu.shi.da.yu (8/10/2009)
I'm curious... would the selectivity of an index influence the execution plan when an inequality predicate is used? I ask because, tucked in the recesses of my mind, I remember some time ago seeing an obscure SQL Server 2000 performance bug that caused it to understimate the cardinality of certain expressions that used an inequality or range operator.
Yes.
CREATE TABLE #Employee (emp_id INT NOT NULL, padding VARCHAR(MAX) NULL);
CREATE INDEX nc1 ON #Employee (emp_id);
INSERT #Employee SELECT TOP (2500) ROW_NUMBER() OVER (ORDER BY C1.[object_id]), REPLICATE('X', 1000) FROM master.sys.all_columns C1, master.sys.all_columns C2
SELECT * FROM #Employee E WHERE emp_id 0 -- Table scan
-- |--Table Scan(OBJECT: ([tempdb].[dbo].[#Employee] AS [E]), WHERE: ([tempdb].[dbo].[#Employee].[emp_id] as [E].[emp_id](0)))
TRUNCATE TABLE #Employee
INSERT #Employee SELECT TOP (2500) 0, REPLICATE('X', 1000) FROM master.sys.all_columns C1, master.sys.all_columns C2
SELECT * FROM #Employee E WHERE emp_id 0 -- Index seek
--|--Nested Loops(Inner Join, OUTER REFERENCES: ([Bmk1000]) OPTIMIZED)
-- |--Compute Scalar(DEFINE: ([Expr1003]=BmkToPage([Bmk1000])))
-- | |--Index Seek(OBJECT: ([tempdb].[dbo].[#Employee] AS [E]), SEEK: ([E].[emp_id] (0)) ORDERED FORWARD)
-- |--RID Lookup(OBJECT: ([tempdb].[dbo].[#Employee] AS [E]), SEEK: ([Bmk1000]=[Bmk1000]) LOOKUP ORDERED FORWARD)DROP TABLE #Employee
DROP TABLE #Employee;
edit: fixed the colon plus bracket smileys 🙂
August 10, 2009 at 7:41 am
rja.carnegie (8/10/2009)
Is there a reason besides less typing, distinction between DML and variable operations, another issue that only existed in SQL Server 6.5, or maybe being less compatible with a competitor's SQL product?
No idea. I don't know the basis or reason for that comment in BoL
If the BoL entry doesn't say something like "This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature." then the feature is not officially deprecated.
Likewise you can SELECT @variable = expression [, ...] FROM query, or SET @variable = (SELECT expression FROM query). Now which of those is it that raises an error if there's more than one row... I think it's the second.
The second. In the first, if there are multiple rows you get the value of one of them. Which row you get is not defined.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 10, 2009 at 7:42 am
Yeah, very weak article. Nothing in here that good developers don't already know, plus a number of errors and just plain bad advice.
August 10, 2009 at 7:48 am
Wesley Brown (8/10/2009)
Order of operation can help speed a little bit, since you stop checking once of the conditions is met, kinda like the EXISTS issue, but you would have to be using A LOT of them to see any real difference, you may run into the optimizer picking a simple plan before this issue even comes up I'd have to do some testing to prove that out though.
You sure about that?
http://sqlinthewild.co.za/index.php/2007/07/29/order-of-execution/ (old post, needs updating, mostly still valid)
While there are certain circumstance under which SQL can do position-based short circuiting of conditions in the where clause, under most circumstances the order that conditions are evaluated depends on the available indexes and the plan that the optimiser comes up with.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 10, 2009 at 7:48 am
ta.bu.shi.da.yu (8/10/2009)
vetri (8/10/2009)
Hi,I tried with these two statements but the second one is not get executing and it says "Incorrect syntax near the keyword 'exists'." What wrong with this or is this wrong statement?
SELECT * FROM employee WHERE emp_no NOT IN (SELECT emp_no from emp_detail)
SELECT * FROM employee WHERE NOT EXISTS (SELECT emp_no FROM emp_detail)
Change the second one to:
SELECT * FROM employee WHERE NOT EXISTS (SELECT * FROM emp_detail)
Not the same thing. to get the equivalance of the first statement in a NOT EXISTS you need to reference the outer table.
SELECT * FROM employee a WHERE NOT EXISTS (SELECT * FROM emp_detail b WHERE a.emp_no = b.emp_no)
August 10, 2009 at 7:51 am
Adrian Green (8/10/2009)
Overall I've rated this article poorly. Sorry, a "Top Twenty Tips" sounds good (a nice bit alliteration), but unfortunately for a topic as complex as this you can only mislead.
It doesn't say "Top Twenty Tips". It doesn't say "Twenty Good Tips", but the word "Good" is in there, so a certain expectation remains 😉
August 10, 2009 at 7:55 am
ronmoses (8/10/2009)
10. WHERE clauses - In a WHERE clause, the various operators used directly affect how fast a query can run. Here are the conditional operators used in the WHERE clause, ordered by their performance.
=, >, =, <=, , !=, !>, !' be much further up the performance list to '!<' as they are both mean exactly the same thing.
August 10, 2009 at 7:58 am
jacroberts (8/10/2009)
Why should '>' be much further up the performance list to '! is twice as efficient as !< since it uses just one character :laugh:
August 10, 2009 at 7:59 am
This list hits on the right items but should be refined based on Gail Shaw's comments and a few other. I thought about passing this to our developers but because of a few descrepencies, will not.
August 10, 2009 at 8:03 am
Here's a question about one of the items. You mentioned that writing "master.dbo.proc_name" is better than just "proc_name", well, what about writing "master..proc_name"? My company does not specify the schema name so it's always "dbo". Is doing the ".." just a shortcut/default to "dbo"?
Thanks
Viewing 15 posts - 46 through 60 (of 244 total)
You must be logged in to reply to this topic. Login to reply