November 17, 2009 at 8:49 pm
Comments posted to this topic are about the item Speed Phreaks
Best wishes,
Phil Factor
November 18, 2009 at 6:24 am
Throughout the 30 plus years of my career I have listened to techies arguing the speed issue. I understand that for some businesses, this is crucial, but we should not lose sight that speed is not the main concern for every installed copy SQL Server in the world. Speed is important to us here, but data accuracy is far more the big concern. I don't care if a query takes an extra minute or two, just so long as I can depend on the data returned.
None the less, if you are going to have this competition and its purpose is educating the masses, I sure would like to see someone come out with a definitive list of "do's and don'ts" for effectively managing the speed issue. Seems like every time one SQL expert says "this" about speed and efficiency, another comes along and says "No, don't do that, do this..." about speed, and this "round and round" has gone on for years. Can someone step up and just list the 10 (or 20, 30?) "Commandments" for efficient querying?
(And Phil... Sorry, but you're a great writer and talented guy, and I cant let slip that you misspelled your own last name. You are Phil Factor, not Phil Fector... Unless of course we are talking alter-egos... Thanks for the post!)
November 18, 2009 at 7:40 am
Oops! I've corrected the spelling.
With the Speed Phreak challenge, the entries that give the wrong answer are eliminated from the final results. The test harness compares the result with a datum (correct answer) and flags up any inconsistencies.
The reason that we don't give hard and fast rules is because we still don't know them all. We are learning a lot as we go along. Peso and I were discussing the other way how we could write up our conclusions in a way that is accessible and interesting. It is at this point that things should get more interesting for the wider SQL Server community.
We often hear of die-cast performance rules. I find they are mostly right, but there is a danger in these rules being followed to literally. Rules that are good for a perticular version of SQL Server, or for a particular size of data, become poor guides in different circumstances. this is why I call it an art rather than a science.
Best wishes,
Phil Factor
November 18, 2009 at 7:51 am
blandry (11/18/2009)
Throughout the 30 plus years of my career I have listened to techies arguing the speed issue. I understand that for some businesses, this is crucial, but we should not lose sight that speed is not the main concern for every installed copy SQL Server in the world. Speed is important to us here, but data accuracy is far more the big concern. I don't care if a query takes an extra minute or two, just so long as I can depend on the data returned.None the less, if you are going to have this competition and its purpose is educating the masses, I sure would like to see someone come out with a definitive list of "do's and don'ts" for effectively managing the speed issue. Seems like every time one SQL expert says "this" about speed and efficiency, another comes along and says "No, don't do that, do this..." about speed, and this "round and round" has gone on for years. Can someone step up and just list the 10 (or 20, 30?) "Commandments" for efficient querying? ...[/i]
Nope.
Different situations have different answers. We don't live in a world of absolutes.
Here are a few that are generally the difference between slow and fast:
Avoid cursors
Don't use cursors
Use something other than cursors (Could just repeat variations on that for the next ten lines, but I think I've made my point)
Index appropriately, including the clustered index
Assign the primary key appropriately
Normalize for OLTP
Denormalize appropriately for OLAP
Don't mix OLTP and OLAP in the same tables
Use appropriate data types and avoid implicit conversions
Use SARGable arguments in your Where and Join clauses
Avoid UDFs
Keep Views simple and to the point, no excess joins nor columns
Know the differences between CTEs/Derived Tables, Temp Tables, and Table Variables and use them appropriately
Constraints can help the execution plan engine, so use them where appropriate
Understand parameter sniffing and the steps to handle it
Understand the benefits and drawbacks to both horizontal and vertical table partitioning
Avoid complex triggers
Those are the top ones that come to my mind as general rules for a performant database. Every one of them, even the cursors curse, has exceptions. This leads to the
ONE TRUE RULE FOR A PERFORMANT DATABASE:
Understand what you're doing when you design, build and code!
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 18, 2009 at 8:20 am
blandry is correct, obviously data accuracy is the most important thing. Period. If a reports runs in seconds but has the wrong values it's useless. Even after accuracy is achieved, speed is not the be all and end all. But there is a limit. If you have a nightly process that takes an hour to run and you optimize it down to 15mins , well done but it doesnt really help anyone. If you have a routine that is fired by every user , every minute and that takes 10secs and you get it to run in 2, then that makes everyone more productive. Your system will be smoother and more predictable.
These challenges are for 'fun' and education. You can learn a lot in what not to do as well what to do by just studying the code. And of course no-one can say that any particular query is the fastest possbile.
Rules are limiting as solutions are sometimes art, rather than science, but here's my general quick list.
Mostly pretty obvious
Understand the data and understand the question and then understand what data is relevant to the answer.
Minimize the amount of data io (Indexing and not re-reading tables)
Minimize the amount of calculations
Avoid Looping / RBAR
November 18, 2009 at 8:20 am
Cor, GSquared, that was rather good!
Best wishes,
Phil Factor
November 18, 2009 at 8:26 am
Phil Factor (11/18/2009)
Cor, GSquared, that was rather good!
Thanks Phil. The workbenches you and Robyn used to post on simple-talk.com were a key factor in me learning SQL Server and T-SQL, so that compliment from you means a lot to me.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 18, 2009 at 8:49 am
Sorry, GSquared. I have an anti-ditto (of a sort) to your list.
RE: Indexing. Yes, index. But don't OVER index.
Pet Peeve = developers add an index to an existing table just so their query will run faster, not realizing the table already has a nearly 1:1 column to index relationship. And it never occurs to them that sometimes indexes can be Bad Things.
"Query running slow? Why re-write it when we can slap an index on the table?"
@sigh.
Rant over. Soapbox away.
November 18, 2009 at 2:18 pm
Brandie Tarvin (11/18/2009)
Sorry, GSquared. I have an anti-ditto (of a sort) to your list.RE: Indexing. Yes, index. But don't OVER index.
Pet Peeve = developers add an index to an existing table just so their query will run faster, not realizing the table already has a nearly 1:1 column to index relationship. And it never occurs to them that sometimes indexes can be Bad Things.
"Query running slow? Why re-write it when we can slap an index on the table?"
@sigh.
Rant over. Soapbox away.
Which is why I summarized to "index appropriately".
If I remember correctly, I once ran into a 10-column table with 50 indexes on it.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 18, 2009 at 5:19 pm
I had at one time asked someone to fix a report that ran in 2 hours. They refused, until later that year when it was taking longer than 26 hours and wouldn't complete in time to refresh the reporting database. But at least they finally listened!
I once fixed a performance problem where the duration of "the problem" was completing in 200ms.
As it turns out, the 26-hour report used a UDF and that UDF was the problem. Not because UDFs are bad, the problem was they were using an inline UDF as a scalar UDF (very bad!). It got fixed and it ran in under an hour.
My speed suggestions:
1. Use SQL Profiler
Build a performance template... include Textdata, reads, cpu, writes, duration, dbid, spid, loginame for the events SQL: Completed, SP:Stmt Completed, RPC:Completed and filter by Reads > 50000. Store the trace information in a table on a dev box. Hint: When you build a query to get the trace info from the table use this for the textdata column as it converts tabs, carriage returns, and whatever 13 (linefeed?) to a space:
replace (replace (replace(convert (varchar (4000), textdata), char(13), ' '), char(10), ' '), char(9), ' ') as Textdata
Fix the performance problems in your trace. Sensory overload? Sort by Reads DESC. Find repeated queries, target those first. They probably show up together. 😉
When those are fixed, reduce the amount of reads. Repeat.
2. Every table gets a clustered index.
The reason SQL Server defaults the clustered index on the PK is because you primarily will join to a table on it's PK, or at least look up info by it. When you use the clustered index, you get lookups (any of the columns) for free. Hence, putting the clustered index on your PK will likely be your best option.
3. Avoid data conversions on a COLUMN
WHERE convert(varchar(12), ModifiedDate) = @TheDateIWant <-- VERY bad = table scan
WHERE ModifiedDate = convert(datetime, @TheDateIWant) <-- Fast = uses index on ModifiedDate
I once supported a Workflow system that put UPPER() around everything. It was a case-insensitive database. Long story, you get the point.
November 19, 2009 at 8:45 am
I think the problem with many of the "considerations" or "rules" that GQSquared listed is that they don't mean anything if you're ignorant. What is "appropriate" if you're new to indexing?
These competitions are good in that they get you to examine the ways that you can apply speed improvements in specific situations. If we could get people to read 100 of these, they might have better ideas of what is appropriate, why not to use a cursor, etc.
November 19, 2009 at 8:47 am
Steve Jones - Editor (11/19/2009)
I think the problem with many of the "considerations" or "rules" that GQSquared listed is that they don't mean anything if you're ignorant. What is "appropriate" if you're new to indexing?These competitions are good in that they get you to examine the ways that you can apply speed improvements in specific situations. If we could get people to read 100 of these, they might have better ideas of what is appropriate, why not to use a cursor, etc.
Which is exactly why I listed my one-true-rule on the subject.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 19, 2009 at 9:24 am
Unfortunately, a production database administrator isn't necessarily for the design of the database or for writing the SQL. Identifying performance issues is much more important and over half the battle, especially if you aren't responsible for fixing them.
November 19, 2009 at 9:59 am
Unfortunately, a production database administrator isn't necessarily for the design of the database or for writing the SQL.
Sure, this topic is more for the database developers in the community. Many DBAs will also take a lead in advising application developers on how to improve their code.
Best wishes,
Phil Factor
November 19, 2009 at 10:43 am
This topic is also for those of us serving in the "Jack-of-All-DBA-Jobs" role. @=)
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply