Select *

  • We have a third party vendor that uses a ton of "SELECT *" statements.

    To make it worse, they are using it without a WHERE clause also.

    Out of 600,000 queries in an hour over 90,000 of them use the SELECT * without a WHERE.

    This isn't in stored procedures either, it's ad-hoc.

    Here's their response:

    "Adding where clauses to all queries is appropriate unless, of course, all of the rows are truly required every time. There will be many cases where the optimizer will still choose to do a full table scan if the query is against a table with a small number of rows. But it is always best to go after only the columns that are needed with a where cause and let the database engine determine the best execution plan"

    Let's say they did need every row and every column of that table every time, would there still be a benefit of listing the columns out. I need some feedback that will help me understand this and also show them the best way.

    Am I just being a stickler for the no "SELECT *" and shouldn't be?

  • In the worst possible case,lets assume that the 90,000 Select *'s are required. In that case will not be feasible to run them as Stored Procs rather than inline sql's..just a thought. As a DBA you are entitled to ask as to why so many of the ad-hocs have Select * in them

    It doesn't make any difference by specifying the column names individually unless the number of columns specified is lesser then * itself..........

    I had this issue at my place and they were habituated to it..even some of the SP's had * in them and I made them change so many codes because they realised they didnt need all the columns from the tables..which makes me tell, a DBA should make some noise at least when we see somethings like these...

    I would appreciate any other views too..

    Thanks!!

    The_SQL_DBA
    MCTS

    "Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives."

  • While it is possible that it needs every row and column every time in those, it's sloppy coding nonetheless.

    Are you actually in a position to do anything about it if it turns out that it doesn't actually need that? If so, then do some load testing. Write a query that just pulls the exact columns and rows needed, and run it a few thousand times to gather speed data, then do the same with an open-ended one (* and no Where). Make sure to check how it affects locking behavior and all that.

    Most likely, if it's pulling a small table into the application, it won't matter which one gets used in terms of performance, locking, etc.

    Also, in most of these cases, where it's 3rd party software, you won't be able to do anything about it anyway, in which case you work out what you need to do, if anything, to minimize the impact on other applications, and go from there.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I want to preface my response by saying I don't know the specifics of your application or tables, but these are general "best practices" in my experience:

    1. Use stored procedures. This will allow SQL Server to cache your query plan, resulting in faster executions.

    2. "Select *" and "Select [insertFullColumnListHere]" are the same query; they will resolve with the same execution plan. But what happens when you begin to add additional columns to a table, perhaps for reporting purposes? Now their queries will return these columns too, which are not needed. Depending on how the application is written, it could even break it (I've seen it happen). Frankly, writing "Select *" is just lazy and does not offer any additional performance improvement.

    3. Are the "return all row" queries for look-up tables? If so, 90k calls an hour seems high. You'd be better off having them get the data once and cache it on the app end, perhaps renewing the data hourly or daily.

    HTH!

  • Using SELECT * is just lazy. Do all the queries that are being run need all the columns from all the rows everytime? that is the question I'd ask. I find it hard to believe that they actually do.

  • My understanding is that there was a different cost(with the engine) to the following options:

    TableA has 5 columns and 200 rows.

    I need all columns and all rows every time.

    Option 1:

    SELECT * FROM TableA

    Option 2:

    SELECT col1,col2,col3,col4,col5 FROM TableA

    If I'm wrong, please let me know.

    I don't understand why in a one hour period a Profiler trace gets 660757 rows. 67,900 of them do a SELECT * FROM TableName and that table name has 246 rows (only 3 columns though).

    To me, that's added bandwidth.

  • The cost is the same. Now, if you return fewer columns, the cost would probably be different.

    If that query is for the same table, is that table fairly static? If so, I'd like to re-iterate that the data should be cached in the application, which will improve database performance.

  • I don't believe there's going to be an actual difference in execution time for the two queries. Theoretically there might be, because of the one with the * having to look up more metadata, but really, once it's got an execution plan in the cache, that's not going to matter. So, it might cost a few extra microseconds the first time it runs, but after that, it's not going to matter.

    Does the data in that table change often? Or not at all? Or somewhere in between? If it doesn't change, or only changes rarely, caching the data on the client (or app server/web server, whichever is appropriate) might be a better solution than querying it every few seconds.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • yea, I just ran a test to see and it is the same.

    I need to ask the static question.

    As others have stated, this is a third-party application and I have little power in making change. They are very resilient to thinking things can be issues.

    They are using more ad-hoc than stored procs so query plans will suffer.

    They have never had the load that we are giving them and the application (mainly the database) is having trouble keeping up. There are tons of views(that have multiple joins) and benefit when querying the table directly but that doesn't seem to be something that can change.

    I'm pretty frustrated, but this discussion is at least helping me flush it out.

  • There is a small cost to the * for queries in that the engine will have to do a search to find the column list that it's returning. It's slight, but it's there. However, if you're moving umpty-ump rows and all the columns out of the database and across the wire on most of the queries, who's going to notice?

    That's horrifyingly bad design by the way. They took the easiest way out in terms of coding the db side of things, but you'll be paying for it in the long run. How well does it scale? Locking, blocking, deadlocks? Network I/O? It doesn't sound good any way you put it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Ok Jason: Do they never complain of blocking issues?

    Thanks!!

    The_SQL_DBA
    MCTS

    "Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives."

  • In a practical sense, there's not much you can do once you've already purchased the app. 99% of vendors are motivated by new sales first and foremost. Your request for optimization will likely get added to the bottom of the list, right underneath "candy-coloured UI themes", unless you can light a fire under them.

    If you're a VERY large customer, and you can get your internal people riled up enough to threaten to move to another software, you might get some action. Ultimately, as sad as it sounds, for most vendors you need to attach a dollar figure to a problem in order to get action. Especially something as "esoteric ;-)" as coding standards.

    Aaron Alton
    Blogging at The HOBT

  • Grant Fritchey (4/7/2009)


    How well does it scale? Locking, blocking, deadlocks? Network I/O? It doesn't sound good any way you put it.

    Not very well at all. The previous implementation of their app was a much smaller user base.

    Tons of locks. We've got the blocking and deadlocks as well. I've sent them profiler traces, SQL Error logs with the traces on to show them what is happening.

    Most fixes I've seen them do is add query hints (NOLOCK). Alot of their updates have TABLOCK and HOLDLOCK on them and that's not good.

    Plenty of stuff to mention but I'm just trying to propose things I actually think they will take the time to fix.

  • Jason Crider (4/7/2009)


    They have never had the load that we are giving them and the application (mainly the database) is having trouble keeping up. There are tons of views(that have multiple joins) and benefit when querying the table directly but that doesn't seem to be something that can change.

    This may actually work well in your favor. See if you can schedule a meeting with their VP of Development, Marketing, Sales, etc... whoever you can get that's high up. Explain to them the type of volume you're working with, the troubles you're having, and that you're not satisfied with their current solution. Also explain to them that you're an experienced DBA, and you'd like to work with them to improve their application. This will only benefit them in the long run.

    Also, as many large software companies rely on company referrals to secure new customers, tell them that once they've implemented some of the changes you've requested, you'd be happy to serve as a referral for new customers, or perhaps let them do a case study on your environment.

    Just a thought on how to perhaps approach this from a different perspective...

  • It sounds like you're going to be in pain for some time to come. Document everything and work with them as closely as possible. That's about all you can do.

    We had a similar situation here. It was an Access app that got popular. Unfortunately it was developed by an outside organization that had a lot more ego than skill. They ported the database straight to SQL Server and then started pointing fingers at us for the problems. It took years of very careful presentations interspersed with screaming matches to show them the error of their ways. They still think our team is incompetent, but they've been making the changes we suggest with less argument these days.

    Good luck.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply