Select *

  • I have some novice developers with big egos but small ears... We have some tables that are multi-million records and some of our developers, despite being told not to, frequently do a select * on these tables to look at data, fields, and even get a count (rather than doing count(*)).

    They need to be allowed to do select queries, but is there any way to prevent them from executing select *?

    I tried using the query governor, but that causes problems for one of the apps that uses the DB.

  • There isn't a way that I know of.

    If what they're running are simple ad hoc queries, it shouldn't matter too much. If they're doing that in production code, then the handling is to use both source control and code review.

    The number of rows will matter more to query performance than the number of columns, anyway. I'd be more worried about poorly written or missing Where clauses than about explicit column lists.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Maybe you could write a Logon trigger that does a SET ROWCOUNT 1000? 😀

    Of course there are a whole lot of potential problems with that...

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • They will honestly to a select * with no where on huge tables and just let it run, pulling all rows and columns.

    The logon trigger is actually an interesting idea. Hmmm.

  • Vincent Central (9/30/2008)


    I have some novice developers with big egos but small ears... We have some tables that are multi-million records and some of our developers, despite being told not to, frequently do a select * on these tables to look at data, fields, and even get a count (rather than doing count(*)).

    They need to be allowed to do select queries, but is there any way to prevent them from executing select *?

    I tried using the query governor, but that causes problems for one of the apps that uses the DB.

    Yes... code reviews and "professional" fines. Either that, or a bucket of porkchops and a slingshot.;)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Are you sure you don't work for the same place I do?

    The best solution I've found is to allow them to do whatever they want with the dev systems. When they complain the dev system is slow I tell them to review their code or buy more hardware. The boss quickly falls down on the "free" review of code, usually with my help which removes much of the select * from 8,000,000 row tables. The code speeds up, I get the credit.

    On production systems I'm slowly (and it's a long fight) restricting the developer's access to read only. That means when they want to deploy code it passes under my nose. No where clause gets it rejected immediately, even if the table only has 3 rows.

    For those who scream they must have full access (and there are some), I do periodic "performance checks" on the SQL server capturing all the queries with duration > a few seconds. My first seach is for "SELECT *". When I find any, it is immediately lobbed back over the fence to be amended pronto. The production systems speed up, I get the credit.

  • They only have write access on Dev. If they do it on their Dev boxes and bring them to a crawl, I don't care. It's Production and test environments I'm bothered by. They need to do selects there "to get current data".

    Running Profiler filtering on "SELECT *" over a period of time, then doing more coaching 😉 of the offending developer is another good tip. Thanks.

  • Good idea on select * in the filtering on the profiler...............awesome then when it fires off

    send out an email every 5 minutes to the users.....after 1000 emails do not do select * that might do the trick..

  • one other thing you could do in SSMS is set the rowcount to be, say, 10000. that way every query they run will be limited by that. They will need to know you have done it so that if they do write queries where they are expecting 10000+ rows they need to do a set rowcount 0 at the start of the query.

    Other than that the profiler idea sounds good to me. Very hard to limit this sort of activity though 🙁

  • A BIG 'Name and Shame' poster in the common area works well...two strikes and you're named. Actually, this works for lots of things. 😉

  • Go for the big egos. Nobody likes to be shown up for writing dross, as well as the profiler log, you could also highlight where pulling shedloads of rubbish thaht is not required is blocking tables to other processes and e-mail the entire team with the user and query using something like this. Name and shame and they'll get the message eventually

    CREATE PROCEDURE FindOpenTransactions AS

    DECLARE @emailAddress VARCHAR(128)

    SET @emailAddress = 'developerTeam@wherever....'

    --Create the required Tables IF they do not exist

    IF NOT EXISTS (SELECT * FROM sysobjects WHERE type = 'u' and name = 'Open_transactions')

    BEGIN

    CREATE TABLE Open_transactions

    (

    spid int NULL,

    login VARCHAR(32) NULL,

    db VARCHAR(128) NULL,

    hostname VARCHAR(64) NULL

    )

    END

    IF NOT EXISTS (SELECT * FROM sysobjects WHERE type = 'u' and name = 'Open_transactions_history')

    BEGIN

    CREATE TABLE dbo.Open_transactions_history

    (

    Found_Date datetime NULL,

    spid int NULL,

    login VARCHAR(32) NULL,

    db VARCHAR(128) NULL,

    hostname VARCHAR(64) NULL,

    program_name VARCHAR(128) NULL,

    sqlCommandVARCHAR(2000) NULL

    )

    END

    CREATE TABLE #InputBuffer/*Can't use table variable to take result of DBCCINPUTBUFFER (growl)*/

    (

    eventtype nVARCHAR(30) NULL,

    params int NULL,

    eventinfo nVARCHAR(255) NULL

    )

    /*Get list of processes with open transactions on the server*/

    DECLARE curOpenTransProcs CURSOR

    FOR SELECT

    spid,

    dbid,

    hostname,

    loginame,

    program_name,

    sql_handle

    FROM master..sysprocesses

    WHERE Open_tran > 0

    DECLARE @spid int, @hostname VARCHAR(64), @login VARCHAR(32), @cmd VARCHAR(4000), @database VARCHAR(128), @program_name VARCHAR(128), @dbid int, @spidlist VARCHAR(2000), @wehavedata int

    DECLARE @sql_handle-2 binary(20)

    DECLARE @sqlCommand VARCHAR(2000)

    /*Set initial values for flag / spid list variables*/

    SET @spidlist = ''

    SET @wehavedata = 0

    OPEN curOpenTransProcs

    FETCH NEXT

    FROM curOpenTransProcs

    INTO @spid, @dbid, @hostname, @login, @program_name, @sql_handle-2

    WHILE @@fetch_status = 0

    BEGIN

    SET @wehavedata = 1

    SELECT @database = name FROM master..sysdatabases WHERE dbid = @dbid

    /*which processes with open transactions have been active since the last check

    (rather than bombard recipients with anything that's open at all)*/

    IF EXISTS

    (

    SELECT spid FROM open_transactions

    WHERE

    spid = @spid

    and

    login = @login

    and

    db = @database

    and

    hostname = @hostname

    )

    BEGIN

    /*Add current spid to the list of 'ones of interest'*/

    SELECT @spidlist = @spidlist + CONVERT(VARCHAR(32), @spid) + ','

    SELECT @sqlCommand = ''

    -- IF @sql_handle-2 <> 0x0

    -- BEGIN

    /*Was originally going to try and take SQL command info from

    ::fn_get_sql - but wasn't getting anything back. This, in theory

    should allow us to spot the exact locking statement from a batch -

    but, doesn't appear to return any value if, say, the issue's caused by

    a transaction where the work's complete but awaiting a rollback/commit

    command (say being run manually through QA). So - use scruffy

    DBCC INPUTBUFFER approach instead

    */

    DELETE FROM #Inputbuffer

    SELECT @cmd = 'DBCC INPUTBUFFER (' + STR(@spid) + ') WITH NO_INFOMSGS'

    --PRINT @cmd

    INSERT #Inputbuffer EXEC(@cmd)

    SELECT @sqlCommand = COALESCE(RTRIM(EventInfo), 'Unable to identify SQL command')

    FROM #InputBuffer

    -- SELECT @sqlCommand =

    -- SUBSTRING

    -- (text, 1, 2000)

    -- FROM

    -- ::fn_get_sql(@sql_handle)

    -- END

    /*Keep a list of problematic transactions for future reference */

    INSERT INTO Open_Transactions_History

    (

    Found_Date,

    spid,

    login,

    db,

    hostname,

    program_name,

    sqlCommand

    )

    VALUES

    (

    getdate(),

    @spid,

    @login,

    @database,

    @hostname,

    @program_name,

    @sqlCommand

    )

    -- SELECT @cmd = 'netsend ' + @hostname + '"Your machine has an OPEN transaction to ' + @database + '"'

    -- EXEC master..xp_cmdshell @cmd

    /*Send warning e-mail to appropriate recipients*/

    SELECT @cmd = 'Host: ' + @hostname + CHAR(13)

    SELECT @cmd = @cmd + 'Login: ' + @login + CHAR(13)

    SELECT @cmd = @cmd + 'Database: ' + @database + CHAR(13)

    SELECT @cmd = @cmd + 'SPID: ' + CONVERT(VARCHAR(6), @spid) + CHAR(13)

    SELECT @cmd = @cmd + 'Program: ' + CONVERT(VARCHAR(20), @program_name) + CHAR(13)

    SELECT @cmd = @cmd + 'SQL: ' + CHAR(13) + @sqlCommand + CHAR(13) + CHAR(13)

    SELECT @cmd = @cmd + 'Procedure: FindOpenTransactions'

    EXEC master..xp_sendmail @recipients = @emailAddress, @subject = 'Open Transaction warning', @message = @cmd

    END

    ELSE

    /*If it's a new proc with open transactions add it to the 'watch list'*/

    BEGIN

    INSERT INTO open_transactions

    (

    spid,

    login,

    db,

    hostname

    )

    VALUES

    (

    @spid,

    @login,

    @database,

    @hostname

    )

    SELECT @spidlist = @spidlist + CONVERT(VARCHAR(32),@spid) + ','

    END

    FETCH NEXT FROM curOpenTransProcs INTO @spid, @dbid, @hostname, @login, @program_name, @sql_handle-2

    END

    --Cleanup

    DROP TABLE #InputBuffer

    --PRINT 'spidlist ' + @spidlist

    IF @wehavedata = 1

    BEGIN

    /*Remove anything from the open transactions table where the spid no longer exists on the server*/

    SELECT @cmd = 'DELETE FROM Open_transactions

    WHERE spid not in ('

    + SUBSTRING( @spidlist, 1, LEN(@spidlist) -1) + ')' + CHAR(13)

    --PRINT @cmd

    EXEC (@cmd)

    END

    ELSE

    DELETE FROM Open_transactions

    CLOSE curOpenTransProcs

    DEALLOCATE curOpenTransProcs

    GO

  • Jane (10/1/2008)


    A BIG 'Name and Shame' poster in the common area works well...two strikes and you're named. Actually, this works for lots of things. 😉

    I wouldn’t set rowcount on login trigger, because it won’t take long until they figure out how to cancel it.

    I also don’t think that trying to “name and shame” will do some help. In fact I think that it will do more damage then good. In the described situation the dev team doesn’t think that the DBA’s recommendations are important, so they won’t care if the DBA will name developers that don’t fallow those recommendations. The thing that you can do is to try and educate them. Show them why there code is causing problems. Show them the code that they wrote, then show them your code. Let them see the difference in the code’s execution. Explain them about the code’s difference and why it affected the performance. From my experience most of the time that helps and developers are starting to come to you on there own initiative.

    I admit that I had one working place in the past with some development teams that just wouldn’t listen to me at all. Even when I showed them that there applications can be used to take control over the server with SQL Injection, they refused to modify the applications.. In that case after I realized that I lost the battle and the developers at this place don’t really care how there applications work and don’t want to improve there skills, I left the job and found a new one.

    Adi

    --------------------------------------------------------------
    To know how to ask questions and increase the chances of getting asnwers:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Showing exactly why a SELECT * is bad is harder than it needs to be in the SQL Server world, partly due to lack of metrics.

    Going back a decade when I ran a DB2 mainframe shop, the metrics were well known. The same principals apply to SQL Server, even if some of the numbers are different.

    Every data item you include in a SELECT statement has a more or les fixed overhead to materialise in the result set. This means if you do a SELECT TOP 1 col1 FROM tablea, you materialise 1 data item. If you do SELECT TOP 5 col1,col2 FROM tablea you materialse 10 data items.

    A decade ago, the figures were that you could materialise 17 data items in about 1/100 second. It also took about 1/100 second to issue and complete a disk read, so the performance experts liked to say that selecting 17 data items slows your application down by the same amount as a single disk read.

    This makes it easier to go to developers with facts and figures. If you only need one column from a 17 column row, then doing a SELECT TOP 1 * FROM tablea will take the same elapsed time as SELECT TOP 17 col1 FROM tablea. Doing a SELECT * FROM tablea when tablea is 1m rows would have a minimum response time of 100 seconds, compared to SELECT col1 FROM tablea having a minimum response time of 5.8 seconds, assuming all of tablea is in memory.

    The figures for how many data items you can materialise in a given amount of time will definitely be different now, and will vary according to processor speed. However, even if you just make a best guess about this ratio, it can help educate people in the importance of only asking for what they can use.

    If you knew you could only eat one Shredded Wheat (tm) would you insist on having a shipping container full dumped on your plate, only to throw the rest away after eating the first one that was given. And then doing the same again, and again?

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • Before everyone jumps on the "never SELECT *" bandwagon, I'd argue that there are cases where * is both useful and appropriate. One example is a view that provides a row-filtered subset of a table; if columns are added to the table (that's one of the advantages of a relational table, right?) it is cleaner (more generalized) to have the view reflect the current column composition of the table, as opposed to "hard coding" in the columns that existed in the "old days". Strikes me that hard coding column names is not far removed from hard coding character positions in a flat file, or worse, hard coding row numbers.)

    (OK, the view may need to be refreshed, but I think that is an artifact of the current SQL Server implementation.)

  • Jim Russell (10/2/2008)


    Before everyone jumps on the "never SELECT *" bandwagon, I'd argue that there are cases where * is both useful and appropriate. One example is a view that provides a row-filtered subset of a table; if columns are added to the table (that's one of the advantages of a relational table, right?) it is cleaner (more generalized) to have the view reflect the current column composition of the table, as opposed to "hard coding" in the columns that existed in the "old days". Strikes me that hard coding column names is not far removed from hard coding character positions in a flat file, or worse, hard coding row numbers.)

    (OK, the view may need to be refreshed, but I think that is an artifact of the current SQL Server implementation.)

    Jim, i could be wrong, but i think the debate is regarding the number of rows returned rather than being able to see all the columns. so doing a select top 1000 * from MyTable would not be an issue, whereas select * from MyTable where MyTable contains 20m rows could be a problem.

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply