Select *

Question

Select *

Vincent Central

SSC-Addicted

Points: 495
More actions
September 30, 2008 at 12:21 pm

#123844

I have some novice developers with big egos but small ears... We have some tables that are multi-million records and some of our developers, despite being told not to, frequently do a select * on these tables to look at data, fields, and even get a count (rather than doing count(*)).
They need to be allowed to do select queries, but is there any way to prevent them from executing select *?
I tried using the query governor, but that causes problems for one of the apps that uses the DB.

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply

GSquared SSC Guru Points: 260824 More actions · Answer 1

There isn't a way that I know of.

If what they're running are simple ad hoc queries, it shouldn't matter too much. If they're doing that in production code, then the handling is to use both source control and code review.

The number of rows will matter more to query performance than the number of columns, anyway. I'd be more worried about poorly written or missing Where clauses than about explicit column lists.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

RBarryYoung SSC Guru Points: 143329 More actions · Answer 2

Maybe you could write a Logon trigger that does a SET ROWCOUNT 1000? 😀

Of course there are a whole lot of potential problems with that...

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Vincent Central SSC-Addicted Points: 495 More actions · Answer 3

They will honestly to a select * with no where on huge tables and just let it run, pulling all rows and columns.

The logon trigger is actually an interesting idea. Hmmm.

Jeff Moden SSC Guru Points: 1004435 More actions · Answer 4

Vincent Central (9/30/2008)
I have some novice developers with big egos but small ears... We have some tables that are multi-million records and some of our developers, despite being told not to, frequently do a select * on these tables to look at data, fields, and even get a count (rather than doing count(*)).
They need to be allowed to do select queries, but is there any way to prevent them from executing select *?
I tried using the query governor, but that causes problems for one of the apps that uses the DB.

Yes... code reviews and "professional" fines. Either that, or a bucket of porkchops and a slingshot.;)

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

FNS SSCrazy Points: 2015 More actions · Answer 5

Are you sure you don't work for the same place I do?

The best solution I've found is to allow them to do whatever they want with the dev systems. When they complain the dev system is slow I tell them to review their code or buy more hardware. The boss quickly falls down on the "free" review of code, usually with my help which removes much of the select * from 8,000,000 row tables. The code speeds up, I get the credit.

On production systems I'm slowly (and it's a long fight) restricting the developer's access to read only. That means when they want to deploy code it passes under my nose. No where clause gets it rejected immediately, even if the table only has 3 rows.

For those who scream they must have full access (and there are some), I do periodic "performance checks" on the SQL server capturing all the queries with duration > a few seconds. My first seach is for "SELECT *". When I find any, it is immediately lobbed back over the fence to be amended pronto. The production systems speed up, I get the credit.

Vincent Central SSC-Addicted Points: 495 More actions · Answer 6

They only have write access on Dev. If they do it on their Dev boxes and bring them to a crawl, I don't care. It's Production and test environments I'm bothered by. They need to do selects there "to get current data".

Running Profiler filtering on "SELECT *" over a period of time, then doing more coaching 😉 of the offending developer is another good tip. Thanks.

TRACEY-320982 SSChampion Points: 13653 More actions · Answer 7

Good idea on select * in the filtering on the profiler...............awesome then when it fires off

send out an email every 5 minutes to the users.....after 1000 emails do not do select * that might do the trick..

Animal Magic SSChampion Points: 13964 More actions · Answer 8

one other thing you could do in SSMS is set the rowcount to be, say, 10000. that way every query they run will be limited by that. They will need to know you have done it so that if they do write queries where they are expecting 10000+ rows they need to do a set rowcount 0 at the start of the query.

Other than that the profiler idea sounds good to me. Very hard to limit this sort of activity though 🙁

bossimal SSC Journeyman Points: 89 More actions · Answer 9

A BIG 'Name and Shame' poster in the common area works well...two strikes and you're named. Actually, this works for lots of things. 😉

Andrew Gothard-467944 Hall of Fame Points: 3928 More actions · Answer 10

Go for the big egos. Nobody likes to be shown up for writing dross, as well as the profiler log, you could also highlight where pulling shedloads of rubbish thaht is not required is blocking tables to other processes and e-mail the entire team with the user and query using something like this. Name and shame and they'll get the message eventually

CREATE PROCEDURE FindOpenTransactions AS

DECLARE @emailAddress VARCHAR(128)

SET @emailAddress = 'developerTeam@wherever....'

--Create the required Tables IF they do not exist

IF NOT EXISTS (SELECT * FROM sysobjects WHERE type = 'u' and name = 'Open_transactions')

BEGIN

CREATE TABLE Open_transactions

(

spid int NULL,

login VARCHAR(32) NULL,

db VARCHAR(128) NULL,

hostname VARCHAR(64) NULL

)

END

IF NOT EXISTS (SELECT * FROM sysobjects WHERE type = 'u' and name = 'Open_transactions_history')

BEGIN

CREATE TABLE dbo.Open_transactions_history

(

Found_Date datetime NULL,

spid int NULL,

login VARCHAR(32) NULL,

db VARCHAR(128) NULL,

hostname VARCHAR(64) NULL,

program_name VARCHAR(128) NULL,

sqlCommandVARCHAR(2000) NULL

)

END

CREATE TABLE #InputBuffer/*Can't use table variable to take result of DBCCINPUTBUFFER (growl)*/

(

eventtype nVARCHAR(30) NULL,

params int NULL,

eventinfo nVARCHAR(255) NULL

)

/*Get list of processes with open transactions on the server*/

DECLARE curOpenTransProcs CURSOR

FOR SELECT

spid,

dbid,

hostname,

loginame,

program_name,

sql_handle

FROM master..sysprocesses

WHERE Open_tran > 0

DECLARE @spid int, @hostname VARCHAR(64), @login VARCHAR(32), @cmd VARCHAR(4000), @database VARCHAR(128), @program_name VARCHAR(128), @dbid int, @spidlist VARCHAR(2000), @wehavedata int

DECLARE @sql_handle-2 binary(20)

DECLARE @sqlCommand VARCHAR(2000)

/*Set initial values for flag / spid list variables*/

SET @spidlist = ''

SET @wehavedata = 0

OPEN curOpenTransProcs

FETCH NEXT

FROM curOpenTransProcs

INTO @spid, @dbid, @hostname, @login, @program_name, @sql_handle-2

WHILE @@fetch_status = 0

BEGIN

SET @wehavedata = 1

SELECT @database = name FROM master..sysdatabases WHERE dbid = @dbid

/*which processes with open transactions have been active since the last check

(rather than bombard recipients with anything that's open at all)*/

IF EXISTS

(

SELECT spid FROM open_transactions

WHERE

spid = @spid

and

login = @login

and

db = @database

and

hostname = @hostname

)

BEGIN

/*Add current spid to the list of 'ones of interest'*/

SELECT @spidlist = @spidlist + CONVERT(VARCHAR(32), @spid) + ','

SELECT @sqlCommand = ''

-- IF @sql_handle-2 <> 0x0

-- BEGIN

/*Was originally going to try and take SQL command info from

::fn_get_sql - but wasn't getting anything back. This, in theory

should allow us to spot the exact locking statement from a batch -

but, doesn't appear to return any value if, say, the issue's caused by

a transaction where the work's complete but awaiting a rollback/commit

command (say being run manually through QA). So - use scruffy

DBCC INPUTBUFFER approach instead

*/

DELETE FROM #Inputbuffer

SELECT @cmd = 'DBCC INPUTBUFFER (' + STR(@spid) + ') WITH NO_INFOMSGS'

--PRINT @cmd

INSERT #Inputbuffer EXEC(@cmd)

SELECT @sqlCommand = COALESCE(RTRIM(EventInfo), 'Unable to identify SQL command')

FROM #InputBuffer

-- SELECT @sqlCommand =

-- SUBSTRING

-- (text, 1, 2000)

-- FROM

-- ::fn_get_sql(@sql_handle)

-- END

/*Keep a list of problematic transactions for future reference */

INSERT INTO Open_Transactions_History

(

Found_Date,

spid,

login,

db,

hostname,

program_name,

sqlCommand

)

VALUES

(

getdate(),

@spid,

@login,

@database,

@hostname,

@program_name,

@sqlCommand

)

-- SELECT @cmd = 'netsend ' + @hostname + '"Your machine has an OPEN transaction to ' + @database + '"'

-- EXEC master..xp_cmdshell @cmd

/*Send warning e-mail to appropriate recipients*/

SELECT @cmd = 'Host: ' + @hostname + CHAR(13)

SELECT @cmd = @cmd + 'Login: ' + @login + CHAR(13)

SELECT @cmd = @cmd + 'Database: ' + @database + CHAR(13)

SELECT @cmd = @cmd + 'SPID: ' + CONVERT(VARCHAR(6), @spid) + CHAR(13)

SELECT @cmd = @cmd + 'Program: ' + CONVERT(VARCHAR(20), @program_name) + CHAR(13)

SELECT @cmd = @cmd + 'SQL: ' + CHAR(13) + @sqlCommand + CHAR(13) + CHAR(13)

SELECT @cmd = @cmd + 'Procedure: FindOpenTransactions'

EXEC master..xp_sendmail @recipients = @emailAddress, @subject = 'Open Transaction warning', @message = @cmd

END

ELSE

/*If it's a new proc with open transactions add it to the 'watch list'*/

BEGIN

INSERT INTO open_transactions

(

spid,

login,

db,

hostname

)

VALUES

(

@spid,

@login,

@database,

@hostname

)

SELECT @spidlist = @spidlist + CONVERT(VARCHAR(32),@spid) + ','

END

FETCH NEXT FROM curOpenTransProcs INTO @spid, @dbid, @hostname, @login, @program_name, @sql_handle-2

END

--Cleanup

DROP TABLE #InputBuffer

--PRINT 'spidlist ' + @spidlist

IF @wehavedata = 1

BEGIN

/*Remove anything from the open transactions table where the spid no longer exists on the server*/

SELECT @cmd = 'DELETE FROM Open_transactions

WHERE spid not in ('

+ SUBSTRING( @spidlist, 1, LEN(@spidlist) -1) + ')' + CHAR(13)

--PRINT @cmd

EXEC (@cmd)

END

ELSE

DELETE FROM Open_transactions

CLOSE curOpenTransProcs

DEALLOCATE curOpenTransProcs

GO

Adi Cohn-120898 SSC-Dedicated Points: 33944 More actions · Answer 11

Jane (10/1/2008)
A BIG 'Name and Shame' poster in the common area works well...two strikes and you're named. Actually, this works for lots of things. 😉

I wouldn’t set rowcount on login trigger, because it won’t take long until they figure out how to cancel it.

I also don’t think that trying to “name and shame” will do some help. In fact I think that it will do more damage then good. In the described situation the dev team doesn’t think that the DBA’s recommendations are important, so they won’t care if the DBA will name developers that don’t fallow those recommendations. The thing that you can do is to try and educate them. Show them why there code is causing problems. Show them the code that they wrote, then show them your code. Let them see the difference in the code’s execution. Explain them about the code’s difference and why it affected the performance. From my experience most of the time that helps and developers are starting to come to you on there own initiative.

I admit that I had one working place in the past with some development teams that just wouldn’t listen to me at all. Even when I showed them that there applications can be used to take control over the server with SQL Injection, they refused to modify the applications.. In that case after I realized that I lost the battle and the developers at this place don’t really care how there applications work and don’t want to improve there skills, I left the job and found a new one.

Adi

--------------------------------------------------------------
To know how to ask questions and increase the chances of getting asnwers:
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

EdVassie SSC Guru Points: 60481 More actions · Answer 12

Showing exactly why a SELECT * is bad is harder than it needs to be in the SQL Server world, partly due to lack of metrics.

Going back a decade when I ran a DB2 mainframe shop, the metrics were well known. The same principals apply to SQL Server, even if some of the numbers are different.

Every data item you include in a SELECT statement has a more or les fixed overhead to materialise in the result set. This means if you do a SELECT TOP 1 col1 FROM tablea, you materialise 1 data item. If you do SELECT TOP 5 col1,col2 FROM tablea you materialse 10 data items.

A decade ago, the figures were that you could materialise 17 data items in about 1/100 second. It also took about 1/100 second to issue and complete a disk read, so the performance experts liked to say that selecting 17 data items slows your application down by the same amount as a single disk read.

This makes it easier to go to developers with facts and figures. If you only need one column from a 17 column row, then doing a SELECT TOP 1 * FROM tablea will take the same elapsed time as SELECT TOP 17 col1 FROM tablea. Doing a SELECT * FROM tablea when tablea is 1m rows would have a minimum response time of 100 seconds, compared to SELECT col1 FROM tablea having a minimum response time of 5.8 seconds, assuming all of tablea is in memory.

The figures for how many data items you can materialise in a given amount of time will definitely be different now, and will vary according to processor speed. However, even if you just make a best guess about this ratio, it can help educate people in the importance of only asking for what they can use.

If you knew you could only eat one Shredded Wheat (tm) would you insist on having a shipping container full dumped on your plate, only to throw the rest away after eating the first one that was given. And then doing the same again, and again?

Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

Jim Russell-390299 SSCrazy Eights Points: 9333 More actions · Answer 13

Before everyone jumps on the "never SELECT *" bandwagon, I'd argue that there are cases where * is both useful and appropriate. One example is a view that provides a row-filtered subset of a table; if columns are added to the table (that's one of the advantages of a relational table, right?) it is cleaner (more generalized) to have the view reflect the current column composition of the table, as opposed to "hard coding" in the columns that existed in the "old days". Strikes me that hard coding column names is not far removed from hard coding character positions in a flat file, or worse, hard coding row numbers.)

(OK, the view may need to be refreshed, but I think that is an artifact of the current SQL Server implementation.)

Animal Magic SSChampion Points: 13964 More actions · Answer 14

Jim Russell (10/2/2008)
Before everyone jumps on the "never SELECT *" bandwagon, I'd argue that there are cases where * is both useful and appropriate. One example is a view that provides a row-filtered subset of a table; if columns are added to the table (that's one of the advantages of a relational table, right?) it is cleaner (more generalized) to have the view reflect the current column composition of the table, as opposed to "hard coding" in the columns that existed in the "old days". Strikes me that hard coding column names is not far removed from hard coding character positions in a flat file, or worse, hard coding row numbers.)
(OK, the view may need to be refreshed, but I think that is an artifact of the current SQL Server implementation.)

Jim, i could be wrong, but i think the debate is regarding the number of rows returned rather than being able to see all the columns. so doing a select top 1000 * from MyTable would not be an issue, whereas select * from MyTable where MyTable contains 20m rows could be a problem.