SELECT * FROM…when administrators see that from developer code, we generally tend to cringe. Why? In a nutshell, it’s terrible on a few fronts. First, typically that SELECT * FROM that gets written (a lot of the times) lacks a WHERE clause. What’s the problem? Well, that pulls back every single row from the table.
Sure, that may not be too bad on a table with a few hundred rows but what about on a table with millions? That could cause a lot of performance problems since you’re trying to read all the data off disk (again, potentially). Secondly, do you really need all those rows and/or columns? It’s a waste of time and resources to pull back every column from a table if your application is only going to be using a few of them anyways.
So how do we prevent this? Well I recently learned an extremely evil way of preventing such a query. I’d like to start off with, this was NOT my idea. I learned this trick from Adam Jorgensen (Blog | Twitter). I’d also like to add this disclaimer:
DO NOT JUST GO DO THIS IN YOUR PRODUCTION ENVIRONMENT! I am not responsible for whatever evil you turn loose upon your environment. Always test things out in a development environment first and get proper approvals before making any changes.
Pure Evil Method
This method is actually evil in its simplicity. What we’ll be doing is adding a new column to the existing table. The “trick” is that this will be a computed column whose formula will cause an error, specifically a divide by zero error. As shown in screenshot below, create the new column on the table and call it something obvious like ‘DoNotSelectAll’. In the Column Properties window, under the Table Designer section, there is a property called Computed Colum Specification. In the formula section, enter (1/0). Save your table.
Now if I try to do my SELECT * on this table, I’ll get this lovely message:
Alright, we learned our lesson, now we’ll explicitly name our columns that we need:
Now this last query worked but notice how I didn’t put a WHERE clause so it pulls back all rows anyways? Yup, your users can still pull back everything, but at least they’re not doing a SELECT *. Also keep in mind, if you’re used to right-clicking that table in SSMS and selecting TOP 1000, with this column in place it will error for you as well. What are your alternative options?
LESS EVIL METHODS
Abstraction
Another way to control this kind of behavior is by not letting users hit base tables at all. You could create Views that have queries in them that limit rows returned. This way a user can do a SELECT * on a view, but the underlying code of the view itself is limiting row returns.
Depending on your situation, this could work and it could not. If the user needed very specific data returned that wasn’t in that limited pool of results could adversely affect whatever process they’re using the data for.
Another option is wrapping everything in stored procedures and granting users access to executing stored procedures rather than querying tables and views. On the one hand, could be good since you’re encapsulating the code. Users can pass parameters to stored procedures so you could make the queries somewhat dynamic.
Handbrake
In SQL Server 2008 they introduced a feature called the Resource Governor. This feature allows you to throttle resources on queries based on custom functions and groupings you specify. Yes, it’s an Enterprise Edition feature but it can be well worth it if you’re having resource-related issues due to runaway queries.
Now this feature will NOT prevent users from doing SELECT * –type queries, however you can throttle how much resource is allocated toward a query so you can at least control how badly it’ll affect you.
Security
My friend Brian Kelley (Blog | Twitter) will probably appreciate this one. Be stringent with the accesses you grant! Grant users only the accesses they need. Also, ff you don’t want users banging against your transactional systems directly, think about setting up a dedicated/isolated reporting environment and point the users there instead.
The reporting box you stand up doesn’t have to be (necessarily) as beefy as your transactional system and you can setup customized security on that database. This is especially helpful for when the transactional system is a vendor application which you can’t make any modifications to the code.
Do you have any other suggestions/tricks to help prevent crazy user queries? Let’s hear it in the comments!