Introduction
Anyone
can learn a little about creating and/or
accessing a database nowadays. Just take a minute to browse your local bookstore and
you will find a plethora of books on the subject. And while I feel that it
is wonderful that all of this knowledge is so easily attainable, I also find it frustrating. It
seems that many times what is being taught is often the "fast and easy" way
to do something with no mention of the repercussions of building actual applications in such
a manner. What I commonly
find missing from these textbooks is any consideration of the following core aspects of
professional database development and deployment:
SECURITY
PERFORMANCE
MAINTAINABILITY
Database access is a subject that
deserves consideration as part of the overall database architecture, and not
just a quick demonstration of the simplest and most easily described
methods. Applications designed without a data access forethought
can be the most difficult to secure, support, maintain and
enhance. Designing your application so that Stored Procedures are the only
database objects that are accessed is generally the best practice. In
fact, Stored Procedures should be used in SQL Server for Data
Manipulation Language (DML)
.
This premise is written with the following assumptions in
mind:
Assumptions... |
|
|
|
|
|
The simple fact is, using
Stored Procedures in conjunction with a well defined development standard to consolidate your
Insert, Update, Delete and Select statements will vastly improve the security, performance and
maintainability of your application at a minimal/negligable
cost.
Security
Permissions - Table/View Level
I have heard and read many times that views are a good way of
controlling access to particular database objects. Stored Procedures, however,
can provide the same and additional functionality. You can grant EXECUTE
permissions to a stored procedure without necessarily granting rights to
the objects referenced by the stored procedure. If stored procedures are used
for DML (insert, update, delete and select) they can provide the same functionality
provided in a view. Programs like Microsoft Access or Crystal Reports allow
users to pull information from stored procedures as if they were tables. In
addition, the user will be prompted with the parameters present in the stored
procedure, possibly eliminating the requirement for a "where clause" or a
filter to be applied to the data. (BTW - If a
user wants to do ad-hoc reporting, ideally this work should be done on a standby
server and not against a production machine.)
Permissions - Column/Row Level
By assigning permissions through Stored Procedures, not only can you control
how any statements that will alter the database are written, you can control
modifications on a column and/or row basis.
Data Integrity
Using Stored Procedures for DML makes it nearly
impossible for someone to forget the "where clause" and make a global update, or worse
yet, delete the data from an entire table. In addition, requiring the use
of DML for inserts, updates and deletes can help to prevent the addition or alteration of data by incorrectly written
ad-hoc SQL statements.
Encapsulation
In many cases, a company may not want all of it's own
developers, or client developers to know the exact schema of a proprietary
database. By using stored procedures for all DML this can be
accomplished while allowing development of a system to move
forward.
RI and Business Rules*
Depending upon where the referential integrity is enforced, using stored procedures in this way
can help keep the database under control. (You may not want
users adding new magazine articles to a table if the magazine itself has
already been printed.) You can also control RI between your system and other
databases (see Transactions).
Transactions*
By using stored procedures you can be assured that mutually dependent updates within
your database or between your database and others occur synchronously. In
other words, you can wrap a transaction around an entire update so that either
all of it occurs, or it all rolls back as one.
*Some type of Business Rules and Transactions may not be appropriately
placed in Stored Procedures depending upon the application design
and the decision of the layer in which to place this type of logic.
In many cases, some Business Logic arguably belongs in it's own object
layer(s), and transactions may be handled by entities other than just SQL Server
(like MTS). However, in many basic development environments, Business
Logic and Transactions are best segregated in the Stored Procedures.
These items do need to be considered early in development, and organized in a
centralized, predictable and manageable fashion.
Performance
In my experience, there are
basically five
ways in which data in a SQL Server is manipulated,
let's briefly look at each
:
1. Data Access Methods (ADO, RDO, DAO,
etc)
2. Dynamic/Embedded SQL Statements in Code such as
VB/ASP
3. Updateable
Views
4.
Dynamic-SQL Stored Procedures (non-compiled SQL)
5. Static Stored
Procedures
ADO (RDO, DAO, etc) Methods
If you use ADO, you will be familiar with the handy .AddNew,
.Update, and .Delete methods that are available. While these are very
convenient, and what is most commonly taught in the "learn it quick" books,
they are a very expensive means of getting at the data in a SQL Server.
Compared to using stored procedures, which exist right on the SQL
Server itself, these methods in ADO can generate large amounts of network traffic
and go about doing an update using cursor based logic (looping) instead of
joins. If you are dealing with large amounts of data, the
difference can be enormous.
Dynamic/Embedded SQL in Code such as VB/ASP
This can take two forms, hard
coded static SQL statements in an application (VB or ASP for example) or dynamic
series of SQL Strings that get pieced together based on logic also hard
coded in the application. The SQL then gets sent across the network to
the SQL Server where it must be parsed for syntax accuracy, compiled, a query plan
created, and then executed. While this is slightly faster than using
ADO's methods (above), it is still very burdensome, and when an
application gets large it becomes very difficult to maintain. In
addition, by embedding SQL into an application the responsibility for
optimizing the SQL Code is for all practical purposes no longer in the arena
of the DBA or SQL Developer who is specialized.
Updateable Views
While views can provide the same type of
insulation between an application and database tables as stored
procedures, views provide almost the same functionality as a table and
therefore circumvent many of the reasons for using stored procedures discussed
in this article. A view can easily be treated as a table and allows for
the use of ADO Methods, inefficient SQL code, forgotten where clauses,
embedded SQL etc. Views are great to use for ad-hoc reporting solutions,
preferably on a standby server and not on a production
machine.
- Dynamic-SQL Stored
Procedures (non-compiled
SQL)
Stored Procedure is one in which the SQL is built on the fly inside of the
stored procedure, and is dependent upon parameters passed. The SQL
statement is concatenated together and the final result is then executed.
While this can be a very powerful tool, it must be considered that writing
stored procedures in this way is difficult to debug, maintain and inefficient.
It is impossible for SQL Server to reuse query plans created by dynamic stored
procedures, and so each time a new query plan must be
created.
Static Stored Procedures
By using stored procedures
the syntax has already been checked, and a
compiled statement and
pre-defined query plan are ready to use immediately. There are many performance
benefits to having DML located in Stored Procedures, the top four
being that the code is Precompiled, that the SQL Server can then
perform multi-step processes without repeated calls across the network, that the SQL Code
can then be written by SQL Developers, DBAs, or in the
most basic cases a scripting tool (see Tools section at the end of this article)
- not by Webmasters or other types of non-SQL specialists, and
that Network Traffic and CPU Processor Time is decreased.
Precompiled Code
One
reason why stored procedures improve an application's performance is that the SQL
code is precompiled. When a stored procedure is saved in the database,
SQL Server parses the syntax for accuracy and saves the procedure's text in
the syscomments table. Upon the initial execution, the query processor checks to see
that the referenced objects exist. If they do, SQL Server compiles an execution
plan and stores it in memory in an area called the procedure cache.
As a result, SQL Server doesn't have to reconstruct the same query
plan over and over again, and already knows the indexes it will use and the
strategy it will take to return query resultset.
Multi-step Processes
Another performance gain from stored procedures is
that you can perform multi-step processes requiring conditional logic, entirely within a
stored procedure. You simply call the procedure and pass it
all the parameters required for the process. The procedure can then use Transact
SQL (T-SQL) conditional statements (CASE statements are designed for this
purpose) to make decisions rather than have the client application submit a
query to the server, retrieve the result set, make the decision, and finally
submit another query to perform the action. These additional round trips to
the server consume network bandwidth, not to mention processing
time.
Efficiently Written Code
When combined with the proper development process and
personnel, requiring all DML be contained within stored procedures can help
to assure that all SQL will be either written or reviewed by a
knowledgeable SQL person who understands the underlying index structures and
the concept of optimization. Obviously, poorly structured SQL can cause
performance issues as locks escalate and resources are consumed. One good example of this is
that select statements on heavily used tables that do not require up
to the second information can be given the "with (NOLOCK)" hint to allow a dirty read. If all requests
for information go through the same interface (that table's "select" stored procedure), locked records and
timeout issues in the application can be avoided.
- Reduced Network Traffic & CPU Processor Time
Calling a stored procedure only requires sending the stored procedure and
parameters across the network instead of an entire SQL String.
Plus, the stored procedure can decide what type of update needs to take place
instead of successive SQL statements traveling back and forth as conditional
instructions are followed from another location (IIS/ASP, COM or within a
client).
One additional
note: While it may take more code, it is worthwhile to create
parameters for your stored procedure calls explicitly in your code instead of
using the .Refresh ADO method. Doing so eliminates back and forth trips to the
server to provide a list of parameters and will speed things up even
more.
Maintainability
The last, and
probably most significant reason for using Stored Procedures for DML and making
data access it's own "layer" is that the level of complexity related to
database alterations is greatly decreased. This strategy has many benefits:
- It allows for the use of a consolidated data
access object naming convention.
- It places data
storage rules closer to the database itself.
- It isolates code changes required due to database
alteration.
- It promotes data access code
consolidation/componentization.
- It works with source code version control tracking
(
VSS)and development processes.
- It can
unify the method of error logging in all tables throughout a database, and
greatly decrease troubleshooting time.
Error Logging
Stored Procedures can call the raiserror function, andwrite to the log a custom message to assist in tracking and debugging a SQL application. Other
methods of error logging which can be accessed when doing updates via
stored procedure are writing information to meta tables and providing output parameters signaling success, failure,
or some other information. This strategy places the SQL related errors close
to the people in the infastructure who can fix them. An email
can even be sent to an administrator if need
be! Using Stored Procedures for DML, and including Error
Tracking, is a great way to provide consistent Error Handling for all tables
in your database.
- Database Structure Changes
By creating procedures to select, insert, update, and delete data,
you can shield any dependent applications from changes that are required within the
database. When stored procedures are used as the interface that developers use
to manipulate data, you can change the database's structure without affecting the
application programs. As long as the stored procedures contain the same parameters, return the same quantity and
types of columns, and no additional required parameters have been added, there is no need
to recompile and redistribute any client program when database alterations are
made. Stored Procedures are a great way to achieve a separation between
a database and dependent application programs.
Control Source
Code
Using stored procedures for DML allows you to create a modular library code designed
specifically around data access. The code can then be scripted and
versioned, allowing for the creation of a "data access" layer
that can more easily be defined, tracked, debugged and involved in
the development cycle as it's own entity. Since data access is
one of the biggest application performance problem areas (bottlenecks), it is
beneficial to treat it as a serious subject requiring highly trained
development resources and consideration.
DML Consolidation and
Modularity
Having stored procedures that providepredictable and specific DML functionality helps to prevent repeated code from
being written. This type of consolidation is not only efficient in
regard to development time, but makes an application smaller, more streamlined
and modular. Also, the act of SQL optimization only needs to be
done once per actual table event type, instead of multiple times for
essentially the same SQL with a different name or in a different location
in the application. Repeated code is generally undesirable in programming, and
SQL is no exception.
Defined Methodology
Especially in large development teams, if there is not a
development methodology (how do we name objects, what type of code belongs in
each object, what are the technologies we are going to use to create objects and
which are we not going to use, etc..) the entire development project
can become so scattered and
hodge-podge that the application becomes a nightmare to debug
and maintain. SQL and Data Access is no exception. Sure, it may be faster
and easier to just write SQL on the fly in whatever environment you happen to be
coding in at the time, but in the long run this can cause more
problems than the initial convenience. As in anything, a little planning up
front can make your life (or the next guy's) much easier.
If Stored Procedures are used
for DML, a defined data access methodology can prevail which makes
development, debugging, optimization, upgrades and migrations much more
simple. One example is defining a naming convention for all DML to help
immediately pinpoint the logical players tied to a table alteration (see
below)
.
Example DML Naming Conventions
Option 1
ins_company
ins_order
ins_order_detail
upd_company
upd_order
upd_order_detail
del_company
del_order
del_order_detail
Option 2
pi_company
pi_order
pi_order_detail
pu_company
pu_order
pu_order_detail
pd_company
pd_order
pd_order_detail
By requiring all inserts, updates,
deletes and selects occur through stored procedures named in such a way, a
database change becomes fairly straightforward. Otherwise, one would need to
go through the familiar task of changing views, stored procedures, asp code,
vb code, include files, Access databases, Crystal Reports,
etc...!
Summary
I am aware
that there are many ways to access data, but I am of the general opinion that
DML itself does not belong scattered all over an application. It is too
difficult to maintain, secure and nearly impossible to optimize. Plan
ahead when doing your development, otherwise most developers will unthinkingly follow the
shortest development path - sacrificing stability, efficiency and long term development efforts.
Tools
- Hemant Sahgal has posted
T/SQL Code that will generate insert, update and delete stored procedures for
every table in a database based on metadata. You can find these
scripts at the following link under:
Automatically Create Insert, Update and Delete
Queries.
I found that the
scripts do require some modification to work under SQL 2000 - mostly double
quote and single quote issues, but nothing too complex. It would be
fairly easy to write a script that stepped through each table in a database,
called a revised version of Sahgal's code and compiled stored procedures for
Insert, Update and Deletes automatically.
- Lockwood Technologies has a freeware VB - SQL
DMO program, including source code, called
Proc Blaster that generatesInsert, Update, Select and Delete Stored Procedures for each table in a
database.
Related Articles
Tuning for SQL Server Developers
Best Practices
SQL Server in an OLTP Environment
Tuning Tips for Creating VB Applications using SQL Server
MSDN - Coding Techniques and Programming Practices
MSDN - Designing Efficient Applications for Microsoft SQL Server
MSDN - Top Ten Tips: Accessing SQL Through ADO and ASP
instead of Dynamic SQL
Procedures with ASP, ADO and SQL Server
Books
SQL Server 2000 Resource Kit
Professional ASP Data Access
Beginning ASP Databases
Professional SQL Server 7