Developers love documentation. At least they love having documentation available
to them. But suggest that they might like to contribute to a documentation set then you can
expect every excuse under the sun.
- The code is self-documenting
- There is no established process that everyone conforms to so what documentation
exists is unclear.
- Code comments are not actually bound to the code itself and so comments are irrelevant
- The documentation is out-of-date as soon as it is written
- Everyone has a different copy of the documentation
- There are no tools to make this easy
- etc
All these points are valid but surmountable and I will deal with each of them in
turn
The myth of self documenting code
I think there are two conditions under which a developer would consider code to
be self-documenting
- They wrote it in the first place
- They wrote it quite recently
I used to work for a documentation company acting as a consultancy/software house
within a small team of developers who had to support numerous external customers. I
have always been diligent in commenting my code and a firm believer in refactoring
code down to the point where it cannot be refactored any more. Even so, I have had
occasions when I have had to dig into my own code for a support query and thought,
"Lord, how does this work"?
SQL is a relatively simple and descriptive language but even this can become complex
and obscure. Doubly so if you are on-call and get a telephone call at 03:00 in the
morning!
In short, I don't believe there is such a thing as self-documenting code.
Established process for documentation
It should be up to the lead developers to establish a process that must be adopted
by the organisation. Key to establishing a process is having a tool that will assemble
the documentation.
Precisely how code is documented will depend on the tool that is used to write it
and to assemble the documentation into a distributable form.
As a lead DBA the process I established was as follows
- New objects must have brief but clear comments that determine their intent
- If it is a table or view then what do the records in the table/view represent?
- If it is a field in a table then what is the intended use of the field?
- If it is a function or stored procedure then briefly comment on its purpose but
also note any peculiarities of its use. For example a stored procedure for carrying
out a purge on a large table might have a comment that it is not to be run during
the day.
- If it is parameter then state its intent and valid values.
- Legacy uncommented objects must gain comments for those items that are being changed.
It is optional but highly desirable to comment those items that are not being changed
so that data accumulates over time.
- Legacy commented objects should have their comments updated.
- The peer review process will reject any object creation/changes that do not include
comments
- Comments must be deployed with the code as it passes through the different environments.
The last point is important. It is no good accumulating database comments in the development environment if
it can be wiped out by someone restoring a database from an integration, testing or LIVE environment.
The reality is that commenting objects at the point of creation or modification
requires trivial effort and enforcing it in the peer review process ensures that
it gets done.
By limiting mandatory changes to only those legacy items that are being changed
by the developer then the legacy tasks are not onerous. In fact some developers
voluntarily chose to research the use of legacy systems and add comments to the
code. They did this because they knew that having changed something about the code it was likely that
any legacy bugs would be attributed to them so they wanted to make sure they were armed and ready when
the inevitable support call came in.
Code comments and documentation become out-of-date
It is a fair comment to say that code comments are not bound to the code. I am not
a fan of over-commenting code. If the code is so complex that it requires extensive
comments then it is probably too complex and needs refactoring.
The main thing is that the effort should go into the comments that can be assembled
by the documentation tool.
The other key points are as follows
- The documentation should be web-based so that everyone is looking at the same version
of the documentation
- An automated process should rebuild the documentation at regular intervals.
There are no tools to make this easy
For SQL Server this is emphatically not the case. The remainder of this document
is dedicated to describing how Red-Gate SQLDoc 2.0 can be used to satisfy the points
mentioned above.
Red-Gate SQLDoc 2.0
SQLDoc 2.0 is an inexpensive documentation tool for SQL Server and as its start-up
splash page says, it is ingeniously simple.
The product relies on grabbing extended properties for objects that are named MS_DESCRIPTION.
You can enter these for tables and fields from SQL Management Studio as the "Description"
property dialogue corresponds to the extended property.
There are three ways to maintain these properties
- From within the SQL Management Studio GUI (or Enterprise Manager for those of you
still using SQL Server 7 & 2000)
- From traditional SQL Scripts
- From within Red-Gate SQLDoc 2.0 itself of which more later
Database scripting for commenting objects
The MS_DESCRIPTION extended property for objects can be set
in scripts by using the sp_addextendedproperty and
sp_updateextendedproperty stored procedures.
In the project properties there is a check box to indicate whether
or not the SQL Creation Script should be included in the documentation set. If this
is checked then the sp_addextendedproperty syntax will
be revealed.
Books on line gives full details of the sp_addextendedproperty and sp_updateextendedproperty
stored procedures and the argument values which I will summarise below
- Property name. In this case MS_DESCRIPTION
- Property value
- Top level object type(Level 0) such as SCHEMA
- Top level object name(Level 0) such as HumanResources
- Middle level object type(Level 1) such as TABLE
- Middle level object name(Level 1) such as Employee
- Bottom level object type(Level 2) such as COLUMN
- Bottom level object name(Level 2) such as EmployeeID
The table below lists most of the ones of interest
Top Level 0 Type | Middle Level 1 Type | Bottom Level 2 Type |
---|---|---|
SCHEMA | TABLE | COLUMN |
VIEW | COLUMN | |
PROCEDURE | PARAMETER | |
FUNCTION | PARAMETER | |
TYPE |
| |
XML SCHEMA COLLECTION |
| |
TRIGGER |
|
|
The example script below is one I use for maintaining comments for tables and fields.
I tend to have an explicit documentation script within my projects that makes adding
or updating the database documentation easy.
If you need to use it with SQL2000 then change SCHEMA to USER
Because it is boilerplate code a peer review consists of two steps
- Making sure the CTRL + F5 (check syntax) for the script does not throw any errors
- Making sure that each item in the deployment has an entry in the script. Remember
the development standards for documenting objects have to be set and enforced.
DECLARE @v sql_variant , @ObjectName SYSNAME , @ObjectType SYSNAME , @PropertyName SYSNAME , @PropertyType SYSNAME , @SchemaName SYSNAME SET @ObjectName='ContractExpiry' SET @ObjectType=@ObjectType SET @v = N'The Contract Expiry relates to the lock in time for a particular contract rather than the actual expiry of a contract.' IF EXISTS(SELECT 1 FROM ::fn_listextendedproperty (NULL, 'SCHEMA', @SchemaName, @ObjectType, @ObjectName, default, default)) EXECUTE sp_updateextendedproperty N'MS_Description', @v, N'SCHEMA', @SchemaName, @ObjectType, @ObjectName, NULL, NULL ELSE EXECUTE sp_addextendedproperty N'MS_Description', @v, N'SCHEMA', @SchemaName, N'table', @ObjectName, NULL, NULL -- Repeat the following block of code for each property (column/constraint/index/parameter) SET @PropertyName = 'ContractExpiryID' SET @PropertyType = 'COLUMN' SET @v =N'A unique identifier for the contract expiry. This is intended to be the number of months until the contract expires.' IF EXISTS(SELECT 1 FROM ::fn_listextendedproperty (NULL, 'SCHEMA', @SchemaName, @ObjectType, @ObjectName, @PropertyType, @PropertyName)) EXECUTE sp_updateextendedproperty N'MS_Description', @v, N'SCHEMA', @SchemaName, @ObjectType, @ObjectName, @PropertyType, @PropertyName ELSE EXECUTE sp_addextendedproperty N'MS_Description', @v, N'SCHEMA', @SchemaName, @ObjectType, @ObjectName, @PropertyType, @PropertyName
Getting started with SQLDoc 2.0
Download the evaluation copy of Red-Gate SQLDoc 2.0, install and run it.
After the splash page has vanished SQL Doc will do one of three things depending
on the settings under the Tools\Application Settings menu
- Open the last project you worked on
- Present the open a project dialogue
- Create a new project
The first stage of any new project is to specify the connection which can be done
as shown in the dialogue box shown below
You should note that the documentation set is specified at the server level.
The "Documentation Setup" tab allows some basic information to be supplied
to the project. As you can see from the dialogue below this is a very simple set
of information.
If you look at the "Use application defaults" check box you can see
that the copyright notices and author have been picked up from default setting which
can be altered using the Tools\Application Options menu.
Once you click the OK button you will be faced by a dialogue box similar
to the one shown below though obviously your database server and databases will
be displayed.
You can choose the databases that you want to document and also expand the tree
structure to choose the objects within the database that you want to document. The
diagram below shows what is displayed for a table.
Where ever the
icon is shown this indicates that comments can be entered against the item associated
with that icon. To do so you simply have to click on the icon to bring up a dialogue
box as shown below
Once all the comments have been filled in and it is time to generate the documentation
set then you simply click the "Generate Documentation " icon to receive
the dialogue box shown below.
In my case I am only interested in generating web documentation but as you can see
I could generate a CHM file or MS Word Document which will give some additional
options
Both the "Document (.doc)" and "Help file(.chm) allow the filename to
be specified and also whether the filename should include the timestamp.
As you can see the "Document (.doc)" option also allows you to specify the
page size.
The "Web page(.html) option has the following features
- Allows the left hand navigation panel to be generated as an html document within
a frame or as part of each page.
- Will always place the documentation in a folder named after the server concatenated with
the ISO8601 timestamp for when the data was generated. I have contacted Red-Gate
and the ability to exclude the timestamp is on their TODO list. For now I have a method of getting around that as
will be shown later.
Generating the documentation set from the command line.
The first thing to do is to make sure that the PATH environment variable includes
the path to the SQLDoc.exe.
To do this open Windows Explorer and right-click on the "My Computer" icon
and navigate to the tab dialogue shown below.
Click on the Environment Variables button as highlighted and append the following
directory to those already in the PATH variable
Restart your computer and you should be able to run sqldoc /help from the command
line.
If that works then run the following from the command line.
sqldoc /help /verbose /html > "C:\Documents and Settings\<your profile>\My Documents\My Database Documentation\help.html"
This will list the command line switches and return codes into the help.html file for you to access locally.
To generate a documentation set for our example Adventureworks project we simply
type the following
sqldoc /project:adventureworks.sqldoc
By default SQLDoc will generate the documentation set using the settings that were
saved with the project however these project settings can be over-ridden from the
command line. The example below shows how to force a project to output in html format
with the navigation pane in its own frame.
sqldoc /project:adventureworks.sqldoc /filetype:html-frames
If we look at our output directory we will see a folder structure similar to the
one below
As you can see the tool creates a time stamped directory for your server with sub-folders
for the databases which in turn contain sub-folders for the type of database object.
This is by no means a definitive list, it very much depends on what objects you
asked to be documented.
Automating the documentation
We need to get to the stage where generating documentation is a "set it and
forget it" item. For us to do this there are some obstacles to overcome.
- The timestamp on the main folder
- The fact that the folder is named after the server
The 2nd point might not affect you but in my case I have over 250 databases
on one server so one big documentation set is likely to be unwieldy. I like to generate
my sets by line of business and purpose.
Renaming the document folder
I do this using a Windows command file. Obviously I don't know what the time
stamp on the folder will be before it is generated so I have to revert to using
the old DOS FOR command.
for /D %%a in (MyDBServer*) do rename %%a Adventureworks
In this example I am going to rename my server folder to Adventureworks. Obviously
there must not already be an Adventureworks folder so I need to put together some
commands to remove it if it already exists.
To remove a directory using the Windows command line two things must be true
- The directory must contain no files
- The directory must contain no sub-directories.
Looking at the folder structure for a database you can see similar sub-folder structures.
This suggests that being able to pass a directory name into a clean-up routine would
be a useful facility.
With this in mind I created a Windows command file called DocCleanUp.cmd with the
following commands in it.
echo off REM ************************************** REM * Remove any files in subdirectories * REM ************************************** for /D %%a in (%1) do del %%a /s/q REM ************************************** REM If a directory exists then remove it * REM ************************************** if exist %1\images (rd %1\images) if exist %1\tables (rd %1\tables) if exist %1\views (rd %1\views) if exist %1\scripts (rd %1\scripts) if exist %1\style (rd %1\style) if exist %1\programmability\Assemblies (rd %1\programmability\Assemblies) if exist %1\programmability\Database_Triggers (rd %1\programmability\Database_Triggers) if exist %1\Programmability\Functions\Scalar-valued_Functions (rd %1\Programmability\Functions\Scalar-valued_Functions) if exist %1\Programmability\Functions\Table-valued_Functions (rd %1\programmability\Functions\Table-valued_Functions) if exist %1\Programmability\Functions (rd %1\programmability\Functions) if exist %1\programmability\Stored_Procedures (rd %1\programmability\Stored_Procedures) if exist %1\programmability\Types\User-Defined_Data_Types (rd %1\programmability\Types\User-Defined_Data_Types) if exist %1\programmability\Types\User-Defined_Types (rd %1\programmability\Types\User-Defined_Types) if exist %1\programmability\Types\XML_Schema_Collections (rd %1\programmability\Types\XML_Schema_Collections) if exist %1\programmability\Types (rd %1\programmability\Types) if exist %1\programmability (rd %1\programmability) if exist %1\security\schemas (rd %1\security\schemas) if exist %1\security\roles\database_roles (rd %1\security\roles\database_roles) if exist %1\security\roles (rd %1\security\roles) if exist %1\security\users (rd %1\security\users) if exist %1\security (rd %1\security) if exist %1\storage\Full_Text_Catalogs (rd %1\storage\Full_Text_Catalogs) if exist %1\storage (rd %1\storage) if exist "%1\Support Files\SQLServer" (rd "%1\Support Files\SQLServer") if exist "%1\support files" (rd "%1\support files") if exist %1 (rd %1)
I stress that the batch commands above are for my particular databases. I have no synonyms, service broker objects etc
so you may have to add new lines to the command file to suit your own purposes.
The next step is to build a Windows command file that will actually call our DocCleanUp.cmd
file and rebuild the documentation step. I've called my BuildAdventureworks.cmd
and the source code is as follows.
cls cd\Documents and Settings\<Your Profile>\My Documents\My Database Documentationcall DocCleanup.cmd Adventureworks\Adventureworks call DocCleanup.cmd Adventureworks\SQLServerCentral call DocCleanup.cmd Adventureworks sqldoc /project:adventureworks.sqldoc /filetype:html-frames for /D %%a in (MyDBServer*) do rename %%a Adventureworks
You may notice that I call DocCleanUp.cmd 3 times. This is because SQLDoc only generates
sub-folders for databases if there is more than one database being documented.
There is some more code to add to this file in order to deploy it but I will come
back to this later.
Windows scheduled tasks
To rebuild the documentation on a regular basis it is simply a case of setting up
a Windows scheduled task to run the BuildAdventureworks.cmd file.
At this point it is worth mentioning that Red-Gate SQLDoc makes use of schema views
that require elevated privileges such as ALTER DATABASE permissions. To achieve
this I set up a Windows domain account with a strong password and made it a member of the
sysadmin group on the server.
The account is and passwords are known to the DBAs who are sysadmin members in any
case.
I used this account for the Windows Schedule Task.
How frequently should the documentation be rebuilt?
For active projects I set the documentation set to rebuild every night.
For databases where the changes are minor fixes and performance tuning I set them
to rebuild on a rolling week basis.
Copying the data to a web server
The final step is to copy the documentation set from where SQLDoc runs to a suitable
web server.
In my case I have the following setup
- An explicit documentation web server running IIS.
- A shared directory to which my files can be deployed
- An explicit web-site for database documentation with an easily remembered url such
- A set of virtual directories in the web site pointing at the shared folder and the
sub-directories for the explicit documetation set.
Virtual directories for common files
In every documentation set for Red-Gate SQLDoc there are 3 common folders
Common Folder | Description |
---|---|
Images | Holds the web icons used by SQLDoc |
Scripts | Holds the JavaScript for the navigation pane |
Style | Holds the stylesheets for the documentation |
In total they only contain 120KB of files so if these are duplicated in each documentation
set it is no big deal from a storage perspective.
If, however you want to modify the look and feel of your documentation site it makes
sense to store them once and point a virtual directory per documentation set to
each of these 3 folder. By doing so changing the style sheets or images will affect
your entire documentation site.
These files need only be copied to your web server manually as a one off with 3
commands such as
xcopy Adventureworks\Images \\DocumentationServer\DatabaseDocumentation\Images /q/y/i xcopy Adventureworks\style \\DocumentationServer\DatabaseDocumentation\style /q/y/i xcopy Adventureworks\scripts \\DocumentationServer\DatabaseDocumentation\scripts /q/y/i
Virtual directories for the documentation files
If I have gone to the bother of setting up common folders then I don't want
to copy these every time. This means I need a simple windows command file which
I shall call DocPublish.cmd which is shown below
echo off REM ******************************************************** REM * XCOPY only the necessary files for the documentation * REM ******************************************************** if exist %1\tables (xcopy %1\tables \\DocumentationServer\DatabaseDocumentation\%1\tables /q/e/y/i) if exist %1\views (xcopy %1\views \\DocumentationServer\DatabaseDocumentation\%1\views /q/e/y/i) if exist %1\programmability (xcopy %1\programmability \\DocumentationServer \DatabaseDocumentation\%1\programmability /q/e/y/i) if exist %1\security (xcopy %1\security \\DocumentationServer \DatabaseDocumentation\%1\security /q/e/y/i) if exist %1\storage (xcopy %1\storage \\DocumentationServer \DatabaseDocumentation\%1\storage /q/e/y/i) if exist "%1\Support Files" (xcopy "%1\Support Files" "\\DocumentationServer \DatabaseDocumentation\%1\Support Files" /q/e/y/i) if exist %1 (xcopy %1 \\DocumentationServer\DatabaseDocumentation\%1 /q/y/i)
I simply append 3 calls to the bottom of the BuildAdventureworks.cmd file
call DocPublish.cmd Adventureworks\Adventureworks call DocPublish.cmd Adventureworks\SQLServerCentral call DocPublish.cmd Adventureworks
Again, I must stress that the commands in DocPublish.cmd are for my particular databases. You may have to edit this file to support your
particular implementation.
One thing you do have to remember is that the domain account used for the Windows scheduled
task must be able to read/write to the \\DocumentationServer
share.
Configuring the web server
Configuring the web server is easy as for each documentation set there are three
steps.
- Set up a virtual folder for the documentation set
- Set the default document for the virtual folder to main.html
- Set up virtual folders within the documentation set for the three common folders
- Images
- Style
- Scripts
Examples are shown below
Other things to consider
There are a few other tips I should like to pass on to you as I have found them useful.
Set up hints
The SQLDoc project files are held in a central place and backed them up. This means that all DBAs know where they are held.
Although the projects are quick and easy to create it is much quicker to get them back from a backup than to have to go
through rebuilding them.
I have an explicit folder structure both on my web server and where I build the document store that is representative of how
the business thinks about the databases. The way you structure information is as important as the information itself.
I generate the documentation on a machine other than the web server and then XCOPY data across for the following reasons.
- The documentation set builds faster when it is built locally, there is no lag caused by writing across the network
- In the event of problems on the web server (I have an old version of IIS for my documentation box) the documentation set is still rebuilt, just the XCOPY fails.
- I exclude the three common folders so rebuilding direct to target is not desirable in any case
Marketing the documentation set
It is no good building a fabulous documentation set then sitting back thinking "if I build it they will come"
.
You need to broadcast the fact that there is a company wide resource describing what information is available. Think of the
documentation as a shop window for information.
The audience for database documentation is wider than you would think. In the old days database access was
limited to the IT department but now the whole business wants to know what is in the database.
- New IT recruits want a jump-start on where everything is held
- Business Analysts want to understand the basics of what information is stored and where
- Any MIS and data warehouse staff are going to be acutely interested in the database
- Any one accessing the data marts is going to be interested and these days that can go all the way up to the board room!
There will always be the die-hards who insist that documentation is for wimps and that you shouldn't waste your time.
Don't be put off. All it takes is one or two evangelists in each area to get hold of the documentation set and it will gain
a life of its own.
What next?
Two things spring to mind
- Integration with developer code documentation. JavaDoc/NDoc have <seealso>
</seealso> tags that can be used
to point to external sets. The code for the Data Access Layer can have comments that point to the database documentation.
- Search engine integration. I' a big fan of Thunderstone Webinator and
it strikes me that plugging a search engine into a documentation site would be a natural progression.