When you're doing any database development work, it won't be long before you need to seriously consider the documentation of your routines. Even if you are working solo and you operate a perfect source-control system, it is still a requirement that kicks in pretty soon, unless you have perfect recall. I like to keep the documentation for a database object with its script, where possible, so that it is easy to access. This usually means structured headers in comment blocks.
We tend to borrow techniques for documentation from procedural programmers, and there are certainly some common requirements. Both groups want to know at least who wrote and revised a routine, when, why, and what they did, even if the 'who' was yourself. I suspect that you’ll need, in addition, a comprehensive list of examples of use, together with the expected output, which can then become a quick-check test harness when you make a minor routine change.
Database routines are unlike other code in that a lot of the documentation work is done for you by the system. Information that you'd normally want to know, such as dependencies (which other database objects depend on a given object and which it, in turn, depends on) come for free, as does a host of other information such as the object type, the parameters and parameter types. The same is often true of outputs.
All this information can be placed, either manually or automatically, into structured headers or into the rather clumsy 'extended properties'. These headers have to conform to a standard, so that routines can be listed and searched. At a minimum, there should be agreement as to the headings. Many different corporate-wide standards exist but I don't know of any common shared standard for these various of documentation.
Many conventions for 'structured headers' take their inspiration from JavaDocs or from the C# XML comment blocks in Visual Studio. I've seen quite a few 'homebrew' solutions that extract structured headers from T-SQL routines, automatically add information such as name, schema, and object type, and store them in an XML file. From there on, things get murky. We don't have the database equivalent of Sandcastle, which takes the XML file and generates a formatted, readable, help file. However, one can easily do an XSLT transformation on the XML output to provide HTML pages of the data, all nicely formatted, or one can do corresponding transformations into a format compatible with HELP-file documentation systems. It is even possible to use Microsoft's SQLMetal to output an XML documentation file straight out of a SQL Server database, but you then have a tricky task in merging your header information into the output DBML file.
Should we, as a community, propose a common standard for T-SQL headers, and the information that should go into them? Should we use a range of extended properties? I like to use the /** … */ comment block together with XML-structured headers, as used by Visual Studio, using extended properties for objects such as tables that have no stored DDL.
However, is there a better way of doing this? Have I missed an existing shared standard that seems sensible?
Phil Factor (Guest Editor)