Introduction
I have a love hate affair with user defined types. I was first exposed to their benefits when I went on the original
Microsoft SQL 6.5 database design course. My first reaction was "WOW, this is a really cool idea" but now I am much more
circumspect in how I use them.
Their advantages are immediately apparent
- They greatly simplify the creation of the database schema
- They aid in standardisation across the schema.
When I was first learning about UDTs their disadvantages were not discussed and it was only through trial and
error that their limitations became apparent.
What is a user defined type?
The basics
Leaving aside the CLR data-types which are something different entirely at its simplest a UDT is simply a shortcut
to a built in data-type plus a definition of whether it is nullable or not.
So we could create a UDT of ty_rqdesc as follows
exec sp_addtype ty_rqdesc, 'VARCHAR(50)','NOT NULL'
Or using SQL 2005 syntax
CREATE TYPE [dbo].ty_rqdesc FROM VARCHAR(50) NOT NULL
This means we can create a table using our type such as
Using a UDT | Using standard SQL |
---|---|
CREATE TABLE dbo.SalesTerritory ( SalesTerritoryID INT NOT NULL , SalesTerritoryDescription ty_rqdesc ) GO | CREATE TABLE dbo.SalesTerritory ( SalesTerritoryID INT NOT NULL , SalesTerritoryDescription VARCHAR(50) NOT NULL ) GO |
Where ever we use our ty_rqdesc data-type it will always be a variable length 50 character field that does not accept NULL values.
Taking it a step further
There are two other optional things we can do with our user defined type
- Bind a rule to the type to act as a CHECK constraint
- Bind a DEFAULT to the type
For example we could create a rule and bind it to our datatype as follows
CREATE RULE dbo.rl_reqdString AS LEN(RTRIM(@value))>0 GO exec sp_bindrule 'rl_reqdString','ty_rqdesc'
What this now means is that our user defined type is a compulsory variable length 50 character field into which some entry must be made.
We cannot get away with entering an empty string into the field.
A real world example
Let us suppose that we want a user defined type to represent gender. Gender is a mandatory field in the database, 80% of our customers are female
and we are going to use ISO5218 encoding.
Code | Meaning |
---|---|
0 | Unknown |
1 | Male |
2 | Female |
9 | Not applicable |
As an aside, if you are going to store real world items in your database such as gender, countries, business classifications etc
I would always check to see if there is an international, or at least national encoding scheme for such data. In a world where
communications between organisations become ever more important, using an agreed standard makes far more sense than inventing
your own encoding scheme.
We would define our type as follows
exec sp_addtype 'ty_gender','TINYINT','NOT NULL' GO CREATE DEFAULT dbo.DEF_FEMALE AS (2) GO CREATE RULE dbo.gender_rule AS @gender IN (0,1,2,9) GO exec sp_bindrule 'gender_rule','ty_gender' exec sp_bindefault 'DEF_FEMALE','ty_gender' GO
Now let us look compare a table definition using standard SQL versus a user defined type as we did before. We will
only use our ty_gender type in the example for simplicity sake.
Using a UDT | Using standard SQL |
---|---|
CREATE TABLE dbo.SalesPerson ( SalesPersonID INT NOT NULL , SalesPersonName VARCHAR(50) NOT NULL , SalesPersonGender ty_gender ) GO | CREATE TABLE dbo.SalesPerson ( SalesPersonID INT NOT NULL , SalesPersonName VARCHAR(50) NOT NULL , SalesPersonGender TINYINT DEFAULT(2) NOT NULL CONSTRAINT CHK_SalesPersonGender CHECK (SalesPersonGender IN (0,1,2,9)) ) GO |
Advantages of user defined types
So far the advantages are obvious
- We can couple a base type to a rule and default to represent a more complex entity
- We have a shorthand implementation for the above
- We have a standardised definition for a unit of business information for use in our database
The overall advantage that user defined types do have is they make it easier to build in data integrity by default. My
experience with non-DBA designed databases is that data integrity is normally very poor. Instead of the database engine
being thought of as a data integrity engine it is used as a retrieval system for what is basically just a
glorified dumping ground for data.
What is not apparent at this stage are the disadvantages of using these user defined types
The disadvantages of using user defined types
If portability between platforms is important to you then user defined types may be a no-go area. This is even more
true of SQL 2005 with its potential to use CLR data-types.
The big bugbear for user defined types is that they are not easily altered and require a great deal of work to plan for
an alteration. The time saved in creating a schema using the user defined types is lost thrice over if the user defined type has to be altered.
A concrete example of a UDT problem
Let us go back to our ty_rqdesc example. It is our standard non-nullable VARCHAR(50) with a rule bound to it to prevent zero length strings.
I decide that the use of this type is inappropriate for our dbo.SalesTerritory table and decide I want to use a nullable VARCHAR(40) field instead. I use my ALTER TABLE
statement as follows
ALTER TABLE dbo.SalesTerritory ALTER COLUMN SalesTerritoryDescription VARCHAR(40) NULL
It seems to have worked however when I go to create a new record with a null entry or zero length string I get an error telling me that a rule
has been violated!
Although I have altered the data type the rule that was originally attached to our data type remains bound to the column.
The same is also true of any default that may have been originally bound to the user defined data type.
For the purposes of creation a user defined data type can encompass the type, nullability, default and check constraint. For the purposes of altering a datatype
of a column the user defined datatype is only regarded as the datatype and nullability.
The problem is compounded because if I want to unbind the rule from my user defined type then drop, recreate and reattach the rule in a more appropriate form
then I am going to be told that I cannot drop the rule because it is bound to a column. This means that there is more involved in altering a datatype
than is immediately apparent.
Unfair accusations?
To be fair if you use either sp_bindrule or sp_bindefault you are going to
run across the same problems. To amend a bound rule or default you are going to have to sp_unbindrule and
sp_unbindefault all occurrences.
Identifying all occurrences can be done using the query below however a suitable script has to be written to amend and rebind
the objects back onto their source.
-- Identify where the rule "gender_rule" is used SELECT OBJECT_NAME(id) AS TableName, C.name AS ColumnName, T.name AS TypeName FROM dbo.syscolumns AS C INNER JOIN dbo.systypes AS T ON C.xusertype = T.xusertype WHERE C.domain =OBJECT_ID('dbo.gender_rule') -- Identify where the default "DEF_FEMALE" is used SELECT OBJECT_NAME(id) AS TableName, C.name AS ColumnName, T.name AS TypeName FROM dbo.syscolumns AS C INNER JOIN dbo.systypes AS T ON C.xusertype = T.xusertype WHERE C.cdefault =OBJECT_ID('dbo.DEF_FEMALE')
The more I think about this the more this sounds like an internal SQL Server version of Don Peterson's
Lookup table madness where
an object orientated approach is grafted onto a non-object orientated system. In our case it is thinking that a table specific constraint
can be represented by a single generic object because that constraint "is a type of" gender constraint.
Retrospectively changing a column from a built in datatype to a user defined type
Let us suppose that I have created a table as follows
CREATE TABLE dbo.ASample( id SMALLINT IDENTITY(1,1) NOT NULL, sometext VARCHAR(30) NULL, CONSTRAINT PK_ASample PRIMARY KEY CLUSTERED (id ) )
I put some data into the table and run the following
ALTER TABLE dbo.ASample ALTER COLUMN sometext dbo.ty_rqdesc
If I run sp_help 'dbo.ASample' I can see that the field size and NULLability have
been addressed but any rule and default that have been bound to the user defined datatype have not been applied.
Other problems with user defined types
Sooner or later you are going to want to use your user defined type in a temporary table or table variable.
Firstly, you cannot use your user defined type in a table variable if it has a rule or default bound to it.
Msg 1710, Level 16, State 1, Line 1 Cannot use alias type with rule or default bound to it as a column type in table variable or return table definition in table valued function. Type 'ty_rqdesc' has a rule bound to it.
Secondly, before you can use your user defined type in a temporary table you have to create the type in TEMPDB. The only reliable
way of ensuring that your user defined type stays available to you after a reboot is to include the type in the MODEL database.
This means that if you need to alter your user defined type in your main database you will have to remember to amend it in MODEL and all
subsequently created databases. If you are not prepared to restart the MSSQLSERVER service this includes alterations to the TEMPDB
implementation of your user defined type.
Thirdly, if you are going to deploy your database to multiple servers then you must ensure that your user defined type is present
on all target servers. There is an exception to this rule. Replication gives the option to translate user defined types to their
native interpretation. I have to admit that I have never attempted this. I have had problems with replication where a feature
should work but does not for some reason. For example, telling the publication not to replicate DELETEs.
Lastly, a user defined type is really a template used at the point of creation and I cannot alter that template and expect it
to propagate the change all the way through my schema.
Conclusion
For most applications I feel that user defined types are more trouble than they are worth. If you are going to use them then
I would recommend limiting their use to types that are unlikely to change. Some examples are listed below
- Non-nullable DATETIME or SMALLDATETIME type defaulting to the GETDATE() value
- Non-nullable DATETIME or SMALLDATETIME type defaulting to the date portion (excluding the time) of the GETDATE() value
- Non-nullable numeric data types defaulting to zero
- Integer types with rules to allow YES/NO type responses. BIT fields are not part of the ANSI standard so I tend to use
TINYINT fields with defaults and rules attached to them instead.
- Email fields as VARCHAR(100) with a rule to check for the presence of an @ symbol and a . character.
- ISO5218 gender fields. Arguably this could be handled by a gender table and a foreign key but as this is most
likely to be a fixed set of values I would stick with a range that cannot be accidentally altered by DDL statements.
In the past I have used user defined types extensively and they undoubtedly allowed me to build a database schema very rapidly.
I was fortunate that I was in a position to spend a great deal of time designing the database and as a result the only schema changes
that were necessary in a 5 year period were to add new tables and the occasional field. I never had the pain of amending the
schema.
If amending a user defined datatype could be achieved in one command and the type automatically reapplied across the database
then I would be more enthusiastic about their general use but can you imagine the affect of a cascading data type upgrade on a
database with many hundreds of tables?