Enums in SQL Server

SQL Server Lookups and Enums

We are told in database design class that normalisation is a good thing. We

shouldn't repeat values, for many reasons including performance and

maintainability. This means most of us have reference/lookup tables to hold

lists of values. These tables will generally have 2 columns, a surrogate primary

key normally the smallest integer data type as possible and a description column

normally varchar(x).

create table PaymentType
  (PaymentTypeId          tinyint not null
,PaymentTypeDescription varchar(255) not null
,constraint  PK_PaymentType  primary key (PaymentTypeId)
)

What this means is that the related tables contain lots of foreign key columns that one there own have no meaning, they just contain a bunch of

numbers.

create table Order
(OrderId       int      not null
,OrderDate     datetime not null
...
,PaymentTypeId tinyint  not null
)

If you want to retrieve data from the related tables based on the lookup value you have two options. You can either hard code the surrogate keys in your code, or join to the lookup table and use the description.

select Order.OrderId, Order.OrderDate, Order.PaymentTypeId
  from Order
 where PaymentTypeId = 3 --Invoice type

select Order.OrderId, Order.OrderDate, Order.PaymentTypeId
  from Order
  join PaymentType on Order.PaymentTypeId = PaymentType.PaymentTypeId
 where PaymentTypeDescription = 'Invoice'

Neither of these is ideal. The first makes your code un-readable (unless you use comments) and will only work if you have inserted the data into the lookup table with the same surrogate keys. If you have used an identity column this becomes much more difficult, especially when you have multiple environments.

The second option adds a level of complexity to all of your queries and if you have multiple lookups can affect performance.

In SQL Server 2000 and SQL Server 2005 where my application needs to perform lookups like the examples above I change my lookup table to use a meaningful surrogate key. I do this by changing the surrogate key to a 4 character column. This size is a provides data storage the same as an integer and should allow for meaningful values.

create table PaymentType
(PaymentTypeCode        char(4)      not null
,PaymentTypeDescription varchar(255) not null

,constraint   PK_PaymentType  primary key (PaymentTypeCode)
)
create table Order
(OrderId         int      not null
,OrderDate       datetime not null
...

,PaymentTypeCode char(4)  not null
)

Your code can then look like this,

select Order.OrderId, Order.OrderDate, Order.PaymentTypeCode
  from Order
 where PaymentTypeCode = 'INVC'

This is still not ideal because even if you only have 10 lookup values you are still using 4 bytes to store the surrogate key, when you could have used a tinyint and only used 1 byte

Proposed Solution

What would be great would be able to have the best of both worlds, be able to

use meaningful in code whilst maintaining the performance of small integer

values.

If you have programmed outside of TSQL i.e. C# or VB then you should now be

muttering something along the lines of "what we need is enums". Spot on, that is

what I am suggesting "Allow literals and read only table data to be represented as enums"

select Order.OrderId, Order.OrderDate, Order.PaymentTypeCode
  from Order
 where PaymentTypeId = PaymentType.Invoice

If you think it would be good to be able to do the following then please vote

for the suggestion on connect, "Allow literals and read only table data to be represented as enums( https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=254293)"

About Simon Sabin

Simon is a Independent Consultant focusing on database architecture.

He specializes in performance SQL Server systems and recently on search technologies.

To keep up on Simons thoughts on SQL

Server and other database related topics, read his blog

You can also read about Simon in a

SQLServerCentral article by Steve Jones

Proposed Solution

About Simon Sabin

Rate

Share

Categories

Share

Rate

Enums in SQL Server

Proposed Solution

About Simon Sabin

Rate

Share

Categories

Share

Rate

Related content

Worst Practice - Defining Rows that Exceed The Max Length

Is XML the Answer?

SQL and Java Go on a Date

Normalizing Dates

DBMS vs File Management System