Enforcing Sort-Of Uniqueness
Introduction
Recently I noticed a post where someone was attempting to use a UDF as a check constraint on a table. No big deal, right? Well, it shouldn't be, but that wasn't what was interesting. The interesting thing was that this person wanted to enforce a unique column value, but only based on some other column. I thought this was an interesting problem and one that was worth tackling in more depth.
The Problem
In this case, the issue had to do with emails. Each email should be unique, something that is often enforced in many systems, with one caveat. There was an active flag set and if the active flag was zero, then the uniqueness need not be enforced. Makes sense when you've got a system that may want to re-use some PK type data, but archive or deactivate the older data.
For company emails, this makes sense. Suppose I work for a company (actually I do work for a company). I am listed as "sjones". Now I quit, but I have some information I created and they would like to know that I created that information. Some time later, a "Stephanie Jones" is hired and they want to reuse "sjones" as her login and email. If there is a PK setup for the email column, this will fail.
Now I know some of you will say, just make the PK email + active. This solves the issue.
Well it does, but suppose that after a couple months, Stephanie quits and Sam Jones is hired. If you wish to deactivate the "sjones" that refers to Stephanie and add an "sjones" for Sam Jones, you will run into problems. Yes I know you can continue to add columns until the PK is unique, but for the sake of argument, you might not have unique columns in this table. Or you might not want the overhead of a 5 or 6 key index. So what can you do?
Solution 1 - Triggers
There are a few choices for solving this problem. The first choice is to use a trigger to validate your business logic. In this case, I would argue this is business logic and not Referential Integrity (RI), so a trigger can handle this. Let's look at a sample table:
create table MyLogin ( MyID int identity(1, 1) not null , Email varchar( 80) , active tinyint )
This is a rather simple table, and if the PK were set on MyID, there would be no conflicts with duplicate emails. However, the business rule would require uniqueness among active emails, or those with MyLogin.active set to 1. So an index doesn't really work, but the following trigger would:
create trigger MyLogin_Insert on MyLogin for Insert as if ( select count(*) from MyLogin m where m.email = (select email from inserted) and m.active = 1 ) > 1 begin rollback transaction raiserror( 'Duplicate Email', 12, 1) end return go
In this trigger we check the count of emails that exist in the table that match the email that was just inserted. If there is more than one match (the newly inserted row will count as one row here), then we rollback the transaction. This prevents more than one email from being inserted. Let's look at how this works in practice.
NOTE: The Raiseerror is needed if you want this to report an error to the calling program
Suppose I run the following:
insert MyLogin select 'sjones@sqlservercentral.com', 1
This will insert a single row into the table. Now if I run this same line again, I will get:
Server: Msg 50000, Level 12, State 1, Procedure MyLogin_Insert, Line 13 Duplicate Email
However, I can continuously run the following line of T-SQL and it will continue to insert rows into the table.
insert MyLogin select 'sjones@sqlservercentral.com', 0
and it will add more rows.
Solution 2 - Constraints
The second way to handle this business logic is through the use of CHECK constraints that are bound to the table. A constraint is a check that the server applies to a table or column when the data is altered in that table or column through an insert or update. These checks can be applied to a table at table creation or sometime later. If they are added later, the dbo has the ability to ignore the existing data and not apply the check, but this is not recommended.
Now drop the trigger from the above table (drop trigger MyLogin_Insert) and let's build a constraint to enforce this business logic.
A CHECK constraint must evaluate to a boolean true or false to allow the insert to be allowed or not. Therefore, since this uses complex logic, I borrowed an idea from keymoo, who posted the original problem at SQL Server Central. I will use a user-defined function to validate the inserted data. First, let's built the UDF:
create function dbo.uniqueemail ( @email varchar( 80) returns tinyint) as declare @allow tinyint select @allow = 1 if (select count(*) from MyTable where email = @email and active = 1 ) > 1 select @allow = 0 return @allow
We now bind this to the table as follows:
alter table MyTable add constraint check( dbo.uniqueemail = 0)
Now if you repeat the pattern of inserts from Solution 1, you will find that you can only add a unique email address that is active, however, you can add as many inactive emails as you like that match.
Conclusions
I am sure that some of you out there will have other solutions for enforcing uniqueness among columns without using indexes, and I would like to hear them, please post them below (Your Opinion) or send them to me. The post I saw was intriguing and I enjoyed working with the subject. While this is not the way I would enforce uniqueness in general, in this particular case, it appears to be the best way to do this.
As always I welcome feedback on this article using the "Your Opinion" button below. Please also
rate this article.
Steve Jones
©dkRanch.net March 2002