An Approach to SQL Server 2005
Encryption
Overview
There is an abundance of information regarding SQL Server
encryption, too much almost.
When starting out, I just did not get it. I read and re-read many
articles, executed samples,
but it all just seemed very confusing. I'm not an encryption expert,
just a DBA trying to
figure out how to implement new 2005 functionality.
Now with quite a bit of time invested, a lot of
experimentation, and a lot of feedback
from my coworkers, I believe I finally have figured this out well
enough to have a plan of
attack. Here I want to try to explain it in a way that I hope is clear
and understandable
on the first reading.
I also look forward to feedback, especially if I've made some
fundamental error or have
missed an important consideration.
Simplest Approach
When I want to use SQL encryption, what I really want to do is
encrypt columns of data.
SQL Server doesn't offer an "encryption" attribute for columns, so what
we do is manually
encrypt the data on insert/update and then decrypt the data when
selecting. Let's use the
following table for test purposes:
create table MyTable
(
ID int not null identity(1,1) primary key clustered,
ClearText varchar(16) not null, --would not have this column in a real table
EncryptedText varbinary(68) not null
);
The simplest way to encrypt/decrypt data is with the
EncryptByPassphrase() and
DecryptByPassPhrase() functions. The problem with these is that we need
to supply the
passphrase every time we want to encrypt or decrypt the data.
Hard-coding the passphrase
into a stored procedures is no good, so we'll probably have to pass the
passphrase in as
a parameter to our data access stored procedures. Which means all
calling applications
will have to know about and protect the passphrase. All in all, I feel
this is not the
best way to proceed. My only point here is that we could do it as
simply as this, if we
chose to:
--passphrase approach
declare @s varchar(16);
set @s = 'TestText';
insert MyTable(ClearText, EncryptedText) values (@s, EncryptByPassPhrase('MyPassPhrase', @s));
--verify we can decrypt data
select ID,
ClearText,
cast(DecryptByPassPhrase('MyPassPhrase', EncryptedText) as varchar(16)) as "Decrypted",
EncryptedText --just to verify the data really is encrypted
from MyTable;
That's it. Please note that so far we have NOT had to deal
with certificates, master keys,
service master keys etc. Our DB may not even have a master key yet, no
problem.
Better Approach
The big problem with the above approach is dealing with the
passphrase. How are we going
to secure it? I don't want to have to worry about that if I don't have
to. Storing this passphrase inside a stored procedure is clearly no good.
Perhaps if we instead use the EncryptByKey() and
DecryptByKey() functions these problems go
away? Turns out the answer is "maybe"...it depends on how we encrypt
our key. If we encrypt
"by password" then we'll have the same fundamental problem as with the
passphrase approach, and
that password will be required for all data access.
But if we encrypt our symmetric key "by certificate" then
there is no password/passphrase
problem! We create our encryption key use the following syntax:
create symmetric key MyKey with algorithm=AES_256 encryption by certificate MyCert;
...oops, that doesn't work "Server: Msg 15151, Level 16, State
1, Line 2 Cannot find the
certificate 'MyCert', because it does not exist or you do not have
permission." I guess it
makes sense that we need to create the certificate first:
create certificate MyCert with subject = 'MyCertSubj';
...oops, that doesn't work either! "Server: Msg 15581, Level
16, State 1, Line 2 Please
create a master key in the database or open the master key in the
session before performing
this operation." I guess we need to create the DB master key first:
create master key encryption by password = 'SecretPassword';
That worked (finally!). Now we create the certificate, and
then the key:
create certificate MyCert with subject = 'MyCertSubj';
create symmetric key MyKey with algorithm=AES_256 encryption by certificate MyCert;
Now we have a key. We only need to change our example data
access code only a little bit:
a) add a step to open the key, and b) replace the passphrase functions
with their counterparts
EncryptByKey() and DecryptByKey():
--symmetric key approach; need to open the key once per session
open symmetric key MyKey decryption by certificate MyCert;
--eliminate old data, then insert new data
declare @s varchar(16);
set @s = 'TestText';
delete MyTable;
insert MyTable(ClearText, EncryptedText) values (@s, EncryptByKey(Key_GUID('MyKey'), @s));
--verify we can decrypt data
select ID,
ClearText,
cast(DecryptByKey(EncryptedText) as varchar(16)) as "Decrypted",
EncryptedText
from MyTable;
Note that nowhere in the data access code do we need to
specify a password...this is
definitely the way to go!
We should now be able to see why so many of the encryption articles discuss master keys etc...we must have one in order to create a certificate. I don't know much about master keys and I am pretty sure I don't need to know or care much about them, just need to have one.
Also note that unlike many other examples, the "create certificate" statement above does not use the "authorization" qualifier. That qualifier limits the certificate (and ultimately the entire encryption scheme) to a single user. For me that's a deal-bre aker. The way around this is to simply avoid the authorization qualifier, and instead control access to the certificate with grant statements.
Open Symmetric Key Has a Cost
There is definitely a cost associated with this statement. As long as we don't explicitly close our symmetric key, it stays open for the duration of our session. Several simplistic encryption approaches I've seen have an "open symmetric key statement" at the top of every insert/update/select stored procedure or workflow. This is definitely much more expensive than only opening the key when it's not open
already. That is to say, it's far cheaper to test if the key is open than it is to open it again.
In a very simple performance test I conducted, inserting a few thousand rows of encrypted data took 44sec when opening the key for every single-row insert, and took only 13sec when opening the key only as needed.
Here's one way to tell if a key is already open:
if not exists(select 1 from sys.openkeys where key_name = 'MyKey' and database_name = db_name() )
DecryptByKeyAutoCert()
This useful function is simply a combination of an "open symmetric key" statement and the DecryptByKey() function. It will open the key only if it has to, then does the decryption. I've verified this with simple timing tests, and I'm
positive this function does not open the key every time it's called. I see no downside to ALWAYS using this function instead of DecryptByKey() when using keys encrypted by a certificate.
Aside from using this to replace DecryptByKey() in our data access code, another pretty cool use for this function is that we can use it to create views which decrypt the data automatically, on-the-fly. No need to worry about user's
connection opening a key or not, it's automatic. Try it, it's easy!
There is no corresponding function for encrypting data...too bad.
How Big Does the Encrypted VarBinary
Column Need To Be?
While we can always get by with varbinary(max),
http://blogs.msdn.com/yukondoit/archive/2005/11/24/496521.aspx
basically says that we can use the following formula to determine
precisely how big to
make our varbinary encrypted data column (this page also warns us to
test the result
by encrypting the largest possible string and checking the size of the
encrypted result):
declare @PlainTextLen int, --length of the plain text
@UsesAuth tinyint, --1 if using the optional authenticator parameter, else 0
@BlockSize tinyint; --8 if using DES, else 16 if using AES
select @PlainTextLen = 16, --my plaintext is 16 chars long
@UsesAuth = 0, --not using authenticator
@BlockSize = 16; --AES algorithm
select @PlainTextLen as PlainTextLen,
((floor((8 + @PlainTextLen + (@UsesAuth * 20)) / @BlockSize) + 2) * @BlockSize) + 20 as CipherLen;
Optimization - Indexing Encrypted Data
If we ever need to use the encrypted data as primary search
criteria in queries
(assuming our data is not trivially small) then we will need to take
additional steps.
We can't simply add a DecryptByKeyAutoCert(...) = @param clause to our
WHERE clause
and expect decent performance...this will surely result in a clustered
index or table
scan every time.
One way to ensure adequate performance is to add an extra
indexed column to our
table, containing a hashed version of the PlainText. There are a few
good articles
published on this subject. It's not that simple! If we apply a simple
hashing function
(e.g. checksum()) to our plaintext data, then an attacker can use a
dictionary attack
to determine the plaintext corresponding to our hashed values. Some of
the articles
I've read suggest very complex solutions to this problem.
I don't think it has to be that complex. I believe that a
satisfactory solution to
this problem is to apply a hashing function to a subset
of our plaintext
only--this way a dictionary attack will yield at most only a subset of
the original
plaintext. We will need to determine what constitutes an acceptable
subset of plaintext
on a case-by-case basis. Obviously if the subset is large, we risk
giving away too
much of the plaintext to an attacker; if the subset is small, then the
value of
the index is reduced.
Thus a simple but effective algorithm for such a function is:
hash value =
checksum(subset(plaintext))
Whatever function we pick, we need to add a new indexed column
to our table,
populate it with the hash value on inserts, maintain it on updates, and
then
in our queries use both the new column and the encrypted column
(decrypted) in
our WHERE clause. The optimizer will use the index on the new column to
narrow
the overall search, and only matching rows will require decryption to
see which
rows ultimately match our criteria.
create table MyTable
(
ID int not null identity(1,1) primary key clustered,
ClearText varchar(16) not null, --would not have this column in a real table
EncryptedText varbinary(68) not null,
HashedText int not null
);
create index MyTable_HashedText on MyTable(HashedText);
--a UDF to implement hash function
create function dbo.fn_HashText(@Text varchar(16))
returns int
as
begin
return checksum(isnull(left(@Text, 6) + right(@Text, 4), ''))
end
go
--insert row
declare @s varchar(16);
set @s = 'TestText';
insert MyTable (ClearText, EncryptedText, HashedText)
select @s,
EncryptByKey(Key_GUID('MyKey'), @s),
fn_HashText(@s);
--search for row (assuming table is not trivially small)
declare @searchstring varchar(16);
set @searchstring = 'TextText';
select ID,
ClearText,
cast(DecryptByKey(EncryptedText) as varchar(16)) as "Decrypted",
EncryptedText
from MyTable
where HashedText = fn_HashText(@s)
and cast(DecryptByKey(EncryptedText) as varchar(16)) = @searchstring;
Moving Encrypted Data to Another
Server - Disaster Recovery
One thing we want to be sure of is that we can move our
encrypted data to another
server and still be able to decrypt it. For now let's call the server
currently
containing the encrypted data the "primary" server, and the server we
want to move
data to the "secondary" server.
The absolute simplest way to make this happen is to copy the
primary server's
service master key over to the secondary server. The only way to synch
service
master keys is to backup the key on the primary server and then restore
it on the
secondary server.
--on primary server: backup service master key
backup service master key to file = 'service.key' encryption by password = 'MyPwd';
--on secondary server: restore service master key
restore service master key from file = 'service.key' decryption by password = 'MyPwd';
Then when we restore/log-ship/replicate data from the primary
server to the
secondary, the same encryption/decryption mechanisms that work on the
primary
server will work on the secondary. For log-shipping, this is the only
way that
will let us access data in a standby secondary database.
Note: We don't have to know anything about these service
master keys, just that
they're synched!
Clearly, it's best to do this when the secondary server is
relatively new and there
are not any dependencies on the service master key. Once someone starts
using encryption
functionality on that server, we can no longer change the service
master key without
consequences. If we cannot replace the secondary server's service
master key, then we
must tackle things at the DB level. That is no big deal, but is beyond
the scope of
this article.
Proposed Approach
Now I'll sum all this up and present what I believe is the
best way to implement
data encryption in SQL 2005.
Setup:
- Create a master key, if one does not already exist.
- Create a certificate, do not specify the authorization
qualifier.
- Create a symmetric key, being sure to specify the
key_source and identity_value
parameters--this way we will be able to recreate the exact same key in
a different
DB or on another server. We must choose an encryption algorithm; I
typically use DES
for playing around on my workstation, and AES_256 for production.
- Create a DB role for our application (or whatever).
- Grant control access on the certificate to the DB role.
Users in this role will
be able to automatically access the certificate.
- Grant view definition access on the symmetric key to the DB
role.
- At this point all users in the DB role will be able to
encrypt and decrypt
data. Use role membership and standard TSQL grants (e.g. to views or
stored procedures)
to achieve more granular control as desired.
- If the encrypted data will ever be the primary search
criteria in queries, write
a UDF hash function, add an indexed column containing the hash value,
and reference that
column in queries. See above for detail.
- If there is any requirement to restore, log-ship, or
replicate this data to another
server, synchronize service master keys on both servers. See above for
detail.
- Save scripts for the above in source control. In the event
that master keys are lost,
we can always use these to gain access to encrypted data.
if not exists (select 1 from sys.symmetric_keys where name = '##MS_DatabaseMasterKey##')
create master key encryption by password = 'SecretPassword';
if not exists (select 1 from sys.certificates where name = 'MyCert')
create certificate MyCert with subject = 'MyCertSubj';
if not exists (select 1 from sys.symmetric_keys where name = 'MyKey')
create symmetric key MyKey with algorithm=AES_256,
key_source='SomeUniqueString', identity_value='SomeOtherUniqueString',
encryption by certificate MyCert;
create role TestRole;
grant control on certificate::MyCert to MyRole;
grant view definition on symmetric key::MyKey to MyRole;
Data Access:
- To insert/update/encrypt data, check if the key is already
open and if not, open it.
Then perform the insert using EncryptByKey().
- To select/decrypt data, use DecryptByKeyAutoCert() and cast
the result to the proper datatype.
--open symmetric key if needed before insert/update
if not exists(select 1 from sys.openkeys where key_name = 'MyKey' and database_name = db_name())
open symmetric key MyKey decryption by certificate MyCert;
insert ... EncryptByKey(key_guid('MyKey'), @ClearText) ...
--but no need to open symmetric key before select
select ... convert(varchar(16), DecryptByKeyAutoCert(cert_id('MyCert'), null, EncryptedText)) ...