Dynamic Data Masking (DDM) is a new feature to the on-premise SQL Server instance beginning in SQL Server 2016. The purpose of DDM is to limit the ability of non privileged users to see certain information in a database. Administrative users can determine which fields are masked, and how they are masked, without requiring changes to application code. The masking is also consistently applied, no matter how the data is accessed.
This is a feature that was first introduced in Azure SQL Database, where it received testing by users, and has migrated to the on-premise product. I would guess that many other new features will follow this path, and it is a good reason to keep an eye on the changes in Azure and test the features to understand how they may help you in the future when they are released on-premise.
This article looks at how DDM works, how you can set it up, and manage permissions. Future articles will look in more depth at the way in which this feature works. The code for this article is available as a download in the Resources section below.
Adding Masking to a Column
Let's begin by creating a table with a mask over some of the data. I'll add a mask to one of the fields to start with in the table definition. Note that the way we do this is with a "MASKED WITH ()" format after the data type, but before the NULL and default options. Inside of the parenthesis we include FUNCTION = '', which specifies our function. Inside of the quotes, we specify the mask. This looks as such for a CREATE TABLE statement.
CREATE TABLE MyTable ( MySSN VARCHAR (10) MASKED WITH (FUNCTION = 'default()') DEFAULT ('0000000000' ) , MyName VARCHAR (200) DEFAULT ( ' ') , MyEmail VARCHAR (250) DEFAULT ( '') , MyInt int ) GO INSERT dbo. MyTable ( MySSN , MyName, MyEmail , MyInt) VALUES ( '1234567890', 'Steve Jones', 'SomeSteve@SomeDomain.com', 10 )
If I query this table as myself, I see a normal table. I get all the data, as it was inserted. This is because I am a privileged user. Those users with dbo privileges (db_owner or sysadmin roles), will not see masked data.
Now I want to see a normal user. I'll create one without a login, but it works the same with any user. I, of course, need to grant normal SQL Server permissions to see the data in the table.
CREATE USER mytest WITHOUT LOGIN GRANT SELECT ON mytable TO mytest
When I now query the table with this user, I see a different set of data.
We can see that the first column contains masked data. Only x's appear in place of the data. This achieves what I want, which is hiding data from my non-privileged users. Note that the data is not changed on disk. The data is only masked when returned to non-privileged users.
I can see this occurring in the last part of the execution plan. I need to grant my user rights to view the plan, but when I do, I see this plan for the user, with the same query above.
There are other types of masks I can define on the table. There is a custom mask allowing control over what is shown, an email mask for email addresses, and a random mask for numbers. We will examine those in more detail in another article.
Let's add masking to another column in this table. In this case, let's add a mask to the MyEmail column and use the email mask. In this case, we use the same format as we did with the CREATE STATEMENT. We used MASKED WITH (FUNCTION='mask()') format.
ALTER TABLE dbo.MyTable ALTER COLUMN MyEmail VARCHAR(250) MASKED WITH (FUNCTION='email()') GO
Now if I query the table as my user, I'll see that both the MySSN and MyEmail columns are masked.
I can certainly mask multiple columns in a table, each one separate from the others. Here I'll apply different masks to columns in a table.
CREATE TABLE MySecondTable ( MyEmail VARCHAR( 250) MASKED WITH (FUNCTION= 'email()') , MySSN VARCHAR (10) MASKED WITH (FUNCTION ='default()') , MyID INT MASKED WITH (FUNCTION ='random(1,4)') ) GO INSERT MySecondTable VALUES ( 'myname@mydomain.com', '1234567890', 100 ) , ( 'abrother@mycorp.com' , '0123456789' , 555) , ( 'somesister@somecompany.org' , '9876543210' , 999)
Now let's query the table with our normal user (mytest).
As we can see, I get various masks from the different rows, each applied to the data in that particular row. We will discuss the masks and defaults more in the next article, but feel free to experiment with them and view the impact of masking.
If you execute this query multiple times, you will see different, random, values for the MyID column.
Allowing Users to Unmask Data
There is a new permission in SQL Server 2016 for DDM. This is the UNMASK permission, which is granted like any other permission. Let's examine how this works. I'll create a new user with the same permissions as my existing user. I will then query the table.
As you can see, the data is masked as we expect. Note the random values are different than above, which is what we want from random values.
Now we will grant the UNMASK permission to the user and then requery the table.
Now we can see that the data appears as it would for a privileged user. All the data is "unmasked" for the NewTester user.
There is a downside to this, however. The UNMASK permission is granted database wide to users. There is no granularity by table or column. If a user has UNMASK, they can view all data in tables for which they have SELECT permission as it is stored in the database. We can see this by querying the first table with Newtest.
Removing Masks
If I decide that data no longer needs to be masked, the mask is removed with a simple ALTER TABLE statement.
ALTER TABLE dbo.MySecondTable ALTER COLUMN MySSN DROP MASKED;
Once I do this, users will no longer see masked data, as shown by the user MyTest.
Note that the data for the MySSN column is unmasked, but the data for MyEmail and MyID is still masked.
Conclusion
Dynamic Data Masking is a nice new feature designed to make protecting data from non-privileged users a bit easier for organizations. This can be implemented in the database, without any application code changes, allowing you to mask sensitive data from application users with a minimum of cost and effort.
I would also caution you that this is not really a security feature. The data stored on disk, and in your tables, is not changed in any way. This is still plain text data, and if your users have the ability to query the system, they can still potentially query your data and discover it's value.
There are limitations in this first version of the feature, most notably around the UNMASK permission. In future articles, we will examine the masks, defaults, capabilities, and limitations of DDM.
The code for this article is available as a download in the Resources section below.