Test Data Manager (TDM) is a suite of products from Redgate that make it easy to build dev and test databases in seconds. It’s a nice rewrite of a number of pieces of technology that we have sold for years, and it was launched at the PASS Data Community Summit in 2023.
I’ve been working with a few customers and sales engineers as they evaluate the fit for TDM in their environment. TDM is great once it’s running and can create a lot of agility for development teams as well as help them build better tested, higher quality software.
One of the challenges we’ve found is that the setup can be complex, and the knowledge required to get up to speed is high. There are a lot of moving parts to get this working in a way that makes it seem simple for the end users (usually developers).
In the spirit of ingeniously simple software, one of our engineers, Alex Yates, built a mini-PoC system using PowerShell that’s available. This post shows how you can get started to demo subsetting and masking in less than 10 minutes.
Getting Started
There are really a few things you need, but the tool does most of the work:
- dbatools
- Redgate tools (and a Redgate account)
- git
- make sure c:windowstemp exists
If you have dbatools installed, we import the module, if not, we download and install it for you. For the tools, we’ll download them and start a trial or you can get these tools and a license from your account executive. These tools run on various platforms, so ping your rep or sales@red-gate.com.
You should have git installed, and windows temp should be there.
Next, get the repo from here: https://github.com/alex-yates-redgate/TDM-AutoMasklet
Git makes this super simple. Just clone this down.
Configuration
There isn’t much to configure. In the repo, open run-auto-masklet.ps1 and look at the first 15 lines. These are where you might change things.
The local instance is set in line 2. If you have a named instance, use that. I’ve included a config file below that I used on a named instance.
That’s it. The repo includes a copy of Northwind in an install script to create the full sized database as Northwind_FullRestore. The subsetter will then move a portion of data to Northwind_Subset. If you want to change these names, you can do that.
Running the Tool
This tool can run run as a normal user, but if you need dbatools installed (it will do that), then it needs to run as Admin. I added that as a requirement above, so you don’t need to run this as an admin.
Here’s what the tool does:
- Get dbatools
- Get the latest versions of the subsetter and anonymize.
- authorize you, and start a trial if a license isn’t assigned to your Redgate ID.
- drop the two databases (Northwind_FullRestore and Northwind_subset by default)
- Creates the two databases with schema (and data for the full restore)
- pauses with output
- runs the subsetter to move a portion of data to the Northwind_Subset database
- pauses
- runs the classification process against Northwind_Subset to classify columns
- pauses
- Runs the masked against Northwind_Subset to mask data
Here’s the first set of output, showing the config and first part of the process.
Here is the first pause. You can see there are db create notes and then an explanation of what to see:
I like that this gives the subset command, which takes some getting used to. The TDM GUI hides this, but every customer has wanted to customize things, so this is a helpful way to do the PoC.
The subsetter does a lot, as you can see below, but basically it map out the database and then starts to determine which data needs to move. In this case, lines 10 and 11 of the source scripts shows that we are subsetting dbo.Orders with the OrderID<10260.
When this is complete, we get another message that explains what happened. We get some telemetry as well with the time taken here.
We also see the next part of the process, which is classifying the data. Again, we see the command for this, and you see this runs quickly.
Lastly, the next pause tells us there is a classification file at a particular location. We get the path if we want to look or edit the file.
Then the masker runs, and we see that 5 tables are masked. We get telemetry and below the results you see, there is more info on what’s happened and what to look for in the databases.
Checking the Tool
Once the execution is complete, I decided to look at the two databases. In SSMS, I had a vertical tab group to compare things.
First, subsets. I’ll count orders, order details, products, and employees. You can see the original db on the left and the subset on the right. Less data.
Not super impressive, but imagine there were a factor of 1000 on the left. That would be cool.
What about masking? Let’s check.
The Shippers, Suppliers, Employees, Customers, and Ordere tables were masked. Let’s look at Shippers. We can see the phone number is masked.
Checking Employees, I see less as there is a subset here, but I see data masked.
You should see similar results, and what’s more, you can alter the various config files or filters to test how your changes work.
I’m a big believer in sandboxes for learning and experimenting. This gives you a nice sandbox. You can change the various files or script and then re-run the tool in a couple minutes to see your changes.
Here are the CLI docs you might use to change things:
- Classification – https://documentation.red-gate.com/testdatamanager/command-line-interface-cli/anonymization/classification
- Masking – https://documentation.red-gate.com/testdatamanager/command-line-interface-cli/anonymization/masking
- Subsetting – https://documentation.red-gate.com/testdatamanager/command-line-interface-cli/subsetting/subsetting-configuration/subsetting-configuration-file
Give it a go and see what you think.
If you want to see a video version of this, check this out: