I recently needed to generate thousands of test session records for our enterprise VPN management product. I've done this more or less manually in the past and found that it's practically impossible to generate realistic ISP and VPN session data without spending a couple days writing and tweaking SQL scripts. This time I decided to search the web (again) for some product to perform this difficult and esoteric task.
There are several products out there, so I applied my usual snap judgment questions: how much? Does the product or the web site look like 1998? Does it require a scientist? Is there more than one screen shot? Fortunately after this exacting first cut, I found that Red Gate had just put out a beta of a new product, SQL Data Generator.
I was more than pleasantly surprised that the beta release installed and launched perfectly. At least half the beta software out there never gets that far.
Some of the fields I need to generate include date added and modified, phone numbers, GUIDs, OS types, and various integers keyed to look up tables. All of these data types, and plenty more, are provided right out of the box (and that's in a beta!). Many of these default "generators", as they are called, are built from regular expressions, and the expressions can be edited in the same dialog in which they are chosen. That's very flexible and easy.
As you would expect, there's also an all-purpose regular expression generator for custom expressions. The final release version of the tool, which I recently downloaded, has a powerful extension to the regular expression generator. With one click you can add values from other columns in the target table. Figure 1 demonstrates how I inserted a value from the Date Added column, and also shows the regular expression for the data in the Access Number column.
Figure 1
SQL Data Generator can also generate columns based on other columns, something that several people on their boards for the beta requested. (I was also pleased with how responsive they were to people on their boards). In my work, the best examples of dependent columns are the Date Added and Date Modified columns. The software will generated a Date Added, within a specified range, then also generate a Date Modified column based on an offset from the Date Added column. The offset can be days, or just hours and minutes. Can I say it? Sweet.
Figure 2
If you have some proprietary data that cannot be randomly generated, there is a "SQLColumn" data generator that will draw data from another database, based on a SELECT statement. Data can also be drawn from a list in a text file. The external data can be shuffled or looped, which is essential for distributing look up table values across thousands of generated records.
The generators all have features to control the uniqueness and percentage of nulls in the generated data, as do most similar products. In addition to that, you'll notice a Seed value. By default, each column has a different seed, so the generated values "begin" in different spots. However, if you want to put random nulls in the same record across two columns, then you can simply set the Seed value to be the same for each column. (The online Red Gate help says it better, so you can read that for a complete description.)
If, with all that, you still can't get the data you want (and would you be a real developer if you could?), then SQL Data Generator will run any number of SQL scripts before or after the generation occurs. Before the offset feature was added, I used a script to increment my Date Modified column, based on the Date Added column.
I couldn't figure out how to change the script name, and maybe that's not implemented (even though I asked for it, oh well), but that has a pretty minor impact on generating data. I did hit a bug after clicking the Import button; the app froze. But I couldn't reproduce it, so it doesn't count, right?
As to performance it's as fast as it needs to be, I think. You can generate tens of thousands of records in less than a minute.
There are plenty of other basic and advanced features, some I haven't even used yet. If you're looking for a data generator you should give this one a try. I haven't bothered downloading anything else because this worked so well right way, and I'm lazy, err, busy. In the spirit of full disclosure, I have been interviewed by a Red Gate product manager about my opinions on the product, and I asked for and received a free license. It gave me the impetus and "managerial rationale" for writing the review. I do hope you find SQL Data Generator as useful as I have.
---
Michael Gaertner, the author of this review, is the portal architect for ROVA, Inc. ROVA provides enterprise software to Fortune 1000 companies that reduces the cost and complexity of managing secure mobile workforce solutions.