So, you’ve created a database and application and want to
see how it operates with a substantial load of data? When dealing with
development projects, a large problem I find is the lack of test data in a
database to appropriately tune an application or database for 6 months down the
road. Time after time, development groups create a large application without
testing the application with large amounts of data. When the application goes to
production, it may be ill equipped to handle the real-world stress of data.
Datatect 1.6 from Banner Software provides an easy solution
to loading large amounts of sample test data into your database. It can load
data into nearly any OLE DB data source like Oracle, Sybase and of course, SQL
Server. It can also load data into an ASCII flat file for later usage by an
extract program. Loading this type of data the old fashion way could take weeks.
Traditionally, you would have either a human type the data or create a script.
During this review, I tested Datatect 1.6 in loading a SQL
Server 2000 database with 1 million records in the parent tables and 20 million
in the children tables. All the data loads were performed against the Northwind
database. I have evaluated other data loading programs in the past and this
product was the best by far. While the other products like Quest Software’s
DataFactory had a cleaner interface, it was much harder to use and was not as
stable with SQL Server. DataFactory did work fine in loading data into Oracle
and DB2 but struggled with SQL Server 2000 (it did work fine with SQL Server
7.0). The Banner Software product, Datatect however handled the high-stress job
of loading 128 million total records into SQL Server 2000 with ease. This amount
of data equated to 50 GB in the Northwind database.
Other products I tried locked up at after a few thousand records.
Datatect made loading data into children tables easy. As
you define how you would like to load each table, it would automatically read
the table’s properties and detect how it would recommend loading the data. For
example, if the column had a foreign key constraint on it, Datatect would
automatically detect this and load data into the column randomly from the parent
table. Of course, as you select more columns like this, the slower the product
will load data because it has to run a select statement in the parent table. As
you can see below, you can also set what the chance a column will be left NULL.
As you can see in the above screenshot, I’m loading the
Products table in the Northwind database. When you load tables that have
identity columns on them (like the Products table), make sure you uncheck the
identity column or Datatect will try to force data into the column and you will
receive an error. While Datatect does read the SQL Server properties for some
types of items like data types and nullability, it does not detect if you have
the identity property on the column.
As you select a table, you can set a column to any number
of items. Datatect ships with a number of pre-defined lists of businesses,
names, and cities (among other lists). You can set a column to a set value or a
random string of numbers or characters. You have the option to also add your own
customized lists of values. For example, you could export data from a different
source into a flat file and then import it into this system. Datatect also
allows for advanced scripting where data can be massaged as it’s entering the
system. Keep in mind that the more advanced you make your data load, the slower
it will load.
Another item to watch for as you define your tables is when
you add a new table to your project. When you click New in the Table Specs tab,
you will be given a list of tables and views that are available to add to your
project. Be careful not to add a SQL Server view on this screen or you will
receive an error when trying to load it. This option should be left in though
just in case the user wants to load a distributed partitioned view.
To load a table, right click on the table and select
Generate to Table. You will then be presented with the below screen, where
Datatect will ask you how many records you would like to load into the table and
how often you would like to commit the records. If you’re loading a large
database and you have ample RAM, it may make sense to increase the commit size
to a larger amount. In my data loads where I was loading over a million records,
I would set this setting to 10,000. You also have the option in the screen to
purge the old records before loading or just append your data to the end of the
table.
Before you click OK, make sure your database is at the
target size after the data load (or at least your best guess). If your database
is set to automatically grow, you will experience a problem with Datatect if the
database must grow during the data load. For example, I ran into the problem
where my 20 GB database was set to grow by 10%. When the autogrow feature was
triggered, my database spent 3 minutes growing by 10% and timed out the Datatect
application. This is because the database creates a series of locks that
prevents the data from being loaded while the database is growing. This is not
Datatect’s issue, but rather SQL Server locking.
If you’re looking for a fast way to load your test data,
Datatect is your answer. Because it automatically detected your foreign keys,
its interface was much easier to use than its rivals. The only interface issue I
had was that there was no easy way to load all the tables in your project in one
execution. Instead, you had to load each table individually. This prevents you
from being able to start a data load before you leave for the weekend and have
it done when you get back. There are ways around this with some of the available
scripting features. Other data loading products did have this feature built into
the GUI, but were buggy the more child tables you added.
If you expect a few million record data load to execute in
an hour, don’t hold your breath. Complicated data loads with millions of
records may take a long time to execute. For example, my 60 million record data
load took the better part of the weekend to execute. Yes, I’m afraid I have
nothing better to do on a weekend than to look at a data load.
Datatect 1.6 makes loading data easy. After trying the old
fashion method of using scripts to load my sample data for years, I was able to
save days of script writing by using this product. Datatect is sure to offer the
most advanced users ample features and time saving techniques to load your data.
By using this product, you can ensure that your application’s performance will
excel before your application ever enters production.
Datatect
Pricing $995.00 per copy.
Discount Schedules are available for 5 packs, 10 packs etc.
Banner Software inc.
email banner@ns.net
Phone 800-336-9969 U.S. and Canada
Outside U.S. and Canada 916-481-1157
Web Site www.datatect.com