DTS Load Speeds

Question

DTS Load Speeds

philcart

SSC-Forever

Points: 47798
More actions
August 29, 2002 at 6:53 pm

#105262

Hi All
Just wondering if anyone can give some idea on DTS speeds given different execution methods. I've hunted around but couldn't find anything that helped. If I had a server to work on at the moment I'd run up a test case myself.
Environment
Server1 - File Storage area with SQL Client Tools
Server2 - SQL Server 2k with Datamart and Staging tables
Option 1
DTS Package saved as storage file and executed via DTSRun to push data into staging tables and then perform transformations to load datamart.
Option 2
As above but DTS package is stored on SQL Server ( as storage file or in msdb ?? )
Option 3
DTS Package is stored and executed from SQL Server.
Thanks for any feedback provided
Phill
--------------------
Colt 45 - the original point and click interface

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Antares686 SSC Guru Points: 125444 More actions · Answer 1

The fastest is to always execute at the server itself. If you do from a client machine the data wil cross the client to the server so extra traffic on network. Beyond that it is a matter of various issues. Speed of server, process running at time, index structures, number of records to insert, db options currently configured, type of recovery method, speed of hard drives, placement and size of both log and data files in relation to each other, and various other bits here and there affect overall performance.

"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

philcart SSC-Forever Points: 47798 More actions · Answer 2

quote:
The fastest is to always execute at the server itself. If you do from a client machine the data wil cross the client to the server so extra traffic on network. Beyond that it is a matter of various issues. Speed of server, process running at time, index structures, number of records to insert, db options currently configured, type of recovery method, speed of hard drives, placement and size of both log and data files in relation to each other, and various other bits here and there affect overall performance.
"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

The record volume will be in the vicinity of the 100,000's. There is no "client" involved, just two servers.

In option 1, the data and the DTS packages would reside on Server1. The data would be pushed across to the database then transformed using ExecuteSQL tasks.

In option 2, the data would reside on Server 1, but the DTS package would reside on Server2. The data would be pulled across to the server.

The data size, network speed, server speed, indexes, etc... will be the same for both options. What I would like to know is which would be faster and why.

Thanks

Phill

--------------------
Colt 45 - the original point and click interface

Antares686 SSC Guru Points: 125444 More actions · Answer 3

In that situation both should provide approximately the same result as for speed. If you test with conditions almost exactly the same the lag between each would be minimal if any noticeable at all.

"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

scotttr SSC Eights! Points: 965 More actions · Answer 4

If you are looking at speed then also look at how your DTS packages are developed. Here are some tips:

1) Develop tasks that you can accomplish using the "Execute SQL Task" as opposed to using a "Transform Data" or "Data Driven Query" Task. The addition processing that is involved with those tasks can affect performance. Even more so if you are using Active X scripting and the multi phase data pump functionality to do error checking.

2) Look at your transforms under the "transformations" tab of your "transform data" tasks

(thats the black line that you see between your connections. There are three things to remember 1. Try to minimize the number of column mappings to reduce the number of datapump operations on a row by row basis. 2. When designing column mappings, use many to many column mappings when possible to optimize performance of the data pump and to decrease the number of data pump operations. 3. Group common transformations together because each line between your columns represents a data pump operation. You can see performance benefits if you delete the sinsle column mappings, group common transformations together, and implement a many-tomany mapping.