Performance tuning

  • We have a SSIS package that runs a web service that is a vendor's API.

    Basically the process it passes a list of bus routes, and then for each routes, it find stops in routes, calcualte the corners direction, and finally get students' stop detailed information and save in database. so there are a lot of traffic between web server and database server.  the job sometimes run over 10 hours esp. when beginning of school year which has most routes to process.

    I would like to tune and reduce the time of the job run in some way.

    I run this SSIS on dev, test, prod enviroment, same exact databases, but the one currently runs slowest is test server, then prod, then dev. for some reason dev environemnt is the fastest, and run 2 hours less than other environment, even prod has more CPU, RAM.

    Is there any way that can increase the speed of the SSIS/Web service run?

    I noticed the faster dev SQL server has 1 socket, 4 virtual processors, but slow test SQL server has 4 sockets and 4 virtual processors, I read online it should not make any difference, does it?

    For RAM dev has 8 GB, test has 24 GB. so test has more, but slower.

  • Have you analysed execution logs to determine where the slowness occurs?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • all this web service is to process each route one by one for stops and students, and street corners. After one route, process another one. It repeats same process for each route, student. I think the slowness is because it constant traffice between networks, but not 100% percent sure why dev enviroment runs faster than test environemt 2hour ahead of time for same db. For the web service itself I cannot change code since it is verdors web service and API.

  • sqlfriend wrote:

    all this web service is to process each route one by one for stops and students, and street corners. After one route, process another one. It repeats same process for each route, student. I think the slowness is because it constant traffice between networks, but not 100% percent sure why dev enviroment runs faster than test environemt 2hour ahead of time for same db. For the web service itself I cannot change code since it is verdors web service and API.

    So your answer to my question is 'no' then?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Yes, I did analyze the execution log, the slowess happened on each route processed for each stop location.

  • sqlfriend wrote:

    Yes, I did analyze the execution log, the slowess happened on each route processed for each stop location.

    That sounds like your entire process.

    Doesn't your SSIS package contain multiple steps? Eg, get data from Web service, deserialise JSON, stage results, transform results, merge transformed results into database etc etc?

    Each one of these steps takes a certain amount of time. Out of all of them, is there any one step which seems to take much longer than all of the others?

    • This reply was modified 3 years, 3 months ago by  Phil Parkin. Reason: Fix typo

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • All the steps in ETL runs pretty quick except this only task called execute process task which calls a web application .exe which is a wrapper that actually called vendors web service.

    It is the web service that runs slow, however dev and test have exactly same databases and application, but dev job step runs 3 hours faster than test server,  the total hours of the job is 10 hrs.

  • sqlfriend wrote:

    however dev and test have exactly same databases and application

    That never turns out to be the truth.  At  the very least, they live on different hardware, they have different NICs and connections.  They usually have different memory.

    And even all those were as near perfect as possible, they almost always have different data with varying conditions in the indexes and statistics, etc, etc.  Heh... then check out the configuration settings.

    Find out what is obtrusively different and you'll know the answer to your question as to why one is slow and the other is not.

    My bet is that it's going to be related to the data itself.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sqlfriend wrote:

    All the steps in ETL runs pretty quick except this only task called execute process task which calls a web application .exe which is a wrapper that actually called vendors web service.

    It is the web service that runs slow, however dev and test have exactly same databases and application, but dev job step runs 3 hours faster than test server,  the total hours of the job is 10 hrs.

    Web Services can be notoriously slow if you are processing individual records through them.  First are your dev/test/prod servers all calling the same vendor service or do they have separate services as well, it's common for a vendor to have a dev/test service that simply doesn't process as much as their prod does(or just returns dummy values)?  Second can you batch up the calls to the web service, IE instead of calling 1 service for each route for example call the service with 100 routes sometimes that can save significant over head of having to go through the web?

  • I did see in the .net code the route list is passed in batch to the application, but when it process detailed information in each route, it has to process each route one by one: some .net code like

    foreach (int routeId in routeList)

    {

    for (int i = 1; i < 6; i++)....

    ...}

     

    I also see in .net code:

    private DataSet GetRouteData(int jobId, int routeId)

    {

    DataSet dataSet = null;

    if (jobId != 0)

    {

    while (_vtStudentService.GetJobStatus(jobId) == VT.StudentService.JobStatusType.jsPending)

    {

    // wait...

    Thread.Sleep(1);

    }

    Could that also contributes the slowness ? I am not a .net developer, so it is really hard for me to do code side tuning.

    • This reply was modified 3 years, 3 months ago by  sqlfriend.
  • How many total service calls is it making?  And how long does each one take?

  • SSIS only has one execute task that  calls the web service.  I run today, there are 2000 routes to process, it runs about 8 hours on dev, and 10 hours on test server.

     

    • This reply was modified 3 years, 3 months ago by  sqlfriend.
  • sqlfriend wrote:

    SSIS only has one execute task that  calls the web service.  I run today, there are 2000 routes to process, it runs about 8 hours on dev, and 10 hours on test server.

    There are literally dozens of possible reasons for such differences between the two boxes.  To be honest and from what little I've seen and heard, it would actually be a waste of time to try to figure it out.

    If it were me, I'd spend the time on replacing the Web Service code and rewriting the T-SQL.

     

    You talk about "routes"... what does all this actually do?  What's a "route"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sqlfriend wrote:

    SSIS only has one execute task that  calls the web service.  I run today, there are 2000 routes to process, it runs about 8 hours on dev, and 10 hours on test server.

     

    So once again are your dev and test servers hitting the same end point for the service?  And what is the exact orchestration, a single execute task that runs and does what, calls the service and generates some files or something that load to the DB?  What is the processing time on the execute task making the service calls?

  • unfortunately it is not something simple that T-SQL can handle. There are complicated logic to calculate stop corners according to time and direction of the bus etc, that is why the vendor created the web service API. And we don't know their logic of how they do it.

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply