Hardcoded filter vs filter by parameter

  • I am trying to set a date filter on several  large transaction tables to limit the years loaded.

    When I use a hardcoded filter like:

    DATEFROMPARTS ( DATEPART(yyyy, GETDATE()) - 4, 1, 1 ) to get running year and previous 4 years

    the dataflow starts reading data immediately and data is loaded in 10 seconds.

    When I use a parameter, based on a variable, then it takes at least a minute. I'm wondering whether method introduces some sort of overhead or delay and whether I can try another approach?

     

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • This is very likely a case where the use of a statement level OPTION (RECOMPILE) will work perfectly and very quickly.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Adding  the  option (recompile) slows it down even further , taking an additional 30 seconds to start the dataflow

  • blom0344 wrote:

    I am trying to set a date filter on several  large transaction tables to limit the years loaded.

    When I use a hardcoded filter like:

    DATEFROMPARTS ( DATEPART(yyyy, GETDATE()) - 4, 1, 1 ) to get running year and previous 4 years

    the dataflow starts reading data immediately and data is loaded in 10 seconds.

    When I use a parameter, based on a variable, then it takes at least a minute. I'm wondering whether method introduces some sort of overhead or delay and whether I can try another approach?

    can you post the SQL itself (from + where clauses) as well as the datatype of the column you are filtering on, and the datatype you defined on your SSIS package for that filter.

  • select

    a.dataareaid

    ,a.ledgerdimension

    ,a.inventtranspostingtype

    ,a.inventtransorigin

    ,a.voucher

    ,a.transdate

    from

    Inventtransposting a

    where a.transdate >= ?

    transdate is a datetime type

  • This could just be normal parameter sniffing issues - there are a couple ways to handle this from SSIS.

    First - unless you really need to be able to parameterize the date, then just hard-code it.  A better option to calculate the first of the year 4 years ago would be:

    WHERE transdate >= DATEADD(YEAR, DATEDIFF(YEAR, 0, GETDATE()) - 4, 0)

    This returns a DATETIME value instead of returning a DATE and implicitly converting the date to a datetime.

    Another option would be to define the parameter in your code:

    DECLARE @startDate datetime = ?;

    SELECT ...
    FROM ...
    WHERE transdate >= @startDate;

    If you always want to pull data in full years back - another option would be to pass in the number of years:

    DECLARE @yearsBack int = ?;

    SELECT ...
    FROM ...
    WHERE transdate >= DATEADD(YEAR, DATEDIFF(YEAR, 0, GETDATE()) - @yearsBack, 0);

    --==== Or this
    SELECT ...
    FROM ...
    WHERE transdate >= DATETIMEFROMPARTS(year(getdate()) - @yearsBack, 1, 1, 0, 0, 0, 0);

    Depending on your requirements - there could be other options.  If possible, creating a stored procedure on the source system that you can call to extract the data would be ideal.  That way you can control the input parameters to the procedure to ensure optimal performance on the source system.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • blom0344 wrote:

    Adding  the  option (recompile) slows it down even further , taking an additional 30 seconds to start the dataflow

    Heh... now that's interesting.  And, I just noticed... this is for use in SSIS.  I don't use SSIS so my apologies for jumping in here.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Well, hardcoding would be an option, though this would require some maintenance issues if customers decides to need to look further into the past.

    The source system is an Axapta ERP and I'm not allowed to add stored procedures to suit my needs.

    Normally the slight 1 minute delay would not be such an issue, but this is an high frequency process I'm trying to optimze, because the original setup causes performance drop in the ERP itself.

    I'm going to experiment somemore and  @jeff: though not applicable to this case , your reaction was still informative!

     

  • blom0344 wrote:

    Well, hardcoding would be an option, though this would require some maintenance issues if customers decides to need to look further into the past.

    The source system is an Axapta ERP and I'm not allowed to add stored procedures to suit my needs.

    Normally the slight 1 minute delay would not be such an issue, but this is an high frequency process I'm trying to optimze, because the original setup causes performance drop in the ERP itself.

    I'm going to experiment somemore and  @jeff: though not applicable to this case , your reaction was still informative!

    Not sure if you were responding to me or @JeffModen - but I provided multiple options.  One of those is a fixed hard-coding - and all of the other options utilize a parameter.

    How you implement the parameter all depends on the desired result.  If the desired result is to always pull from the first of a year through current date, then passing in the number of years back works quite well.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Does it make any difference if you make the data flow source a variable set to an expression that includes the date variable? The example below uses a MinDate string variable. I think converting the variable to dt_str allows it to work with a datetime variable too.

    "select a.dataareaid,a.ledgerdimension,a.inventtranspostingtype,a.inventtransorigin,a.voucher,a.transdate from Inventtransposting a where a.transdate >=  CAST('" +  (DT_STR,20,1252)@[User::MinDate] + "' AS DATETIME)"
  • @ Jeffrey Williams:

    Yes, I responded to your input and I answered Jeff Moden in the same post.

    Sorry, if that caused confusion.

    All solutions work from a technical perspective, but passing the value from variable to parameters causes delay.

    The delay I cannot affort, so maybe I end up with the hardcoded solution.

    I'm going to experiment somemore and then give feedback on the results

     

  • I tried incorporating Jeffrey's suggestion, but the ancient VS version I have to work with keeps crashing in the expression builder.

    So, I decided to use the hardcoded solution and some additional maintenance in the future

     

  • blom0344 wrote:

    I tried incorporating Jeffrey's suggestion, but the ancient VS version I have to work with keeps crashing in the expression builder.

    So, I decided to use the hardcoded solution and some additional maintenance in the future

    Okay - now I am confused.  What is the version of Integration Services you are deploying to - if anything greater than 2017 then you can and should be using VS 2019 or higher.  If you are deploying to something less than 2017 then you need to get that system upgraded.

    Why are you even in the expression builder?  All of the solutions I provided are just looking for a parameter to be passed - in the case of passing a date, the parameter is passed into a defined variable instead of passed directly to the SQL statement.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • I guess you haven't met our customers. This is  Microsoft Visual Basic 2010 and i'm stuck with it.

    Passing the parameter is the cause of the delay it seems. Hard-coding provides the best response sofar, so I'm going to implement that solution.

    But still thanks for the input given.

     

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply