How to create a Stored Procedure from this Code

  • Hi all,

    I have a process written in SQL that is around 4000 Line, the process is basically 25 section of SQL code that pull data from different data source then do some data validation, aggregation, calculation and finally feed a DW tables.

    I am trying to automate this code and put everything in one or 2 stored procedure but the challenge is every section is pretty much relying on the output from the previous section. I am running the code manually right now to monitor the output from each section and make sure there is no errors.

    Any thoughts on best practice to automate this huge code.

    Greatly appreciated.

    Thanks

  • Any thoughts on best practice to automate this huge code.

    Best practice is to break up the code into small chunks, each of which does essentially one thing, and not to create a code Behemoth.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin (3/2/2016)


    Any thoughts on best practice to automate this huge code.

    Best practice is to break up the code into small chunks, each of which does essentially one thing, and not to create a code Behemoth.

    Then, of course, you would need a 'master' proc to drive the entire process. This would call the smaller procs you have created.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil - Do you mean a Master SP that will call 25 smaller SP? assuming I am turning each section (Chunk) to a SP.

    Thanks,

  • oneteabag (3/2/2016)


    Phil - Do you mean a Master SP that will call 25 smaller SP? assuming I am turning each section (Chunk) to a SP.

    Thanks,

    Precisely.

    Takes more setting up initially, but far easier to debug and maintain going forward.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • That's a lot of work indeed..

    Thanks,

  • The alternative, since this appears to be some type of ETL process to feed a DW table, would be to create an SSIS package.

    That tool is specifically built for this job.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • I haven't used SSIS before, will it handle this giant 4000 Line Code?

  • Hugo Kornelis (3/2/2016)


    The alternative, since this appears to be some type of ETL process to feed a DW table, would be to create an SSIS package.

    That tool is specifically built for this job.

    IMO that's an overkill if everything is in the same server and can be accomplished in pure T-SQL.

    SSIS would require additional resources for O&M that aren't always available.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (3/2/2016)


    Hugo Kornelis (3/2/2016)


    The alternative, since this appears to be some type of ETL process to feed a DW table, would be to create an SSIS package.

    That tool is specifically built for this job.

    IMO that's an overkill if everything is in the same server and can be accomplished in pure T-SQL.

    SSIS would require additional resources for O&M that aren't always available.

    I agree, if we're talking about a single database.

    But if there are multiple DBs involved, even on the same instance, I'd use SSIS to avoid the creation of explicit cross-database dependencies.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Hugo Kornelis (3/2/2016)


    The alternative, since this appears to be some type of ETL process to feed a DW table, would be to create an SSIS package.

    That tool is specifically built for this job.

    The previous recommendation would be ideal for building it in SSIS also...the 'master sp' becomes SSIS. I would even consider this solution is all sources were from the same database and the destination was the same database - if for no other reason than the fact that I can control the batch and commit sizes without having to write code in SQL.

    With that said - I would guess that the OP is pulling the data from multiple sources over a linked server (although he didn't specifically state that - he did state multiple sources). Additionally, the data validation - aggregation - calculation could be simplified in SSIS and performed in the pipeline between the source and destination, improving processing time.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Phil Parkin (3/2/2016)


    Luis Cazares (3/2/2016)


    Hugo Kornelis (3/2/2016)


    The alternative, since this appears to be some type of ETL process to feed a DW table, would be to create an SSIS package.

    That tool is specifically built for this job.

    IMO that's an overkill if everything is in the same server and can be accomplished in pure T-SQL.

    SSIS would require additional resources for O&M that aren't always available.

    I agree, if we're talking about a single database.

    But if there are multiple DBs involved, even on the same instance, I'd use SSIS to avoid the creation of explicit cross-database dependencies.

    I'd like to have cross-database dependencies to analyze if any change to the schema would impact current code.

    There's not enough information to give solid advise.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • oneteabag (3/2/2016)


    I haven't used SSIS before, will it handle this giant 4000 Line Code?

    If you decide to use SSIS (and note that some important remarks about this have been made by others), you would not use your existing code, You would build an SSIS package from scratch, using the native SSIS tools.

    Using SSIS just to call an existing stored procedure that does all the work will not help you at all, that will just add a layer of complexity.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Luis Cazares (3/2/2016)


    Phil Parkin (3/2/2016)


    Luis Cazares (3/2/2016)


    Hugo Kornelis (3/2/2016)


    The alternative, since this appears to be some type of ETL process to feed a DW table, would be to create an SSIS package.

    That tool is specifically built for this job.

    IMO that's an overkill if everything is in the same server and can be accomplished in pure T-SQL.

    SSIS would require additional resources for O&M that aren't always available.

    I agree, if we're talking about a single database.

    But if there are multiple DBs involved, even on the same instance, I'd use SSIS to avoid the creation of explicit cross-database dependencies.

    I'd like to have cross-database dependencies to analyze if any change to the schema would impact current code.

    There's not enough information to give solid advise.

    I'm surprised that you think this. Cross-database dependencies limit scale-ability (because you can't move the DBs to separate instances without refactoring) and it's all too easy to create what I think of as 'circular-reference syndrome', where database A depends on database B and vice versa, meaning that you cannot recreate one database, from script, unless the other already exists.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • To Clarify it is just One database we are talking about.

    So the question is : would I create one SSIS with 4000 line of SQL code or I need to break this to 25 small SSIS For each Section?

Viewing 15 posts - 1 through 15 (of 29 total)

You must be logged in to reply to this topic. Login to reply