SQL Data Flow Tool

  • Hi SSC,

    I'm looking for a tool which helps me achieve the following:

    • Understand what happens inside a stored procedure  and in which order

    Here's a partial example:

        SET @NoOfRecordsPendingBefore = 
    (
    SELECT COUNT(*)
    FROM RRawData (NOLOCK)
    WHERE R_Status_ID = @STATUS_REJECTED
    )

    SET @NoOfRecordsInFile =
    (
    SELECT COUNT(*)
    FROM RRawData (NOLOCK)
    WHERE R_Status_ID = @STATUS_NOT_PROCESSED
    )

    SET @NoOfRecordsWithNrNULL =
    (
    SELECT COUNT(*)
    FROM RRawData (NOLOCK)
    WHERE R_Status_ID = @STATUS_NOT_PROCESSED
    AND S_No IS NULL
    )

    -- Error handling
    SET @Error=@@Error
    IF @Error <> 0 BEGIN
    SET @ImportRStepNo = 2
    END

    I'd like to see what happens to @STATUS_REJECTED throughout which @ImportRStepNo is happening inside a certain procedure.

    Issue for me is really there is over 1800 variable calls within one stored procedure and I have to understand under which conditions (like @ImportRStepNo = 2 ) which variables might change to a different value at this specific point of execution.

    What we want to achieve is being able to simplify checking (being done a lot) and in the process improve performance / get rid of RBAR.

    another partial example:

    if (select count(*) from @R_Import) > 0
    begin

    declare @id integer
    set @id = (select AI_ID from @R_Import)

    declare @updated bit
    set @updated = 0

    /* Did Values change? */
    if (select AI_RKE from @R_Import) <> @AI_RKE
    begin
    update R_Import set AI_RKE=@AI_RKE where AI_ID=@id
    set @updated = 1
    end

    if (select AI_UTY from @R_Import) <> @AI_UTY
    begin
    update R_Import set AI_UTY=@AI_UTY where AI_ID=@id
    set @updated = 1
    end

    if (select AI_ANR from @R_Import) <> @AI_ANR
    begin
    update R_Import set AI_ANR=@AI_ANR where AI_ID=@id
    set @updated = 1
    end

    if (select AI_BRA from @R_Import) <> @AI_BRA
    begin
    update R_Import set AI_BRA=@AI_BRA where AI_ID=@id
    set @updated = 1
    end

    if (select AI_1IV from @R_Import) <> @AI_1IV
    begin
    update R_Import set AI_1IV=@AI_1IV where AI_ID=@id
    set @updated = 1
    end

    if (select AI_IVS from @R_Import) <> @AI_IVS
    begin
    update R_Import set AI_IVS=@AI_IVS where AI_ID=@id
    set @updated = 1
    end

    if (select [AI_IVS-TMJ] from @R_Import) <> @AI_IVSTMJ
    begin
    update R_Import set [AI_IVS-TMJ]=@AI_IVSTMJ where AI_ID=@id
    set @updated = 1
    end

    if (select AI_MY from @R_Import) <> @AI_MY
    begin
    update R_Import set AI_MY=@AI_MY where AI_ID=@id
    set @updated = 1
    end

    if (select AI_WK from @R_Import) <> @AI_WK
    begin
    update R_Import set AI_WK=@AI_WK where AI_ID=@id
    set @updated = 1
    end

    if @updated = 1
    begin
    update R_Import set AI_UpdateDate=getdate() where AI_ID=@id
    end
    end
    ELSE
    --insert statement definition

    i think @updated is rather obvious, it's used to determine wether we're talking about an existing record where one of the columns might have been changed or a new record.  But I need to understand for example what happened to @AI_MY later on, under which conditions might @AI_MY get another change in value down the stored procedure call.

     

    I really hope there is some sort of tool that can help me understand relations / conditions for any value going through that stored procedure.

  • No tools that I know of that will help here.

     

  • I think the tool you are looking for already exists in SQL Server - comments.

    If the code is difficult to understand, it should be commented.

    Failing that, PRINT or SELECT to print out debug notices showing you the value of a variable.

    Both of your examples look pretty easy to follow though.  In the second chunk of code you provided, @AI_MY never changes.

    If you find a variable and want to know WHERE it could change, do a search for the variable name and an = sign with and without a space (my opinion, you should ALWAYS have spaces around = signs, but that is just my preference).  for example, search for:

    @AI_MY =

    Then read those lines and you should be able to tell which ones are doing a comparison and which are doing assignment.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Funny thing is, there are many comments in this procedure but there's aswell about 3300 LOC in just this one procedure and comments start back in 2008, I can't even find an employee who can tell me what was when agreed on because of why.

    I was told Code Analysis Tools could help with this task (to avoid me reading through procedures for weeks) which I'm still uncertain would work with SQL Code even if it wasn't inside SQL Server but like scripted out .sql files.

    Our goal is essentially replicate the behaviour of existing stored procedures just recode them to actually be fast and not rely on RBAR.

  • you out of luck really. For SQL the tools that do "code analysis" are basically looking to see if it uses constructs that should be replaced with something else, and not trying to explain what the whole code does.

    these tools (like SqlCop for example) can tell you as a minor example

    • Code

      Procedures with SP_

      VarChar Size Problems

      Decimal Size Problem

      Undocumented Procedures

      Procedures without SET NOCOUNT ON

      Procedures with SET ROWCOUNT

      Procedures with @@Identity

      Procedures with dynamic sql

      Procedures using dynamic sql without sp_executesql

    but although this is helpful it does not really do what you are asking.

    so you do need to go through each proc and each line of code to try and figure out what it is doing - manual work, requires good thought process and ability to quickly identify how things can be replaced.

    for example that block of code you gave us can be replaced with a single update statement - but most likely that is not what the real change should be and we can't obviously help there as we don't have the full code.

     

     

  • Yes indeed, it was just a small portion of what the code consists of - and yes I'd make that part a single update, too -  mostly it consist of cursors and checks and there are comments which describe most of those checks but yet we have a feeling we might be able to simplify a lot but we essentially miss the big picture what happens all the way through and why at what point.

    I'll most likely come back here with some code snippets but I'm not sure if it makes sense to actually post the whole thing. All my research pointed to tools like SqlCop which is nice to have and useful on it's own but you're right - far from what I was looking for - anyways thanks again to everyone for confirming my suspicion - there is no help except hard manual legwork 😉

  • Something that seems a bit obvious to me but hasn't actually been posted in this thread is that you really need to simplify your proc. Anything that is 3300 lines long is going to be difficult to troubleshoot at the best of times but more importantly, is way too large to fit into cache so you will likely never get that benefit either.

    Take the discrete parts of the proc that can be turned into procedures on their own and do so. Then call them in the outer proc. You get the benefit of modularity and the disparate parts can be held in cache to allow that performance gain.

     

    • This reply was modified 4 years, 7 months ago by  rlDawson_SLC.
  • rlDawson_SLC wrote:

    Something that seems a bit obvious to me but hasn't actually been posted in this thread is that you really need to simplify your proc. Anything that is 3300 lines long is going to be difficult to troubleshoot at the best of times but more importantly, is way too large to fit into cache so you will likely never get that benefit either.

    Take the discrete parts of the proc that can be turned into procedures on their own and do so. Then call them in the outer proc. You get the benefit of modularity and the disparate parts can be held in cache to allow that performance gain.

    simplifying the proc is exactly what the OP is trying to do so that has been explicitly mentioned

    understanding what the proc does and the dependencies/data flow within the proc is what the OP as asking if there was a tool to help with

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply