How do you find where a table/view/whatever is being used in SSIS?

  • Let's say you have a decent sized SSIS project, using Visual Studio, and you want to find where a particular object (table, view, stored procedure, ...) is being used. You've got tasks nested within containers to various levels (greater than 2). How?

    I'll let you initially assume that tasks and containers have decent names, but if you're bored, please feel free to answer the harder problem of trying to find an object when tasks have default names ("OLE DB Source" or "OLE DB Destination", for example).

    My personal opinion: this is where the visual metaphor breaks down. But what are the alternatives?

  • This is a good example of why proper naming of objects it a good idea!

    To find where a table is being referenced you could open the code page - right click the package in solution explorer.

    You can then search for the table name. Ctrl+F will open the find string dialog box

  • Yes, but how do you get back to the proper position in the visual editor?

    And I'll add a related question: How do you do a code review? If somebody is making a change to an existing package, how do you determine what has really changed? I've looked at the code, and every time you open a container, the XML changes! And it's not like you can UNDO opening a container.

  • Tom Bakerman (5/20/2009)


    Yes, but how do you get back to the proper position in the visual editor?

    ...

    If you mean: "How do I get back to design mode?" the answer is that you close the (new) window you have just opened. If not, please define "proper position".

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Come on, give me a little credit. I mean: Ok, I find the database object I'm interested in in the XML. How do you find that same location in the visual representation? Why? Maybe I want to find out where a particular column is getting updated, or what table is used as the source data for another table. Or worse yet, I actually want to change the table. Surely you're not recommending editing the XML directly (although I'm game)?

    Remember my beginning premise: Tasks may be two or more levels deep in containers.

    I ask this question because, although I'm new to SSIS, I've been developing software for a long time, and I've inherited some of the worst code I've seen in over 20 years. If I want to make a change to a database object, it takes me days to determine what might break, and even then I'm not 100% sure until I run the package (which takes a couple of hours).

  • My question was an honest one - as you're only new here, how am I supposed to make any assessment of your experience and knowledge? If I treated everyone like they have 20 years in the business, 90% of them would not understand my answers, so I start off simple.

    Now that you have spelled it out, I know what you mean. It's an excellent question, but it's not an available feature, as far as I know.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • You're right. My apologies.

    So, how do you recommend doing "code reviews"? Trust the author to remember everything that has changed, and point it out to you? :crazy:

  • You could take a peek at the underlying XML, since a .dtsx file is just XML under the covers. However, it's not exactly user friendly - open it up in a text editor and you'll see what I mean. You can edit this XML file directly, but this is highly discouraged.

    As far as tracking specific changes to a package, you probably won't find a good way to do this. You can store your packages in a version control system, but it won't likely describe the specific changes. If this is a critical need, you might try using third party tools to compare versions of the file, but again, you're back to looking at the XML with this method.

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • Accepted 🙂

    In future, you could consider building your packages entirely in code. Then you would get the review features you desire, but it adds a significant layer of complexity to the whole process.

    But I do not know any short-cuts that will give you what you want. You might like to check out 'SSIS Documenter' - that might help you analyse existing packages

    Phil

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Tom Bakerman (5/20/2009)


    Come on, give me a little credit. I mean: Ok, I find the database object I'm interested in in the XML. How do you find that same location in the visual representation? Why? Maybe I want to find out where a particular column is getting updated, or what table is used as the source data for another table. Or worse yet, I actually want to change the table. Surely you're not recommending editing the XML directly (although I'm game)?

    Remember my beginning premise: Tasks may be two or more levels deep in containers.

    I ask this question because, although I'm new to SSIS, I've been developing software for a long time, and I've inherited some of the worst code I've seen in over 20 years. If I want to make a change to a database object, it takes me days to determine what might break, and even then I'm not 100% sure until I run the package (which takes a couple of hours).

    Read through the XML... every object name has to be unique within it's current scope even if they are:

    OLE DB Source

    OLE DB Source 1

    OLE DB Source 2

    Alternatively, spend a day going through the package renaming everything properly and adding comments (right click in the package and choose "add annotation) to the package where appropriate.

  • Ok SSIS veterans. Thanks for your answers so far. But please help a veteran software implementor (>20 years) to understand why using SSIS is a good thing for anything outside of moving data from a source to a destination. That includes the question of why it is good to do any sort of transformation in SSIS. Perhaps I'm jaded by the introduction I've received into the tool, but I just don't see how burying data manipulation inside a visual tool helps in the long run (and remember, , that something like 60-70% of the lifetime of a software package is spent in maintenance).

    I've looked at some of the "Best Practices" sites to try to find what I'm missing here, but they've tended to lean towards what to do for best package performance, not package design.

    And for laughs, here's what I had to discover yesterday:

    We are moving our client-facing server to a 3rd party host. While testing the new configuration settings, I killed a package that was taking an exceedingly long time (normally takes 10 minutes, I stopped it after 1.5 hours). What was going on? Turns out that one step was moving all of the sales data (4.5 million rows, about 18M) just to use it to find the latest sales date! (Oh, I didn't mention that here in New Zealand we are somewhat of a 3rd world country when it comes to technology infrastructure. Our outbound link is 256K maximum. You do the math). Anyway, it took me an additional 2 hours of searching to find out where this table was being used.

  • Problem is Tom, you've not had much of an exposure to SSIS yet

    You don't have to use the data flow components to move data

    You can still use T-SQL and procedures and call these from a "Execute SQL" task

    Why do this? Because when SSIS is used properly it gives you a controlled wrapper around the processes.

    Errors can be handled elegantly.

    http://www.sqlservercentral.com/articles/SSIS/65758/

    http://blogs.conchango.com/jamiethomson/archive/2005/06/11/SSIS_3A00_-Custom-Logging-Using-Event-Handlers.aspx

    Sources can be verified

    http://blogs.conchango.com/jamiethomson/archive/2005/10/10/SSIS-Nugget_3A00_-Verify-a-data-source-before-using-it.aspx

    Sources can be dynamic and rely on other criteria

    http://www.sqlservercentral.com/articles/Video/64262/

    Processes can be made dependant on 1, 2, 3, 4 or more other processes and there are process containers and loops

    http://www.sqlservercentral.com/articles/whatsnewwithssis/2234/

    Many different data sources can be handled

    configurations can be easily changed (SSIS configurations)

    http://www.sqlservercentral.com/articles/SSIS/66500/

    Checkpoints can be used to gracefully restart a process

    http://www.sqlservercentral.com/articles/Integration+Services/65473/

    And with the scripting the only limit to how good an SSIS package can be is down to you imagination

    Unfortunately I think you've only seen the dark, lazy side to SSIS

  • Samuel, it sounds to me like you're saying the conclusion I've come to:

    Process control, and movement of data from source to destination - SSIS good

    Data manipulation - SSIS bad, or more like, Just because you can, doesn't mean you should do data manipulation within SSIS packages.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply