Find data source of nested view columns?

  • I've got a problem that I've been coming up with nothing but dead ends on. I have a database with fairly complex views where the views are nested several levels deep. We need to know where in the actual tables this data is coming from and I have a couple hundred of these to deal with. We're currently tracing them manually to get the answer which has several people working full time to read through the code.

    What I've been attempting to do is to automate the process of finding the ultimate source of each view data column at the bottom of the tree. I've come up with several false positives. Specifically, I've looked pretty heavily at INFORMATION_SCHEMA.VIEW_COLUMN_USAGE and found that it's missing more data then it contains regarding this problem. I've also looked at INFORMATION_SCHEMA.COLUMNS, sys.views, sys.sql_dependencies, and sys.sql_expression_dependencies as well as several others and turned up nothing that can connect a column in a view with the data source for that column with any reliability.

    Probably the best results I've gotten so far is with this query below, but it's missing more view columns then it contains. It's definitely not working. Does anyone have any ideas? Basically, this problem boils down to "How can you get the underlying source/column for a given view column?" If I can get that, the problem is solved because the rest is basically just recursion through the tree.

    I might also mention that I've tried outputting the query plan as XML and reading the XML query plan for the answer, which did not seem to be working. At least it certainly was not returning the table and column that the view was getting the data from.

    SELECT A.Table_Name AS [View],

    B.TABLE_NAME AS [Source],

    C.TABLE_TYPE AS [Source Type],

    B.COLUMN_NAME AS [Source Column]

    FROM INFORMATION_SCHEMA.TABLES AS A

    LEFT OUTER JOIN INFORMATION_SCHEMA.VIEW_COLUMN_USAGE AS B

    ON A.TABLE_SCHEMA = B.VIEW_SCHEMA

    AND A.TABLE_NAME = B.VIEW_NAME

    LEFT OUTER JOIN INFORMATION_SCHEMA.TABLES AS C

    ON B.TABLE_SCHEMA = C.TABLE_SCHEMA

    AND B.TABLE_NAME = C.TABLE_NAME

    WHERE A.Table_Type = 'VIEW'

    ORDER BY A.Table_Name, B.TABLE_NAME, B.COLUMN_NAME

  • I've done something similar recently, although I was looking for linked server names in stored procedure and view definitions. Here is what I did. It may be more complex than you require, since it looks in all databases, but you can play about with it. Apologies if the syntax isn't 100% correct, but I've removed a whole load of joins from the end where I also try to ascertain what jobs are referencing the objects.

    John

    USE tempdb;

    IF OBJECT_ID('tempdb..#Dependencies') IS NOT NULL

    DROP TABLE #Dependencies;

    CREATE TABLE #Dependencies (

    DBName sysname NOT NULL

    ,ReferencingObj sysname NOT NULL

    ,LinkedServer sysname NULL

    ,ReferencedDB sysname NULL

    ,ReferencedObj sysname NOT NULL

    );

    -- Populate temp table with dependencies for all databases

    DECLARE @DBName sysname, @FetchStatus int, @sql nvarchar(max);

    DECLARE DBs CURSOR

    FOR SELECT name FROM sys.databases

    WHERE database_id > 4

    AND name NOT IN ('ReportServer', 'ReportServerTempDB')

    AND state_desc = 'ONLINE';

    OPEN DBs;

    FETCH NEXT FROM DBs INTO @DBName;

    SELECT @FetchStatus = [fetch_status]

    FROM sys.dm_exec_cursors(@@SPID)

    WHERE name = 'DBs';

    WHILE @FetchStatus = 0

    BEGIN

    SET @sql = 'SELECT ''' + @DBName + ''', o.name COLLATE Latin1_General_BIN, r.referenced_server_name COLLATE Latin1_General_BIN, r.referenced_database_name COLLATE Latin1_General_BIN, r.referenced_entity_name COLLATE Latin1_General_BIN FROM '

    + @DBName + '.sys.objects o JOIN ' + @DBName + '.sys.sql_expression_dependencies r ON o.object_id = r.referencing_id';

    INSERT INTO #Dependencies

    EXEC (@SQL);

    FETCH NEXT FROM DBs INTO @DBName;

    SELECT @FetchStatus = [fetch_status]

    FROM sys.dm_exec_cursors(@@SPID)

    WHERE name = 'DBs';

    END;

    CLOSE DBs;

    DEALLOCATE DBs;

    WITH Dependencies2 (DBName, ReferencingObj, LinkedServer, ReferencedDB, ReferencedObj) AS ( -- Where ReferencedDB is null, replace with DBName

    SELECT DBName, ReferencingObj, LinkedServer, COALESCE(ReferencedDB,DBName), ReferencedObj

    FROM #Dependencies

    )

    , LServers AS ( -- This is the recursive part

    -- Anchor member definition

    SELECT * FROM Dependencies2

    WHERE LinkedServer IN ('LinkedServer1','LinkedServer2')

    AND ReferencedObj <> 'sysdiagrams'

    UNION ALL

    -- Recursive member definition

    SELECT d.DBName, d.ReferencingObj, d.LinkedServer, d.ReferencedDB, d.ReferencedObj

    FROM LServers l JOIN Dependencies2 d

    ON d.ReferencedDB = l.DBName AND d.ReferencedObj = l.ReferencingObj

    )

    SELECT * FROM LServers

    OPTION (MAXRECURSION 5) -- or whatever level of nesting you think you have

  • You might try SQL Search from Red Gate[/url].

    And before anyone freaks, it's completely free.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply