Determine if a table column is used?

  • Does anyone know if there's any way you can determine if a column in a table is used or not?

    I have a table with 926 columns (I didn't build this), of which about 850+ of them appear to have NULL values for every record.

    Is there a way to determine which columns are 100% unused without actually querying each table and column to see if there's a value or not in any given row?

    Ex:

    SELECT COUNT(1) from [dbo].[TableWithManyColumns] WHERE [Column] IS NOT NULL

    We're trying to see if we can remove all the unused columns in this table (and dozens of other similar ones)

    Thanks in advance!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Try this. Notice how the count is zero for the column containing all NULLs

    IF OBJECT_ID('tempdb..#tmp', 'U') IS NOT NULL

    DROP TABLE #tmp;

    CREATE TABLE #tmp

    (

    c1 VARCHAR(10)

    , c2 VARCHAR(10)

    , c3 VARCHAR(10)

    );

    INSERT #tmp

    (

    c1

    , c2

    , c3

    )

    VALUES

    (

    'a', 'b', NULL

    )

    ,(

    'a', 'b', NULL

    )

    ,(

    'a', NULL, NULL

    )

    ,(

    'a', 'b', NULL

    )

    ,(

    'a', 'b', NULL

    )

    ,(

    NULL, 'b', NULL

    );

    SELECT

    COUNT(t.c1)

    , COUNT(t.c2)

    , COUNT(t.c3)

    FROM #tmp t;

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • DECLARE @query nvarchar(max)

    SET @query = (SELECT STUFF(COL_ONE, 1, 10, '') FROM

    (SELECT ' UNION ALL SELECT DISTINCT ' + char(39) + COLUMN_NAME + char(39) + '' + ' FROM <your table> ' + ' HAVING SUM( CASE WHEN ' + COLUMN_NAME + ' IS NOT NULL THEN 1 ELSE 0 END) = 0' FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '<your table>'

    FOR XML PATH('') ) TESTY(COL_ONE))

    EXEC sp_executesql @stmt = @query

    Replace <your table> with whatever table you are interested in, should give you a list of columns that are all nulls.

  • The DATA VALUES are absolutely meaningless in determining if you can drop a column or not!!! You must do a FULL AND COMPLETE impact analysis, searching EVERY piece of code for ANY reference to a column you wish to drop. And I mean every piece of code: entire application(s), scripts/batch files and processes, sql agent job text, ALL programmable objects in sql server (quite a few there), index definitions, etc., etc.

    I note that ORM and dynamic SQL references can still slip through the cracks and you wind up breaking something.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • It's strange that there would be about 850 columns containing nothing but NULL. Apparently the original DBA has long since left and can't be sought for guidance.

    If the columns truly are useless and yet applications and downstream processes still depend on them (perhaps they're doing SELECT *), then one thing you can do is copy the table to a new name minus the 850 columns, drop the original table, and then create a view having the same name as the original table. The view can select the 77 useful columns but compute the remaining 850 useless columns as a value of NULL.

    For example:

    create view MyBigTable

    as

    select

    col1

    , ...

    , col76,

    , NULL as col77

    , ...

    , NULL as col926

    from MyBigTableNew;

    GO

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Or you could leave the columns as is and add page compression, which in this case would probably shrink the table down to 10% or less of it's original size.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (11/4/2016)


    Or you could leave the columns as is and add page compression, which in this case would probably shrink the table down to 10% or less of it's original size.

    Enterprise Edition only, so maybe what, 1-5% of the installed user base can do this. 🙁

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks for the comments guys (not sure why I'm not being notified about them)...

    The table design is horrendous...in fact this is a rather small table 🙂 It's a transaction processing system that I guess was originally built/ported over from an older mainframe system that used flat files, etc so it is what it is. There's so many columns in 1 table that they had to break it out into 3 tables because there are 2400+ column sin it (which exceeds SQL's limit of 1024 columns per non-wide table) - so there's like a MyBigTable_Primary, MyBigTable_Secondady, and MyBigTable_Tertiary :w00t: just to accommodate all the columns. You can imagine how that SELECT * looks...

    In any case, several options work to get this information, but all involve actually querying the table.

    ZZartan's example works great on smaller tables with minimal rows but not so well on thick tables with 100+ million records

    DECLARE @query nvarchar(max)

    SET @query = (SELECT STUFF(COL_ONE, 1, 10, '') FROM

    (SELECT ' UNION ALL SELECT DISTINCT ' + char(39) + COLUMN_NAME + char(39) + '' + ' FROM <your table> ' + ' HAVING SUM( CASE WHEN ' + COLUMN_NAME + ' IS NOT NULL THEN 1 ELSE 0 END) = 0' FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '<your table>'

    FOR XML PATH('') ) TESTY(COL_ONE))

    EXEC sp_executesql @stmt = @query

    I found another version that I believe ScottPletcher posted here that works much better (with regards to speed) USE COREISSUE

    GO

    DECLARE @tablename sysname

    SET @tablename = N'CCard_Secondary' --<<-- put table name here Smile

    DECLARE @sql varchar(max)

    SET @sql = NULL

    SELECT @sql = ISNULL(@sql + ', ' + CHAR(13) + CHAR(10), '') + SPACE(4) +

    'SUM(CASE WHEN [' + CAST(column_name AS varchar(128)) + '] IS NULL THEN 0 ELSE 1 END) ' +

    'AS [' + CAST(column_name AS varchar(128)) + ']'

    FROM INFORMATION_SCHEMA.columns

    WHERE table_name = @tablename

    AND IS_NULLABLE = 'YES'

    AND COLUMNPROPERTY(OBJECT_ID(@tablename), column_name, 'IsComputed') = 0

    ORDER BY ordinal_position

    SET @sql = 'SELECT' + CHAR(13) + CHAR(10) + @sql + CHAR(13) + CHAR(10) +

    'FROM [' + CAST(@tablename AS varchar(128)) + ']'

    EXEC(@sql)

    But again, all involve reading the table.

    I was hoping for some way "under the covers" to tell this but I guess not 🙁

    Appreciate everyone's comments/suggestions - thanks!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • Eric M Russell (11/4/2016)


    Or you could leave the columns as is and add page compression, which in this case would probably shrink the table down to 10% or less of it's original size.

    Another option is to change the all null columns to sparse columns, wouldn't break any code either.

    😎

  • MyDoggieJessie (11/7/2016)


    Thanks for the comments guys (not sure why I'm not being notified about them)...

    The table design is horrendous...in fact this is a rather small table 🙂 It's a transaction processing system that I guess was originally built/ported over from an older mainframe system that used flat files, etc so it is what it is. There's so many columns in 1 table that they had to break it out into 3 tables because there are 2400+ column sin it (which exceeds SQL's limit of 1024 columns per non-wide table) - so there's like a MyBigTable_Primary, MyBigTable_Secondady, and MyBigTable_Tertiary :w00t: just to accommodate all the columns. You can imagine how that SELECT * looks...

    In any case, several options work to get this information, but all involve actually querying the table.

    ZZartan's example works great on smaller tables with minimal rows but not so well on thick tables with 100+ million records

    DECLARE @query nvarchar(max)

    SET @query = (SELECT STUFF(COL_ONE, 1, 10, '') FROM

    (SELECT ' UNION ALL SELECT DISTINCT ' + char(39) + COLUMN_NAME + char(39) + '' + ' FROM <your table> ' + ' HAVING SUM( CASE WHEN ' + COLUMN_NAME + ' IS NOT NULL THEN 1 ELSE 0 END) = 0' FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '<your table>'

    FOR XML PATH('') ) TESTY(COL_ONE))

    EXEC sp_executesql @stmt = @query

    I found another version that I believe ScottPletcher posted here that works much better (with regards to speed) USE COREISSUE

    GO

    DECLARE @tablename sysname

    SET @tablename = N'CCard_Secondary' --<<-- put table name here Smile

    DECLARE @sql varchar(max)

    SET @sql = NULL

    SELECT @sql = ISNULL(@sql + ', ' + CHAR(13) + CHAR(10), '') + SPACE(4) +

    'SUM(CASE WHEN [' + CAST(column_name AS varchar(128)) + '] IS NULL THEN 0 ELSE 1 END) ' +

    'AS [' + CAST(column_name AS varchar(128)) + ']'

    FROM INFORMATION_SCHEMA.columns

    WHERE table_name = @tablename

    AND IS_NULLABLE = 'YES'

    AND COLUMNPROPERTY(OBJECT_ID(@tablename), column_name, 'IsComputed') = 0

    ORDER BY ordinal_position

    SET @sql = 'SELECT' + CHAR(13) + CHAR(10) + @sql + CHAR(13) + CHAR(10) +

    'FROM [' + CAST(@tablename AS varchar(128)) + ']'

    EXEC(@sql)

    But again, all involve reading the table.

    I was hoping for some way "under the covers" to tell this but I guess not 🙁

    Appreciate everyone's comments/suggestions - thanks!

    I'd be interested to hear why you discounted my suggestion of using COUNT()? Did you find scenarios where it did not provide the expected results?

    My limited testing suggests that the query plan it produces is a little cleaner than that produced by Scott's suggestion.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Because in that route, while working with 100+ million rows and doing an insert into a temp table on several hundred columns wouldn't play well with tempdb and I don't think it would perform any faster. A good method for smaller tables though for sure.

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (11/7/2016)


    Because in that route, while working with 100+ million rows and doing an insert into a temp table on several hundred columns wouldn't play well with tempdb and I don't think it would perform any faster. A good method for smaller tables though for sure.

    Aha, you misunderstood my intentions. I used a table variable merely to illustrate the principle with an example.

    Scott's query produces syntax similar to the following:

    select C1Total = sum(case c1 is null then 0 else 1 end)

    etc etc

    My suggestion was to use this instead:

    select C1Total = count(c1)

    etc etc

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Hey Phil! Thanks for clarifying and yes I did misunderstood what you were intending. Both methods work and perform about the same - your method omits an extra "computer scalar" operation - thanks!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • TheSQLGuru (11/4/2016)


    Eric M Russell (11/4/2016)


    Or you could leave the columns as is and add page compression, which in this case would probably shrink the table down to 10% or less of it's original size.

    Enterprise Edition only, so maybe what, 1-5% of the installed user base can do this. 🙁

    OK, when I read this, I assumed it was Enterprise Edition.

    I have a table with 926 columns (I didn't build this) ...

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eirikur Eiriksson (11/7/2016)


    Eric M Russell (11/4/2016)


    Or you could leave the columns as is and add page compression, which in this case would probably shrink the table down to 10% or less of it's original size.

    Another option is to change the all null columns to sparse columns, wouldn't break any code either.

    😎

    Sparse Columns can't be used with Page Compression. Compression applies also to the used columns as well, so it might offer more storage savings, assuming that's the goal here.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply