how to select only non null columns out of 100 plus columns table

  • Hi

    I have 100 plus columns table and right of the bat I see 90 columns are null so I want focus only on non null columns for example see if there are any dups or simply analyze data... so how to filter non-null columns ?

  • Do you mean you want to use some code to identify the columns that only contain nulls so that you can skip them?  If you know which columns are always NULL, then just omit those from your query.

  • This might point you in the right direction. 'Address' is the name of the table you are interested in.
    use AdventureWorks
    go
    select *
    from sys.all_columns c
      join sys.all_objects o on c.object_id=o.object_id
    where o.name='Address' and c.is_nullable=0
    order by column_id

  • You can use a CROSS APPLY to adjust the row to make it much easier to analyze values.  I can't give you more details than this right now because your initial request is too vague.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • If you've got SSIS around you can use the Data Profiling Task to find out the columns that have mostly or all NULL values. Then you can build your query off those results.

  • here is an aproach, try it out 🙂


    use tempdb;
    go
    create table test_20170407(id int identity(1,1) , col1 int null, col2 int null, col3 int null, col4 int null, col5 int null, col6 int null)
    insert into test_20170407 (col1, col2, col3, col4, col5, col6)
    values (null, 1, null, null, null, null)
            ,(null, 2, 1, null, null, null)
            ,(null, 3, null, null, 2, null)
            ,(null, 4, null, null, null, null)
            ,(null, 5, null, null, null, null)
            ,(null, 6, 1, null, 2, null)
            ,(null, 7, null, null, null, null)
            ,(null, 8, 1, null, null, null)
    SELECT    * FROM test_20170407

    DECLARE @query VARCHAR(MAX) /* @query is null at this point, so in the first ocurrence of coalesce the comma is ommited */
    SELECT    @query = COALESCE(@query + '+ ', '') + 'case ISNULL(COUNT(' + c.name + '), 0) WHEN 0 THEN '''' ELSE '',' + c.name + ''' END'
    FROM    sys.all_columns c
    INNER    JOIN
            sys.all_objects o
            ON c.object_id = o.object_id
    WHERE    o.name='test_20170407'
    AND        c.is_nullable=1

    set @query = '
    declare @innerquery2 VARCHAR(MAX)
    select @innerquery2 = ' + @query + ' from test_20170407
    set @innerquery2 = ''select id'' + @innerquery2 + '' from test_20170407 ''
    print @innerquery2
    exec(@innerquery2)
    '
    exec(@query)

    /*
    -- if you dont want id as first collumn, or all columns are nullable, cut the first comma with RIGHT(@innerquery, LEN(@innerquery)-1)
    --
    set @query = '
    declare @innerquery VARCHAR(MAX)
    select @innerquery = ' + @query + ' from test_20170407
    set @innerquery = ''select '' + RIGHT(@innerquery, LEN(@innerquery)-1) + '' from test_20170407 ''
    print @innerquery
    exec(@innerquery)
    '
    exec(@query)
    */
    drop table test_20170407

  • and yet again, you can dinamically create a view with only the columns that have no nulls
    (you must run it by steps, if in the same batch the table is not create when you try to create the view)

    use tempdb;
    go

    --STEP 1
    create table test_20170407(id int identity(1,1) , col1 int null, col2 int null, col3 int null, col4 int null, col5 int null, col6 int null)
    insert into test_20170407 (col1, col2, col3, col4, col5, col6)
    values (null, 1, null, null, null, null)
            ,(null, 2, 1, null, null, null)
            ,(null, 3, null, null, 2, null)
            ,(null, 4, null, null, null, null)
            ,(null, 5, null, null, null, null)
            ,(null, 6, 1, null, 2, null)
            ,(null, 7, null, null, null, null)
            ,(null, 8, 1, null, null, null)

    SELECT    * FROM test_20170407

    --STEP 2
    DECLARE @query VARCHAR(MAX) -- @query is null at this point, so in the first ocurrence of coalesce the comma is ommited
    SELECT    @query = COALESCE(@query + '+ ', '') + 'case ISNULL(COUNT(' + c.name + '), 0) WHEN 0 THEN '''' ELSE '',' + c.name + ''' END'
    FROM    sys.all_columns c
    INNER    JOIN
            sys.all_objects o
            ON c.object_id = o.object_id
    WHERE    o.name='test_20170407'
    AND        c.is_nullable=1

    set @query = '
    declare @innerquery2 VARCHAR(MAX)
    select @innerquery2 = ' + @query + ' from test_20170407
    set @innerquery2 = ''
    create view test_20170407_vw
    as
    select id'' + @innerquery2 + '' from test_20170407
    GO
    ''
    print @innerquery2
    --exec(@innerquery2)
    '
    exec(@query)

    select * from test_20170407_vw

    --STEP 3
    --dont forget to drop the test table and view
    --drop table test_20170407
    --drop view test_20170407_vw

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply