Query that returns only columns with data

  • I have a table with a large amount columns. Instead of trying to manually go thru the table and determine which columns have data. Is there a way to make a select that will return only the columns with data in them (omitting any columns w/out data). Without data, I am referring to all nulls or empty strings, etc.

  • tskelley (9/26/2013)


    I have a table with a large amount columns. Instead of trying to manually go thru the table and determine which columns have data. Is there a way to make a select that will return only the columns with data in them (omitting any columns w/out data). Without data, I am referring to all nulls or empty strings, etc.

    You can't write a query that conditionally has a column based on the content of the column. The list of columns is defined in the select portion of the query.

    You could do with this with dynamic sql but I would question what you are really trying to do here.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I was hoping someone know of a tool or trick that would accomplish this. If not, would need something using dynamic sql to loop thru each column, check for any non-null or non-empty columns, then provide that column as output to build a select list.

  • Why would you need this?

    To achieve this, you would need to build a dynamic query and the process might be very inefficient. If you're trying to determine which columns are unused, then it might probably be part of a solution and further analysis should be made.

    Example:

    create table #Example

    (ColID int,

    COL1 int,

    Col2 datetime,

    Col3 datetime,

    Col4 varchar(20),

    Col5 int);

    INSERT #Example

    values

    (1, 201, DATEADD(mi,-3,getdate()-1), DATEADD(mi,+60,getdate()-1), NULL, NULL),

    (2, 202, DATEADD(mi,-3,getdate()), DATEADD(mi,+75,getdate()), 'LABC2681', NULL),

    (3, 203, DATEADD(mi,-3,getdate()), DATEADD(mi,+63,getdate()), 'LABC3123', NULL),

    (4, 204, DATEADD(mi,-3,getdate()), DATEADD(mi,+65,getdate()), 'LABC4765', NULL),

    (5, 205, DATEADD(mi,-1,getdate()), DATEADD(mi,+78,getdate()), 'LABC1521', NULL);

    DECLARE @SQL varchar(8000)

    SELECT @SQL = 'SELECT ' + STUFF( CASE WHEN COUNT(ColID) > 0 THEN ',ColID ' ELSE '' END +

    CASE WHEN COUNT(Col1) > 0 THEN ',Col1 ' ELSE '' END +

    CASE WHEN COUNT(Col2) > 0 THEN ',Col2 ' ELSE '' END +

    CASE WHEN COUNT(Col3) > 0 THEN ',Col3 ' ELSE '' END +

    CASE WHEN COUNT(Col4) > 0 THEN ',Col4 ' ELSE '' END +

    CASE WHEN COUNT(Col5) > 0 THEN ',Col5 ' ELSE '' END , 1, 1, '') +

    ' FROM #Example'

    FROM #Example

    EXEC( @SQL)

    DROP TABLE #Example

    EDIT: As an option, you could change the COUNT(Col) for a SUM(LEN(Col))

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Here is some dynamic sql to generate some more dynamic sql to generate the required sql...:crazy:

    select

    '

    select stuff(cast(list as varchar(max)),1,2,''select '') + '' from '+tablename+'''

    from (

    '+select1+' for xml path(''''),type) a(list) where a.list is not null'

    from (

    select

    'select distinct '', ''+x.name

    from '+quotename(schema_name(t.schema_id))+'.'+quotename(t.name)+'

    cross apply (

    '

    + ca.list.value('(text())[1]','varchar(max)')

    + ') x(colid,name)

    ' as select1,quotename(schema_name(t.schema_id))+'.'+quotename(t.name) as tablename

    from sys.tables t

    cross apply (

    select 'select '+cast(c.column_id as varchar(20))+', '''+quotename(c.name)+''' where '+

    case

    when c.system_type_id in (type_id(N'date'),type_id(N'datetime2')) then 'cast(cast('+quotename(c.name)+' as datetime) as int)'

    when c.precision<>0 then 'cast('+quotename(c.name)+' as int)'

    else 'isnull(datalength('+quotename(c.name)+'),0)'

    end + '<>0 ' + case when c.column_id = max(c.column_id) over (partition by c.object_id) then '' else 'union all

    ' end

    from sys.columns c

    where c.object_id = t.object_id

    order by c.column_id

    for xml path(''),type

    ) ca(list)

    ) a

    Basically, if you run that in your database, it will generate a series of sql statements like this:

    (mine was run in a fresh install ReportServer database)

    select stuff(cast(list as varchar(max)),1,2,'select ') + ' from [dbo].[Catalog]'

    from (

    select distinct ', '+x.name

    from [dbo].[Catalog]

    cross apply (

    select 1, '[ItemID]' where isnull(datalength([ItemID]),0)<>0 union all

    select 2, '[Path]' where isnull(datalength([Path]),0)<>0 union all

    select 3, '[Name]' where isnull(datalength([Name]),0)<>0 union all

    select 4, '[ParentID]' where isnull(datalength([ParentID]),0)<>0 union all

    select 5, '[Type]' where cast([Type] as int)<>0 union all

    select 6, '[Content]' where isnull(datalength([Content]),0)<>0 union all

    select 7, '[Intermediate]' where isnull(datalength([Intermediate]),0)<>0 union all

    select 8, '[SnapshotDataID]' where isnull(datalength([SnapshotDataID]),0)<>0 union all

    select 9, '[LinkSourceID]' where isnull(datalength([LinkSourceID]),0)<>0 union all

    select 10, '[Property]' where isnull(datalength([Property]),0)<>0 union all

    select 11, '[Description]' where isnull(datalength([Description]),0)<>0 union all

    select 12, '[Hidden]' where cast([Hidden] as int)<>0 union all

    select 13, '[CreatedByID]' where isnull(datalength([CreatedByID]),0)<>0 union all

    select 14, '[CreationDate]' where cast([CreationDate] as int)<>0 union all

    select 15, '[ModifiedByID]' where isnull(datalength([ModifiedByID]),0)<>0 union all

    select 16, '[ModifiedDate]' where cast([ModifiedDate] as int)<>0 union all

    select 17, '[MimeType]' where isnull(datalength([MimeType]),0)<>0 union all

    select 18, '[SnapshotLimit]' where cast([SnapshotLimit] as int)<>0 union all

    select 19, '[Parameter]' where isnull(datalength([Parameter]),0)<>0 union all

    select 20, '[PolicyID]' where isnull(datalength([PolicyID]),0)<>0 union all

    select 21, '[PolicyRoot]' where cast([PolicyRoot] as int)<>0 union all

    select 22, '[ExecutionFlag]' where cast([ExecutionFlag] as int)<>0 union all

    select 23, '[ExecutionTime]' where cast([ExecutionTime] as int)<>0 union all

    select 24, '[SubType]' where isnull(datalength([SubType]),0)<>0 union all

    select 25, '[ComponentID]' where isnull(datalength([ComponentID]),0)<>0 ) x(colid,name)

    for xml path(''),type) a(list) where a.list is not null

    And that query is what I think you need...when it is run against my data, currently it produces this:

    select [CreatedByID], [CreationDate], [ExecutionFlag], [ItemID], [ModifiedByID], [ModifiedDate], [PolicyID], [PolicyRoot], [Type] from [dbo].[Catalog]

    ...which as you see has only 9 of the original 25 columns selected - the rest are empty OR zero.

    Now, I made a choice that if a non-string column contains the value zero, I would count that as "not having a value" - this is arbitrary and would be different for you presumably.

    Maybe the concept of using cross apply to select each column's name where the column is not "empty" (for want of a better word) and using that distinct list to build the select is all you need?

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply