Counting non-blank columns

Question

Counting non-blank columns

ajrendall

SSC-Addicted

Points: 450
More actions
April 8, 2010 at 4:45 am

#220290

Hello everyone,
I have a challenge where I need to count the number of rows that have valid data in them across a number of columns. Valid data in this instance is defined as not being blank for character columns, not being 0 for numeric and being more recent than the start of the century for dates.
At the moment I am using something like the second method in the following example:
IF OBJECT_ID('tempdb..#test_data_table','U') IS NOT NULL DROP TABLE #test_data_table
IF OBJECT_ID('tempdb..#test_count_table','U') IS NOT NULL DROP TABLE #test_count_table
SELECT 'Data' C1,'Data' C2,'Data' C3,1 C4 INTO #TEST_DATA_TABLE UNION ALL
SELECT 'Data','Data','Data',1 UNION ALL
SELECT 'Data','','Data',0 UNION ALL
SELECT 'Data','Data','',1 UNION ALL
SELECT '','Data','Data',0 UNION ALL
SELECT '','','Data',1 UNION ALL
SELECT '','Data','',1 UNION ALL
SELECT '','','',0
--One way (creates a result table)
SELECT
SUM(CASE WHEN C1!='' THEN 1 ELSE 0 END) C1Count,
SUM(CASE WHEN C2!='' THEN 1 ELSE 0 END) C2Count,
SUM(CASE WHEN C3!='' THEN 1 ELSE 0 END) C3Count,
SUM(CASE WHEN C4!=0 THEN 1 ELSE 0 END) C4Count
INTO #test_count_table
FROM #test_data_table
SELECT 'C1Count',C1Count
FROM #test_count_table
UNION
SELECT 'C2Count',C2Count
FROM #test_count_table
UNION
SELECT 'C3Count',C3Count
FROM #test_count_table
UNION
SELECT 'C4Count',C4Count
FROM #test_count_table
--Another way (direct)
SELECT 'C1Count',COUNT(C1)
FROM #test_data_table
WHERE C1!=''
UNION
SELECT 'C2Count',COUNT(C2)
FROM #test_data_table
WHERE C2!=''
UNION
SELECT 'C3Count',COUNT(C3)
FROM #test_data_table
WHERE C3!=''
UNION
SELECT 'C4Count',COUNT(C4)
FROM #test_data_table
WHERE C4!=0
What I am trying to find out is if there is a more efficient way to do this as while I'm currently testing on some small tables, I will have to run this on some very large ones too (10 million+ rows). I want to get away from using dynamic SQL if possible and would really like a solution where I can make it as set-based as possible.
The results need to be in the form presented by either of the above methods (column names as row labels and counts in a column)
Any suggestions?
Thanks in a advance,
Andrew

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184394 More actions · Answer 1

I think you have the best way to do this.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

ajrendall SSC-Addicted Points: 450 More actions · Answer 2

I feared this may be the case.

:satisfied:

Thanks for the help.

Jack Corbett SSC Guru Points: 184394 More actions · Answer 3

I think the first method is better then the second with the unions because you are only scanning the source table once. If you have to cross tab/pivot it, check the articles on that in my signature.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

ajrendall SSC-Addicted Points: 450 More actions · Answer 4

Thanks again.

I had already read most of those excellent articles, but forgotten that I had :hehe:. I've now re-read them fully.

A.

Jeff Moden SSC Guru Points: 1004414 More actions · Answer 5

As a side bar, I'm not sure why you want to avoid dynamic SQL for this... it's not slow when used for batch processess... just for bazillions of single row GUI hits. Since it's not public facing, it's not likely that you need to worry about SQL Injection but preventing it isn't that hard, either.

Written correctly, dynamic SQL isn't difficult to troubleshoot, either.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Paul White SSC Guru Points: 150468 More actions · Answer 6

SELECT U.Name,

U.Value

FROM (

SELECT C1Count = COUNT(ASCII(C1)),

C2Count = COUNT(ASCII(C2)),

C3Count = COUNT(ASCII(C3)),

C4Count = COUNT(NULLIF(C4, 0))

FROM #test_data_table

) S

UNPIVOT (

Value

FOR Name

IN (C1Count, C2Count, C3Count, C4Count)

) U;

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi