April 16, 2013 at 7:57 am
Hugo Kornelis (4/16/2013)
Good question, mediocre explanation - no errors, but complicating things more than needed.There are three variations of COUNT:
* COUNT(*) - returns the number of rows, regardless of content.
* COUNT(expression) - evaluates expression for each row, and counts returns the number of non-NULL results. Both COUNT(columnname) and COUNT(1) are commonly used versions of this variation. The last one is too - it uses a complex expression that always returns NULL; COUNT returns the number of non-NULL results, which is 0.
* COUNT(DISTINCT expression) - as the previous one, but only the number of distinct values is returned. So if the same value is returned multiple times, it will be counted as 1. Rarely used in practise. The version in this question that used a constant expression is legal syntax, but will never be used in practise, as it will by definition return 1, except if the source of the query has no rows.
Thanks for the explanation Hugo.
I have a small doubt on
Both COUNT(columnname) and COUNT(1) are commonly used versions of this variation.
COUNT((columnname) and COUNT(1) returns the number of non-NULL results ??
create table #temp (SomeText varchar(20))
insert into #temp
select 'SomeText1'
union all
select NULL
union all
select 'SomeText2'
union all
select 'SomeText3'
union all
select 'SomeText3'
select COUNT(*) from #temp
select COUNT(1) from #temp
select COUNT(SomeText) from #temp
drop table #temp
The above code returns 5, 5, 4. Based on the output it looks like COUNT(*) and COUNT(1) returns the number of rows, regardless of content.
April 16, 2013 at 8:01 am
Hugo Kornelis (4/16/2013)
Good question, mediocre explanation - no errors, but complicating things more than needed.There are three variations of COUNT:
* COUNT(*) - returns the number of rows, regardless of content.
* COUNT(expression) - evaluates expression for each row, and counts returns the number of non-NULL results. Both COUNT(columnname) and COUNT(1) are commonly used versions of this variation. The last one is too - it uses a complex expression that always returns NULL; COUNT returns the number of non-NULL results, which is 0.
* COUNT(DISTINCT expression) - as the previous one, but only the number of distinct values is returned. So if the same value is returned multiple times, it will be counted as 1. Rarely used in practise. The version in this question that used a constant expression is legal syntax, but will never be used in practise, as it will by definition return 1, except if the source of the query has no rows.
Great question and great explanation Hugo. We can always COUNT() on you. :hehe:
April 16, 2013 at 8:04 am
kapil_kk (4/15/2013)
For the last query:select COUNT(convert(int,NULL)) from #temp
It will return 0 because when you execute the query:
SELECT CONVERT(int,NULL) it will return NULL and as the return type of COUNT is INT so NULL will implicitly
converted to '0'.
Hope it will clear to you now Yogi π
This isn't quite right Kapil. It is not an implicit conversion. What happens is you count NULL which will be zero. It doesn't actually matter what datatype you use, it just needs to have a datatype. It could just as easily be varchar or uniqueidentifier. The point is that you are casting NULL as a datatype and counting the number of NULL which we know will be ignored in a COUNT so the result is 0.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 β Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
April 16, 2013 at 8:39 am
rals (4/16/2013)
(...)The above code returns 5, 5, 4. Based on the output it looks like COUNT(*) and COUNT(1) returns the number of rows, regardless of content.
COUNT(*) returns the number of rows (after evaluating WHERE and GROUP BY if they are in the query) - by definition.
COUNT(1) is a specific use of the generic form COUNT(expression). The expression here is a very simple one - the constant 1. The definition of COUNT(expression) is to evaluate expression for each row - that will result in the value 1 for each row - and then count the number of non-NULL results. Since 1 is never NULL, the result of COUNT(1) will indeed always be equal to COUNT(*), allthough the "official" way to get there is a lot more complicated.
(And before you ask - the optimizer is smart enough to know that COUNT(1), just as COUNT(0) or COUNT('T'), or even COUNT(non-nullable column), are all equal to COUNT(*) and will use the same code path for all these versions).
And for the sake of completeness - the use of COUNT(1) instead of COUNT(*) to count the number of rows is very common. Long ago (I think it was in the days of SQL Server 6.0, maybe even before that), the parser used to first expand the * in COUNT(*) to a full column list, which would then be still ignored. Using COUNT(1) was more efficient because it avoided the hit on the metadata tables to get that columnlist. That has long since changed - but the myth lives on, and many people still use COUNT(1), claiming it to be better then COUNT(*).
April 16, 2013 at 9:09 am
Great explanation Hugo. Thanks.
Even I always use COUNT(1) though few times I heard both COUNT(*) and COUNT(1) are same. But today I more clear and thanks for you.
π
April 16, 2013 at 10:05 am
Thanks for a nice one, Yogesh!
April 16, 2013 at 10:24 am
Thanks for the question
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
April 16, 2013 at 10:33 am
I thought the original poster's explanation was great: I found it very simple to understand and I liked the way each count statement was addressed individually.
April 16, 2013 at 11:01 am
Rose Bud (4/16/2013)
I thought the original poster's explanation was great: I found it very simple to understand and I liked the way each count statement was addressed individually.
The problem with the original explanation is this:
COUNT(convert(int,NULL)) - It will give count as 0 always. Don't know why it is giving 0 count however COUNT() function counts NULL values also.
The problem is that convert(int,NULL) is not a row but a column value, and the count function - although it counts rows all of whose values are null - doesn't count null values when it's counting the number of non-null occurences of a column. There is a big difference between count(*) )or count(<non-null constant>) from <something> and count(X) from <something> - the first is counting rows and doesn't care about nulls - a row is a row even if it contains nothing but nulls, the second doesn't see any nulls anyway, while the third (which is what was in question here, in this particular case) is counting non-null values.
Tom
April 16, 2013 at 11:01 am
Thanks for the great question!
If you want retunr 5? here you go,
select COUNT(@@ERROR) FROM #temp
select COUNT(0) from #temp
select COUNT(100000000000000000000000000000) from #temp
select COUNT('ZZZZZZZZZZZZZZZZZZZZZZZZZZZZ') from #temp
select COUNT(-5.0) from #temp
select COUNT(+5.0) from #temp
select COUNT('') from #temp
select COUNT($10.0) from #temp
select COUNT(\.00) from #temp
--------------------------------------
;-)βEverything has beauty, but not everyone sees it.β β Confucius
April 16, 2013 at 12:12 pm
Hugo Kornelis (4/16/2013)
And for the sake of completeness - the use of COUNT(1) instead of COUNT(*) to count the number of rows is very common. Long ago (I think it was in the days of SQL Server 6.0, maybe even before that), the parser used to first expand the * in COUNT(*) to a full column list, which would then be still ignored. Using COUNT(1) was more efficient because it avoided the hit on the metadata tables to get that columnlist. That has long since changed - but the myth lives on, and many people still use COUNT(1), claiming it to be better then COUNT(*).
+1
You always seem to turn even the simplest of these Qotds to something so insightful that it forces me to read the discussions on every qotd...
You are awesome..!!!
___________________________________________________________________
If I can answer a question then anyone can answer it..trying to reverse the logic.. :hehe:
April 16, 2013 at 12:38 pm
Mike Hays (4/16/2013)
Interesting, learned something, thanks for the good question....
Same Here!
Not all gray hairs are Dinosaurs!
April 16, 2013 at 9:28 pm
Sean Lange (4/16/2013)
kapil_kk (4/15/2013)
For the last query:select COUNT(convert(int,NULL)) from #temp
It will return 0 because when you execute the query:
SELECT CONVERT(int,NULL) it will return NULL and as the return type of COUNT is INT so NULL will implicitly
converted to '0'.
Hope it will clear to you now Yogi π
This isn't quite right Kapil. It is not an implicit conversion. What happens is you count NULL which will be zero. It doesn't actually matter what datatype you use, it just needs to have a datatype. It could just as easily be varchar or uniqueidentifier. The point is that you are casting NULL as a datatype and counting the number of NULL which we know will be ignored in a COUNT so the result is 0.
Thanks for correcting me Sean
_______________________________________________________________
To get quick answer follow this link:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
April 16, 2013 at 11:47 pm
Hugo Kornelis (4/16/2013)
rals (4/16/2013)
(...)The above code returns 5, 5, 4. Based on the output it looks like COUNT(*) and COUNT(1) returns the number of rows, regardless of content.
COUNT(*) returns the number of rows (after evaluating WHERE and GROUP BY if they are in the query) - by definition.
COUNT(1) is a specific use of the generic form COUNT(expression). The expression here is a very simple one - the constant 1. The definition of COUNT(expression) is to evaluate expression for each row - that will result in the value 1 for each row - and then count the number of non-NULL results. Since 1 is never NULL, the result of COUNT(1) will indeed always be equal to COUNT(*), allthough the "official" way to get there is a lot more complicated.
(And before you ask - the optimizer is smart enough to know that COUNT(1), just as COUNT(0) or COUNT('T'), or even COUNT(non-nullable column), are all equal to COUNT(*) and will use the same code path for all these versions).
And for the sake of completeness - the use of COUNT(1) instead of COUNT(*) to count the number of rows is very common. Long ago (I think it was in the days of SQL Server 6.0, maybe even before that), the parser used to first expand the * in COUNT(*) to a full column list, which would then be still ignored. Using COUNT(1) was more efficient because it avoided the hit on the metadata tables to get that columnlist. That has long since changed - but the myth lives on, and many people still use COUNT(1), claiming it to be better then COUNT(*).
I was not aware of this things..
Thanks a lot Hugo for telling new things π
_______________________________________________________________
To get quick answer follow this link:
http://www.sqlservercentral.com/articles/Best+Practices/61537/
April 17, 2013 at 5:09 am
Thanks for the question. In general, I don't favor questions the require 57 answers to be evaluated. However, this one was OK as the code involved wasn't too complex.
[font="Verdana"]Please don't go. The drones need you. They look up to you.[/font]
Connect to me on LinkedIn
Viewing 15 posts - 16 through 30 (of 46 total)
You must be logged in to reply to this topic. Login to reply