March 3, 2006 at 9:16 am
Hello Mike,
this is a very good article, everything correctly describes issues I have to deal with almost every day when working with reports.
It would be a good idea if you would like to write a part 2 dealing with joins and nulls. The example may be when you have customers and orders and you would like to return all customers (or a count of customers) who did not place the orders. Sometimes these queries could be complex, especially when the specific order is required.
Regards,Yelena Varsha
March 3, 2006 at 10:16 am
Hi Steve and Yelena,
Thanks for the feedback! I'll look into writing an article on JOINs in the future. I'm actually working on a couple of other articles now, but I'll draft up a couple of ideas for later. My feeling is that you can't over-explain JOINs. Especially now that SQL 2005 kicks it up a notch with CTE's.
Thanks again
March 6, 2006 at 7:10 am
To expand a bit on the count(*) versus count(1) discussion, what does count(1) mean?
The fact that count(1) returns the same value as count(*) is an artifact of the parser and the query optimizer. In fact, count(0), count(2), and count(20) also return the same value on both of the SQL implementations I've tested this morning.
I suspect that the suggestion that count(1) is faster than count(*) came from a poorly constructed performance test on some version of some SQL implementation.
Keith W. Hare
March 6, 2006 at 7:34 am
The point I wanted to make was that "select count(x) from table" gives you the number of non-null values returned by the statement "select x from table". This applies when x is a column and also when x is something else (except *). So count(*) isn't so special after all, you may think of it as nothing more than count(1) (or count(0) if you prefer). I certainly don't claim it is faster...
I acknowledge that count(*) is part of the ansi syntax, as Mike also points out. I don't want to throw away count(*), only point out that SQL (on SQL Server 2000) is equally powerful without it.
March 7, 2006 at 7:47 am
The point I'm making is that COUNT(1) is not defined as standard syntax, COUNT(*) is. COUNT(1) [or COUNT(100), etc.] doesn't have a standard behavior defined, so that its behavior could change or even be removed in future versions of SQL Server... That being the case, I don't see COUNT(constant) as being powerful... I think it's probably more along the lines of a proprietary undocumented feature that should probably be avoided in favor of well-documented/standard features and syntax. I think "more powerful" implies that COUNT(constant) achieves something above and beyond what can be done with the standard ANSI COUNT(*) syntax... I don't see that as the case though...
March 7, 2006 at 1:07 pm
As I tried to say in my previous post, I think I understand your point.... I only wanted to say that on SQL Server 2000 you don't lose anything (except perhaps performance) by not using count(*) (the quote "more powerful" must be from somewhere else ). I think expressions such as count(1) or count('') or even count(case when column = 1 then null else 1 end) are pretty straightforward (but not necessarily very useful) generalizations of the count(column) syntax, and I would be very surprised if this behaviour changes in future versions of SQL Server (no guarantees, however ). Personally I stick with count(*) in queries but sometimes think of it as count(1). This could very well be because I have never read the ansi standard and have never worked with other database systems...
March 8, 2006 at 6:06 pm
Oops you're right, you said SQL was "just as powerful" without COUNT(*), not "more powerful". Sorry about that. Because COUNT(constant) doesn't have a guaranteed behavior, it's possible that ANSI could define a specific behavior for it in the future that differs from the MS implementation; or they could define that it should throw an exception, or any other number of strange things that could break code down the road.
On another topic, one of the items I should have mentioned in the article is COUNT(DISTINCT column), which is also defined by ANSI. This variation returns the count of DISTINCT rows for a particular column. Like COUNT(column), this one eliminates NULLs before returning the result. I think this variation tends to get less use than it's cousins COUNT(*) and COUNT(column), but it can be useful on occasion.
Thanks again.
September 6, 2006 at 12:01 pm
I ran into a snag on the SUM() section, especially the statement "we can rest assured that NULLs are eliminated from our result"
I tried to use SUM() as suggested but it returns NULLS.
I then tried to use a CASE statement like this (but it failed to eliminate NULLS as well):
SELECT SUM(CASE WHEN someField IS NULL THEN 0) AS SumOfSomeField FROM someTable
To get it to work I needed to change the CASE logic like this:
SELECT SUM(CASE WHEN someField IS NOT NULL THEN someField ELSE 0) AS SumOfSomeField FROM someTable
I can't understand why T-SQL behaves this way. But then again it is a Microsoft product so I'm not surprised
September 6, 2006 at 2:07 pm
The only time (that I know of) that SQL returns NULLs with SUM() is if there are no rows to SUM. This is the case if: 1) There are no rows in your table, and 2) If all the rows you are SUMming are NULL [in which case they are eliminated, and it's the same thing as SUMming no rows].
The SUM of no rows is not a known value. AFAIK that's standard ANSI procedure. The only other thing I can think of without seeing your table and your test data, is that you might have set some strange non-ANSI NULL feature.)
BTW, how does this work out for you:
SELECT SUM(COALESCE(somefield, 0)) FROM sometable
September 24, 2006 at 8:39 pm
Hi Mike ,
happen to stumble across these articles of yours..they are really informative & well explained.
thanks for making the technology a better tool to work with...
Priyaranjan Mohanty
September 26, 2006 at 7:10 pm
Thank you Priyaranjan, I appreciate the feedback!
March 27, 2008 at 11:39 pm
greate one , very useful for aggregate function,
thanks
April 4, 2008 at 4:33 am
Very good! Thank you.
I was waiting for someone to address the fact that nulls are grouped together by a group by clause -- a special case exception to the "null is never equal null" rule? (Not equal, just the same as...)
April 4, 2008 at 12:10 pm
April 5, 2008 at 4:32 pm
Mike,
I've started reading your article and am not clear why the query: SELECT COUNT(region) FROM Suppliers WHERE Region IS NOT NULL returns 0
1. WHERE clause eliminates every row in which region is not NULL.
Therefore, I only have 9 rows where Region IS Not Null
3. The COUNT(region) function eliminates every row in which region is NULL.
I would still expect 9 rows still to be returned back after the select query is parsed.
If you could please elaborate on why 0 rows are returned back from the query.
Viewing 15 posts - 16 through 30 (of 31 total)
You must be logged in to reply to this topic. Login to reply