Understanding NULL in SubQueries

  • Hi, I have the following extract from a subquery.

    WHERE EXISTS (SELECT NULL

    FROM tblCustomerData b

    I need to know how the 'NULL' actually fits in to this query and how it effects the output.

    Thanks

    Scott

  • It will generate the same number of NULL rows as the number of rows in the table. Since this will return a count of records higher than 0 it will me the exists check and thus the where. However if your Select statement has a where clause that would generate no rows, then no NULL rows will generate and a 0 record count will occurr, thus does not exist.

  • I'm really sorry..i'm just as confused.. Can you explain the following query for me

    USE Northwind

    GO

    SELECT CategoryName

    FROM Categories

    WHERE EXISTS (SELECT NULL)

    ORDER BY CategoryName ASC

    GO

    and how the (SELECT NULL) comes into play.

    Thanks again for your time

    Scott

  • Mr. Travis is correct in that it will return the number of rows, all having NULL to be checked by the EXISTS() function, matching the parameters of the query contained within the EXISTS(). If the EXISTS() clause gets any rows back, it evaluates to TRUE. So it doesn't matter if the query had SELECT null, SELECT 1, or SELECT <column name> there. EXISTS is just looking for the existence of at least one row.

    I'm assuming that this is part of a correlated subquery, where part of a predicate (a statement that evaluates to True, False, or Unknown as is EXISTS() in this case) is dependent upon the SELECT clause. What happens is the outer query is built (the one with the SELECT CustomerID FROM Customers) and the subquery (within the EXISTS() clause) is executed for each row in the outer query in order to generate a final result set.

    An example from the Northwind sample database:

    SELECT DISTINCT CustomerID 
    
    FROM Customers
    WHERE EXISTS(SELECT NULL FROM Orders WHERE Customers.CustomerID = Orders.CustomerID)

    This will return all CustomerID for which I have an order. This query should return 89 records. However, if I was to look at the number of rows in the Customers table, I'd see there are 91.

    Now, in this particular case I can do the same thing with a JOIN:

    SELECT DISTINCT Customers.CustomerID
    
    FROM Customers JOIN Orders ON Customers.CustomerID = Orders.CustomerID

    K. Brian Kelley

    bkelley@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/bkelley/

    K. Brian Kelley
    @kbriankelley

  • quote:


    USE Northwind

    GO

    SELECT CategoryName

    FROM Categories

    WHERE EXISTS (SELECT NULL)

    ORDER BY CategoryName ASC

    GO

    and how the (SELECT NULL) comes into play.


    In this particular case, SELECT NULL is ALWAYS going to generate a row. It's the same as doing SELECT 1. In QA, apart from any other query, if we execute SELECT NULL we'll get back a record set with a single row with a single field and it'll be NULL.

    For this particular query, the SELECT NULL in combination with EXISTS doesn't do anything at all. It is there to demonstrate that when the EXISTS() clause gets back something, anything, it evaluates to true, even if what it gets back is a single row with a field that has a value of NULL. Only in the case where no rows are returned would it evaluate to false. For instance:

    SELECT CategoryName
    
    FROM Categories
    WHERE EXISTS (SELECT NULL FROM Categories WHERE 1 = 0)
    ORDER BY CategoryName ASC

    I've done the opposite of the SELECT NULL. I've ensured that there is never a row returned, because 1 never equals 0.

    Hope that helps.

    K. Brian Kelley

    bkelley@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/bkelley/

    K. Brian Kelley
    @kbriankelley

  • Now i'm coming to understand the principles of this. My original query i was checking out is as follows for finding duplicates in a table;

    SELECT Surname, Address1, Address2, Address3

    FROM tblCustomerData

    WHERE EXISTS (SELECT NULL FROM tblCustomerData b

    WHERE b.Surname = tblCustomerData.Surname AND b.Address1 = tblCustomerData.address1

    AND b.address2 = tblCustomerData.address2 AND b.address3 = tblCustomerData.address3

    AND b.Town = tblCustomerData.Town AND b.county = tblCustomerData.county AND b.Postcode = tblCustomerData.PostCode

    GROUP BY b.surname, b.address1, b.address2,b.address3

    HAVING tblCustomerData.CustomerNo < MAX(b.CustomerNo))

    so because it produces a Null value it therefore makes the EXIST() statement true, therefore makes the outer query be outputted on the screen.

    I'm with it..AT LAST!! bit of a strange concept..I'm still re-reading the posts to get it lodged into my head.

    Thanks for your help again..most grateful

    Scott

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply