SELECT records from table 1 that does not exist in table 2 via UNION

  • Sorry if this seems too beginner; I have been stuck at this for days and for some reason couldn't figure out the answer.

    Suppose I have tbl Customers and tbl Orders.  The relationship is that 1 customer can have many or no orders.

    I have to use UNION, to retrieve customer records that does not have any orders. I can get the answer via LEFT JOIN, but just can't seem to figure how to do it with UNION.

    The best that I can do is this

    SELECT DISTINCT CustomerID FROM (
    SELECT CustomerID, null as OrderID FROM [Customers] as c
    UNION
    SELECT CustomerID, OrderID FROM [Orders] as o
    ) WHERE OrderID IS NOT null

    Thank you.

    • This topic was modified 3 years, 3 months ago by  technoob.
  • Could you clarify why you say you have to use UNION?

  • I agree with irgp, it makes no sense to use UNION here. At least not with the information you have shared so far.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • I am pretty sure you cannot do this with UNION.  UNION is for combining 2 data sets, not for filtering or reducing a data set.  Using your query, you would get the same results if you completely removed the first part of the UNION because your WHERE clause is filtering that out.

    A UNION is not designed to filter data.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • I was given this question in a coding interview, and was told that I can only use union... since then I've been trying to figure a way to do so.

  • Maybe it was a trick question where the expected answer was "you can't do that" or "that's a completely meaningless thing to do". Remember, the purpose of interview questions are not only to test your technical skills, but also to test how you are as a person. It is not only the answer as such that counts, but also how you make that answer.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Erland Sommarskog wrote:

    Maybe it was a trick question where the expected answer was "you can't do that" or "that's a completely meaningless thing to do". Remember, the purpose of interview questions are not only to test your technical skills, but also to test how you are as a person. It is not only the answer as such that counts, but also how you make that answer.

    Heh... I was thinking the same thing.  One of those questions to find out if you could say "No... that won't work, here's why, and here's what you need to do instead".

    Either that or someone came up with something "spec-hackular" (a "special hack" and can't imagine what it might be but who knows?) that they're entirely too proud of and are showing off at the expense of candidates like I've seen too many people do.  I can see a couple of ways to do it with UNION ALL but not all by itself and not just as a UNION.

    I'd love to see the spec-hack the does it using only UNION, if someone has actually pulled it off.  At the very least, it would demonstrate a different way of thinking and, like I said, who knows?  Maybe they found a better way.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Generally most prospective employers will give feedback after an interview. Why not ask for feedback and mention that you were confused by the question as you've never heard of anyone using a union query for the purpose. If you don't get a reply you have lost nothing, if you do then you may gain some very valuable experience

  • Maybe they meant using a set operator. In which case you would use EXCEPT not UNION.

    SELECT c.CustomerID 
    FROM [Customers] c
    EXCEPT
    SELECT o.CustomerID
    FROM [Orders] o

    Other ways I can think of achieving the desired result:

    SELECT c.CustomerID 
    FROM [Customers] c
    WHERE NOT EXISTS(SELECT *
    FROM [Orders] o
    WHERE o.CustomerId = c.CustomerId);


    SELECT c.CustomerID
    FROM [Customers] c
    LEFT JOIN [Orders] o
    ON o.CustomerId = c.CustomerId
    WHERE c.CustomerId IS NULL;


    SELECT c.CustomerID
    FROM [Customers] c
    OUTER APPLY (SELECT *
    FROM [Orders] o
    WHERE o.CustomerId = c.CustomerId) o
    WHERE o.CustomerId IS NULL;

    SELECT c.CustomerID
    FROM [Customers] c
    WHERE c.CustomerID NOT IN (SELECT c.CustomerID
    FROM [Orders] o)

    But a UNION is not suitable as it can only add rows to a query not remove the ones that don't have a matching row in [Orders].

  • Here's a way, though it would be pretty stupid to do it.

    SELECT x.CustomerID
    FROM (SELECT c.CustomerID, 0 Col1
    FROM [Customers] c
    UNION
    SELECT o.CustomerID, 1 Col1
    FROM [Orders] o) x
    GROUP BY x.CustomerId
    HAVING COUNT(*) = 1;
  • Jonathan AC Roberts wrote:

    Here's a way, though it would be pretty stupid to do it.

    SELECT x.CustomerID
    FROM (SELECT c.CustomerID, 0 Col1
    FROM [Customers] c
    UNION
    SELECT o.CustomerID, 1 Col1
    FROM [Orders] o) x
    GROUP BY x.CustomerId
    HAVING COUNT(*) = 1;

    I though of that bit of code and the MAX ROW_NUMBER version as well.  I dismissed those because I'm under the (possibly mistaken) impression that the original party (not the original poster) wanted a UNION solution without the likes of GROUP BY or ROW_NUMBER() or anything else other than a couple of SELECTS.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • So, the next question is, why do they want to know if you know this?  Is there a non-obvious performance advantage?   I wouldn't believe so but who knows?

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • >> Suppose I have TABLE Customers and TABLE Orders. The relationship is that 1 customer can have many or no orders. <<

    The conventional SQL forms for over 30 years has been that you post some data and some DDL when asking for help.

    Is this what you might have posted if you had been polite? Please notice the use of references to relate with tables together. That's what the R in RDBMS stands for 🙂 . Notice that the tables have keys; this is not an option. Here is a skeleton schema

    CREATE TABLE Customers

    (customer_id CHAR(16) NOT NULL PRIMARY KEY, -- credit card number as key!

    ...);

    CREATE TABLE Orders

    (order_id CHAR(20) NOT NULL PRIMARY KEY,

    customer_id CHAR(16) NOT NULL

    REFERENCES Customers,

    PRIMARY KEY (order_id, customer_id),

    ...);

    CREATE TABLE Orders_Details

    (order_id CHAR(20) NOT NULL REFERENCES Orders,

    item_gtin CHAR(15) NOT NULL, --- look up GTIN if you don't know what it is

    PRIMARY KEY (order_id, item_gtin),

    ...);

    >> I have to use UNION, to retrieve customer records [sic: rows are nothing like records] that does not have any orders. I can get the answer via LEFT OUTER JOIN, but just can't seem to figure how to do it with UNION. <<

    Is this some sort of weird class problem? This is a needless constraint. SQL has said oriented operators that can be quite fast, as well as very compact.

    (SELECT customer_id FROM Customers

    EXCEPT

    SELECT customer_id FROM Orders) AS X (virgin_customer_id)

    You might want to look up the UNION [ALL], INTERSECT [ALL], and EXCEPT [ALL] operators

    Please post DDL and follow ANSI/ISO standards when asking for help. 

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply