Outer joins to inner joined tables - CTEs best?

  • Hi there, this is my first posting...

    I've come from a world of simple SQL against a Foxpro db, and we are now converting a whole heap of Crystal Reports to new reports where we're doing as much as we can in tSQL (having ported our app to SQL Server from Foxpro) before letting the new reports software loose on the results.

    I have been discovering the joys of inline SELECTs, and other ways to avoid subreports within main reports.... and even being able to place comments in my SQL is a massive leap forward! 😀

    I was a bit stumped that I couldn't do the following, though :

    (A INNER JOIN B) LEFT OUTER JOIN (C INNER JOIN D)

    - where A & B represent our main customer table and a few joins for other fields, for a report which will be one line per customer.

    On each customer's line we also requre some totals from the C & D tables, if there are some.

    No matter what I tried, it always seemed to return roughly half of the customers, seemingly ignoring my polite request for the OUTER JOIN so that we'd still get a line for each customer. This was handled with a subreport in the original Crystal Report - but even though we've got a way to achieve subreports I'd rather get it sorted in the main SQL if possible.

    If it was just the one total we needed, the inline SELECT would be fine, but with 3 totals required from C & D it seems ridiculous (and slow) to have 3 inline SELECTs being almost identical. Is there a way to return more than one field from the additional SELECT? Probably not. (although we have done something like this before now, where we glued together several fields into a string to get a WHERE.. IN (SELECT... working, where the old report suppressed certain rows based on specific criteria - which was fun)

    In the end I discovered CTEs (thanks!!!!) and was able to define E beforehand as the C & D join, and do the one OUTER JOIN to that. Result!!

    It seems that an outer join only works reliably with one more single table...?

    However, having poked around these forums a bit more today, I found out about temporary tables, and tried my E join as a (SELECT) after my OUTER JOIN. That still didn't give me the entire list of customers, whereas the CTE method does. Without posting the full details, can anyone spot any glaring errors I've made, or suggest ways to get the outer joins working as I require?

    At the moment I'm delighted with the CTE approach, and it absolutely made my day to discover how powerful it is. But if there's a better way, I'd love to hear it! I'm hoping that if I hang around here long enough, I'll pick up so much know-how that I'll look back on this first query and cringe 😛 (and hopefully if I climb the ladder high enough I'll try to help others up too - it's only fair)

    So, if you needed

    (A INNER JOIN B) LEFT OUTER JOIN (C INNER JOIN D)

    with every row from (A INNER JOIN B) and also any totals from (C INNER JOIN D) - using COALESCE to handle the NULLs - what would you do?

    Many thanks

  • HI there,

    It will be hard for us to tell why your CTE returned different results to your tempory table, without actually seeing the code for the creation of the two objects.

    There are a few methods you could use for your problem some of which you have already come across:

    CTE

    Table Variables

    Temp Tables

    Sub-Queries

    All of the above could be used to create a sub set of data namely E (a inner join b) for example, and in turn you can then join onto this data set to other tables/sub-queries to gather results.

    I hope this helps.

    Thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Thanks, I'm not sure I should post any of this company's code online, so I'll play it safe and not do that.

    I was just wondering if I'd overlooked anything that was obvious to anyone more experienced - in the way I wanted the OUTER JOIN to work.

    Cheers

  • HI There,

    It's still going to be pretty hard to give advise without knowing the relationship between the 4 tables.

    Could you give us an example such as.

    CREATE TABLE [TableA]

    (....)

    CREATE TABLE [TableB]

    (....)

    CREATE TABLE [TableB]

    (....)

    CREATE TABLE [TableB]

    (....)

    With the data for each col1 something simple for us to work with , as well as the replationship between the different tables, i.d tableA.Col1 = TableB.Col6 etc.

    And then if you have time what you want your results set to be .

    Thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • a648 (7/28/2008)


    I was a bit stumped that I couldn't do the following, though :

    (A INNER JOIN B) LEFT OUTER JOIN (C INNER JOIN D)

    - where A & B represent our main customer table and a few joins for other fields, for a report which will be one line per customer.

    On each customer's line we also requre some totals from the C & D tables, if there are some.

    No matter what I tried, it always seemed to return roughly half of the customers, seemingly ignoring my polite request for the OUTER JOIN so that we'd still get a line for each customer.

    This syntax works as far as I know. What is almost certainly happening is that your Customer table(A) records are being filtered out either by the "(A INNER JOIN B)" or by WHERE conditions (which act like inner join conditions). Try executing just the first inner join with your WHERE conditions and see what you get back.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • The follow SQL code demonstrates that this syntax does in fact work correctly:

    Select count(*)

    From (sys.columns A INNER JOIN sys.objects B on A.object_id = b.object_id)

    LEFT OUTER JOIN

    (sys.endpoints C INNER JOIN sys.events D on D.object_id = C.endpoint_id)

    ON D.object_id = A.object_id

    If you just execute the fist two lines and then execute the whole query, you will see that they both return exactly the same number of rows. (Note that the second Inner Join (highllighted) probably does not return any rows).

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks all, I think I've figured out where I went wrong.

    I was filtering my C & D results in the main WHERE clause, which obviously affects the total number of rows returned. Simple after a good night's sleep 😀

    I had :

    (A INNER JOIN B) LEFT OUTER JOIN (C INNER JOIN D)

    WHERE a=value and b<>filter and d<>filter - etc.

    ... when what I really wanted was something like :

    (A INNER JOIN B) LEFT OUTER JOIN (C INNER JOIN D WHERE... )

    With the tempory table method, it wasn't working because I was still filtering in the main WHERE clause, but it DID work when I moved those filters into a new WHERE clause in the tempory table's SELECT.

    With the CTE method I tried, I had already put the relevant WHERE clause in the CTE instead of the main SELECT's WHERE, which is why it worked.

    Many thanks. From now on I'll use either the temp table method, or a CTE, depending on how the mood takes me... or if there are any great advantages to either way?

    Cheers

  • FYI: In general the proper palce to put OUTER JOIN conditions is in the OUTER JOIN's ON clause. Though it will also work (and sometimes is faster) to put them inside a derived table subselect or in a CTE.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks - learning all the time here....

    My complete query (2600 rows) takes me 6 seconds with no CTEs - featuring an inline SELECT for one outer join SUM, and a temp table for the three other SUMs from the other outer join.

    The neat version with two CTEs displays with just 1 second on the clock 🙂

    All I have to do now is satisfy myself that CTEs are safe to use in general - do they scale up well, have other performance issues, etc.

    cheers

  • CTE's are as safe to use as anything. They scale up just fine (assuming they are not recursive). The real limitations are in the overall query, not in CTE's vs. Views, Subqueries or anything else.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • By the way, if you have SQL Server 2005, you should be posting in the SQL2005 forums. 🙂

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I too was delighted to learn about and use CTE's -- subqueries give me a headache!

    But I have found that there are restrictions that come with CTE's (and I'm not sure yet what they all are.) But, for example, if I need to declare local variables I run afoul of restrictions on the CTE syntax. Also, testing or timing just parts of the query can get messy.

    For those reasons, I have started using #temp tables in preference to CTE/s. (And have probably just not hit any unusual restrictions of those yet.)

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply