query join, two many-to-many relationships

  • Hi, I'm hoping someone can help me with a query that I am so far unsuccessful with. I need to avoid nested SELECT's because I need to keep it compatible with MySQueaL 4.0

    I'm trying to get all the members that fall into one specific Activity category, AND many business categories (Activities and Business Categories have their own table which isn't shown below). I can get it to work as an OR:

    SELECT DISTINCT Member.id, Member.name FROM (Member LEFT JOIN Business ON Member.id=Business.id) LEFT JOIN Activity ON Member.id=Activity.id WHERE Activity.Code = 'Coordinator' And Business.code IN ('code1','code2');

    But I can't figure out how I can get all the members that match multiple Business.Code records. Any takers?

    ('id' field always represents the member.id)

     

  • Here you want a member that has Activity code Coordinator and Business codes code1 AND code2. The way to do this is to LEFT JOIN to Business twice, in effect to two 'copies' of Business, once to look for code1 and once to look for code2:

    SELECT DISTINCT Member.id, Member.name FROM
    ((Member LEFT JOIN Activity ON Member.id=Activity.id)
    LEFT JOIN Business AS Business1 ON Member.id=Business1.id)
    LEFT JOIN Business AS Business2 ON Member.id=Business2.id
    WHERE
    Activity.Code='Coordinator' AND Business1.code='code1' AND Business2.code='code2'

    As you can see, depending on how many conditions you want to apply you will have to join out to another instance of whatever table each time. This code, which doesn't use subqueries, but only outer joins and table aliasing, should be simple enough to work in MySQL.

     

  • A more efficient way would be to use a full outer join. This will return records not matching either side of the middle table.

    SELECT DISTINCT M.id, M.name

    FROM Member M

    FULL OUTER JOIN Activity A

    ON M.id=A.id

    FULL OUTER JOIN Business B

    ON M.id=B.id

    WHERE A.Code='Coordinator'

    AND B.code='code1'

    AND B.code='code2'


    ------------------------------
    The Users are always right - when I'm not wrong!

  • First off you may have the query wrong, because of your WHERE conditions you are actually inherently getting the results of an inner join.

    Based on yours then if you wrote

    SELECT

     DISTINCT

     Member.id,

     Member.name

    FROM

     Member

    INNER JOIN

     Business

    ON

     Member.id = Business.id

    INNER JOIN

     Activity

    ON

     Member.id = Activity.id

    WHERE

     Activity.Code = 'Coordinator' And

     Business.code IN ('code1','code2');

    you will get the same results.

    The reason is for Acitvity.code and Business.Code to have a value, then matching records have to exist in each of those tables.

    Thus an INNER JOIN resultset is the net effect.

    However if this is the desired effect then this may perform better

    SELECT

     DISTINCT

     Member.id,

     Member.name

    FROM

     Member

    INNER JOIN

     Business

    ON

     Member.id = Business.id AND

     Business.code IN ('code1','code2')

    INNER JOIN

     Activity

    ON

     Member.id = Activity.id AND

     Activity.Code = 'Coordinator';

    Which may perform faster as it is working on minimalizing the data at each join step by way of the extra condition. This should still work with MySQL.

    Now the questions I have which seems to need to be asked are:

    1) Are you wanting only the folks who have a match in either or a match in both?

    2) Is Business.code a condition that it has to be both or can be either?

    3) What is your expected result set supposed to look like?

    4) If the following

    Member

    --------------

    id name

    1 bob

    2 carol

    3 dave

    4 jim

    Business

    --------------

    id code

    1 code1

    1 code2

    2 code2

    3 code1

    4 code1

    4 code2

    Activity

    --------------

    id code

    1 Leader

    2 Coordinator

    3 Coordinator

    4 Coordinator

    What whould you expect the output to be and why?

  • Hi AKM, that did the trick, thanks!

    Antares686, I would expect Jim only because he matches all three criteria, AKM's code returns what I was expecting!

    So I'm new to the db 'stuff' side of things, what indices would be best on the business and activity tables? Right now there is only a single index on each, covering both columns, would it not be better to also have an index on the respective 'code' columns? Or am I totally off-base on how indices work? (I inherited this schema recently, never was a 'db' guy before this)

  • As far indexes a covering index with both ID and Code is fine. You would not need to add one specifically for Code as the one with ID and Code would contain more unique values statistically and be choosen anyway.

    Also, now knowing your expected results this may perform a bit better as it removes the extra join and with GROUP BY like this DISTINCT is not needed.

    Note this assumes Code values per ID cannot be duplicated in either table.

    SELECT

     Member.[id],

     Member.[name]

    FROM

     Member

    INNER JOIN

     Business

    ON

     Member.[id] = Business.[id] AND

     Business.code IN ('code1','code2')

    INNER JOIN

     Activity

    ON

     Member.[id] = Activity.[id] AND

     Activity.Code = 'Coordinator'

    GROUP BY

     Member.[id],

     Member.[name]

    HAVING

     COUNT(Member.[id]) = 2;

  • So I've played with this on-and-off for the past few days, I see that it works but I don't quite understand why. Can you offer an explanation? I find it interesting, but I'm not comfortable using code that I don't understand.

    What if I had 5 business codes that I wanted to match, would it then be

    Business.code IN ('code1'....'code5')

    and

    HAVING COUNT(Member.id) = 5;

    ?

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply