Select Distinct - but only on a joined table

  • CREATE TABLE #tblContacts

    (

    ContactID int,

    ContactName varchar(50)

    )

    GO

    INSERT INTO #tblContacts (ContactID, ContactName)

    SELECT 1, 'Fred' UNION ALL

    SELECT 2, 'Jim'

    CREATE TABLE #tblEmails

    (

    EmailID int,

    EmailAddress varchar(100),

    ContactID int,

    LocationID int

    )

    GO

    INSERT INTO #tblEmails (EmailID, EmailAddress, ContactID, LocationID)

    SELECT 1, 'Fred@CompanyA.cam', 1, 25 UNION ALL

    SELECT 2, 'Fred@CompanyB.cam', 1, 37 UNION ALL

    SELECT 3, 'Fred@CompanyB.cam', 1, 21 UNION ALL

    SELECT 4, 'Jim@CompanyA.cam', 2, 15 UNION ALL

    SELECT 5, 'Jim@CompanyC.cam', 2, 32

    SELECT ContactName, EmailAddress, EmailID

    FROM #tblContacts

    INNER JOIN #tblEmails ON #tblContacts.ContactID = #tblEmails.ContactID

    DROP Table #tblContacts

    DROP Table #tblEmails

    GO

    The code above returns 5 records - 3 records for Fred and 2 records for Jim. #tblEmails contains a list of email addresses for people at different locations (organisations - but that is not relevant).

    But, where someone has the same email address listed twice, I only want to return one of the email addresses. I don't care which one.

    At the moment, the data returned in the code above is:

    Fred_____Fred@CompanyA.cam____1

    Fred_____Fred@CompanyB.cam____2 <------- one of these needs to go, don't care which one

    Fred_____Fred@CompanyB.cam____3 <------- one of these needs to go, don't care which one

    Jim______Jim@CompanyA.cam____4

    Jim______Jim@CompanyC.cam____5

    What do I need to do to get the data returned to be:

    Fred_____Fred@CompanyA.cam____1

    Fred_____Fred@CompanyB.cam____2

    Jim______Jim@CompanyA.cam____4

    Jim______Jim@CompanyC.cam____5

    Thanks for any help. I have used (been shown by people on here) RANK() before to select the Top 1 of a group of records in a joined table. But, here, I don't want the Top 1 - I want, in Fred's case - 2 out of 3, I want the DISTINCT records - based on Email Address - in the #tblEmails table - but also returning EmailID which is unique for every row.

  • I can think on several different ways to accomplish this. I'm not sure which one is the best on your scenario. Could you test and tell us which one and why?

    SELECT DISTINCT ContactName, EmailAddress

    FROM #tblContacts c

    INNER JOIN #tblEmails e ON c.ContactID = e.ContactID;

    SELECT ContactName, EmailAddress

    FROM #tblContacts c

    INNER JOIN(SELECT DISTINCT ContactID, EmailAddress

    FROM #tblEmails) e ON c.ContactID = e.ContactID;

    SELECT ContactName, EmailAddress

    FROM #tblContacts c

    CROSS APPLY(SELECT DISTINCT EmailAddress

    FROM #tblEmails

    WHERE c.ContactID = ContactID) e ;

    WITH cteEmails AS(

    SELECT ContactID,

    EmailAddress,

    ROW_NUMBER() OVER( PARTITION BY ContactID, EmailAddress ORDER BY (SELECT NULL)) rn

    FROM #tblEmails

    )

    SELECT DISTINCT ContactName, EmailAddress

    FROM #tblContacts c

    INNER JOIN cteEmails e ON c.ContactID = e.ContactID

    WHERE rn = 1;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thank you for your reply. My apologies - I am an idiot. I failed to mention I need the EmailID returned as well. I have modified my code.

    Edit again. I have added EmailID in to the CTE you provided, and this works perfectly. Thank you again.

  • You just need to change some things and the 4 options are still available to test. 😉

    SELECT c.ContactName, e.EmailAddress, MIN( e.EmailID) EmailID

    FROM #tblContacts c

    INNER JOIN #tblEmails e ON c.ContactID = e.ContactID

    GROUP BY c.ContactName, c.ContactID, e.EmailAddress;

    SELECT c.ContactName, e.EmailAddress, e.EmailID

    FROM #tblContacts c

    INNER JOIN(SELECT ContactID, EmailAddress, MIN( EmailID) EmailID

    FROM #tblEmails

    GROUP BY ContactID, EmailAddress) e ON c.ContactID = e.ContactID;

    SELECT c.ContactName, e.EmailAddress, e.EmailID

    FROM #tblContacts c

    CROSS APPLY(SELECT EmailAddress, MIN( EmailID) EmailID

    FROM #tblEmails

    WHERE c.ContactID = ContactID

    GROUP BY ContactID, EmailAddress) e ;

    WITH cteEmails AS(

    SELECT ContactID,

    EmailAddress,

    EmailID,

    ROW_NUMBER() OVER( PARTITION BY ContactID, EmailAddress ORDER BY (SELECT NULL)) rn

    FROM #tblEmails

    )

    SELECT c.ContactName, e.EmailAddress, e.EmailID

    FROM #tblContacts c

    INNER JOIN cteEmails e ON c.ContactID = e.ContactID

    WHERE rn = 1;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply