Join Issue

  • Hi,

    I have query related to joining. If I am using a OR condition to join 2tables then I am getting more than the expected number of records if the value matches with both conditions. For example.

    select s.id, s.name1,s.name2,d.[desc]

    from dbo.Test_Join_Source s

    inner join dbo.Test_Join_Destination d

    on

    (s.name1=d.name1 or s.name2=d.name2)

    The above query will produce 2 records if the source record matches by both the joining condition.

    Question is, How do I prevent this.

    Regards,
    Pravasis

  • Heres a fairly literal approach. However, I'm sure theres more interesting ways to go about it!

    select s.id, s.name1,s.name2,d.[desc]

    from dbo.Test_Join_Source s

    inner join dbo.Test_Join_Destination d

    on

    s.name1=d.name1

    union

    select s.id, s.name1,s.name2,d.[desc]

    from dbo.Test_Join_Source s

    inner join dbo.Test_Join_Destination d

    on

    s.name2=d.name2

    where s.id not in

    (

    select s.id

    from dbo.Test_Join_Source s

    inner join dbo.Test_Join_Destination d

    on

    s.name1=d.name1

    )

  • Could you tell me the purpose of this query? I'm not clear about this why you use OR condition to join 2 tables.

    How to prevent it? => It means that you want to return one record in Test_Join_Source table although it matches with 2 records in Test_Join_Destination?

  • Dung Dinh (5/28/2012)


    Could you tell me the purpose of this query? I'm not clear about this why you use OR condition to join 2 tables.

    How to prevent it? => It means that you want to return one record in Test_Join_Source table although it matches with 2 records in Test_Join_Destination?

    No, The problem is even if there is one record in source and one record in destination and you use a OR join condition then it will give you 2 records rather than 1!!

    Regards,
    Pravasis

  • patrickmcginnis59 (5/28/2012)


    Heres a fairly literal approach. However, I'm sure theres more interesting ways to go about it!

    select s.id, s.name1,s.name2,d.[desc]

    from dbo.Test_Join_Source s

    inner join dbo.Test_Join_Destination d

    on

    s.name1=d.name1

    union

    select s.id, s.name1,s.name2,d.[desc]

    from dbo.Test_Join_Source s

    inner join dbo.Test_Join_Destination d

    on

    s.name2=d.name2

    where s.id not in

    (

    select s.id

    from dbo.Test_Join_Source s

    inner join dbo.Test_Join_Destination d

    on

    s.name1=d.name1

    )

    Thanks Patrick.

    But if I have more than 2 joins then it would be a cumbersome process. I was thinking of more sophisticated way of handing this problem.

    Regards,
    Pravasis

  • I try to test with your case

    Create table

    CREATE TABLE [dbo].[Test_Join_Source](

    [id] [int] NULL,

    [name1] [varchar](50) NULL,

    [name2] [varchar](50) NULL

    )

    CREATE TABLE [dbo].[Test_Join_Destination](

    [id] [int] NULL,

    [name1] [varchar](50) NULL,

    [name2] [varchar](50) NULL,

    [desc] [nchar](10) NULL

    )

    Insert data

    INSERT INTO Test_Join_Source(id,name1,name2)

    VALUES(1,'test1','test2')

    INSERT INTO Test_Join_Destination(id,name1,name2,[desc])

    VALUES(1,'test1','test2','testing')

    Then I run your query. Here is result

    idname1name2desc

    1test1 test2 testing

    No 2 records as you see.

    Let me know if I miss anything

  • I got the same results as Dung Dinh when I set up a similar test scenario, I only got one row from the resulting query. You might want to look at your results a bit closer or try to create a test script including DDL and test data that shows the behavior you are trying to avoid.

  • OP has two records as it matches to two different records in table she JOINs to, one record matches on ONE column and another record matches on SECOND column.

    OP need to supply ddl, sample data and clear expected results and it will attract the relevant help.

    The link at the bottom of my signature leads to the "must read" article for this sort of questions...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Try the following.

    drop table [dbo].[Test_Join_Source2]

    drop table [Test_Join_Destination2]

    CREATE TABLE [dbo].[Test_Join_Source2](

    [id] [int] NULL,

    [name1] [varchar](50) NULL,

    [name2] [varchar](50) NULL

    )

    CREATE TABLE [dbo].[Test_Join_Destination2](

    [id] [int] NULL,

    [name1] [varchar](50) NULL,

    [name2] [varchar](50) NULL,

    [desc] [nchar](10) NULL

    )

    INSERT INTO Test_Join_Source2(id,name1,name2)

    VALUES(1,'test1','test2')

    INSERT INTO Test_Join_Destination2(id,name1,name2,[desc])

    VALUES(1,'test1','test4','testing')

    INSERT INTO Test_Join_Destination2(id,name1,name2,[desc])

    VALUES(2,'test1','test2','testing2')

    select s.id, s.name1,s.name2,d.[desc]

    from dbo.Test_Join_Source2 s

    inner join dbo.Test_Join_Destination2 d

    on (s.name1=d.name1 or s.name2=d.name2)

    Regards,
    Pravasis

  • I'm sure that with the sample data and your query. The result is always 2 records

    idname1name2desc

    1test1 test2 testing

    1test1 test2 testing2

    The 2 records are distinguished. I think that no way to get one record as your expectation.

  • That is exactly what I've described (even a bit more).

    It returns two rows as single record in Test_Join_Source2 matches two different records in Test_Join_Destination2:

    it matches the first one on name1 and matches the second one on name1 and on name2.

    Therefore two records returned.

    I guess OP wants something like "priority"-join:

    select s.id, s.name1,s.name2,COALESCE(d0.[desc], d1.[desc],d2.[DESC])

    from dbo.Test_Join_Source2 s

    LEFT join dbo.Test_Join_Destination2 d0

    on (s.name1=d0.name1) AND (s.name2 = d0.name2)

    LEFT join dbo.Test_Join_Destination2 d1

    on (s.name1=d1.name1) AND (s.name2 != d0.name2)

    LEFT join dbo.Test_Join_Destination2 d2

    on (s.name2=d2.name2) AND (s.name1 != d1.name1)

    The above can be written in more elegant way, but I'm about to go home 🙂

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Dung Dinh (5/28/2012)


    I'm sure that with the sample data and your query. The result is always 2 records

    idname1name2desc

    1test1 test2 testing

    1test1 test2 testing2

    The 2 records are distinguished. I think that no way to get one record as your expectation.

    If you use 'testing' for both the records still you would get 2 records.

    basically if you have one record in source, no matter how many records you have in destination you would expect one record in the result set. If it finds a match with condition one then it should skip the second condition.

    Regards,
    Pravasis

  • pravasis (5/28/2012)


    Dung Dinh (5/28/2012)


    I'm sure that with the sample data and your query. The result is always 2 records

    idname1name2desc

    1test1 test2 testing

    1test1 test2 testing2

    The 2 records are distinguished. I think that no way to get one record as your expectation.

    If you use 'testing' for both the records still you would get 2 records.

    basically if you have one record in source, no matter how many records you have in destination you would expect one record in the result set. If it finds a match with condition one then it should skip the second condition.

    I don't think anyone

    would expect one record in the result set

    based on query you have build. That is a "cool" idea, but it has nothing to do with how JOINs do work in T-SQL. Have you tried to read BoL on this subject?

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Maybe, I use a simple query to return the expected results. Something like "priority" JOIN

    CREATE TABLE #temp

    (

    id int,

    name1 varchar(50)

    name2 varchar(50)

    desc varchar(1000)

    )

    INSERT INTO #temp

    SELECT s.id,s.name1,s.name2,d.[desc]

    FROM Test_Join_Source s

    INNER JOIN Test_Join_Destination d ON s.name1 = s.name1

    INSERT INTO #temp

    SELECT s.id,s.name1,s.name2,d.[desc]

    FROM Test_Join_Source s

    INNER JOIN Test_Join_Destination d ON s.name2 = s.name2

    LEFT JOIN #temp tmp ON s.id = tmp.id

    WHERE tmp.id IS NULL

    SELECT * FROM #temp

    DROP TABLE #temp

  • Eugene Elutin (5/28/2012)


    pravasis (5/28/2012)


    Dung Dinh (5/28/2012)


    I'm sure that with the sample data and your query. The result is always 2 records

    idname1name2desc

    1test1 test2 testing

    1test1 test2 testing2

    The 2 records are distinguished. I think that no way to get one record as your expectation.

    If you use 'testing' for both the records still you would get 2 records.

    basically if you have one record in source, no matter how many records you have in destination you would expect one record in the result set. If it finds a match with condition one then it should skip the second condition.

    I don't think anyone

    would expect one record in the result set

    based on query you have build. That is a "cool" idea, but it has nothing to do with how JOINs do work in T-SQL. Have you tried to read BoL on this subject?

    I thought there is a concept of "short-circuiting" in sql server. Try the following.

    SELECT 1 WHERE (1 = 1) OR (1 / 0 = 0)

    even if there is a division by zero still it works, because the second part never gets evaluated. Thats how probably all programming languages work to improve the performance.

    based on this understanding I thought the second part will only be evaluated if we don't have a match for the first part.

    Regards,
    Pravasis

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply