May 28, 2012 at 8:23 am
Hi,
I have query related to joining. If I am using a OR condition to join 2tables then I am getting more than the expected number of records if the value matches with both conditions. For example.
select s.id, s.name1,s.name2,d.[desc]
from dbo.Test_Join_Source s
inner join dbo.Test_Join_Destination d
on
(s.name1=d.name1 or s.name2=d.name2)
The above query will produce 2 records if the source record matches by both the joining condition.
Question is, How do I prevent this.
Regards,
Pravasis
May 28, 2012 at 9:04 am
Heres a fairly literal approach. However, I'm sure theres more interesting ways to go about it!
select s.id, s.name1,s.name2,d.[desc]
from dbo.Test_Join_Source s
inner join dbo.Test_Join_Destination d
on
s.name1=d.name1
union
select s.id, s.name1,s.name2,d.[desc]
from dbo.Test_Join_Source s
inner join dbo.Test_Join_Destination d
on
s.name2=d.name2
where s.id not in
(
select s.id
from dbo.Test_Join_Source s
inner join dbo.Test_Join_Destination d
on
s.name1=d.name1
)
May 28, 2012 at 9:10 am
Could you tell me the purpose of this query? I'm not clear about this why you use OR condition to join 2 tables.
How to prevent it? => It means that you want to return one record in Test_Join_Source table although it matches with 2 records in Test_Join_Destination?
May 28, 2012 at 9:16 am
Dung Dinh (5/28/2012)
Could you tell me the purpose of this query? I'm not clear about this why you use OR condition to join 2 tables.How to prevent it? => It means that you want to return one record in Test_Join_Source table although it matches with 2 records in Test_Join_Destination?
No, The problem is even if there is one record in source and one record in destination and you use a OR join condition then it will give you 2 records rather than 1!!
Regards,
Pravasis
May 28, 2012 at 9:24 am
patrickmcginnis59 (5/28/2012)
Heres a fairly literal approach. However, I'm sure theres more interesting ways to go about it!
select s.id, s.name1,s.name2,d.[desc]
from dbo.Test_Join_Source s
inner join dbo.Test_Join_Destination d
on
s.name1=d.name1
union
select s.id, s.name1,s.name2,d.[desc]
from dbo.Test_Join_Source s
inner join dbo.Test_Join_Destination d
on
s.name2=d.name2
where s.id not in
(
select s.id
from dbo.Test_Join_Source s
inner join dbo.Test_Join_Destination d
on
s.name1=d.name1
)
Thanks Patrick.
But if I have more than 2 joins then it would be a cumbersome process. I was thinking of more sophisticated way of handing this problem.
Regards,
Pravasis
May 28, 2012 at 9:32 am
I try to test with your case
Create table
CREATE TABLE [dbo].[Test_Join_Source](
[id] [int] NULL,
[name1] [varchar](50) NULL,
[name2] [varchar](50) NULL
)
CREATE TABLE [dbo].[Test_Join_Destination](
[id] [int] NULL,
[name1] [varchar](50) NULL,
[name2] [varchar](50) NULL,
[desc] [nchar](10) NULL
)
Insert data
INSERT INTO Test_Join_Source(id,name1,name2)
VALUES(1,'test1','test2')
INSERT INTO Test_Join_Destination(id,name1,name2,[desc])
VALUES(1,'test1','test2','testing')
Then I run your query. Here is result
idname1name2desc
1test1 test2 testing
No 2 records as you see.
Let me know if I miss anything
May 28, 2012 at 9:39 am
I got the same results as Dung Dinh when I set up a similar test scenario, I only got one row from the resulting query. You might want to look at your results a bit closer or try to create a test script including DDL and test data that shows the behavior you are trying to avoid.
May 28, 2012 at 9:48 am
OP has two records as it matches to two different records in table she JOINs to, one record matches on ONE column and another record matches on SECOND column.
OP need to supply ddl, sample data and clear expected results and it will attract the relevant help.
The link at the bottom of my signature leads to the "must read" article for this sort of questions...
May 28, 2012 at 9:50 am
Try the following.
drop table [dbo].[Test_Join_Source2]
drop table [Test_Join_Destination2]
CREATE TABLE [dbo].[Test_Join_Source2](
[id] [int] NULL,
[name1] [varchar](50) NULL,
[name2] [varchar](50) NULL
)
CREATE TABLE [dbo].[Test_Join_Destination2](
[id] [int] NULL,
[name1] [varchar](50) NULL,
[name2] [varchar](50) NULL,
[desc] [nchar](10) NULL
)
INSERT INTO Test_Join_Source2(id,name1,name2)
VALUES(1,'test1','test2')
INSERT INTO Test_Join_Destination2(id,name1,name2,[desc])
VALUES(1,'test1','test4','testing')
INSERT INTO Test_Join_Destination2(id,name1,name2,[desc])
VALUES(2,'test1','test2','testing2')
select s.id, s.name1,s.name2,d.[desc]
from dbo.Test_Join_Source2 s
inner join dbo.Test_Join_Destination2 d
on (s.name1=d.name1 or s.name2=d.name2)
Regards,
Pravasis
May 28, 2012 at 10:17 am
I'm sure that with the sample data and your query. The result is always 2 records
idname1name2desc
1test1 test2 testing
1test1 test2 testing2
The 2 records are distinguished. I think that no way to get one record as your expectation.
May 28, 2012 at 10:21 am
That is exactly what I've described (even a bit more).
It returns two rows as single record in Test_Join_Source2 matches two different records in Test_Join_Destination2:
it matches the first one on name1 and matches the second one on name1 and on name2.
Therefore two records returned.
I guess OP wants something like "priority"-join:
select s.id, s.name1,s.name2,COALESCE(d0.[desc], d1.[desc],d2.[DESC])
from dbo.Test_Join_Source2 s
LEFT join dbo.Test_Join_Destination2 d0
on (s.name1=d0.name1) AND (s.name2 = d0.name2)
LEFT join dbo.Test_Join_Destination2 d1
on (s.name1=d1.name1) AND (s.name2 != d0.name2)
LEFT join dbo.Test_Join_Destination2 d2
on (s.name2=d2.name2) AND (s.name1 != d1.name1)
The above can be written in more elegant way, but I'm about to go home 🙂
May 28, 2012 at 10:24 am
Dung Dinh (5/28/2012)
I'm sure that with the sample data and your query. The result is always 2 recordsidname1name2desc
1test1 test2 testing
1test1 test2 testing2
The 2 records are distinguished. I think that no way to get one record as your expectation.
If you use 'testing' for both the records still you would get 2 records.
basically if you have one record in source, no matter how many records you have in destination you would expect one record in the result set. If it finds a match with condition one then it should skip the second condition.
Regards,
Pravasis
May 28, 2012 at 2:36 pm
pravasis (5/28/2012)
Dung Dinh (5/28/2012)
I'm sure that with the sample data and your query. The result is always 2 recordsidname1name2desc
1test1 test2 testing
1test1 test2 testing2
The 2 records are distinguished. I think that no way to get one record as your expectation.
If you use 'testing' for both the records still you would get 2 records.
basically if you have one record in source, no matter how many records you have in destination you would expect one record in the result set. If it finds a match with condition one then it should skip the second condition.
I don't think anyone
would expect one record in the result set
based on query you have build. That is a "cool" idea, but it has nothing to do with how JOINs do work in T-SQL. Have you tried to read BoL on this subject?
May 28, 2012 at 10:36 pm
Maybe, I use a simple query to return the expected results. Something like "priority" JOIN
CREATE TABLE #temp
(
id int,
name1 varchar(50)
name2 varchar(50)
desc varchar(1000)
)
INSERT INTO #temp
SELECT s.id,s.name1,s.name2,d.[desc]
FROM Test_Join_Source s
INNER JOIN Test_Join_Destination d ON s.name1 = s.name1
INSERT INTO #temp
SELECT s.id,s.name1,s.name2,d.[desc]
FROM Test_Join_Source s
INNER JOIN Test_Join_Destination d ON s.name2 = s.name2
LEFT JOIN #temp tmp ON s.id = tmp.id
WHERE tmp.id IS NULL
SELECT * FROM #temp
DROP TABLE #temp
May 29, 2012 at 1:48 am
Eugene Elutin (5/28/2012)
pravasis (5/28/2012)
Dung Dinh (5/28/2012)
I'm sure that with the sample data and your query. The result is always 2 recordsidname1name2desc
1test1 test2 testing
1test1 test2 testing2
The 2 records are distinguished. I think that no way to get one record as your expectation.
If you use 'testing' for both the records still you would get 2 records.
basically if you have one record in source, no matter how many records you have in destination you would expect one record in the result set. If it finds a match with condition one then it should skip the second condition.
I don't think anyone
would expect one record in the result set
based on query you have build. That is a "cool" idea, but it has nothing to do with how JOINs do work in T-SQL. Have you tried to read BoL on this subject?
I thought there is a concept of "short-circuiting" in sql server. Try the following.
SELECT 1 WHERE (1 = 1) OR (1 / 0 = 0)
even if there is a division by zero still it works, because the second part never gets evaluated. Thats how probably all programming languages work to improve the performance.
based on this understanding I thought the second part will only be evaluated if we don't have a match for the first part.
Regards,
Pravasis
Viewing 15 posts - 1 through 15 (of 25 total)
You must be logged in to reply to this topic. Login to reply