How to get spare no from categories 1 table where not exist per same spare no on table categories 2?
so i need to make select statement query select spare no from table categories 1 that have different categories
on table categories 2 per same spare no
as example spare no 1350 have category 5902 on table categories1 but on table categories2
i have same same spare no 1350 but have different categories as 7090 and 4020
then i select or display this spare no from table categories 1
as example spare no 1200 have category 5050 on table categories1 but on table categories2
i have same same spare no 1200 but have same categories as 5050 on table categories 2
so i don't need it or don't need to display it because it exist same sapre no and same category on table categories 2
so How to make select query give me expected result below ?
create table #categories1
(
catId int identity(1,1),
SpareNo int,
CategoryId int,
)
insert into #categories1(SpareNo,CategoryId)
values
(1200,5050),
(1350,5902),
(1700,8070),
(1990,2050),
(7000,2030)
create table #categories2
(
catId int identity(1,1),
SpareNo int,
CategoryId int,
)
insert into #categories(SpareNo,CategoryId)
values
(1200,5050),
(1200,5090),
(1200,5070),
(1350,7090),
(1350,4020),
(1700,8612),
(1990,7575),
(1990,2050),
(7000,4200),
(7000,4500)
expected result :
catId SpareNo CategoryId
2 1350 5902
3 1700 8070
5 7000 2030
March 28, 2021 at 8:10 pm
select c1.*
from #categories1 c1
where not exists(select 1
from #categories2 c2
where c1.SpareNo=c2.SpareNo
and c1.CategoryId=c2.CategoryId);
catIdSpareNoCategoryId
213505902
317008070
570002030
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
There are several methods for doing this. Steve Collins has correctly identified what most people identify as the "Best Method" and it frequently is.
There are 3 methods that I'll try based on "conditions" because, as with all else in SQL Server, "It Depends". The first two methods below are the most common methods with the first method being that commonly referred to as "Best Method".
--===== 1. WHERE NOT EXISTS method
SELECT c1.*
FROM #categories1 c1
WHERE NOT EXISTS (SELECT *
FROM #categories2 c2
WHERE c1.SpareNo = c2.SpareNo
AND c1.CategoryID = c2.CategoryId)
;
GO
--===== 2. LEFT OUTER JOIN method
SELECT c1.*
FROM #categories1 c1
LEFT JOIN #categories2 c2
ON c1.SpareNo = c2.SpareNo
AND c1.CategoryID = c2.CategoryId
WHERE c2.SpareNo IS NULL
;
GO
--===== 3. FULL OUTER JOIN method
SELECT c1.*
FROM #categories1 c1
FULL JOIN #categories2 c2
ON c1.SpareNo = c2.SpareNo
AND c1.CategoryID = c2.CategoryId
WHERE c2.SpareNo IS NULL
;
GO
For the small examples of data that you've provided, all of the methods produce an awful lot of reads in the absence of indexes.
With no indexes on the tables, here's the performance (duration is in uSecs). Method 1 is clearly the "Best Method" in this case.
It's doubtful that your real tables have no indexes on them. As some will be quick to point out, for these queries, a Clustered PK on the SpareNo and CategoryID columns would likely be the best and it would also guarantee that you don't have dupes in those two columns. Here are the run results after we add that Clustered PK to both tables.
Again, the Duration is in uSecs. All 3 queries resulted in about the same duration and they also enjoy a reduction in Reads... especially the 3rd query that uses the FULL outer join. It actually uses an order of magnitude fewer reads because it now uses a MERGE join instead of a LOOP join like the other two. That could certainly change in the presence of more rows, so "It Depends" and you'd need to test all 3 against the amount of data you may have or expect.
You also have an IDENTITY column for CatID. If, like so many other folks do (and, as Scott Pletcher will remind us, should not be an "automatically" assumed general practice), they make CatID is the PK, we can add a non-clustered unique index to the SpareNo and CategoryID columns and we get the following for performance.
All 3 are still a bit faster than with no indexes but the uncommonly used 3rd method with Full Outer Join still consumes an order of magnitude fewer READs than the other two.
Again, "It Depends" a whole lot on the actual number of rows you're dealing with and it also may depend on whether the composite indexes on the SpareNo and CategoryID columns are unique or not. Give all 3 methods with all 3 index configurations a shot and see which one wins for your insitu row counts.
And thank you much for the "Readily Consumable" example data you included in your original post. It makes life a whole lot easier for us.
Speaking of that and shifting gears a bit, I have a celebrity "follower" that thinks that everyone that posts readily consumable data with table definitions that don't include a PK or that do include IDENTITY columns as the PK needs to suffer through one of his interminable not-so-passive-aggressive ad hominem attacks. Don't take anything personally if he does. He just doesn't know any better even after repeated pleas to not do so. It seems that he uses my posts a whole lot to target such people and this site won't allow me to disable his ability to follow. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply