December 8, 2016 at 1:11 pm
Ok, say I have two tables table1 and table2 and I want to find records from table1 that don't exist in table2 or vise-versa. I could use EXCEPT like this:
-- sample data
DECLARE @table1 TABLE (SomeId int identity, content int NOT NULL, primary key(id, content));
DECLARE @table2 TABLE (SomeId int identity, content int NOT NULL, primary key(id, content));
INSERT @table1(content) VALUES (12), (15), (20);
INSERT @table2(content) SELECT content FROM @table1;
--DELETE TOP(1) FROM @table1;
--DELETE TOP(1) FROM @table2;
--solution
(
SELECT id, content FROM @table1
EXCEPT
SELECT id, content FROM @table2
)
UNION ALL
(
SELECT id, content FROM @table2
EXCEPT
SELECT id, content FROM @table1
);
I know there is a more efficient way to do this where I don't need to make two trips to each table. I saw someone here (SQLServerCentral) do this differently (I'm 99% sure it was Scott Peltcher). Anyone have a better way to do this?
-- Itzik Ben-Gan 2001
December 8, 2016 at 1:33 pm
Would a FULL OUTER JOIN help?
I changed your sample data to get some results.
-- sample data
DECLARE @table1 TABLE (id int identity, content int NOT NULL, primary key(id, content));
DECLARE @table2 TABLE (id int identity, content int NOT NULL, primary key(id, content));
INSERT @table1(content) VALUES (12), (15), (20);
INSERT @table2(content) VALUES (13), (15), (20);
--DELETE TOP(1) FROM @table1;
--DELETE TOP(1) FROM @table2;
--solution
(
SELECT id, content FROM @table1
EXCEPT
SELECT id, content FROM @table2
)
UNION ALL
(
SELECT id, content FROM @table2
EXCEPT
SELECT id, content FROM @table1
);
SELECT ISNULL( t1.id, t2.id) AS id,
ISNULL( t1.content, t2.content) AS content
FROM @table1 t1
FULL OUTER
JOIN @table2 t2 ON t1.id = t2.id AND t1.content = t2.content
WHERE t1.id IS NULL
OR t2.id IS NULL;
Or maybe using HAVING:
SELECT *
FROM (
SELECT id, content FROM @table1
UNION ALL
SELECT id, content FROM @table2
) x
GROUP BY id, content
HAVING COUNT(*) = 1;
December 8, 2016 at 1:56 pm
That FULL JOIN solution was what I'd seen before - The HAVING solution works just as well. Excellent work, thanks Luis!
-- Itzik Ben-Gan 2001
December 8, 2016 at 2:05 pm
Alan.B (12/8/2016)
Ok, say I have two tables table1 and table2 and I want to find records from table1 that don't exist in table2 or vise-versa. I could use EXCEPT like this:
-- sample data
DECLARE @table1 TABLE (SomeId int identity, content int NOT NULL, primary key(id, content));
DECLARE @table2 TABLE (SomeId int identity, content int NOT NULL, primary key(id, content));
INSERT @table1(content) VALUES (12), (15), (20);
INSERT @table2(content) SELECT content FROM @table1;
--DELETE TOP(1) FROM @table1;
--DELETE TOP(1) FROM @table2;
--solution
(
SELECT id, content FROM @table1
EXCEPT
SELECT id, content FROM @table2
)
UNION ALL
(
SELECT id, content FROM @table2
EXCEPT
SELECT id, content FROM @table1
);
I know there is a more efficient way to do this where I don't need to make two trips to each table. I saw someone here (SQLServerCentral) do this differently (I'm 99% sure it was Scott Peltcher). Anyone have a better way to do this?
Just wondering if you can determine that if the PK value in each table is identical, for a given pair of records (1 from each table), that you can then guarantee that both records are identical? The reason I ask is that if not all fields are identical for a given primary key value, then this query might not be entirely what you actually want, as it will supply values for records that don't have a primary key field match. If you could guarantee that a record in one table is always identical to one in the other table with the same PK value, then this becomes a lot easier and probably a lot better performance-wise. Think something like this:
WITH SOURCE_DATA AS (
SELECT DISTINCT COALESCE(T1.PK_FLD, T2.PK_FLD) AS PK_FIELD
FROM TABLE_1 AS T1
FULL OUTER JOIN TABLE_2 AS T2
ON T1.PK_FLD = T2.PK_FLD
WHERE T1.PK_FLD IS NULL
OR T2.PK_FLD IS NULL
)
SELECT D.PK_FIELD,
COALESCE(T3.ADDL_FLD1, T4.ADDL_FLD1) AS ADDL_FLD1,
...
FROM SOURCE_DATA AS D
LEFT OUTER JOIN TABLE_1 AS T3
ON D.PK_FIELD = T3.PK_FLD
LEFT OUTER JOIN TABLE_2 AS T4
ON D.PK_FIELD = T4.PK_FLD
ORDER BY D.PK_FIELD;
If you do need to see records that are different, AND you need to know the difference between EXISTS but is different, and NOT EXISTS, then you may only need to add some logic to what you already have. Just figured another idea to look at might not hurt...
EDIT: realized my thinking was way ahead of my typing and corrected a rather obvious dumb statement.
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
December 8, 2016 at 2:36 pm
sgmunson (12/8/2016)
Alan.B (12/8/2016)
Ok, say I have two tables table1 and table2 and I want to find records from table1 that don't exist in table2 or vise-versa. I could use EXCEPT like this:
-- sample data
DECLARE @table1 TABLE (SomeId int identity, content int NOT NULL, primary key(id, content));
DECLARE @table2 TABLE (SomeId int identity, content int NOT NULL, primary key(id, content));
INSERT @table1(content) VALUES (12), (15), (20);
INSERT @table2(content) SELECT content FROM @table1;
--DELETE TOP(1) FROM @table1;
--DELETE TOP(1) FROM @table2;
--solution
(
SELECT id, content FROM @table1
EXCEPT
SELECT id, content FROM @table2
)
UNION ALL
(
SELECT id, content FROM @table2
EXCEPT
SELECT id, content FROM @table1
);
I know there is a more efficient way to do this where I don't need to make two trips to each table. I saw someone here (SQLServerCentral) do this differently (I'm 99% sure it was Scott Peltcher). Anyone have a better way to do this?
Just wondering if you can determine that if the PK value in each table is identical, for a given pair of records (1 from each table), that you can then guarantee that both records are identical? The reason I ask is that if not all fields are identical for a given primary key value, then this query might not be entirely what you actually want, as it will supply values for records that don't have a primary key field match. If you could guarantee that a record in one table is always identical to one in the other table with the same PK value, then this becomes a lot easier and probably a lot better performance-wise. Think something like this:
WITH SOURCE_DATA AS (
SELECT DISTINCT COALESCE(T1.PK_FLD, T2.PK_FLD) AS PK_FIELD
FROM TABLE_1 AS T1
FULL OUTER JOIN TABLE_2 AS T2
ON T1.PK_FLD = T2.PK_FLD
WHERE T1.PK_FLD IS NULL
OR T2.PK_FLD IS NULL
)
SELECT D.PK_FIELD,
COALESCE(T3.ADDL_FLD1, T4.ADDL_FLD1) AS ADDL_FLD1,
...
FROM SOURCE_DATA AS D
LEFT OUTER JOIN TABLE_1 AS T3
ON D.PK_FIELD = T3.PK_FLD
LEFT OUTER JOIN TABLE_2 AS T4
ON D.PK_FIELD = T4.PK_FLD
ORDER BY D.PK_FIELD;
If you do need to see records that are different, AND you need to know the difference between EXISTS but is different, and NOT EXISTS, then you may only need to add some logic to what you already have. Just figured another idea to look at might not hurt...
EDIT: realized my thinking was way ahead of my typing and corrected a rather obvious dumb statement.
Thank you sir - interesting.
I only included the primary key to create a clustered index on my table variable but you gave me something to think about. Thanks again!
-- Itzik Ben-Gan 2001
December 8, 2016 at 3:07 pm
Alan.B (12/8/2016)
sgmunson (12/8/2016)
Alan.B (12/8/2016)
Ok, say I have two tables table1 and table2 and I want to find records from table1 that don't exist in table2 or vise-versa. I could use EXCEPT like this:
-- sample data
DECLARE @table1 TABLE (SomeId int identity, content int NOT NULL, primary key(id, content));
DECLARE @table2 TABLE (SomeId int identity, content int NOT NULL, primary key(id, content));
INSERT @table1(content) VALUES (12), (15), (20);
INSERT @table2(content) SELECT content FROM @table1;
--DELETE TOP(1) FROM @table1;
--DELETE TOP(1) FROM @table2;
--solution
(
SELECT id, content FROM @table1
EXCEPT
SELECT id, content FROM @table2
)
UNION ALL
(
SELECT id, content FROM @table2
EXCEPT
SELECT id, content FROM @table1
);
I know there is a more efficient way to do this where I don't need to make two trips to each table. I saw someone here (SQLServerCentral) do this differently (I'm 99% sure it was Scott Peltcher). Anyone have a better way to do this?
Just wondering if you can determine that if the PK value in each table is identical, for a given pair of records (1 from each table), that you can then guarantee that both records are identical? The reason I ask is that if not all fields are identical for a given primary key value, then this query might not be entirely what you actually want, as it will supply values for records that don't have a primary key field match. If you could guarantee that a record in one table is always identical to one in the other table with the same PK value, then this becomes a lot easier and probably a lot better performance-wise. Think something like this:
WITH SOURCE_DATA AS (
SELECT DISTINCT COALESCE(T1.PK_FLD, T2.PK_FLD) AS PK_FIELD
FROM TABLE_1 AS T1
FULL OUTER JOIN TABLE_2 AS T2
ON T1.PK_FLD = T2.PK_FLD
WHERE T1.PK_FLD IS NULL
OR T2.PK_FLD IS NULL
)
SELECT D.PK_FIELD,
COALESCE(T3.ADDL_FLD1, T4.ADDL_FLD1) AS ADDL_FLD1,
...
FROM SOURCE_DATA AS D
LEFT OUTER JOIN TABLE_1 AS T3
ON D.PK_FIELD = T3.PK_FLD
LEFT OUTER JOIN TABLE_2 AS T4
ON D.PK_FIELD = T4.PK_FLD
ORDER BY D.PK_FIELD;
If you do need to see records that are different, AND you need to know the difference between EXISTS but is different, and NOT EXISTS, then you may only need to add some logic to what you already have. Just figured another idea to look at might not hurt...
EDIT: realized my thinking was way ahead of my typing and corrected a rather obvious dumb statement.
Thank you sir - interesting.
I only included the primary key to create a clustered index on my table variable but you gave me something to think about. Thanks again!
Glad I could help. Enjoy!
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply