December 9, 2013 at 1:12 pm
I have 2 tables. Each one has the following columns: SSN, PKT and FICE_NBR.
I need to be able to retrieve the rows from table A where the combo of SSN and PKT and FICE_NBR in table A do not equal the combo of SSN and PKT and FICE_NBR in table B. Each of the tables has approx. 4 million rows.
I need to be able to include all of the columns from table A in the result set. The columns from table B are not required in the result set.
Any thoughts on the best way to write the SQL for this query?
Any help would be appreciated.
December 9, 2013 at 1:14 pm
I forgot to mention that I need to exclude duplicate rows from the result set.
Thanks.
December 9, 2013 at 3:17 pm
gary.morey (12/9/2013)
I have 2 tables. Each one has the following columns: SSN, PKT and FICE_NBR.I need to be able to retrieve the rows from table A where the combo of SSN and PKT and FICE_NBR in table A do not equal the combo of SSN and PKT and FICE_NBR in table B. Each of the tables has approx. 4 million rows.
I need to be able to include all of the columns from table A in the result set. The columns from table B are not required in the result set.
Any thoughts on the best way to write the SQL for this query?
Any help would be appreciated.
you can go like this:
SELECT SSN , PKT , FICE_NBR
FROM A
WHERE SSN + PKT + FICE_NBR
NOT IN (
SELECT SSN + PKT + FICE_NBR
FROM B
)
and for the duplicates a DISTINCT is enough i think
like:
SELECT DISTINCT SSN , PKT , FICE_NBR
FROM BLAH ...
December 9, 2013 at 4:10 pm
You could use EXCEPT
SELECT SSN , PKT , FICE_NBR
FROM A
EXCEPT
SELECT SSN , PKT , FICE_NBR
FROM B
December 9, 2013 at 5:53 pm
Luis Cazares (12/9/2013)
You could use EXCEPT
SELECT SSN , PKT , FICE_NBR
FROM A
EXCEPT
SELECT SSN , PKT , FICE_NBR
FROM B
This?
with baserecs as (
SELECT SSN , PKT , FICE_NBR
FROM TableA
EXCEPT
SELECT SSN , PKT , FICE_NBR
FROM TableB
)
select a.*
from TableA a inner join baserecs br on (a.SSN = br.SSN and a.PKT = br.PKT and a.FICE_NBR = br.FICE_NBR);
December 10, 2013 at 1:42 am
SELECT DISTINCT *
FROM Tablea a
WHERE NOT EXISTS (
SELECT 1
FROM Tableb b
WHERE b.SSN = a.SSN
AND b.PKT = a.PKT
AND b.FICE_NBR = a.FICE_NBR
)
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 10, 2013 at 1:49 am
Aram Koukia (12/9/2013)
gary.morey (12/9/2013)
I have 2 tables. Each one has the following columns: SSN, PKT and FICE_NBR.I need to be able to retrieve the rows from table A where the combo of SSN and PKT and FICE_NBR in table A do not equal the combo of SSN and PKT and FICE_NBR in table B. Each of the tables has approx. 4 million rows.
I need to be able to include all of the columns from table A in the result set. The columns from table B are not required in the result set.
Any thoughts on the best way to write the SQL for this query?
Any help would be appreciated.
you can go like this:
SELECT SSN , PKT , FICE_NBR
FROM A
WHERE SSN + PKT + FICE_NBR
NOT IN (
SELECT SSN + PKT + FICE_NBR
FROM B
)
and for the duplicates a DISTINCT is enough i think
like:
SELECT DISTINCT SSN , PKT , FICE_NBR
FROM BLAH ...
1. A row in table a with SSN = 1 and PKT = 2 would match a row in table b with SSN = 2 and PKT = 1 and matching FICE_NBR. This method is not recommended where the matching columns can generate an arithmetic addition.
It can break with character data too, less likely if an unexpected character e.g. '|' separates the columns.
2. It's not SARGable. In the unlikely event that it works, performance will always be poor.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 10, 2013 at 10:46 am
Yes, That's right
In cases like this, when I concat column data together I always use a special character as a separator to make sure that kind of conflict won't happen.
December 10, 2013 at 5:45 pm
[font="Arial Black"]Personally, I'd like to know why in the hell anyone would be storing SSNs in plain text![/font]
--Jeff Moden
Change is inevitable... Change for the better is not.
December 11, 2013 at 6:35 am
Jeff Moden (12/10/2013)
[font="Arial Black"]Personally, I'd like to know why in the hell anyone would be storing SSNs in plain text![/font]
Are you assuming combo = combination = concatenation 😀
Far away is close at hand in the images of elsewhere.
Anon.
December 11, 2013 at 6:44 am
The SSN is stored as a hashed value.
December 11, 2013 at 6:50 am
The SSN is stored as a MD5 hashed value.
The combination of SSN, PKT and FICE_NBR in table A does not have a match on the combination of SSN, PKT and FICE_NBR in table B.
December 11, 2013 at 6:54 am
gary.morey (12/11/2013)
The SSN is stored as a MD5 hashed value.The combination of SSN, PKT and FICE_NBR in table A does not have a match on the combination of SSN, PKT and FICE_NBR in table B.
You were probably expecting matches, Gary. Can you post the query you used?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 11, 2013 at 6:58 am
SELECT *
FROM Temp_BigTable
WHERE SSN + PKT + FICE_NBR
NOT IN (
SELECT SSN + PKT + FICE_NBR
FROM shi.raw33
)
December 11, 2013 at 8:19 am
gary.morey (12/11/2013)
The SSN is stored as a MD5 hashed value.The combination of SSN, PKT and FICE_NBR in table A does not have a match on the combination of SSN, PKT and FICE_NBR in table B.
Ah, thanks, Gary. I feel much better now.
Shifting gears a bit, you might want to consider something a little bit more than simple MDF hashing. A simple billion row rainbow table (which is nothing on the machines today) would quickly unravel all of the SSNs. Even adding a salt to the MD5 would be better although that would likely still fail an audit.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply