Query to find matching rows

  • hello-

    I'm working with a database that contains a table with IDs to create Relationships and Reciprocal records in the database. The table with the IDs for the Relat and Recip values look like this:

    UniqueID RelatID RecipID

    21 1 2

    22 2 1

    53 3 8

    54 4 7

    55 5 9

    96 6 9

    97 7 4

    My goal is to find the "matching" rows, which in the example above would be Rows with the UniqueID 21, 22 (they "match" based on the RelatID/RecipID 1-2 and 2-1) and 54 and 97 (they matched on the RelatID/RecipID 4-7 and 7-4) Potentially, I would like to leave only 1 of the matching rows, so my final table would be: (eliminate rows with uniqueId 22 and 97)

    UniqueID RelatID RecipID

    21 1 2

    53 3 8

    54 4 7

    55 5 9

    96 6 9

    any help will be greatly appreciated!

    thank you!!

  • Is this truly in MS Access?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Yes. However, the table can be loaded in SQL server as well, so anything I could do in SQL would be great too. And if so, I can just dump the final clean table in Acces to do the rest of the work. thx u! -m

  • If you can load into SQL Server prior to final load in Access, then this article will be of great help.

    http://jasonbrimhall.info/2011/03/21/dedupe-data-cte/

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Chelo (7/25/2011)


    hello-

    I'm working with a database that contains a table with IDs to create Relationships and Reciprocal records in the database. The table with the IDs for the Relat and Recip values look like this:

    UniqueID RelatID RecipID

    21 1 2

    22 2 1

    53 3 8

    54 4 7

    55 5 9

    96 6 9

    97 7 4

    My goal is to find the "matching" rows, which in the example above would be Rows with the UniqueID 21, 22 (they "match" based on the RelatID/RecipID 1-2 and 2-1) and 54 and 97 (they matched on the RelatID/RecipID 4-7 and 7-4) Potentially, I would like to leave only 1 of the matching rows, so my final table would be: (eliminate rows with uniqueId 22 and 97)

    UniqueID RelatID RecipID

    21 1 2

    53 3 8

    54 4 7

    55 5 9

    96 6 9

    any help will be greatly appreciated!

    thank you!!

    The following code will do exactly as you want. Since you're brand new to these forums, please take the time to also look at the article in the first link in my signature line below. It WILL help you get "tested code" answers.

    As usual, most of the details for how the code works are in the comments in the code. 😉

    --=============================================================================

    -- Create some test data so we can test the solution code we post.

    -- This is NOT a part of the solution but it IS one way to post "Readily

    -- consumable" data.

    --=============================================================================

    --===== Conditionally drop the test table to make reruns easier in SSMS

    IF OBJECT_ID('tempdb..#MyHead','U') IS NOT NULL DROP TABLE #MyHead;

    GO

    --===== Create the test table on the fly with known data

    SELECT *

    INTO #MyHead

    FROM (

    SELECT 21,1,2 UNION ALL

    SELECT 22,2,1 UNION ALL

    SELECT 53,3,8 UNION ALL

    SELECT 54,4,7 UNION ALL

    SELECT 55,5,9 UNION ALL

    SELECT 96,6,9 UNION ALL

    SELECT 97,7,4

    ) d (UniqueID,RelatID,RecipID)

    ;

    --=============================================================================

    -- Now, solve the problem. Details about how it works are in the comments.

    -- This uses a bit of a "Black Arts" technique that a lot of people don't

    -- know about. You CAN delete from the original source table by deleting

    -- from the final cascaded CTE if you're careful, much as you might do in

    -- a VIEW. That also means that you only have to "dip" the table once.

    --=============================================================================

    WITH

    cteSwap AS

    ( --=== Make RelatID and RecipID in the same order for all rows so we can

    -- compare apples to apples

    SELECT UniqueID,

    Compare1 = CASE WHEN RelatID > RecipID THEN RecipID ELSE RelatID END,

    Compare2 = CASE WHEN RelatID > RecipID THEN RelatID ELSE RecipID END

    FROM #MyHead

    ),

    cteEnumerate AS

    ( --=== Then number the rows as Jason suggested in his blog.

    -- Notice how we drag UniqueID along for the ride to maintain a "connection"

    -- with the original table.

    SELECT UniqueID,

    RowNum = ROW_NUMBER() OVER(PARTITION BY Compare1, Compare2 ORDER BY UniqueID)

    FROM cteSwap

    ) --=== Delete the dupes from the cascaded CTE. Yeah... most people don't know

    -- that you can actually do this. ;-)

    DELETE cteEnumerate

    WHERE RowNum > 1

    ;

    --===== Show that we did what we wanted to do

    SELECT *

    FROM #MyHead

    ORDER BY UniqueID

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff-

    thank you so much for the answer. That is exactly what I was looking for! I tested and worked like a charmed. Now, I've got to implement it on my 'true' tables.

    I read the link you mention. Thank you for pointing that out!

    Again, many thanks! 🙂

    marcelo

  • You bet, Marcelo. Thank you for the feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This can be done fairly easily in Access as well. I am including an Access db with a single query to do this.


    If you don't stand for something, you'll fall for anything!,

    Don Urquhart

  • Hi- thank you for sending the accdb. sorry for the late response though. It actually worked well too, just like the sql script I also got.

    thanks again for your help! M

  • @Don

    that was magically done:cool:

    I suppose it could be further refined into:

    SELECT Reciprocals.*, Reciprocals_1.UniqueID

    FROM Reciprocals LEFT JOIN Reciprocals AS Reciprocals_1 ON Reciprocals.RelatID = Reciprocals_1.RecipID

    WHERE Reciprocals.UniqueID<Reciprocals_1.UniqueID OR Reciprocals_1.UniqueID Is Null

    so as to leave out the Field0 column in the output

  • hi! thanks for the response. I actually tried both options on a larger data set and found that if a "relationship" records ins't included if it doesn't have its "Reciprocal" record. While I was able to extract those cases, the sql script posted above does the job for all records no matter what. thx! m

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply