SELECT DISTINCT TOP 1

  • Good Morning Everyone,

    I have a query that I am designing where I want to identify certain records based off of unique combinations of fields (and only one record of it).

    Here is a sample table:

    CREATE TABLE #trades (

    acct_num char(8) NOT NULL,

    record_type char (1) NOT NULL,

    trade_nbr char(4) NOT NULL,

    capacity char (1),

    secy_type char (2),

    market char (1),

    )

    INSERT #trades (acct_num, record_type, trade_nbr, capacity, secy_type, market)

    SELECT '12345678', 'B', '1234', '1', '3A', 'A' UNION ALL

    SELECT '12345677', 'B', '5678', '2', '9B', 'B'UNION ALL

    SELECT '92345670', 'B', '4987', '1', '01', 'C'UNION ALL

    SELECT '12345675', 'S', '1394', 'A', '9B', 'A' UNION ALL

    SELECT '92345670', 'S', '4765', '1', '01', 'B'UNION ALL

    SELECT '92345671', 'S', '8744', 'C', '3A', 'C'UNION ALL

    SELECT '92345670', 'B', '4652', '1', '01', 'A'UNION ALL

    SELECT '92345673', 'S', '4798', '2', '9B', 'B'UNION ALL

    SELECT '02345672', 'S', '5542', 'C', '01', 'C'UNION ALL

    SELECT '02345673', 'B', '6549', '1', '3A', 'A'

    I would like the query to select rows that have a unique combination of capacity, secy_type, and record_type only. So there would be a row for (1,3A), (2,9B), (1,01), (A,9B), etc., but still display the account, trade_nbr, and market information.

    This is the query I have come up with, but it still puts out duplicates (I made this query by trolling this board and picking up pieces here and there (so I'm sure there are things that don't belong or weird coding):

    SELECT A.acct_num,

    A.record_type,

    A.trade_nbr,

    A.market,

    A.secy_type,

    A.capacity

    FROM #trades A

    WHERE A.acct_num IN (

    SELECT TOP 1 B.acct_num

    FROM #trades B

    WHERE (B.secy_type = A.secy_type)

    AND (B.capacity = A.capacity)

    AND(B.record_type = A.record_type)

    )

    ORDER BY secy_type, capacity, record_type

    It appears the problem is that when two rows share an account number, it will list both rows (even though capacity and secy_type are duplicated), ex. acct_num '92345670'

    How can I correct this?

  • It appears the problem is that when two rows share an account number, it will list both rows (even though capacity and secy_type are duplicated), ex. acct_num '92345670'

    Your result set shows that even though capacity and secy_type are duplicated the rest of the data isn't, no line is a completely duplicated line. If the capacity and secy_type are duplicated what do you want to display in the other ie 1st record_type, 1st act_num etc..

    acct_numrecord_typetrade_nbrmarketsecy_typecapacity

    92345670B4987C011

    92345670B4652A011

    92345670S4765B011

    02345672S5542C01C

    12345678B1234A3A1

    92345671S8744C3AC

    12345677B5678B9B2

    92345673S4798B9B2

    12345675S1394A9BA

    Facts are stubborn things, but statistics are more pliable - Mark Twain
    Carolyn
    SQLServerSpecialists[/url]

  • You can use Group By clause and Aggregate functions to solve these duplicates.

    Try this

    SELECTA.acct_num, A.record_type, A.trade_nbr, A.market, A.secy_type, A.capacity

    FROM #trades A,

    (

    SELECT acct_num, record_type, MAX(trade_nbr) trade_nbr FROM #trades

    GROUP BY acct_num, capacity, secy_type, record_type

    ) B

    WHERE A.acct_num = B.acct_num

    AND A.record_type = B.record_type

    AND A.trade_nbr = B.trade_nbr

    ORDER BY A.secy_type, A.capacity, A.record_type

  • Carolyn's got it on the nose. There are two dupe pairs (on capacity, secy_type and record_type) in the sample data:

    SELECT '12345678', 'B', '1234', '1', '3A', 'A' UNION ALL --

    SELECT '02345673', 'B', '6549', '1', '3A', 'A' UNION ALL --

    --

    SELECT '92345670', 'B', '4987', '1', '01', 'C' UNION ALL --

    SELECT '92345670', 'B', '4652', '1', '01', 'A' UNION ALL --

    What are the rules for deciding which one to discard (or return)?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks guys,

    I'm totally indifferent on which row the query chooses when capacity, secy_type, and record_type is the same - that's why I was hoping SELECT TOP 1 would work.

    Hope that helps.

  • It helps a lot - if the data doesn't matter to you, then don't use it. Your query will be a lot simpler. Just group by your three required columns.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Well, ultimately I do neet acct_num. The purpose of the query is that I want to get a list of each unique capacity, secy_type, and record_type combination, then pull up an account (acct_num) with that combination in it (in order to double check that these fields are being populated correctly). So on my main table, when I run a query to indentify how many combinations there are, I get about 143. Then when I add the accounts in, so that I can double check, I get 360 records (because of the problem outlined earlier).

    I'm wondering if I just need to kick out all duplicate accounts earlier in the process...

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply