SQL help to match only 3 values

  • Hello Experts,that
    Needed some help with a query, the requirement is to identify all PID IN TABLE1 that has exactly any 3 unique gcode values (not less not more).
    Any help is greatly appreciated. Below is the test table, data and expected result.

    DECLARE @table1 TABLE (
    pid varchar(5),
    gcode varchar(5)
    )

    DECLARE @table2 TABLE(
    id int,
    gcode varchar(5)
    )

    insert @table1 (pid,gcode)
    values('p1','g1'),
    ('p1','g2'),
    ('p1','g5'),
    ('p2','g3'),
    ('p2','g4'),
    ('p2','g1'),
    ('p3','g1'),
    ('p3','g3'),
    ('p3','g6'),
    ('p3','g1'),
    ('p4','g2'),
    ('p4','g5'),
    ('p5','g3'),
    ('p5','g4'),
    ('p5','g6'),
    ('p6','g2'),
    ('p6','g2')

    insert @table2 (id,gcode)
    values('1','g1'),
    ('2','g2'),
    ('3','g3'),
    ('4','g4'),
    ('5','g5'),
    ('6','g6'),
    ('7','g7')

    Expected result

    pid,gcode

    p2,g3
    p2,g4
    p2,g1
    p5,g3
    p5,g4
    p5,g6

    Thanks

  • Tried HAVING yet?

  • SELECT pid
    FROM @table1
    GROUP BY pid
    HAVING COUNT(DISTINCT gcode) = 3

    I don't understand why you don't have p1 and p3 in your result set.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher - Thursday, March 7, 2019 2:19 PM

    SELECT pid
    FROM @table1
    GROUP BY pid
    HAVING COUNT(DISTINCT gcode) = 3

    I don't understand why you don't have p1 and p3 in your result set.

    While p1 is a valid point, the case of p3 is because p3 has a fourth entry, and thus not JUST 3 unique values.  Let's try this:

    SELECT pid
    FROM @table1
    GROUP BY pid
    HAVING COUNT(DISTINCT gcode) = 3
        AND COUNT(gcode) = 3

    This will still pull p1, of course, so OP, please explain why p1 is not in your result set...

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Thanks Scott, that was my bad, yes p1 and p3 should be there.  This works.  In real world I am going against millions of records with lot of columns in each table, hopefully i won't have any performance issues.

    Thanks

  • It looked you also wanted to list the specific values found.

    That's easy for two values, much trickier for 3 (3 numeric values is easy, but not 3 char values). 

    But if the gcode values really are very short, we can probably derive the middle value by using a formula.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • OH yes, the gcode is actually 14 character value (for testing I minimized it).  Is there a way.

    Also, I have a requirement for 2 as well on the same set.

  • For 3 varchar values, it's tricky, and I'd have to do some testing/experimenting first. 

    For 3 numeric values, it's easy to list all 3.

    For only 2 values of either type, it's super easy:

    SELECT pid, MIN(gcode) AS gcode_min, MAX(gcode) AS gcode_max
    FROM @table1
    GROUP BY pid
    HAVING COUNT(DISTINCT gcode) = 2 /*AND COUNT(*) = 2 /*if applicable*/*/

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Ha! Sounds good!

    Thank you very much!

  • This will work with most (all?) data types.  I used DENSE_RANK() to filter out duplicates, but ROW_NUMBER() would work in a similar fashion.

    WITH CTE AS
    (
        SELECT pid, gcode, DENSE_RANK() OVER(PARTITION BY pid ORDER BY gcode) + DENSE_RANK() OVER(PARTITION BY pid ORDER BY gcode DESC) AS cnt
        FROM @table1
    )
    SELECT pid, gcode
    FROM CTE
    WHERE cnt = 4
    ORDER BY pid, gcode

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply