Dedupe ordered by value

  • Hello All

    I have a table which contains a postcode, source and a distance field

    Postcode,Source,Distance

    HX1 1AA,A,200

    HX1 1AA,B,220

    HX1 1AA,C,100

    HX1 1AA,D,120

    HX1 1AB,D,90

    HX1 1AC,D,300

    HX1 1AD,E,100

    HX1 1AD,E,200

    HX1 1AD,E,152

    I need to create a list containing a unique set of postcodes, with the applicable source and distance.

    The end Results of the above would need to be:

    HX1 1AA,C,100

    HX1 1AB,D,90

    HX1 1AC,D,300

    HX1 1AD,E,100

    I have a list of approx 500,000 records I neded to be perform this query on. Can anybody suggest the best way to do this.

    Any help would be gratefully appreciated...

    Thanks

    (Originally posted in incorrect forum)

  • Can you describe the rules for choosing one row over the others?

    HX1 1AA,A,200

    HX1 1AA,B,220

    HX1 1AA,C,100

    HX1 1AA,D,120

    In this case, you've chosen the row which has the smallest value for distance. Is that the rule?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The rule is the one with the lowest distance. Sorry for not making this clear

    Thanks

  • No problem.

    DROP TABLE #Sample

    CREATE TABLE #Sample (Postcode VARCHAR(10), [Source] CHAR(1), Distance INT)

    INSERT INTO #Sample (Postcode, [Source], Distance)

    SELECT 'HX1 1AA', 'A', 200 UNION ALL

    SELECT 'HX1 1AA', 'B', 220 UNION ALL

    SELECT 'HX1 1AA', 'C', 100 UNION ALL --

    SELECT 'HX1 1AA', 'D', 120 UNION ALL

    SELECT 'HX1 1AB', 'D', 90 UNION ALL --

    SELECT 'HX1 1AC', 'D', 300 UNION ALL --

    SELECT 'HX1 1AD', 'E', 100 UNION ALL --

    SELECT 'HX1 1AD', 'E', 200 UNION ALL

    SELECT 'HX1 1AD', 'E', 152

    SELECT

    Postcode,

    [Source],

    Distance

    FROM (

    SELECT

    Postcode,

    [Source],

    Distance,

    MyRow = ROW_NUMBER() OVER(PARTITION BY Postcode ORDER BY Distance)

    FROM #Sample

    ) d

    WHERE MyRow = 1

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Fantastic!! Works a treat 🙂

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply