Update Column but with Distinct Values

  • Hello All,

    I currently have a table with the DDL below;


    CREATE TABLE #Test (Id INT, CarID INT, CarStatus CHAR(1));
    INSERT INTO #Test (Id, CarId, CarStatus)
    SELECT 1, 1000, 1 UNION ALL
    SELECT 2, 2000, 1 UNION ALL
    SELECT 3, 3000, 1 UNION ALL
    SELECT 4, 3000, 1 UNION ALL
    SELECT 5, 2000, 1 UNION ALL
    SELECT 6, 1000, 1 UNION ALL
    SELECT 7, 1000, 1 UNION ALL
    SELECT 8, 2000, 1 UNION ALL
    SELECT 9, 3000, 1;

    I would like to update the CarStatus to 2 but for only the distinct values of CarId. Notice the CarId repeats between 1000, 2000, 3000.

    I only need to update the CarStatus for the first 3 unique CarId values. Result should look something like this

     

    CREATE TABLE #TestOutput (Id INT, CarID INT, CarStatus CHAR(1));
    INSERT INTO #TestOutput (Id, CarId, CarStatus)
    SELECT 1, 1000, 2 UNION ALL
    SELECT 2, 2000, 2 UNION ALL
    SELECT 3, 3000, 2 UNION ALL
    SELECT 4, 3000, 1 UNION ALL
    SELECT 5, 2000, 1 UNION ALL
    SELECT 6, 1000, 1 UNION ALL
    SELECT 7, 1000, 1 UNION ALL
    SELECT 8, 2000, 1 UNION ALL
    SELECT 9, 3000, 1;
     

    Thank you

  • This will work with the sample data that you provided

    WITH cteData AS (
    SELECT t.Id
    , t.CarID
    , t.CarStatus
    , rn = ROW_NUMBER() OVER (PARTITION BY t.CarID ORDER BY t.Id)
    FROM #Test AS t
    )
    UPDATE cteData
    SET cteData.CarStatus = '2'
    WHERE rn = 1;
  • This should work, I think:

    UPDATE t
    SET t.CarStatus = '2'
    FROM #Test t
    INNER JOIN (
    SELECT CarID, MIN(Id) AS Id
    FROM #Test
    GROUP BY CarID
    HAVING COUNT(*) > 1
    ) t2 ON t2.Id=t.Id

    • This reply was modified 3 years, 2 months ago by  kaj. Reason: Changed SET CarStatus = 2 to SET CarStatus = '2'
  • Thank you. Both solutions worked very well.

  • kaj wrote:

    This should work, I think:

    UPDATE t
    SET t.CarStatus = '2'
    FROM #Test t
    INNER JOIN (
    SELECT CarID, MIN(Id) AS Id
    FROM #Test
    GROUP BY CarID
    HAVING COUNT(*) > 1
    ) t2 ON t2.Id=t.Id

    Careful now... that's only going to mark cars that have more than one occurrence.  An "only" occurrence is still a "first" occurrence.   Your code won't find an "only"first occurrence.

    Of course, that depends on what the OP actually needs but that's what I got out of his definition of the problem even though not explicitly stated so.

    Try your code with the following data and see what I mean.

    DROP TABLE IF EXISTS #Test;
    CREATE TABLE #Test (Id INT, CarID INT, CarStatus CHAR(1));
    INSERT INTO #Test (Id, CarId, CarStatus)
    SELECT 0, 4000, 1 UNION ALL --Should be a 2 by definition because it's the first occurance of 4000.
    SELECT 1, 1000, 1 UNION ALL
    SELECT 2, 2000, 1 UNION ALL
    SELECT 3, 3000, 1 UNION ALL
    SELECT 4, 3000, 1 UNION ALL
    SELECT 5, 2000, 1 UNION ALL
    SELECT 6, 1000, 1 UNION ALL
    SELECT 7, 1000, 1 UNION ALL
    SELECT 8, 2000, 1 UNION ALL
    SELECT 9, 3000, 1 UNION ALL
    SELECT 10, 5000, 1 --Should be a 2 by definition because it's the first occurance of 5000.
    ;

    Your code also makes two scans of the table instead of just one.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    kaj wrote:

    This should work, I think:

    UPDATE t
    SET t.CarStatus = '2'
    FROM #Test t
    INNER JOIN (
    SELECT CarID, MIN(Id) AS Id
    FROM #Test
    GROUP BY CarID
    HAVING COUNT(*) > 1
    ) t2 ON t2.Id=t.Id

    Careful now... that's only going to mark cars that have more than one occurrence.  An "only" occurrence is still a "first" occurrence.   Your code won't find an "only"first occurrence.

    Of course, that depends on what the OP actually needs but that's what I got out of his definition of the problem even though not explicitly stated so.

    Try your code with the following data and see what I mean.

    DROP TABLE IF EXISTS #Test;
    CREATE TABLE #Test (Id INT, CarID INT, CarStatus CHAR(1));
    INSERT INTO #Test (Id, CarId, CarStatus)
    SELECT 0, 4000, 1 UNION ALL --Should be a 2 by definition because it's the first occurance of 4000.
    SELECT 1, 1000, 1 UNION ALL
    SELECT 2, 2000, 1 UNION ALL
    SELECT 3, 3000, 1 UNION ALL
    SELECT 4, 3000, 1 UNION ALL
    SELECT 5, 2000, 1 UNION ALL
    SELECT 6, 1000, 1 UNION ALL
    SELECT 7, 1000, 1 UNION ALL
    SELECT 8, 2000, 1 UNION ALL
    SELECT 9, 3000, 1 UNION ALL
    SELECT 10, 5000, 1 --Should be a 2 by definition because it's the first occurance of 5000.
    ;

    Your code also makes two scans of the table instead of just one.

    Granted. But my code does mark the first of a non-unique group of rows. That was what I got out of the OPs description. Why else put the word unique in there. In effect I thought he wanted to mark all but the latest CarID row with a two, so that only the latest entry for each CarID has a CarStatus of one. 🙂

    As for the two scans: Guilty as charged! So if this becomes a big table and there isn't a decent index on the table, it might become a problem.

     

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply