Idetify repeated values from a table

  • Hi all,

    I have a issue in identifying the repeated values from the table

    IDRangeIDLengthExtendedValueMeasure

    161650912 369 17.2

    261650912 369 17.2

    361650912 370 17.2

    461650912 370 17.2

    561650912 371 17.2

    661650912 371 17.2

    761650912 372 17.2

    861650912 372 17.2

    961650912 373 22.6

    1061650912 373 22.6

    1161650912 374 22.6

    If the measure value is repeated and the count of them is greater or equal to 8 then i want to flag and delete them , I have tried doing a self join and compare row by row, I am able to identify duplicates but I am unable to flag for the condition that the count is greater than or equal to 8 , any help would much be appreciated

  • nm.rajesh (1/14/2016)


    If the measure value is repeated and the count of them is greater or equal to 8 then i want to flag and delete them

    Do you want to delete all of them or only 9th, 10th, etc. occurences?

    _____________
    Code for TallyGenerator

  • Want to flag all the occurrences which are in sequential order and then delete them in later stages for count Greater than or equal to 8

  • It seems like you need a "running total update".

    See the article describing how to do it:

    http://www.sqlservercentral.com/articles/T-SQL/68467/

    _____________
    Code for TallyGenerator

  • I might be missing something, but why not this:

    Setup:

    USE tempdb;

    IF EXISTS ( SELECT *

    FROM sys.objects

    WHERE object_id = OBJECT_ID(N'dbo.SomeMeasurement')

    AND type IN (N'U') )

    DROP TABLE dbo.SomeMeasurement;

    GO

    CREATE TABLE dbo.SomeMeasurement

    (

    ID INT,

    RangeID INT,

    Length INT,

    ExtendedValue INT,

    Measure DECIMAL(10, 4)

    );

    INSERT INTO dbo.SomeMeasurement

    (ID, RangeID, Length, ExtendedValue, Measure)

    VALUES (1, 61650, 912, 369, 17.2),

    (2, 61650, 912, 369, 17.2),

    (3, 61650, 912, 370, 17.2),

    (4, 61650, 912, 370, 17.2),

    (5, 61650, 912, 371, 17.2),

    (6, 61650, 912, 371, 17.2),

    (7, 61650, 912, 372, 17.2),

    (8, 61650, 912, 372, 17.2),

    (9, 61650, 912, 373, 22.6),

    (10, 61650, 912, 373, 22.6),

    (11, 61650, 912, 374, 22.6),

    (12, 61650, 912, 372, 17.2),

    (13, 61650, 912, 372, 17.2),

    (14, 61650, 912, 372, 17.2);

    Solution:

    USE tempdb;

    BEGIN TRAN;

    SELECT ROW_NUMBER() OVER (PARTITION BY Measure ORDER BY ID) AS row_num,

    *

    FROM dbo.SomeMeasurement;

    WITH cte

    AS (

    SELECT ROW_NUMBER() OVER (PARTITION BY Measure ORDER BY ID) AS row_num,

    *

    FROM dbo.SomeMeasurement

    )

    DELETE FROM cte

    WHERE cte.row_num > 8;

    SELECT ROW_NUMBER() OVER (PARTITION BY Measure ORDER BY ID) AS row_num,*

    FROM dbo.SomeMeasurement;

    ROLLBACK;

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • But the measure values needs to be sequential . Like 17.4 in the example has to be repeated more than 8 times in sequence and then mark only those .

    Here is what I have done but this would give me row numbers 1 to 8 or what ever but now I want is to get only the set which has all values from 1 to 8 or greater .

    With NewCte as

    (

    SELECT

    case when a.Measure = b.Measure then row_number () over ( partition by a.Measure order by a.Measure) else 0 END Flag,

    a.id aid,b.id bid,a.RangeID, a.Length, a.ExtendedValue,a.Measure as aMeasure, b.Measure bMeasure

    FROM SomeMeasurement A INNER JOIN SomeMeasurement B

    ON a.id = b.id-1

    WHERE a.Measure = b.Measure

    )

    SELECT * FROM NewCte

  • Using Orlando's test data, maybe:

    WITH Groups

    AS

    (

    SELECT ID, Measure

    ,ROW_NUMBER() OVER (ORDER BY ID)

    - ROW_NUMBER() OVER (PARTITION BY Measure ORDER BY ID) AS Grp

    FROM dbo.SomeMeasurement

    )

    ,GroupOrders

    AS

    (

    SELECT *

    ,ROW_NUMBER() OVER (PARTITION BY Measure, Grp ORDER BY ID) AS rn

    FROM Groups

    )

    DELETE GroupOrders

    WHERE rn >= 8;

  • nm.rajesh (1/14/2016)


    But the measure values needs to be sequential . Like 17.4 in the example has to be repeated more than 8 times in sequence and then mark only those .

    Here is what I have done but this would give me row numbers 1 to 8 or what ever but now I want is to get only the set which has all values from 1 to 8 or greater .

    With NewCte as

    (

    SELECT

    case when a.Measure = b.Measure then row_number () over ( partition by a.Measure order by a.Measure) else 0 END Flag,

    a.id aid,b.id bid,a.RangeID, a.Length, a.ExtendedValue,a.Measure as aMeasure, b.Measure bMeasure

    FROM SomeMeasurement A INNER JOIN SomeMeasurement B

    ON a.id = b.id-1

    WHERE a.Measure = b.Measure

    )

    SELECT * FROM NewCte

    I am still a little unclear but it seems like you want to delete rows 9 through n. See if this gets you close:

    WITH cte

    AS (

    SELECT LAG(Measure, 8) OVER (PARTITION BY Measure ORDER BY ID) AS lag8,

    ROW_NUMBER() OVER (PARTITION BY Measure ORDER BY ID) AS row_num,

    *

    FROM dbo.SomeMeasurement

    )

    SELECT *

    FROM cte

    WHERE cte.lag8 > 0;

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply