Data clean up

  • Hi all,

    I'm trying to clean up some warehouse tables. The records are in SETS defined by a Begin and End date range. Each SET has to have a total PCT (percent) equal 100. The data has some duplicate records with different date ranges (that follow the previous set) that I want to collapse with a min and max type query, but I can't seem to find a way to correctly do what I want to achieve.

    See example in image below. The left side is the old bad data, the right side (good) is what I'm trying to get. Is this possible with a query?
    Thanks for any guidance!

  • Without consumable data, but maybe a simple SELECT DISTINCT...?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Thanks for the reply Thom,

    Distinct won't work because I need to have the MIN of BEGDT and MAX of ENDDT for two contiguous date ranges. For example, on the left side, the first set of 3 records and the next set of 3 records are exact duplicates, except for the dates. I want to have one set of 3 records, with the min begin date, and max end date for the two sets.

  • Jim Bishop - Wednesday, March 15, 2017 10:11 AM

    Thanks for the reply Thom,

    Distinct won't work because I need to have the MIN of BEGDT and MAX of ENDDT for two contiguous date ranges. For example, on the left side, the first set of 3 records and the next set of 3 records are exact duplicates, except for the dates. I want to have one set of 3 records, with the min begin date, and max end date for the two sets.

    Please provide your sample data in consumable format if you would like a working and tested solution. See the link in my signature for details.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Sorry, hope this is helpful. Thanks! 🙂

    CREATE TABLE #myTable
    (
    ID int,
    ITEM char(1),
    PCT int,
    BEGDT datetime,
    ENDDT datetime
    );

    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'A', 20, '1/1/2010', '12/31/2010');

    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'B', 40, '1/1/2010', '12/31/2010');

    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'C', 40, '1/1/2010', '12/31/2010');
    ---------------------
    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'A', 20, '1/1/2011', '12/31/2011');

    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'B', 40, '1/1/2011', '12/31/2011');

    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'C', 40, '1/1/2011', '12/31/2011');
    ---------------------
    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'A', 30, '1/1/2012', '12/31/2012');

    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'B', 20, '1/1/2012', '12/31/2012');

    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'C', 50, '1/1/2012', '12/31/2012');
    ---------------------
    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'A', 100, '1/1/2013', '12/31/2013');
    ---------------------
    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'A', 100, '1/1/2014', '12/31/2014');
    ---------------------
    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'A', 20, '1/1/2015', '12/31/2015');

    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'B', 40, '1/1/2015', '12/31/2015');

    INSERT INTO #myTable (ID, ITEM, PCT, BEGDT, ENDDT)
    VALUES (1, 'C', 40, '1/1/2015', '12/31/2015');

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply