grouping

  • Hi is there a way that I can group this data below to have a different group for each

    consecutive same percent_b.

    For example the first three 'A100' would be group 1, 'A69' group 2, 'A86' group 3 and the last 'A100' group 4. The reason I want that is because I want to get the minimum date value of the very last group.

    Can someone help please?

    Thanks

    Stan

    Clt Part Date_a Val_b percent_b

    08477A3689 2000-09-08 00:00:00.00027418.7 A100

    08477A3689 2000-09-09 00:00:00.00027418.4 A100

    08477A3689 2000-09-10 00:00:00.00027418.2 A100

    08477A3689 2001-05-01 00:00:00.00027973.38 A69

    08477A3689 2002-05-01 00:00:00.00028818 A69

    08477A3689 2003-05-01 00:00:00.00030179.48 A69

    08477A3689 2003-10-17 00:00:00.00037833.7 A86

    08477A3689 2004-05-01 00:00:00.00038971.55 A86

    08477A3689 2005-05-01 00:00:00.00040235.84 A86

    08477A3689 2006-05-01 00:00:00.00041436.91 A86

    08477A3689 2006-09-29 00:00:00.00047903.94 A100

    08477A3689 2007-05-01 00:00:00.00053147.43 A100

    08477A3689 2008-05-01 00:00:00.00055266.75 A100

    08477A3689 2009-05-01 00:00:00.00056929.32 A100

    08477A3689 2010-05-01 00:00:00.00058628.43 A100

    08477A3689 2011-05-01 00:00:00.00059651.55 A100

    08477A3689 2012-05-01 00:00:00.00060839.1 A100

    08477A3689 2012-06-04 00:00:00.00063855 A100

    08477A3689 2013-05-01 00:00:00.00066432 A100

  • This will get you the data grouped as required.

    -- sample data

    DECLARE @yourtable TABLE (Clt char(5), Part char(5), Date_a datetime, Val_b decimal(8,2), percent_b varchar(6));

    -- only did the first three groups for the sake of time...

    INSERT @yourtable VALUES

    ('08477','A3689','2000-09-08 00:00:00.000','27418.7','A100'),

    ('08477','A3689','2000-09-09 00:00:00.000','27418.4','A100'),

    ('08477','A3689','2000-09-10 00:00:00.000','27418.2','A100'),

    ('08477','A3689','2001-05-01 00:00:00.000','27973.38','A69'),

    ('08477','A3689','2002-05-01 00:00:00.000','28818','A69'),

    ('08477','A3689','2003-05-01 00:00:00.000','30179.48','A69'),

    ('08477','A3689','2003-10-17 00:00:00.000','37833.7','A86'),

    ('08477','A3689','2004-05-01 00:00:00.000','38971.55','A86'),

    ('08477','A3689','2005-05-01 00:00:00.000','40235.84','A86'),

    ('08477','A3689','2006-05-01 00:00:00.000','41436.91','A86');

    -- solution

    WITH groups AS

    (

    SELECT

    *,

    DENSE_RANK() OVER (ORDER BY percent_b) AS group_nbr,

    ROW_NUMBER() OVER (PARTITION BY percent_b ORDER BY Date_a) AS date_rank

    FROM @yourtable

    )

    SELECT * FROM groups

    WHERE group_nbr = (SELECT MAX(group_nbr) FROM groups) AND date_rank = 1;

    Change the SELECT * to SELECT date_a if you only want to retrieve the date.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan, I believe that you missed the important part of the sample data. 😀

    -- sample data

    DECLARE @yourtable TABLE (Clt char(5), Part char(5), Date_a datetime, Val_b decimal(8,2), percent_b varchar(6));

    -- only did the first three groups for the sake of time...

    INSERT @yourtable VALUES

    ('08477','A3689','2000-09-08 00:00:00.000','27418.7','A100'),

    ('08477','A3689','2000-09-09 00:00:00.000','27418.4','A100'),

    ('08477','A3689','2000-09-10 00:00:00.000','27418.2','A100'),

    ('08477','A3689','2001-05-01 00:00:00.000','27973.38','A69'),

    ('08477','A3689','2002-05-01 00:00:00.000','28818','A69'),

    ('08477','A3689','2003-05-01 00:00:00.000','30179.48','A69'),

    ('08477','A3689','2003-10-17 00:00:00.000','37833.7','A86'),

    ('08477','A3689','2004-05-01 00:00:00.000','38971.55','A86'),

    ('08477','A3689','2005-05-01 00:00:00.000','40235.84','A86'),

    ('08477','A3689','2006-05-01 00:00:00.000','41436.91','A86'),

    ('08477','A3689','2006-09-29 00:00:00.000','47903.94','A100'),

    ('08477','A3689','2007-05-01 00:00:00.000','53147.43','A100'),

    ('08477','A3689','2008-05-01 00:00:00.000','55266.75','A100'),

    ('08477','A3689','2009-05-01 00:00:00.000','56929.32','A100'),

    ('08477','A3689','2010-05-01 00:00:00.000','58628.43','A100'),

    ('08477','A3689','2011-05-01 00:00:00.000','59651.55','A100'),

    ('08477','A3689','2012-05-01 00:00:00.000','60839.1','A100'),

    ('08477','A3689','2012-06-04 00:00:00.000','63855','A100'),

    ('08477','A3689','2013-05-01 00:00:00.000','66432','A100');

    -- solution

    WITH groups AS

    (

    SELECT

    *,

    ROW_NUMBER() OVER (ORDER BY Date_a) -

    ROW_NUMBER() OVER (PARTITION BY percent_b ORDER BY Date_a) group_nbr

    FROM @yourtable

    )

    SELECT TOP 1 *

    FROM groups

    ORDER BY group_nbr DESC, Date_a;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks Alan but that doesn't really work for me.

    Because what I want is to get for each part number the last (max date_a) percent_b that was entered

    And the earliest date where is the subsequent percent_b.

    INSERT @yourtable VALUES

    ('08477','A3689','2000-09-08 00:00:00.000','27418.7','A100'),

    ('08477','A3689','2000-09-09 00:00:00.000','27418.4','A100'),

    ('08477','A3689','2000-09-10 00:00:00.000','27418.2','A100'),

    ('08477','A3689','2001-05-01 00:00:00.000','27973.38','A69'),

    ('08477','A3689','2002-05-01 00:00:00.000','28818','A69'),

    ('08477','A3689','2003-05-01 00:00:00.000','30179.48','A69'),

    ('08477','A3689','2003-10-17 00:00:00.000','37833.7','A86'),

    ('08477','A3689','2004-05-01 00:00:00.000','38971.55','A86'),

    ('08477','A3689','2005-05-01 00:00:00.000','40235.84','A86'),

    ('08477','A3689','2006-05-01 00:00:00.000','41436.91','A86');

    ('08477','A3689','2004-05-02 00:00:00.000','38971.57','A100'),

    ('08477','A3689','2005-05-03 00:00:00.000','40235.87','A100'),

    ('08477','A3689','2006-05-04 00:00:00.000','41436.97','A100');

    ('08477','A3699','2003-10-17 00:00:00.000','37833.7','A86'),

    ('08477','A3699','2004-05-01 00:00:00.000','38971.55','A86'),

    ('08477','A3699','2005-05-01 00:00:00.000','40235.84','A86'),

    ('08477','A3699','2006-05-01 00:00:00.000','41436.91','A86');

    ('08477','A3699','2007-05-12 00:00:00.000','38971.57','A101'),

    ('08477','A3699','2008-05-13 00:00:00.000','40235.87','A101'),

    ('08477','A3699','2009-05-14 00:00:00.000','41436.97','A101');

    For instance in the example above I would get

    1 - part = A3689 and percent_b = 'A100' and the earliest date for that group = '2004-05-02 00:00:00.000'

    2 - part = A3699 and percent_b = 'A101' and the earliest date for that group = '2007-05-12 00:00:00.000'

    We want that earliest date because we want to know since when we start using that percentage

    Let me know if that is clear now,

    thanks

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply