"Tag" highest value in the month

  • I'm new to SQL and trying to figure out how I can place a "Y" (for yes) for the highest value of the month.  Right now, I have a simple table that contains a date and a numeric value... 

    How can I create a third column the will place a "Y" for the highest value and a "N" on the rest of the rows for the month, the table will hold additional months eventually, so I only want it to tag for each month

    Date   Value  Highest
    11/1    123      N
    11/2    130      N
    11/3    140      Y
    12/1    150     N
    12/2    160     N
    12/3    180    Y

    Thank you in advance

  • Hi,
    Welcome to SQL Central

    create table calendar
    (
    [date] date,
    value int);

    insert into calendar values ('11/1/2017',123);
    insert into calendar values ('11/2/2017',130);
    insert into calendar values ('11/3/2017',140);
    insert into calendar values ('12/1/2017',150);
    insert into calendar values ('12/2/2017',160);
    insert into calendar values ('12/3/2017',180);

    SELECT [date], 
           value, 
           CASE 
             WHEN rnk = 1 THEN 'Y' 
             ELSE 'N' 
           END AS highest 
    FROM  (SELECT value, 
                  [date], 
                  Rank() 
                    OVER( 
                      partition BY Month([date]) 
                      ORDER BY [date] DESC)AS rnk 
           FROM   calendar)cal 
    ORDER  BY [date] ;

    Saravanan

  • Simple SQL Tips - Friday, December 8, 2017 6:04 PM

    I'm new to SQL and trying to figure out how I can place a "Y" (for yes) for the highest value of the month.  Right now, I have a simple table that contains a date and a numeric value... 

    How can I create a third column the will place a "Y" for the highest value and a "N" on the rest of the rows for the month, the table will hold additional months eventually, so I only want it to tag for each month

    Date   Value  Highest
    11/1    123      N
    11/2    130      N
    11/3    140      Y
    12/1    150     N
    12/2    160     N
    12/3    180    Y

    Thank you in advance

    A useful idiom is a report period calendar. It gives a name to a range of dates. The worst way would be to use temporal math; it tells the world you do not think in sets or understand declarative programming yet. Here is a skeleton:

    CREATE TABLE Month_Periods
    (month_name CHAR(10) NOT NULL PRIMARY KEY,
    month_start_date DATE NOT NULL,
    month_end_date DATE NOT NULL,
    CONSTRAINT date_ordering
    CHECK (month_start_date <= month_end_date),
    etc);

    These report periods can overlap; a fiscal quarter will be contained in the range of its fiscal year. I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL. Remember that when you design report name column.

    CREATE TABLE Daily_Foobars
    (foo_date DATE NOT NULL PRIMARY KEY,
    foo_score INTEGER NOT NULL);

    You can now pull out monthly data with a simple BETWEEN predicate. But instead of actually treating the data as if you are still using punchcards, and have to punch a column with a bit flag like we did in the 1960s, you can simply create a view of the highest daily foobar within each month.

    WITH X1
    AS
    (SELECT D.foo_date, D.foo_score, M.month_name
    FROM Daily_Foobars AS D,
    WHERE D.foo_date BETWEEN M.month_start_date <= M.month_end_date),
    X2
    AS
    (SELECT foo_date, foo_score, month_name,
       MAX(foo_score) OVER (PARTITION BY month_name)
       AS foo_score_max
    FROM X1)
    SELECT foo_date, foo_score, month_name
    FROM X2
    WHERE foo_score = max_foo_score;

    This version will handle ties if two or more dates both reached the maximum scores. Using RANKs and ROW_NUMBER() approaches will not; they assume an ordering, which is not the way RDBMS is supposed to work.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • First of all, thank you for the assistance.  After running the code and re-reading my post.  I realized in my first post, the numbers I was using happened to be in descending order and misstated what I'm trying to do. Hope the code below helps.

    create table AMCheck ([date] date, AMValue int, Highest varchar(1))

    insert into AMCheck values('11/01/2017','120',null)
    insert into AMCheck values('11/02/2017','113',null)
    insert into AMCheck values('11/03/2017','107',null)
    insert into AMCheck values('11/04/2017','91',null)
    insert into AMCheck values('11/05/2017','104',null)
    insert into AMCheck values('11/06/2017','126',null)
    insert into AMCheck values('11/07/2017','113',null)
    insert into AMCheck values('11/08/2017','119',null)
    insert into AMCheck values('11/09/2017','105',null)
    insert into AMCheck values('11/10/2017','102',null)
    insert into AMCheck values('11/11/2017','103',null)
    insert into AMCheck values('11/12/2017','118',null)
    insert into AMCheck values('11/13/2017','109',null)
    insert into AMCheck values('11/14/2017','121',null)
    insert into AMCheck values('11/15/2017','102',null)
    insert into AMCheck values('11/16/2017','102',null)
    insert into AMCheck values('11/17/2017','89',null)
    insert into AMCheck values('11/18/2017','111',null)
    insert into AMCheck values('11/19/2017','133',null)
    insert into AMCheck values('11/20/2017','105',null)
    insert into AMCheck values('11/21/2017','104',null)
    insert into AMCheck values('11/22/2017','91',null)
    insert into AMCheck values('11/23/2017','124',null)
    insert into AMCheck values('11/24/2017','93',null)
    insert into AMCheck values('11/25/2017','102',null)
    insert into AMCheck values('11/26/2017','113',null)
    insert into AMCheck values('11/27/2017','117',null)
    insert into AMCheck values('11/28/2017','118',null)
    insert into AMCheck values('11/29/2017','124',null)
    insert into AMCheck values('11/30/2017','122',null)
    insert into AMCheck values('12/01/2017','117',null)
    insert into AMCheck values('12/02/2017','112',null)
    insert into AMCheck values('12/03/2017','107',null)
    insert into AMCheck values('12/04/2017','116',null)
    insert into AMCheck values('12/05/2017','103',null)
    insert into AMCheck values('12/06/2017','110',null)
    insert into AMCheck values('12/07/2017','119',null)
    insert into AMCheck values('12/08/2017','110',null)
    insert into AMCheck values('12/09/2017','117',null)

    So when I run a select statement I would get all the rows back and they would have a N' in "highest" column except for 
    Date             AMValue    Highest
    11/19/2017   133            Y
    12/07/2017   119            Y
    Since those two days have the highest AMValue in their respective month (Note. since December isn't finished, it could be possible that another date be higher than the 9th but that's all I have for now for ).  Please understand that this is just a snippet of data, as my data grows I will have more months that need to do the same thing.  

    Just realized this... It could be possible for the highest AMValue to be repeated in the same month, so those other dates would need to be tagged with a "Y" also if possible.

    I hope this makes sense. 

    Please understand I'm new and still trying to understand the correct way of asking and providing as much information as possible.

  • Hi ,
    Your SQL formatting looks very good. What tools your using for paste the query with color and proper formatting.

    Solution:

    WITH cte
    AS (SELECT
    SUBSTRING(CONVERT(varchar(8), [date], 3), 4, 5) AS [MM / YY],
    MAX(amvalue) AS max_amvalue
    FROM AMCheck
    GROUP BY SUBSTRING(CONVERT(varchar(8), [date], 3), 4, 5))
    SELECT
    [DATE],
    AMVALUE,
    max_amvalue,
    CASE
      WHEN
      ISNULL(max_amvalue, 0) = 0 THEN 'N'
      ELSE 'Y'
    END
    AS HIGHEST
    FROM AMCheck a
    LEFT OUTER JOIN cte b
    ON a.amvalue = b.max_amvalue;

    Saravanan

  • Simple SQL Tips - Sunday, December 10, 2017 12:51 AM

    First of all, thank you for the assistance.  After running the code and re-reading my post.  I realized in my first post, the numbers I was using happened to be in descending order and misstated what I'm trying to do. Hope the code below helps.

    create table AMCheck ([date] date, AMValue int, Highest varchar(1))

    insert into AMCheck values('11/01/2017','120',null)
    insert into AMCheck values('11/02/2017','113',null)
    insert into AMCheck values('11/03/2017','107',null)
    insert into AMCheck values('11/04/2017','91',null)
    insert into AMCheck values('11/05/2017','104',null)
    insert into AMCheck values('11/06/2017','126',null)
    insert into AMCheck values('11/07/2017','113',null)
    insert into AMCheck values('11/08/2017','119',null)
    insert into AMCheck values('11/09/2017','105',null)
    insert into AMCheck values('11/10/2017','102',null)
    insert into AMCheck values('11/11/2017','103',null)
    insert into AMCheck values('11/12/2017','118',null)
    insert into AMCheck values('11/13/2017','109',null)
    insert into AMCheck values('11/14/2017','121',null)
    insert into AMCheck values('11/15/2017','102',null)
    insert into AMCheck values('11/16/2017','102',null)
    insert into AMCheck values('11/17/2017','89',null)
    insert into AMCheck values('11/18/2017','111',null)
    insert into AMCheck values('11/19/2017','133',null)
    insert into AMCheck values('11/20/2017','105',null)
    insert into AMCheck values('11/21/2017','104',null)
    insert into AMCheck values('11/22/2017','91',null)
    insert into AMCheck values('11/23/2017','124',null)
    insert into AMCheck values('11/24/2017','93',null)
    insert into AMCheck values('11/25/2017','102',null)
    insert into AMCheck values('11/26/2017','113',null)
    insert into AMCheck values('11/27/2017','117',null)
    insert into AMCheck values('11/28/2017','118',null)
    insert into AMCheck values('11/29/2017','124',null)
    insert into AMCheck values('11/30/2017','122',null)
    insert into AMCheck values('12/01/2017','117',null)
    insert into AMCheck values('12/02/2017','112',null)
    insert into AMCheck values('12/03/2017','107',null)
    insert into AMCheck values('12/04/2017','116',null)
    insert into AMCheck values('12/05/2017','103',null)
    insert into AMCheck values('12/06/2017','110',null)
    insert into AMCheck values('12/07/2017','119',null)
    insert into AMCheck values('12/08/2017','110',null)
    insert into AMCheck values('12/09/2017','117',null)

    So when I run a select statement I would get all the rows back and they would have a N' in "highest" column except for 
    Date             AMValue    Highest
    11/19/2017   133            Y
    12/07/2017   119            Y
    Since those two days have the highest AMValue in their respective month (Note. since December isn't finished, it could be possible that another date be higher than the 9th but that's all I have for now for ).  Please understand that this is just a snippet of data, as my data grows I will have more months that need to do the same thing.  

    Just realized this... It could be possible for the highest AMValue to be repeated in the same month, so those other dates would need to be tagged with a "Y" also if possible.

    I hope this makes sense. 

    Please understand I'm new and still trying to understand the correct way of asking and providing as much information as possible.

    Cool... readily consumable test data goes a long way with me.

    This problem is easily solved by the use of an "updateable CTE".  The WHERE clause in the following will limit updates to only the current and previous month (to allow for month end final adjustments).  As the comment says, comment that out for the "first ever" run.  Of course, this assumes that anything prior to the previous month will never be updated ever again.  If there's a chance of the data older than the previous month being updated EVER, then just don't include the WHERE clause at all.


       WITH cteMonthMax AS
    (
     SELECT  MonthMaxAMValue = MAX(AMValue) OVER (PARTITION BY DATEDIFF(mm,0,[Date]))
            ,[Date]
            ,AMValue
            ,Highest
       FROM AMCheck
      WHERE [Date] >= DATEADD(mm,DATEDIFF(mm,0,GETDATE())-1,0) --Comment out this line for the "first ever" run
    )
     UPDATE cmm
        SET cmm.Highest = IIF(cmm.AMValue = cmm.MonthMaxAMValue,'Y','N')
       FROM cteMonthMax cmm
    ;

    I will state that I agree with what Joe Celko implies on this one ... this is a terrible idea because it can and will allow for bad data if someone makes an update to the table and either forgets to run the job/proc or the job hasn't executed yet.  It would be far better to build a function or view or stored procedure to produce the results you desire because they would always be up to date at the time of whenever ran a query against one of those objects.

    That, notwithstanding, I'm not so sure with the use of a periodic calendar table here because it just doesn't seem necessary.  Temporal math rocks!  😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,  After reading up on CTE and understanding what you provided, that's exactly what I needed.  Old data will never change therefore I won't use the where clause but nice to know about it.  Thank you very much,

    saravanatn - As far as the formatting all I did was copy the code into the message area, highlighted it and then clicked the button "titled" SQL Code at the bottom of the message box.  Thank you for your assistance unfortunately the code returns inaccurate results.  It tags the row for 11/08/17 with a 'Y' when it shouldn't since it's not the highest record for November.

  • Hi,

    You are right. I made a small mistake.Below query gives accurate results. But you can Jeff Moden query as his query is always efficient.


    WITH cte
    AS (SELECT
    SUBSTRING(CONVERT(varchar(8), [date], 3), 4, 5) AS [MM / YY],
    MAX(amvalue) AS max_amvalue
    FROM AMCheck
    GROUP BY SUBSTRING(CONVERT(varchar(8), [date], 3), 4, 5))
    SELECT
    [DATE],
    AMVALUE,
    max_amvalue,
    CASE
      WHEN ISNULL(max_amvalue, 0) = 0 THEN 'N'
      ELSE 'Y'
    END AS HIGHEST
    FROM AMCheck a
    LEFT OUTER JOIN cte b
    ON a.amvalue = b.max_amvalue
    AND [MM / YY] = SUBSTRING(CONVERT(varchar(8), [date], 3), 4, 5)
    ORDER BY HIGHEST DESC

    Saravanan

  • Simple SQL Tips - Monday, December 11, 2017 7:27 AM

    Jeff,  After reading up on CTE and understanding what you provided, that's exactly what I needed.  Old data will never change therefore I won't use the where clause but nice to know about it.  Thank you very much,

    saravanatn - As far as the formatting all I did was copy the code into the message area, highlighted it and then clicked the button "titled" SQL Code at the bottom of the message box.  Thank you for your assistance unfortunately the code returns inaccurate results.  It tags the row for 11/08/17 with a 'Y' when it shouldn't since it's not the highest record for November.

    If old data never changes, then you actually do need to use the WHERE clause, possibly tightening it up a lot.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • saravanatn - Monday, December 11, 2017 7:42 AM

    Hi,

    You are right. I made a small mistake.Below query gives accurate results. But you can Jeff Moden query as his query is always efficient.


    WITH cte
    AS (SELECT
    SUBSTRING(CONVERT(varchar(8), [date], 3), 4, 5) AS [MM / YY],
    MAX(amvalue) AS max_amvalue
    FROM AMCheck
    GROUP BY SUBSTRING(CONVERT(varchar(8), [date], 3), 4, 5))
    SELECT
    [DATE],
    AMVALUE,
    max_amvalue,
    CASE
      WHEN ISNULL(max_amvalue, 0) = 0 THEN 'N'
      ELSE 'Y'
    END AS HIGHEST
    FROM AMCheck a
    LEFT OUTER JOIN cte b
    ON a.amvalue = b.max_amvalue
    AND [MM / YY] = SUBSTRING(CONVERT(varchar(8), [date], 3), 4, 5)
    ORDER BY HIGHEST DESC

    Thank you for the kind words. :blush:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply