How do I get last record in a month when no guarantee month exists of unique dates

  • okay in a pickle

    following table global_usage

    ID varchar (contains alphanumeric values,not unique)

    Territory (combined with ID unique)

    Total_Used int can be null

    Date_ date (date of the import of the data)

    ID Territory Total_Used Date_

    ACASC CAL071287 2014-06-01

    ACASC CAL071287 2014-08-01

    ACASC CAL071288 2014-09-01

    ACASC CAL071190 2014-11-01

    ACASC CAL071288 2014-11-01

    ACASC CAL071NULL 2014-12-01

    ACASC CAL071190 2014-12-14

    ACASC CAL071NULL 2015-01-01

    ACASC CAL071286 2015-01-01

    ACASC CAL071286 2015-01-22

    ACASC CAL071164 2015-02-01

    ACASC CAL071165 2015-02-01

    ACASC CAL071164 2015-03-01

    Now the problem,per month I need the most recent value so I'm expecting

    ACASC CAL071287 2014-06-01

    ACASC CAL071287 2014-08-01

    ACASC CAL071288 2014-09-01

    ACASC CAL071288 2014-11-01

    ACASC CAL071190 2014-12-14

    ACASC CAL071286 2015-01-22

    ACASC CAL071165 2015-02-01

    ACASC CAL071164 2015-03-01

    Anyone got any idea,I've tried a few thing like group,having even row_number() but I keep getting wrong results

  • You're on the right lines with ROW_NUMBER

    WITH CTE AS (

    SELECT ID ,Total_Used , Date_,

    ROW_NUMBER() OVER(PARTITION BY YEAR(Date_),MONTH(Date_) ORDER BY Date_ DESC,Total_Used DESC) AS rn

    FROM global_usage)

    SELECT ID ,Total_Used , Date_

    FROM CTE

    WHERE rn=1;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • The answer is going to be row_number partitioning by a datepart of month.

    I'd have given you a finished answer, except:

    1) You didn't give us a sample in a consumable manner (DDL for table and insert for the test data), which means more work to help you.

    and

    2) Your data doesn't follow the rules you wrote. You have 2 rows in November with the same ID and territory, and being that they are from the same date and it is not a date time, how are we meant to tell which was the 'last' one?

  • You can't do SET-wise operations because theoretically SETs do not allow duplicates - duplicates will be ambiguous. Assuming your table is a HEAP, you must cursor through the rows using FAST_FORWARD and replace quantity for ID+Territory+Date_ when matched. I don't know if the previous person's reply containing row_number() will guarantee you the correct ordering of rows.

  • You need something to differenciate the MAX and MIN row in some way. I added a column to show which aggregate is used.

    Your sample data in a consumable way:

    DECLARE @sampleData TABLE(

    ID VARCHAR(5) NOT NULL

    , Territory VARCHAR(12)

    , Total_Used int

    , [Date] DATE

    );

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',287,'2014-06-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',287,'2014-08-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',288,'2014-09-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',190,'2014-11-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',288,'2014-11-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',NULL,'2014-12-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',190,'2014-12-14');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',NULL,'2015-01-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',286,'2015-01-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',286,'2015-01-22');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',164,'2015-02-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',165,'2015-02-01');

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES ('ACASC','CAL071',164,'2015-03-01');

    A possible solution using a multi-column unpivot:

    SELECT ID, Territory, Total_Used, unpvt.aggType, unpvt.[Value]

    FROM (

    SELECT ID, Territory, Total_Used, MAX([Date]) AS MaxDate, MIN([Date]) AS MinDate

    FROM @sampleData

    WHERE Total_Used IS NOT NULL

    GROUP BY ID, Territory, Total_Used

    ) AS data

    CROSS APPLY (VALUES('MAX', MaxDate),('MIN', MinDate)) AS unpvt(aggType, Value)

    -- Gianluca Sartori

  • Try

    DECLARE @sampleData TABLE(

    ID VARCHAR(5) NOT NULL

    , Territory VARCHAR(12)

    , Total_Used int

    , [Date] DATE

    );

    INSERT INTO @sampleData(ID,Territory,Total_Used,[Date]) VALUES

    ('ACASC','CAL071',287,'2014-06-01')

    , ('ACASC','CAL071',287,'2014-08-01')

    , ('ACASC','CAL071',288,'2014-09-01')

    , ('ACASC','CAL071',190,'2014-11-01')-- which is 'latest'?

    , ('ACASC','CAL071',288,'2014-11-01')--

    , ('ACASC','CAL071',NULL,'2014-12-01')

    , ('ACASC','CAL071',190,'2014-12-14')

    , ('ACASC','CAL071',NULL,'2015-01-01')

    , ('ACASC','CAL071',286,'2015-01-01')

    , ('ACASC','CAL071',286,'2015-01-22')

    , ('ACASC','CAL071',164,'2015-02-01')-- which is 'latest'?

    , ('ACASC','CAL071',165,'2015-02-01')--

    , ('ACASC','CAL071',164,'2015-03-01');

    WITH t AS (

    SELECT ID, Territory, Total_Used, [Date]

    , YEAR([Date]) as [Year], MONTH([Date]) as [Month]

    , ROW_NUMBER() OVER(PARTITION BY ID, Territory, YEAR([Date]), MONTH([Date]) ORDER BY [Date] DESC) AS rn

    FROM @sampleData)

    SELECT ID, Territory, Total_Used, [Year], [Month], [Date] as LastDate

    FROM t

    WHERE rn=1

    ORDER BY ID, Territory, [Year], [Month];

  • Txn all for your answers they have been very insightful,sorry I didn't present the data in any consumable way but its been a terrible 2 weeks.

    I'd been ill for 3 days last week (week where I was normally the only on call)(now before you say couldn't you work from home,let say I couldn't sit upright without getting my head spinning so much I nearly fainted)I spend monday-tuesday tying to catch up those 3 days & then came this assignment for which I had drop anything (so as I'm writing this I'm currently working on 3 calls simultaneously)

    Yes I know,I created the tables ect... but this was how the data was presented to me & from what I understand the system this data is coming from has no finer for date times,so sometimes it happens the same ID Territory appears twice for the same date, the lowest Total_Used should be shown.

    Txn again for your answers you all have been potential life savers

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply