Convert YTD data to Monthly data

  • Please can anyone help with converting a table with year-to-date (YTD) to monthly data.

    Test data:

    CREATE TABLE ytdExample (
    [Year]VARCHAR(2)
    ,[Month]VARCHAR(2)
    ,[SiteCode]VARCHAR(5)
    ,[AccountCode]VARCHAR(4)
    ,[YTDValue]INT
    )

    INSERT INTO ytdExample ([Year],[Month],[SiteCode],[AccountCode],[YTDValue])
    VALUES
    ('19','01','SITE1','ACC1','1')
    ,('19','02','SITE1','ACC1','2')
    ,('19','03','SITE1','ACC1','4')
    ,('19','04','SITE1','ACC1','5')
    ,('19','05','SITE1','ACC1','7')
    ,('19','06','SITE1','ACC1','10')
    ,('19','07','SITE1','ACC1','12')
    ,('19','08','SITE1','ACC1','13')
    ,('19','09','SITE1','ACC1','14')
    ,('19','10','SITE1','ACC1','16')
    ,('19','11','SITE1','ACC1','17')
    ,('19','12','SITE1','ACC1','19')
    ,('19','01','SITE1','ACC2','3')
    ,('19','02','SITE1','ACC2','10')
    ,('19','03','SITE1','ACC2','11')
    ,('19','04','SITE1','ACC2','12')
    ,('19','05','SITE1','ACC2','15')
    ,('19','06','SITE1','ACC2','18')
    ,('19','07','SITE1','ACC2','20')
    ,('19','08','SITE1','ACC2','21')
    ,('19','09','SITE1','ACC2','22')
    ,('19','10','SITE1','ACC2','24')
    ,('19','11','SITE1','ACC2','25')
    ,('19','12','SITE1','ACC2','26')

    I have the following test data (above) that looks like the following:

    ytd-example1

    The data is in YTD format and ideally what i am looking for is an additional column that converts to monthly data, for example:

    ytd-example2

    I have seen several examples of converting monthly data tables into YTD but nothing that i can un-pick to achieve what i require above.

    I'm sure this shouldn't be as hard as my novice brain thinks it is but i am struggling.

    Please can you help.

    Thanks in advance.

  • First, thank you for posting the readily consumable data that you did.  It made it a pleasure to help.

    The following code will do the trick...

     SELECT * 
    ,MonthlyValue = YTDValue-LAG(YTDValue,1,0) OVER (PARTITION BY SiteCode,AccountCode,[Year] ORDER BY [Month])
    FROM dbo.ytdExample
    ORDER BY SiteCode,AccountCode,[Year],[Month]
    ;

    Please see the following link for the MS documentation on the subject.  There's also a LEAD function that you should also read about.

    https://docs.microsoft.com/en-us/sql/t-sql/functions/lag-transact-sql?view=sql-server-ver15

    The following Google search will also scare up some interesting articles on the subject.

    https://www.google.com/search?q=LAG+function+in+SQL+Server

     

    • This reply was modified 3 years, 11 months ago by  Jeff Moden. Reason: Added partitioning by year, which is a bit necessary for YTD to Monthly calculations. :D

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The first thing we need to do is correct your DDL. By definition, a table must have a key; this is not an option! Since all of your columns are NULL, there is no way that it is possible for this non-table to have a key.

    A year-month combination is called an interval data type in SQL, but Microsoft does not yet support these things, so we need to fake it. You might also want to read the ISO standards on temporal data types. The year is defined as four digits, not one or two as you have. I've also never seen a variable length 'ACCount code in any 'ACCounting system which I worked on the last 50 years. Splitting the month away from the year is a design flaw called attribute splitting. Both those columns are part of a single measurement on a temporal scale.

    I always have flashbacks and I see someone putting a comma on the front of each line. We used to do this back in the 1960s with punchcards so that we could rearrange the deck easily or reuse the cards. It's really a bad formatting practice

    CREATE TABLE Bikini_Sales

    (sales_month CHAR(10) NOT NULL PRIMARY KEY

    CHECK (sales_year_month LIKE '[12][09][09][09]-[09][09]-00'),

    account_code CHAR(4) NOT NULL,

    ytd_sales_count INTEGER NOT NULL DEFAULT 0 );

    I decided to make this little more interesting by giving the table a funny name. The site ID is constant in all the roads, so let's just drop it in this example. I am using the MySQL convention of putting double zeros in the month and date fields (the field is a part of a column that has partial meaning in itself, and it is not a column). This is not an ANSI/ISO standard yet, but it is up for consideration and is found in use because the popularity of MySQL. The advantages are it is language independent, and sorts with the ISO – 8601 formatted dates. Why did you enter numeric values as character strings? Do you really like the extra overhead's might cause? Did you notice the key, the default and the constraint? Most of the work in SQL is done in the DDL and if you have good clean DDL. The DML will practically write itself

    INSERT INTO Bikini_Sales

    VALUES

    ('2019-01-00', 'ACC1', 1),

    ('2019-02-00', 'ACC1', 2),

    ('2019-03-00', 'ACC1', 4),

    ('2019-04-00', 'ACC1', 5),

    ('2019-05-00', 'ACC1', 7),

    ('2019-06-00', 'ACC1', 10),

    ('2019-07-00', 'ACC1', 12),

    ('2019-08-00', 'ACC1', 13),

    ('2019-09-00', 'ACC1', 14),

    ('2019-10-00', 'ACC1', 16),

    ('2019-11-00', 'ACC1', 17),

    ('2019-12-00', 'ACC1', 19),

    ('2019-01-00', 'ACC2', 3),

    ('2019-02-00', 'ACC2', 10),

    ('2019-03-00', 'ACC2', 11),

    ('2019-04-00', 'ACC2', 12),

    ('2019-05-00', 'ACC2', 15),

    ('2019-06-00', 'ACC2', 18),

    ('2019-07-00', 'ACC2', 20),

    ('2019-08-00', 'ACC2', 21),

    ('2019-09-00', 'ACC2', 22),

    ('2019-10-00', 'ACC2', 24),

    ('2019-11-00', 'ACC2', 25),

    ('2019-12-00', 'ACC2', 26);

    Having to compute backwards from the year to date total is a little weird. One of the rules of a good database design is that you don't store computations, especially computations that go over different levels of aggregations. You will find that you have to redo them or undo them like you have now.

    CREATE VIEW Monthly_Bikini_Sales

    AS

    SELECT account_code, sales_month,

    ytd_sales_count - LAG (ytd_sales_count,1,0)

    OVER (PARTITION BY account_code ORDER BY sales_month)

    FROM Bikini_Sales;

    The lead () and lag () functions are used for this sort of thing. One looks at previous rows in an ordering and the other one looks ahead within the partition. The over() is called a Windows clause.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • It's probably not a "table", Joe.  It's probably something that only looks like a table used to hold the results of a reporting query from somewhere.  Having VARCHAR(2) for both the month and year (should be CHAR(2), though) is strong evidence of that not to mention that it does have a running total buried in it.  It might not even exist... it might just be that that OP simply tried to make it easier for his problem to be understood.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply