Update Rolling Date Flags In a Table

  • The analysts I work with have created multiple tables where they structure them so there is data in majority of the table but then they tack on these date fields at the end which are designed to show a varying time period. For example, there might be columns to show current month, prior month, current quarter, prior quarter, current year, prior year, prior 90 days, etcโ€ฆ which are then given the value 1 or 0 based on a date provided. The problem comes with how the updates are handled to these columns. Currently they either perform a massive update to these columns for all records (tables can be as large as 17 million rows) after the data portion is inserted or they truncate the table and insert the data back into it using case statements to handle updating the date flags accordingly. Essentially the โ€œdataโ€ portion of these tables will not change once it is inserted. The only thing that changes are the date flags at the tail end. Is there an efficient way to update all of these date flags without having to update the dates that might not have changed from the prior day, month, year, etc...?

  • Hey Ron, it looks like you've had quite a few reads on your post but no one has ventured to answer your question yet. It may help everyone if you posted your table DDL and some sample data. It would make it easier for everyone to dig into your questions...

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Sorry for the rather vague post. Here is some data that will hopefully provide more information. Say you start with table abc as follows CREATE TABLE [dbo].[abc](

    Product varchar(50) NULL,

    AppNumber varchar(20) NULL,

    AcctNumber bigint NULL,

    Date1 datetime NULL,

    .

    .

    .

    .

    [Today] [int] NULL,

    [MTD] [int] NULL,

    [QTD] [int] NULL,

    [YTD] [int] NULL,

    [DayPrior] [int] NULL,

    )

    Data in that table is cumulative (holds history from beginning of time) and there is a check to delete records where AppNumber exists and has a Date1 that falls between the first day of the current month and the last day of the current month. New data is then inserted into that table for all of the fields except the date flag fields at the end which are given null values during the insert. This is where the insert comes in to populate those fields Update dbo.abc

    Set Today = case when Date1 = @date then 1 else 0 end

    ,MTD = case when Date1 between @MTD_Start and @MTD_End then 1 else 0 end

    ,QTD = case when Date1 between @QTD_Start and @QTD_End then 1 else 0 end

    ,YTD = case when Date1 between @YTD_Start and @YTD_End then 1 else 0 end

    ,DayPrior = case when Date1 = dateadd(d, -1, @date) then 1 else 0 end

    Is there an efficient way to update those fields that were not deleted but might need updated such as MTD or QTD?

  • Thanks Ron. You stated your mail problem in the first sentence in your first post. "The Analysts I work with created tables...". Let me guess, they added these flags on the tables so they could select data out for reporting by day, MTD, QTD, YTD, etc and aggregate the values using those flags. Bad idea to have flag values like this in your table, especially considering that you have to constantly keep these up to date. Like you said, your data does not change, but these flag values are ever-changing. And as you've learned, this becomes unmanageable as your table becomes large.

    I would recommend dropping the flags from this table. I would changes your Date1 column to an integer date with the format of YYYYMMDD and then create yourself a date dimension (table) that is keyed with the same integer date key. Your dates dimension should contain all of your Day, Week, Month, Quarter, Year, etc date attributes that you want to aggregate your values on. You do all of your date math on the dimension and then join the dimension back into your transactions table to get the appropriate rows. Make sense?

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Welcome to the life of "make the analysts lives better, but don't add additional steps to their existing processes." ๐Ÿ™‚ I'm sure we have all been there. Anyway, breaking the fields out into something that resembles fact and dimension tables was an idea I threw out to them but it wasn't well received. It looks like I'm going to have to make a harder sell on that. I wanted to make sure I wasn't overlooking something to keep their tables intact like using a merge. Either way I don't think they would see the type of performance they are expecting with the current structure. Thanks for the help, John!

  • Glad to help. I'd definitely get your sales hat on. It's hard when analysts and business folks try to dictate data structures and how we store and serve up data to them. It's outside of the scope of their expertise. If they were good at it, they'd be data architects and not data analysts. In your case, if your analysts knew anything about data modeling, you would not be asking this question because your existing model would have never been created. Analysts think in terms of spreadsheets & pivots, not tables & relationships.

    I'm of the mind that the end users should not know or care how the data is stored in the database. All they need is a reliable result set. With that said, I know some analysts are knowledgeable enough to query for their own results so we run into issues when we need to change schema and people have saved off queries/reports that will no longer work as-is when schema changes.

    With this problem in mind, I'd recommend creating views for your end users. You can re-model this solution on the back end in the database, but then give them a view that serves it up to them with the old table format. You could even go as far as to give your tables a new name and name the view the same name as the old table. If the view has the same name and structure, and gives the same results, they'll never know the difference! Whether the results came from a table or view should not matter to them as long as they can press F5 and get what they need. Just my $.02.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • John Rowan (4/7/2016)


    Glad to help. I'd definitely get your sales hat on. It's hard when analysts and business folks try to dictate data structures and how we store and serve up data to them. It's outside of the scope of their expertise. If they were good at it, they'd be data architects and not data analysts. In your case, if your analysts knew anything about data modeling, you would not be asking this question because your existing model would have never been created. Analysts think in terms of spreadsheets & pivots, not tables & relationships.

    I'm of the mind that the end users should not know or care how the data is stored in the database. All they need is a reliable result set. With that said, I know some analysts are knowledgeable enough to query for their own results so we run into issues when we need to change schema and people have saved off queries/reports that will no longer work as-is when schema changes.

    With this problem in mind, I'd recommend creating views for your end users. You can re-model this solution on the back end in the database, but then give them a view that serves it up to them with the old table format. You could even go as far as to give your tables a new name and name the view the same name as the old table. If the view has the same name and structure, and gives the same results, they'll never know the difference! Whether the results came from a table or view should not matter to them as long as they can press F5 and get what they need. Just my $.02.

    Well said, John. I agree with every word.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • John Rowan (4/7/2016)


    With this problem in mind, I'd recommend creating views for your end users. You can re-model this solution on the back end in the database, but then give them a view that serves it up to them with the old table format. You could even go as far as to give your tables a new name and name the view the same name as the old table. If the view has the same name and structure, and gives the same results, they'll never know the difference!

    Incredible. That's exactly what I did with one of our horribly built audit tables a couple of years ago. Users really kicked up a fuss at the proposed rework. The view with the original table name solved it all.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/8/2016)


    John Rowan (4/7/2016)


    With this problem in mind, I'd recommend creating views for your end users. You can re-model this solution on the back end in the database, but then give them a view that serves it up to them with the old table format. You could even go as far as to give your tables a new name and name the view the same name as the old table. If the view has the same name and structure, and gives the same results, they'll never know the difference!

    Incredible. That's exactly what I did with one of our horribly built audit tables a couple of years ago. Users really kicked up a fuss at the proposed rework. The view with the original table name solved it all.

    Thanks Jeff...it's good to know that my proposed solutions are exactly what you'd do! I use this type of scenario often to explain to users/analysts who argue with proposed remodeling efforts. I ask them why they care if what they put in their FROM clause is a table, view, function, etc. When it hits them that it doesn't matter as long as they get the results they want, I can use the concept of abstraction to keep them from thinking they should be involved in database architecture level decisions. Then, when they come up and say, "We need a new table for a new process and we want it to have these columns..." I can design it any way I want (cuz it could be 2, 3, 4 tables) and then just create the view the way the asked for the original table. Bam! All smoke and mirrors baby...

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • John Rowan (4/8/2016)


    Jeff Moden (4/8/2016)


    John Rowan (4/7/2016)


    With this problem in mind, I'd recommend creating views for your end users. You can re-model this solution on the back end in the database, but then give them a view that serves it up to them with the old table format. You could even go as far as to give your tables a new name and name the view the same name as the old table. If the view has the same name and structure, and gives the same results, they'll never know the difference!

    Incredible. That's exactly what I did with one of our horribly built audit tables a couple of years ago. Users really kicked up a fuss at the proposed rework. The view with the original table name solved it all.

    Thanks Jeff...it's good to know that my proposed solutions are exactly what you'd do! I use this type of scenario often to explain to users/analysts who argue with proposed remodeling efforts. I ask them why they care if what they put in their FROM clause is a table, view, function, etc. When it hits them that it doesn't matter as long as they get the results they want, I can use the concept of abstraction to keep them from thinking they should be involved in database architecture level decisions. Then, when they come up and say, "We need a new table for a new process and we want it to have these columns..." I can design it any way I want (cuz it could be 2, 3, 4 tables) and then just create the view the way the asked for the original table. Bam! All smoke and mirrors baby...

    It's a similar argument I use for using stored procedures instead of embedded managed code. It's a highly beneficial level of abstraction that every object oriented programmer/user should embrace instead of railing against.

    It's also one of the reasons why I don't prefix my object names with the object type. It's a bugger having a view named "tbl_Audit". ๐Ÿ˜€

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply