Consistent Data Presentation

  • tabinsc (4/20/2011)


    I would argue that sometimes you may have redundant data in your database to make reporting easier and faster.

    Right. There are all kinds of edge cases that can be sighted to buck just about any best practice. That's why I tried to word the question loosely so the answer would be obvious while still leaving room for the edge cases.

    "...which of these practices are most likely..."

    Still, having redundant data may make reporting "easier and faster", in some cases. But easier and faster does not necessarily equate to "consistent data presentation". 😉

  • tabinsc (4/20/2011)


    I would argue that sometimes you may have redundant data in your database to make reporting easier and faster.

    There is a difference between redundant data in one table (like a dimension) and redundant data between different tables.

    I would argue that the last one should be avoided, where the first one is ideal for reporting.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I haven't proven this correct yet, but I am thinking redundant data in two tables makes sense if you need that data to join the tables properly and avoid table/index scans in the execution plan. You could create indexes on the redundant columns to force index seeks on both tables.

    Tony
    ------------------------------------
    Are you suggesting coconuts migrate?

  • tabinsc (4/20/2011)


    I haven't proven this correct yet, but I am thinking redundant data in two tables makes sense if you need that data to join the tables properly and avoid table/index scans in the execution plan. You could create indexes on the redundant columns to force index seeks on both tables.

    My initial reaction to this statement was going to be "well that explains the percentage of incorrect answers". :hehe:

    But to answer more seriously, I'd say it sounds like another edge case scenario. In general, I would try to use a primary key foreign key relationship to join the tables, or a cross-reference table that uses a key from each table make the connection, or any other method that doesn't involve storing, indexing and joining on redundant data with no constraints that ensure data integrity.

  • Excellent straight forward question.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • I was worried this was going to be tricky as well, but the selectable answers pretty much eliminated any of those doubts.

    Off topic, but one thing that bothers me in database design is when there are unnecessary columns in a table which contribute to confusion. For example, we have a third-party database with a table called [Procedures] and another table called [ProcedureFees]. Simple enough -- except that the [Procedures] table contains columns like "Fee1" and "Fee2". :crazy:

    These columns, I believe, are from a previous iteration of the database, but were never cleaned up. I am 99% sure that the fees are pulled from the [ProcedureFees] table, but there's still that 1% which makes me wonder if, in their in-line code or in one of the many stored procedures, the fee is pulled from the [Procedures] table.

    Like I said, off topic, but one of my pet peeves.

  • I was surprised that as many as 5-7% of people missed this. It seemed to be a question that anyone with logic could answer.

    But a great question because it reminds us how to do design correctly!

    Peter Trast
    Microsoft Certified ...(insert many literal strings here)
    Microsoft Design Architect with Alexander Open Systems

  • Great Question Dave!

    I am not suprised by the percentage of correct answers at this time.

    Some truths realy are self evident.

    :smooooth:

  • SanDroid (4/20/2011)


    Great Question Dave!

    I am not suprised by the percentage of correct answers at this time.

    Some truths realy are self evident.

    😎

    Emoticon fail... lol... so I fixed it.

    Peter Trast
    Microsoft Certified ...(insert many literal strings here)
    Microsoft Design Architect with Alexander Open Systems

  • Koen Verbeeck (4/20/2011)


    Dave62 (4/20/2011)


    The inspiration for this question came from:

    ...

    The number of Points (won) - 599 - for me near the top of the Question of the Day Breakdown table does not match my Score - 603 - at the bottom of the Question of the Day table. I have no idea why they differ. It may have nothing to do with the answers in this question. But it did get me thinking about designing for data consistency so I thought I'd submit the question.

    Ah, so I'm not the only whose points don't match 😀

    It is not necessarily a redundancy issue, if those points are calculated facts. But that means they are calculated twice, and one of them has an error in the calculation...

    Could it be due to points being given back for questions we missed in the past due to bad wording of QOTD questions? Maybe there are two places it needed to be updated but only one was.

  • Peter Trast (4/20/2011)


    SanDroid (4/20/2011)


    Great Question Dave!

    I am not suprised by the percentage of correct answers at this time.

    Some truths realy are self evident.

    😎

    Emoticon fail... lol... so I fixed it.

    Ummmm thanks Peter, but that is the "cool" emoticon, I meant to put "smooooth" or :smooooth:

  • Yes, this was easy to answer. But I suspect that among all those who answered correctly, there would be a good percentage who may not follow the stated practices, especially the one about redundant processes. It's so easy to just code up your calculation or validation without looking to see if it's already done somewhere else, maybe somewhere not isolated as a separate module.

    The lesson of this QOD may not so much be "know these best practices", but "think about whether you and your team follow these best practices". Otherwise, it would be like asking "Does a stop sign mean come to a full stop and checking for clear passage before proceeding?". I'd guess that 95% of drivers would answer "yes" on a license test, but how many actually do that?

  • I think it's a good question, in that it's extremely obvous from the options given what the required answers are - so much so in fact that it's difficult to understand how 14% came up with wrong answers - and the answers are a sound theoretical response to the question.

    But in the real world there are often considerations that tend to lead to different answers, and I'm surprised that out of nearly a thousand people who have answered none have taken a serious stab at raising those considerations.

    On the database side, the page referenced makes the comment that one should normalise first and then denormalise to make it work; why denormalise (introduce redundncies in the data)? Answer: so that it will be possible to obtain answers in an acceptable time / with acceptable use of CPU resources. This is pretty common, not a strange edge case. And how often do we see a UNIQUE constraint or index where none of the affected columns is nullable? Every time we see that, we know that there is deliberately introduced redundancy in the schema (but maybe this one is a rare edge case).

    Also on data redundancy, what if what I am doing is logging messages? This can introduce redundancy for all sorts of reasons - for example the messages can be those passing between systems that rely on a "tell me thrice" rule to assist in sanity checks, they can be passing through an extremely noisy channel where there is no back channel to provide acknowledgement or flow control so that each message is not only redundantly coded but transmitted several times, and even with clean channels in both directions the message collection will often be redundant and a bulk redundancy-elimination scheme to sort out common substrings across different messages will be a performance nightmare if it is designed to eliminate all redundancy.

    On modularity, what if I need to use macros or in-line functions to obtain performance? Now the same code appears in many places in the delivered product, but I don't lose any consistency provided my build and update system ensures that when something changes all affected object modules are included in the rebuild. It can be an awful pain doing it manually without an automatic rebuild system, of course, but not doing it can mean unacceptable delays instead of acceptable response times for end users. The overhead of context switching between separate modules can be crippling.

    edit: correct English

    Tom

  • Great question. Makes one think back to how design should be in a perfect world. And I didn't even have to use elimination becasue they were definitely not and definitely So answers. thanks

  • SanDroid (4/21/2011)


    Peter Trast (4/20/2011)


    SanDroid (4/20/2011)


    Great Question Dave!

    I am not suprised by the percentage of correct answers at this time.

    Some truths realy are self evident.

    😎

    Emoticon fail... lol... so I fixed it.

    Ummmm thanks Peter, but that is the "cool" emoticon, I meant to put "smooooth" or :smooooth:

    Thus the irony of my comment... 🙂

    Peter Trast
    Microsoft Certified ...(insert many literal strings here)
    Microsoft Design Architect with Alexander Open Systems

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply