Three Rules for Database Development

  • Comments posted to this topic are about the item Three Rules for Database Development

  • We can define [font="Arial Black"]whatever [/font]standards, rules, frameworks, processes, etc. [font="Arial Black"]that the group wants[/font], but once we decide, we all need to work along in the manner we've prescribed. That's worked well for me, and when everyone follow the rules we've agreed on, things run smoother.

    I've found that such "committee decisions" can be horribly wrong especially when people who might not know as much about databases outnumber those that do (think lynch mob, gang rape, and Lemming mentality combined). Everyone working the same way has some very nasty ramifications if they're all doing the same wrong thing. 😉

    Rule 1 for database development should be, "Find someone that actually knows how to design databases correctly and then follow their rules."

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (9/16/2012)


    "Find someone that actually knows how to design databases correctly and then follow their rules."

    That will be a real world implementation of Steve's rules then...

    cloudydatablog.net

  • I'm definitely happy with agreeing rules in a group and then following those rules. As long as the rules are what I want to be done and not anyone else's shoddy, half-baked ideas. 😛 Fortunately I generally have the loudest and most authoritative sounding voice...

    We use a shared database model - I can see why it might be a problem, but it's always worked for us.

  • The idea of saying that one should never do shared database development is ridiculous, especially nowadays when you can have the best of both worlds and use database source control whilst doing either shared or single-user database-development, or both. The reason for doing your development work on your own server is merely to allow you to work on different versions of the database 'at once', which means that you can re-create a version from source control in your own private server and fix a bug or add a feature, without affecting your colleagues, or damaging other versions. I've always advocated doing both types of database development, depending on the task. The private databases are fine for prototyping, trying things out and doing sandbox development, whilst the shared model gives us extremely rapid integration, since we have schemas that prevent developers from banging heads too badly whilst sharing, source control allow us to roll back from a development mistake, virtualization to allow us to work on as many versions as we can tolerate.

    I don't understand why there should be objection to the mixed-model of database development. I understand the build and integration problems that come from database application developments that chose to ignore the best-practice of having defined interfaces between application and database, but to try to anathemise or prescribe the perfectly reasonable practice of shared database development as a consequence, is like insisting that everyone should use crutches merely because you've shot yourself in the foot.

    Best wishes,
    Phil Factor

  • Dalkeith (9/17/2012)


    Jeff Moden (9/16/2012)


    "Find someone that actually knows how to design databases correctly and then follow their rules."

    That will be a real world implementation of Steve's rules then...

    Possibly. I've found that's normally not true though. I've found that even if a company does have someone who really knows how to design databases, they are, many times, simply not listened to because of schedule or notions by unqualified people on what level of normalization is actually needed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (9/16/2012)


    I've found that such "committee decisions" can be horribly wrong especially when people who might not know as much about databases outnumber those that do (think lynch mob, gang rape, and Lemming mentality combined). Everyone working the same way has some very nasty ramifications if they're all doing the same wrong thing. 😉

    Rule 1 for database development should be, "Find someone that actually knows how to design databases correctly and then follow their rules."

    This isn't about technical design decisions. It's more about process. We agree with naming conventions, we agree that there is a QA process. We agree developers will turn over unit tests or edge cases. We agree that new tables get a review by the group, or by a person.

    We decide to work in a way that complements, rather than each person deciding how they like to work.

  • Both methods have pros and cons, and probably both methods together are best. The last development job I had worked well this way.

    When a developer is starting to modify or add code, and needs to change table structures, he is going to need to work on a database in order to make those changes. If during that time someone else is working on another bit of code that just happens to rely on the table structure, the second developer might end up chasing his tail trying to figure out why something isn't working, only to find out it was because developer one made a change. Not very efficient. We can't say the code should have been checked in because 1 is still working on it. In this scenario I think working on your own private copy UNTIL you get things stable is best.

    Whenever system regression testing occus, it should be on a shared copy. You don't want to test on your own copy and miss changes others have made. So whomever is doing that type of work needs to have a version containing all the changes.

    As with most things, it depends. Teamwork is key. After that, as long as everyone understands the guidelines, you can work through issues, but in the end I don't see how you can reliably test things without using a shared copy (or whatever name you want to give it) that contains all the code and changes that developers are finished with. Um, I mean, that developers THINK they are finished with. Testing sometimes finds bugs. Occasionally. Once in a while. Maybe only one or two...

    Dave

  • My set of three rules is somewhat different - I say

    0) Develop on a full size set of data that is as similar to production as possible. In particular, every wart, mole, blemish, and data violation that exist in production data should exist in your development and test environments.

    2) and 3) are fine.

    1) is something I personally believe in, but if your production environment is "large", 0) tends to cause 1) to be expensive. I believe in it if and only if each developer has their environment, and they can make at least one backup/snapshots of it, and restore them on their own. That lets them create wonky test cases as often as they like, and operate in a "frozen" environment when debugging certain tricky interaction effects.

    1) also requires a shared integration development/test environment, of course.

  • Some wonder why I install SQL Server on my local box and develop against that version first. It is obvious, never develop the initial product in a shared environment is critical. I also like to have complete administrative control of the data server for the early life of the project, that way I can do what I need when I need it, and it is controlled only on my box. SO I guess I go one step further, never develop on a shared data server and only on one that you have complete control over. If possible.

    The others a fine.

    M.

    Not all gray hairs are Dinosaurs!

  • Miles Neale (9/17/2012)


    Some wonder why I install SQL Server on my local box and develop against that version first. It is obvious, never develop the initial product in a shared environment is critical. I also like to have complete administrative control of the data server for the early life of the project, that way I can do what I need when I need it, and it is controlled only on my box. SO I guess I go one step further, never develop on a shared data server and only on one that you have complete control over. If possible.

    The others a fine.

    M.

    I can understand if it's a single use, small app. That said- with large databases with a team of 20 developing against it, having 20 copies of the DB each being modified "under full control" of 20 separate devs each with their own opinions is chaos. Without knowing early on that the SAME component needs to satisfy two separate enhancements, all you will end up with is a deployment tug of war.

    Instead - we've embraced CI and solid check-out/check-in processes. Check out before you modify, check in once you're done and if need be, synchronize and/or normalizae your changes with any source control conflicts that may arise. At least you can ensure that the relevant devs are talking to each other (along with the instructions to come find an architect if some major conflict is found).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt, you are absolutely correct. With many hands on one database there is chaos. But that is not the case here. I have a luxury or two that most do not have.

    Not all gray hairs are Dinosaurs!

  • Steve Jones - SSC Editor (9/17/2012)


    Jeff Moden (9/16/2012)


    I've found that such "committee decisions" can be horribly wrong especially when people who might not know as much about databases outnumber those that do (think lynch mob, gang rape, and Lemming mentality combined). Everyone working the same way has some very nasty ramifications if they're all doing the same wrong thing. 😉

    Rule 1 for database development should be, "Find someone that actually knows how to design databases correctly and then follow their rules."

    This isn't about technical design decisions. It's more about process. We agree with naming conventions, we agree that there is a QA process. We agree developers will turn over unit tests or edge cases. We agree that new tables get a review by the group, or by a person.

    We decide to work in a way that complements, rather than each person deciding how they like to work.

    Ah... got it. Thanks for the feedback, Steve.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There are a variety of reasons not to develop on a shared database. As others have already mentioned, when you are developing locally you can avoid all sorts of collisions between developers. If you are using MS database projects to do your development, you have a complete copy of your schema under source control and can check out objects and isolate your code changes. It also means that your code does not need to compile when you save an instance of it, which is helpful if you need to shelve your current work. In addition you get static code review and much better ability to handle refactoring your database.

    The key factor is unit testing. To conduct efficient testing, you should be able to scrap the data, autogenerate more, rename and recreate objects. Database projects allow you to incorporate database unit tests within the same solution so that you can have a tightly integrated base of code.

    Database projects allow for an easy way to version up or version down any schema. Your team can choose the best way to go about this, but I think more importantly, the implementation of schema should be a separate thing from the definition of that schema. It allows for a much more Agile development environment in which are can be constantly at the ready to implement your code changes.

  • I may just be old school, but I can't STAND not having an integrated development environment amongst all the developers. I guess the idea is that your database needs a 'trunk', just like everyone else's front end stuff.

    How do you test for parallelism on your local box? How do you test the real effects? How can you see what your IO waits will really look like and not just make the excuse "Well it's running off my local IDE, the server will be better"? I work against half tera to terabye level systems in some cases, am I supposed to constantly restore to the 'base state' on my local system? Hell no.

    I'm also not a particular fan of TFS, SourceSafe, or the rest in the database world. I'm VERY frickin' biased against this, particularly because of the idea of Continuous Integration and Databases do not meld well. Oh, sure, store yourself a version control so you can see what was and what is, no problem. It's also rather handy for lockouts so multiple developers don't step on the same object simultaneously. We used to use Excel spreadsheets for that.

    However, in my opinion, TFS/SS/etc do not belong in the actual construction of the change scripts though. I don't want to 'have to inherit' off the TFS system for databases, particularly schema and view level changes where you're talking possible massive data manipulations that might be impossible to back out.

    The shared dev environment contains all the currently working and used code from all developers, allowing for their modifications to directly affect yours during the workflow and development cycle. Allowing dependent changes to affect the optimization testing of items downstream.

    I've never worked with a QA department that had a chance in hell of testing for that level of optimization or integration of multiple components. They're mostly concerned with end results, not how its achieved and they certainly don't have the expertise to figure this stuff out. It's our jobs as developers to do that and without that integrated dev environment we don't have a shot in heck of pulling it off.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply