Bad habit #2: Prod data -->> dev, test databases

  • Bad habit: we generally copy our production databases into test and dev environments to supply "real world" data for testing. Our developers complain that if we used test data they could not verify (short of production) if their fix actually resolved the issue.

    How do you deal with this?

    TIA,

    Barkingdog

  • I've been losing this fight for years. Just recently we had the financial people get involved. When they found out that the developers couldn't guarantee consistent tests since the production data was changing all the time, the finance people insisted on developing test data that ensures repeatable tests. But it's still a struggle becuase we only convinced one team of developers. Now we have to convince the rest.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • This shouldn't be considered a bad habit.

    In a three tier deployment model the "test" better saying UAT version of any given database should be a production size database refreshed from production either on a fix schedule or on demand.

    In a four tier deployment model most likely both UAT and Stage versions should be production size, etc.

    Having said that there is no reason for Development version to be nothing but a token version - Development is for development, not for testing.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Also be aware that depending on where you live and the laws in your country, using certain production data for software testing can be illegal. Here in the UK in can mean a breach of the data protection act if 'sensitive' information is used for testing, though it is grey area it is worth considering depending on what sort of information your dbs hold.

  • Barkingdog (5/13/2009)


    Bad habit: we generally copy our production databases into test and dev environments to supply "real world" data for testing. Our developers complain that if we used test data they could not verify (short of production) if their fix actually resolved the issue.

    How do you deal with this?

    TIA,

    Barkingdog

    We do more or less the same thing, although the refresh only happens once a quarter or so. Once we do the move back to test, we clear out all credit card information, etc.

  • There needs to be controls on the production data. If it must be used for testing, and sometimes it must be, then it needs to be controlled, and you shouldn't allow any developer access to it. It should be a limited number, and you note it so if there are issues, you know who had access to it.

    For general development, even fixes, I'd argue that they need to know the issue, but not the data to development something. To verify their fix worked, you need another environment, but in a company of more than I'd guess 100 people, you need a person (not a developer) that can test and determine if the fix worked.

  • Steve Jones - Editor (5/13/2009)you need a person (not a developer)

    you made my day 😀

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • :), whoops.

    not implying developers are not people, but you need a person that is not a developer.

  • Steve Jones - Editor (5/13/2009)


    :), whoops.

    not implying developers are not people, but you need a person that is not a developer.

    Actually, I think most people would be fine with that 🙂

    You often see signs on the doors of developer sections at "geek friendly" companies such as "Don't feed the monkeys".

    --
    Andrew Hatfield

  • I'm going to disagree with a lot of this, and say "It Depends". If you have a database that is 200MB of data for a home grown app with data that isn't the least bit sensitive, why wouldn't you copy it to Dev or Test? It certainly makes it far easier for the developer to work with real data. I'd much rather have the developer find issues in their development environment during unit testing than have to move the code up to a Test or Quality Control environment and then find an issue because of existing prod data or performance due to data values or size of the Prod database. The earlier you can realistically unit test and performance tune an app, the less issues you are going to have in production.

    I don't know about you guys, but my goal is to minimize production issues.

    Certainly there are caveats if there are space constraints, or if there is sensitive data in the database. But if those can be dealth with, I'd rather have real data as low down the food chain as possible.

  • 1. Set up a Staging environment to restore the PROD backup;

    2. Develop a scrubbing procedure to erase those sensitive data/columns with some random data on the Staging DB

    3. Backup the Scrubbed DB

    4. Restore the scrubbed backup to the Dev Env.

    The Key is to identify WHAT DATA are sensitive and get resources for setting up the Staging area.

  • The issue isn't just the possible sensitivity of the production data, it's also the fact that you cannot do regression testing on data sets that are not fixed.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Steve,

    I agree with you. Privacy of sensitive data is forieted when it appears in test and dev. I think that's nothing but trouble and one day, if \when the auditors catch on to this, there will be hell to pay.

    Barkingdog

  • Good point:

    >>>>>

    The earlier you can realistically unit test and performance tune an app, the less issues you are going to have in production.

    >>>>>

    Of course while it may not match production data RedGat'es data generator (and others) is a good start in my opinion.

    Barkingdog

  • And the size of the databases in production keep growing so you have to have 5-10 times as much space in dev since you have multiple copies of everything. Then the times on the database refreshes are growing longer & longer as those larger databases get moved down. Also the times on clean-up scripts keep going up & up.... It doesn't end.

    I do agree. You have to have some places where you're working on real production data either for testing deployments such as a good staging environment, or for load & performance testing against what's already in production. However, these are the exceptions that should prove the rule. Most development just doesn't need to be done against production sized databases.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 1 through 15 (of 37 total)

You must be logged in to reply to this topic. Login to reply