Untouched Data

  • Donald,

    In 80% of cases you have no say as to store documents\images in or outside of the database. The third-party applications handle things its own way.

    Also, storing in the database or outside the database does not resolve the SW problem: what SW you will open this file with after you take it from the database? You hope that doc or jpg files from today will be readable in 20 years?

    Regards,Yelena Varsha

  • Forum Newbie: Does not matter if you are private you still need to keep corporate financial data and transactional data for 7 years (5 years for IRS, 7 for legal statue of limitations for fraus in the US).

    If we handled the financial data I would have mentioned that, Grasshopper :doze:

  • Donald Burr (7/16/2008)


    Frankley, with the cost of storage as low as it is in my opinion it makes little sense to not keep every thing online, unless you are talking about petabytes, but even chances are that if you have that much data you will have the infrastructure and budget to deal with it. ...

    Yep, absolute agreement. Frightening to think, but I've been working in government IT, at one level or another, off and on, for 20 years. I've read a lot of trade rags over that time, and HSM (along with the paperless office, which I've never personally worked in) is always 'right around the corner'. And I remember reading a fantastic article about solid state hard drives 15 years ago.

    The concept is great, but the dropping price of storage negates it. It is much cheaper to add disk and keep it all online than it is to go to an HSM system. And if you add an HSM system, then you've got another system to manage!

    But it's a definitely valid point: how valuable is a Word document from 10 years ago? (ignoring whether you can read it or not) If you're in government, it may be critical. I have an Oracle 7 box on NT 4 whirring merrily away about 15' away from me right now, and don't get me started on the AS/400 tucked back in the corner. At least the AS/400 is (on rare occasion) active, the Oracle box is strictly archive. But we have to keep the data on it (construction/building permits) for either 25 or 75 years, I don't remember. So at some indeterminate point in the future, I'll be sucking it into SQL Server and writing an Access query/report front end for it.

    Personally, if I ruled our network world, every December/January we'd set up a new structure for fresh documents labeled with that year. When a directory went to 5 years old, it'd be zipped. Anything older than 2 years old would go to read-only status. If it's fresher than 2 years old or an active document, it gets brought forward.

    But data in a database is more difficult. We have a policy in our ERP system of never deleting data, as a result it went from 2 gigabytes to 15 in the last 12 month period, and as we bring new functions online it will grow faster. The storage system that it is on currently has enough space for probably 5 years, but the drives will have to be replaced long before that, and they'll be replaced with higher density drives resulting in more storage, so we shouldn't have a problem coping with growth.

    Our other databases? The data in our GIS system doesn't really grow stale. Police recruit training/testing? Probably need to keep that current for the duration of the career of that officer for law suit purposes. Blackberry? Users should purge that regularly or we should force the purge. Server/network performance data? Should be able to summarize that and purge it every year or so.

    The answer to how long we should keep stuff, as is so often said here, is It Depends.

    And Steve, I managed two Plasmon optical jukeboxes. They were dreams, perhaps the most reliable equipment that I've dealt with. SQL Server handled the indexing so Document X could be pulled from Disc Y. Sorry you had a bad experience, mine was great!

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • Clearly if you are using SW that is 3rd party and you have not choice, then the point is moot; but I will submit that where image management in an enterprise in concerned your 80% number is WAY off. Firstly, most current 3rd party enterprise apps allow for BLOB storage, secondly, you can archive your images to a database so that you are not dealing with 1000's (and in our case millions) of files. Then where you have control, every modern language allows you to stream binary content and render in memory without ever touching the file system, which is ideal.

    Noob... You made a blanket statement "Luckily, I work for a private company - so we're not dictated by legislation on how long we can keep data." which on its face is incorrect. I merely responsed to your comment about the how long a private company should keep "DATA", whcih without quialification by default includes financial data, so "Grasshopper" say what you mean.

  • I think I had an HP jukebox, circa 95, and it was always flaky.

    I'm not sure the cost of storage drops to make up for the cost of storing the data. Data center power usage is starting to cost more than the equipment now, which means that adding an extra TB to keep drives online is making less and less sense. Add in cooling, and this starts to become a big issue.

  • Bingo.. you nailed it.. In most cases the falling cost and growong capability of storage will outpace the accumulation of data, and with SAS and FAS the ability to grow is nearly limitless. Also, keep in mind that you can store compressed data as BLOBs just as easily as anything else; and databases are FAR more capabile at storing data than any file system.

    PS

    Rather than an access front-end, can you make it web based?

  • Donald, as far as going with a web interface, I don't really do ASP.NET right now. When the time comes, maybe I'll be able to revisit it. I did an Access system for our permits people to query against our ERP database, and it's a beauty. If they need a new report, I can do it in an hour or so if I don't have to do a lot of query work behind the scene and deploy it without touching any PCs. In what little work I've done in ASP, I can't get the level of layout precision that I need. I'll be studying reporting services in the not distant future, so that might be a possibility.

    Steve, I don't know that the power cost is that much higher. What we would need to do is compare amps drawn versus storage capacity to see how much the power requirements have changed over time. How much have increased power consumption been because of hotter and faster CPUs versus disk? I'd like to know the answer, I wonder if Adaptec and others have historical power data on their sites, I'll have to take a look. And an HP jukebox? You have my sympathies. 😀 I'm not much of a fan of HP equipment, in fact I don't own any and we only use them for laser printers here.

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • Noob... You made a blanket statement "Luckily, I work for a private company - so we're not dictated by legislation on how long we can keep data." which on its face is incorrect. I merely responsed to your comment about the how long a private company should keep "DATA", whcih without quialification by default includes financial data, so "Grasshopper" say what you mean.

    Check yourself friend.

    Surely if that was the only line I wrote, that would be incorrect - but I went on to explain what kind of data I was talking about - so quite obviously, I was saying exactly what I meant. You just chose to pick out that one sentence and nitpick it. Whatever makes you feel big and smart at the end of the day, I guess. It doesn't make the forum any more welcoming to new members willing to express their personal experiences and opinions and it certainly goes to substantiate the wide-spread belief that IT personnel are full of themselves and pick apart the obvious to show everyone that they know a thing or two. If you wanted to let everyone know that little tidbit of information, that would have been easily executed with something like "I realize you were talking about behavioral targeting information (which obviously would never include financial information because that would be extremely illegal and flat out immoral) but for anyone out there whose interested in legislation for private corporations - yadda yadda yadda..." Not terribly difficult by any means when you're really trying to put good information out there rather than try to tackle someone else to get your point across in a defiant way.

  • I would not normally dignify this with a response, but I never attacked you in anyway, I mererly pointed out that there was still a legal requirement for private companies to maintian financial "Data". It was your rather curt and snotty remark:

    "If we handled the financial data I would have mentioned that, Grasshopper "

    that I wasresponing to. As for self-editing all YOU needed to say was, "When I refered to the legal requirement for storing data, I was only referring to the behavioral targeting information"; and PS it is not clear at all that the latter explaination of your business in anyway modified the more general and sweeping opening statement.

    This will be that last I have on this matter.

  • Power and cooling are a real concern. I have seen this as well, but I am unsure that storage is the culprit without real testing as most of the newer drives consume less power and run cooler. I will say that server sprawl and density (e.g. blades), have ramped up the need for more power and cooling.

    There would be alot of variables, like the size of each drive in the array (smaller means more drives and most likely more power/cooling), but it's a real interesting point, I wonder if anyone has looked at the cost in cooling and power per GB.

  • The thing that no one ever asks (or answers) when a system is being implemented is “How long do we keep the data?” If you do get an answer, it will likely be something unrealistic, like “forever”. The only time the question ever gets seriously addressed is when you are out of disk space and management doesn’t want to buy more. At that point, 7 years of retention can magically turn into 2 weeks.

    The issue of obsolete data formats is a serious one. 25 years ago, data might have been stored on a reel-to-reel 9 track tape, but it might be a little difficult to find a drive to read those old tapes now, and who knows if they are still usable. Could you access an 8-inch, 5.25, or 3.5 floppy? How about 10 years from now? CDs are rapidly becoming obsolete, DVDs soon, and who knows how long Blue-ray will be around. Will computers have a USB port in 25 years to connect your thumb drive?

    Obsolete data formats are not a new problem, either:

    http://en.wikipedia.org/wiki/Linear_A

  • Donald Burr (7/16/2008)


    I would not normally dignify this with a response, but I never attacked you in anyway, I mererly pointed out that there was still a legal requirement for private companies to maintian financial "Data". It was your rather curt and snotty remark:

    "If we handled the financial data I would have mentioned that, Grasshopper "

    Wait, so you refer to me as "Forum Newbie" (obviously in reference to the forum title) and I'M curt/snotty for replying with yours? :Wow:

    Whatever dude. All I ask is that you learn how to utilize the spell check button if you're going to be blatantly hypocritical and trying to be all-knowing at others' expense. :Whistling:

  • So true... At least with SOX the minimums are well defined for some component of the data, but I have heard "forever" too many times.

    In fact we have it in our customer contracts that we will retain their data and images... "forever". We have been lucky that the company is willing to fund this, management has not balked at expanding storage, and that storage continues to plummet it price.

    I do think that companies have gotten better with this, and more aware, when it comes to hard copy storage, because in real terms its far more expensive to store boxes of paper; we can only hope that this continues to bleed-over into data storage.

    We have run into this several times with the legacy issue, and what we have done is pick appropriate times to refactor the data and up-convert the legacy data. It is not without its issues but it does work and there are tools and/or built-in support for most legacy data formats.

    Its kinda like burning your 8mm home moives to VHS and now those VHS tapes to DVD. 🙂

  • Are there any 3rd party software data archiving solutions out there that work well with MSSQL Server?

Viewing 14 posts - 16 through 28 (of 28 total)

You must be logged in to reply to this topic. Login to reply