How do I add files to the server?

  • Mr. Brian Gale wrote:

    While I agree with Jeff in the sense that "if the system is designed in one way, why change it to work a different way", I think I would not take "database as a file storage system" as my first choice when designing a new thing.

    I've not worked with FILESTREAM so I'll have to take your and other's words on that.  It will definitely become a consideration for me in the future but...

    Mr. Brian Gale wrote:

    And jumping back to Jeff's example, if 10% of the data was missing and 10% was corrupt, was anybody reviewing the content? If I am creating a data graveyard or data landfill (whatever you want to call it where people go to drop data that nobody will ever look at), what's the point? If the data needs to reside long term, I'd be looking at some archival storage solutions like tape storage (good for decades) or M-DISC (claims to be good for 100's or 1000's of years). Short term, put it on HDD then migrate it to tape storage monthly. I wouldn't waste precious SQL resources or even SSD storage for landfill data...

    ... for the very reason you mention, I'd have to be really sure that some side-stepping-beach-creature wouldn't be able to cause the same problem as above.  I'd also have to find a way to guarantee that backups were always available because of incidents caused by others like the above so many times in my career, I have an inherent distrust of other people when it comes to the safety of data.  Heh... it's a terrible way to have to go through life with that bit of distrust but it's better than missing data that is supposed to be preserved.

    As they say "No one will protect the data the way a DBA will". 😀

    And, no... I can't reduce the data on disk to tape storage.  That was one of the first things I asked when I found the train-wreck to begin with.  The corporate lawyers said "No" to that and also wanted me to personally ensure that proper backups and the ability to restore at the drop of a hat... ANY HAT... would always be possible with and RPO approaching zero and an RTO of less than an hour.  Just getting a tape request in the right hands would take longer than that.  And, back then, I requested several tapes to do test restores from... the outcome of that was less than adequate.  I mean seriously less than adequate.  We no longer have that issue...  Heh... let's just say that I made it really important to the others whose task it was to preserve and protect tapes. 😀

    Effectively, though, I've done the near equivalent of "archiving" the data, as you suggest.  Each month in the database has it's own NDF file.  Once a final "closure" of the month has occurred and the File Group has been set to "Read Only" and the final backup has been done, the data going to tape consists of only that month one final time as well as the "current" month stuff.

    I'm looking forward to someday teaching myself to have enough knowledge about FILESTREAM to accomplish all of those same goals.  If I'm extremely lucky, maybe I can convince Grant to put together one of his incredible presentations on that particular subject to present to our local SQL Server group.

    What say yea, Mr. Fritchey?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Just a post to expose page 2 because of the forum fault in that area.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have experience with two large systems - one with FILESTREAM blobs and the other using normal blobs.  Both ultimately got migrated out of the database into an object store.  The largest grew to 160TB before the object store migration.  I'd be dealing with over 500TB database today if we hadn't moved to an object store a few years ago.  The problems will come well before 160TB though.  Filegroups & partitioning can help with manageability and you can live with files in the DB if you really need to.  At smaller data volumes it's much less of an issue but I'd still prefer files elsewhere unless there is a good reason to store them in the DB.

    When FILESTREAM first came out I thought it was going to be the best of both worlds.  It doesn't really solve any of my main issues with storing files in the DB though.  It solves some problems like the 2GB file limit and files consuming space in the buffer pool.  It does nothing to help with manageability.

    DBA Dash - Free, open source monitoring for SQL Server

  • Something else worth noting.  If there is a chance you might want the DB to run in the cloud on Azure DB or managed instance - FILESTREAM is not supported in those environments and would be a deal breaker until you can refactor the app to move the files out of the DB.  A normal SQL VM in the cloud would work with FILESTREAM but storage costs are much more expensive (and less reliable) than object store.

    DBA Dash - Free, open source monitoring for SQL Server

  • So Jeff,

    this sounds like a dumb question, but how do I insert a blob (image/pdf/doc) into the table via the SSMS?  Do I need a CAST statement or an SP?

    I've managed to get FILESTREAM setup on my local machine to play around with, just not sure how to take the next step.

     

  • mjdemaris wrote:

    So Jeff,

    this sounds like a dumb question, but how do I insert a blob (image/pdf/doc) into the table via the SSMS?  Do I need a CAST statement or an SP?

    I've managed to get FILESTREAM setup on my local machine to play around with, just not sure how to take the next step.

    Here's the actual code I use.  The variable names contain what they say.  @Counter  and the WHERE RowNum will likely not be necessary in your case.  Also note that @CallPath is the FULL call path including everything from the UNC to and including the file extension.  IF the contents of @CallPath are coming from an external source, YOU MUST FIRST CHECK IT FOR SQL INJECTION BEFORE USING IT!!!

    --===== Create the dynamic SQL to load the WAV file and execute it
    SELECT @SQL = REPLACE(REPLACE(REPLACE('
    UPDATE #CallsStaging
    SET Call_Binary = BulkColumn
    FROM OPENROWSET(BULK "<<@CallPath>>", SINGLE_BLOB) AS SourceFile
    WHERE RowNum = <<@Counter>>;'
    -- Other end of the REPLACEs
    ,'"' ,'''')
    ,'<<@CallPath>>',@CallPath)
    ,'<<@Counter>>' ,CONVERT(VARCHAR(10),@Counter))
    ;
    EXEC (@SQL)
    ;

    Of course, you can direct it to a permanent table instead of a Temp Table.

    This also means that you'll need to all for ad hoc queries in the "Surface Area Configuration" due to the use of OPENROWSET.

    Last but not least, the SQL Server Service needs the privs to have access wherever the UNC source is.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • David Wiseman wrote:

    I have experience with two large systems - one with FILESTREAM blobs and the other using normal blobs.  Both ultimately got migrated out of the database into an object store.

    Interesting.

    How often do you compare the metadata in the database to the metadata in the object store?

    Do you do any additional corruption checks on the objects?

     

  • Ken McKelvey wrote:

    David Wiseman wrote:

    I have experience with two large systems - one with FILESTREAM blobs and the other using normal blobs.  Both ultimately got migrated out of the database into an object store.

    Interesting.

    How often do you compare the metadata in the database to the metadata in the object store?

    Do you do any additional corruption checks on the objects?

    The object store (S3, Azure blob etc) just contains the file data.  All the metadata is stored in the DB.  There are two scenarios that could occur:

    • Metadata in the DB for a file that doesn't exist in the object store.
    • A file in the object store that doesn't have associated metadata in the database.

    There are various ways to ensure reliable processing of files to reduce the chances of the above situations occurring (e.g. using a queue).  An orphan file is probably less of an issue so you could add the file before processing the metadata and have a process that runs periodically to clean up the orphan files.  The quantities of orphan files should be quite small unless you need to roll back the database to an earlier point in time.

    On upload, you might want to do some checks on files - like an AV scan.  Object stores like S3 provide very high durability so the likelihood of files becoming corrupted is very small.  Still, you need to consider things like accidental deletion and ransomware.  You might want to consider backups into a different account and options for immutable storage to protect from this.

    DBA Dash - Free, open source monitoring for SQL Server

  • Sorry for the late reply.

    Currently we have an onsite system where images/documents are stored in the database. We are considering if these should be placed on the file system, or in an object store, instead. While I can see the advantage in terms of database resources I have two main concerns:

    1. File/Object Drift

    This is where the file/object is deleted or moved outside of the application. Presumably a regular check of the database versus the file system/object store helps with this.

    2. File/Object Corruption

    Everything in the database goes through regular CHECKDB processing. If the images/documents are outside the database I was thinking of holding something like a MD5 hash in the database and regularly comparing it against the file system/object store.

    Do you have any views on these types of check?

    Also, given that log backups will occur more often than file system backups, I was maybe thinking of keeping recent images/documents in the database in order to avoid data loss in the event of a restore.

  • Might just be me, but is sounds like you are trying to reinvent the wheel. What you are asking for sounds a LOT like a document control tool to me. Something along the lines of SharePoint or Alfresco would be better than trying to build it into a database. There are a lot of document control tools  out there that have revision control, built-in consistency checks (usually in the form of some checksum verification), and access control and come across in multiple price ranges depending on features and support required. I think that you would probably benefit from looking at a document control system rather than trying to bake this into SQL which may bite you in the long run...

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Thanks. That is certainly worth thinking about.

  • This was removed by the editor as SPAM

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply