August 17, 2010 at 4:39 pm
Hi all,
This is a question I've been researching and would like your input on:
Is it feasible to replace a regular file server with a system that has a SharePoint front-end and a SQL Server back-end that stores files as BLOBs?
August 17, 2010 at 9:50 pm
You could, though it depends if there is anything missing in Sharepoint that people need in a file share, like automated process copies, updates, etc. I'm not sure if things like Excel macros might fail as well.
In terms of blobs, if the files are large than 1MB, it's recommended they be served from the file system rather than SQL Server for performance reasons.
August 18, 2010 at 10:08 am
Thank you Steve. So what you are saying is this option is feasible but must be selective. For instance we could save space by converting a 50MB PDF to a BLOB but we would suffer greatly on performance, correct?
http://research.microsoft.com/pubs/64525/tr-2006-45.pdf
That whitepaper, though not dealing directly with BLOB vs file system as I was thinking, points to the break-even point even lower than 1MB to 256KB.
August 18, 2010 at 10:15 am
256kb, should be in SQL, above 1MB, should be in the file system. I think it's a gray area in between.
You might reduce some SQL resources/IO from a large file, but performance is relative. It might work fine for you, depending on the load.
The other thing is that you can't necessarily drag/drop/script stuff easily with Sharepoint.
August 18, 2010 at 3:42 pm
Steve, do you think there are any other issues in regards to storing large files about 1MB other then performance?
August 18, 2010 at 3:59 pm
Have you thought about restore? It's easier to restore a file on a file system than a BLOB field in a database.
August 18, 2010 at 9:28 pm
No, I don't think there are big issues. SQL Server is designed to handle files, even large ones. I think that you have some limitations around 2GB, but apart from performance, and the methods of access, I think it works fine.
We store a lot of images here at SSC in the db. Most are small, hundreds of KB, but we have had some larger ones (2-10MB) and they worked OK.
August 20, 2010 at 2:22 am
chrisph (8/18/2010)
Thank you Steve. So what you are saying is this option is feasible but must be selective. For instance we could save space by converting a 50MB PDF to a BLOB but we would suffer greatly on performance, correct?
Out of interest, why do you think you'd "save space" by doing that? The PDF will occupy the same amount of disk space whether it's part of the database or not, AFAIK--all you're doing is moving the space it occupies to a different area!
August 23, 2010 at 1:08 pm
paul.knibbs (8/20/2010)
chrisph (8/18/2010)
Thank you Steve. So what you are saying is this option is feasible but must be selective. For instance we could save space by converting a 50MB PDF to a BLOB but we would suffer greatly on performance, correct?Out of interest, why do you think you'd "save space" by doing that? The PDF will occupy the same amount of disk space whether it's part of the database or not, AFAIK--all you're doing is moving the space it occupies to a different area!
Really?
I was under the impression that a BLOB is stored as a string of binary numbers in a database and then the software converts it for use, therefore saving a lot of space.
Am I wrong?
August 23, 2010 at 1:52 pm
The only really big reason I have found to put the BLOBs in the database is that it makes synchronizing a whole lot easier. If the metadata and the files are in different technologies, each can be restored without the other. How will you know this has happened? How will you make sure all your files have metadata and all your metadata points to valid files? Are all these files going into the same directory? One place I worked ended up with millions of files in a single directory and any operation on that directory through a GUI was agonizingly slow.
August 24, 2010 at 1:44 am
Really?
I was under the impression that a BLOB is stored as a string of binary numbers in a database and then the software converts it for use, therefore saving a lot of space.
Am I wrong?
How is storing the data as a string of binary numbers in a disk file--which is what normally happens--or as binary numbers in a database different? Unless the data is actually compressed in some way it will take up the same space in the database as it does anywhere else.
August 24, 2010 at 8:20 am
There's no space saving. What there can be is a different way the data is pulled off the disk and sent to the client. When there are small files, SQL is more efficient than the file system for lots of accesses. When they're larger, the file system streams out better.
August 25, 2010 at 10:34 am
I'm going to add my two-cents worth here, in a previous job we had a constant running debate on this topic. The files were being stored as blobs in the database (SQL 2000).
The two main arguments my boss gave around storing the files in the database were #1) access control (security) and #2) backups of the database also automatically backed up the files. I argued for storing reference paths in the database and enforcing security at the file level and of course performing backups of those files. In our system it was essential that the files be both secure and available. The files being stored included word documents, pdfs, gifs and jpegs some were only a few KB some several MB (medical images).
What is important here is that the files were constantly inserted into the system and also frequently deleted. One of the issues we ran into was that the index on this table would very quickly become heavily fragmented and had a detrimental effect on the entire application. The original method they used for addressing this issue was to stop the production environment and then perform index maintenance on the entire database (often during the middle of the day). Once I determined where the issue was actually occurring I created a job that would reorganize the index every 15 minutes and rebuild it nightly. We never again had to communicate to customers that the application was going offline for an hour.
I typed all that to say (and Steve might have better insight and methodology for thought here) I would consider the objective. You asked if it was feasible, the famous answer is that given enough time, money and resources anything is feasible. Storing the files in the database gives you a few advantages but do you really need access control? Are the files absolutely required to be available? If not I favor storing them on the file system (with paths).
Make what you will of that rambling post, but hopefully something useful comes of it.
August 25, 2010 at 10:45 am
Good post, and you've hit the issues. There are issues either way. In the db I would tend to vertically partition the blob data from other metadata to avoid index issues and speed access to the important parts (meta data)
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply