Save pdf documents in database or on file server?

  • We have a quoting system where pdf drawing files will be generated for some of our quotes. I would estimate about 40 quotes per day would include a pdf drawing file.

    Should we store the pdf documents in the database OR on a file server and store the file path in the database?

    Our current database is about 35GB and has about 150 daily client server users.

    Most of what I find on the web regarding this topic seems to be outdated and wanted to get some current opinions.

    We are using SQL Server 2005 with no plans in the near future of upgrading to 2008.

    Thanks You!

    Scott

  • I'd vote not to store it in the DB if the following statements are true:

    a) it is possible to re-generate the pdf drawing files based on the original data if needed and/or

    b) there is a backup strategy for the file server and

    c) the size of the pdf doc can (and will) exceed 8kB and

    d) the file system has a reasonable security concept

    If the file size is small and there is a significant benefit from having the files included in the db backup strategy, I might consider storing the data in the db. But I haven't had the need to do so as per today.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • there is a a white paper that talks about performance of Filestream v in database storage. Not completely applicable, but it basically says files < 256kb perform better from the db than the file system (http://msdn.microsoft.com/en-us/library/cc949109.aspx)

    That being said. I have tended to store stuff outside the db if I can. I can distribute that stuff to another machine, and often if it's like a web app, I can scale out, allow people to return the data without increasing load on my SQL Server.

    However if this is required for DR, then I might lean towards keeping it in the db for simplification of the backup process. If you can re-generate this, or it isn't needed to verify something for a client "as it was that day", then leave it outside.

    Ultimately there is no right answer. Consider what you know and make the best decision for the company. It doesn't sound like performance will differ much here.

  • I recently read this whitepaper ( http://research.microsoft.com/apps/pubs/?id=64525 ) while researching the same topic, however it was strictly for academic purposes. I have not been pressed to decide whether to implement file storage in a database. The whitepaper was written in 2006 which is about where you are in terms of your product version. The main thing I took away from it was that if you're going to be writing lots of files over 256KB, or overwriting the same smaller document many times, then it pays to go with file system storage.

    The one system I helped build where file handling was relevant was done using a file server and storing the path to the file in the DB. It was for an image-heavy web site with user-provided content so an entity could have 1-n files "attached" to it ranging from a few KBs to several MBs.

    If you have a moment please post your thoughts back to the thread once you have made your implementation decision.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • LutzM (4/25/2011)


    I'd vote not to store it in the DB if the following statements are true:

    a) it is possible to re-generate the pdf drawing files based on the original data if needed and/or

    b) there is a backup strategy for the file server and

    c) the size of the pdf doc can (and will) exceed 8kB and

    d) the file system has a reasonable security concept

    If the file size is small and there is a significant benefit from having the files included in the db backup strategy, I might consider storing the data in the db. But I haven't had the need to do so as per today.

    Lutz, do you mean file system in your first statement?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • SQLRNNR (4/25/2011)


    LutzM (4/25/2011)


    I'd vote not to store it in the DB if the following statements are true:

    a) it is possible to re-generate the pdf drawing files based on the original data if needed and/or

    b) there is a backup strategy for the file server and

    c) the size of the pdf doc can (and will) exceed 8kB and

    d) the file system has a reasonable security concept

    If the file size is small and there is a significant benefit from having the files included in the db backup strategy, I might consider storing the data in the db. But I haven't had the need to do so as per today.

    Lutz, do you mean file system in your first statement?

    Yes and no.

    Yes: By "not to store it in the DB" I meant using the file system under the conditions stated in a) to d)

    and no: I didn't intent to suggest not to use the file system under the given circumstances. 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (4/25/2011)


    SQLRNNR (4/25/2011)


    LutzM (4/25/2011)


    I'd vote not to store it in the DB if the following statements are true:

    a) it is possible to re-generate the pdf drawing files based on the original data if needed and/or

    b) there is a backup strategy for the file server and

    c) the size of the pdf doc can (and will) exceed 8kB and

    d) the file system has a reasonable security concept

    If the file size is small and there is a significant benefit from having the files included in the db backup strategy, I might consider storing the data in the db. But I haven't had the need to do so as per today.

    Lutz, do you mean file system in your first statement?

    Yes and no.

    Yes: By "not to store it in the DB" I meant using the file system under the conditions stated in a) to d)

    and no: I didn't intent to suggest not to use the file system under the given circumstances. 😉

    IC. I misread your post.

    Thanks for clarifying:-D

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • A thread on this topic pops up about every year.

  • We do see this topic regularly here. The authors of the white paper did some fairly extensive testing. It is theoretical in that it wasn't on a live system being used for production work, but you do need to isolate the performance because SQL Server and computer applications are do dynamic in nature. A transient spike in activity from some other query or process could dramatically impact the results.

    The upper end of best performance shows that below 256kb, you should keep documents in the database. Above 1MB, the file system performs better. In between it's a guess.

    Those are guidelines, not rules, and take them as such. If your documents are small, I might leave them in the database, in a separate table, and query them. That way you have some separation if you need to refactor things later, you will have an easier time than if you had the PDFs in tables with other data.

  • I've worked with both styles - a POS system that stored the images for cash register buttons and pictures of products, and a traffic management system that stored both the engineers' drawings and PDF's of the related permits. The POS system stored them in the database as image data type (now deprecated). The traffic management system store both an internal UNC path and a URL for the PDF files.

    Storing UNC path/URL was easier to manage and made it very easy for the 2 different front ends to handle - they just displayed the UNC path or URL and the user clicked on the link to view the PDF.

    Todd Fifield

  • Funny, I just read that whitepaper, researching the same issue for PDF storage.

    I ultimately decided on file storage, not DB, for a reason that has nothing to do with speed of retrieval, disk capacity, I/O, or anything else like that:

    Users will occasionally need access to groups of files with human-readable filenames. So it's better for me to name the PDFs with cumbersome-but-obvious names (like "2011 Q1 Report.pdf" instead of "1252313.pdf") -- users can browse the file store (read-only), and the database can retrieve them with file pointers.

    Point is: there may be issues to consider beyond the scope of "which is faster?" or similar strictly IT-related issues.

    Good luck,

    Rich

  • @rich-2: agreed.

    Actually, I missed one point on my list for not storing the files in a DB: if there are user with access to the file system but no login to SQL Server having the need to search for a specific file.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • tfifield (4/26/2011)


    I've worked with both styles - a POS system[/url] that stored the images for cash register buttons and pictures of products, and a traffic management system that stored both the engineers' drawings and PDF's of the related permits. The POS system stored them in the database as image data type (now deprecated). The traffic management system store both an internal UNC path and a URL for the PDF files.

    Storing UNC path/URL was easier to manage and made it very easy for the 2 different front ends to handle - they just displayed the UNC path or URL and the user clicked on the link to view the PDF.

    Todd Fifield

    Is this a touch-screen system?

  • Another option is to use Sharepoint to store the documents in a Sharepoint database via an application.

    We have a system with more than 600,000 PDF files stored in Sharepoint (SQL 2005) with an average size of about 250KB per document.

    Users access files via a custom interface that sits on top of Sharepoint services; they don't use the native Sharepoint web interface.

    I'm not saying this is the way you should go, but you might want to look into this alternative.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply