Best way to strore a huge number of images

  • What is in your opinion the best way from performance point of view to store a huge number of image files?

    The easiest would be to store them within the filesystem and in the database I only would store the path.

    BUT, there is a but.... I have to store at least 2.5 M files of 10KB monthly and I have to keep the data for 10 years. it means: 2.5 M x 12 x 10 = 300 M files. ANd if the customer decides to store other documents as well the it could be 10 times more docs to store!

    I don't know if there is any filesystem (NTFS, UNIX...) being able to handle this amount of files.

    The problem is not with the total size but with the number of those files.

    In a database there would be no such limitation. But I do'nt know how to insert whithout bii.exe and then select and show those pdf files (the client app would be SAP or something called by SAP.



    Bye
    Gabor

  • I think that any limit on the number of physical files that can be stored will be at the folder level - so can you just create some sort of folder structure that effectively limits the number of files per folder?

    Had a bit of a hunt around but could not find an answer to the "How many files ... ?" question though.

    I would expect overall performance and system maintenance (eg backups) to be better if you are able to go down this path.

    Regards

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • http://www.microsoft.com/resources/documentation/windows/2000/professional/reskit/en-us/part3/proch17.mspx

    Shows that NTFS can hold 4,294,967,295 (~4.3 Billion) per volume.  I have a image archive database that currently has well over 200,000 images stored on an NTFS file system with the relative path from a root folder stored in a table.  The files are organized in folders by year/createdate (ie. \[Client Root Path]\2005\0101\Filename.ext).  The database is currently growing at about 18,000 files per month and so far performs quite well.  If I out grow the volume I can easily segrate storage by changing the client's root folder to another volume, and possibly making some minor changes to some stored procedures.

    Hope this helps.

    Tony

  • By seeing this limitation I will clearly shoot for the file system storage!

    At worst case I will create files within some containers (like zip files) which hold several images.

    Thanks guys!



    Bye
    Gabor

  • Using a file system for images may look like the easiest to design, but it is also by far the hardest to maintain.  There have been a lot of posts on this subject in a lot of forums.  Most people who have had experience of storing images in filesystems say they always without fail experience coruption of the data store - that is certainly my experience. 

    The problem is normally that files get moved away from the location pointed to in the database, either accidentally or for some maintenance purpose.  The next most common problem is the lack of transactional integrity betwen the file store and the database.  In a failure situation, you can get an image file without a database pointer, or a pointer without a file.

    If it is important to ensure integrity of your application, then you need to keep the images in the database.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • We have a doc management application that has ~700k files in the single folder accessed via FTP on NTFS... works a treat

    One backup product we were using though had troubles with >40k files in a folder!

  • Ed already mentioned some aspects to consider. other aspects might be the separate backup process and the separate security that needs to be maintained.

    Are those BLOBs "only" stored and viewd sometimes. Or do the users need to work with them, e.g. manipulate them?

    Anyway, maybe this helps a bit:

    http://www.microsoft.com/resources/documentation/sql/2000/all/reskit/en-us/part3/c1161.mspx

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • The users want only to see those images.

    My customer is a utility company having 2.3+M customers. They want to store the image of the invoices they are generating in a PDF format.

    Therefore regarding the amount of the files stored only a small number of images will be consulted and only in a read only way and only from SAP.

    I would say that the users will never consult those images more then 1000 times per day.

    I was thinking to have a dedicated fileserver with a 2 TB SATA disk system for this purpose. So no accidental file moving is expected.

    I could also store those files into a database, but I'm afraid of the performence. I have to store 100K documents at once into the database.

    It must be performent! The question is not the selection speed of the image but the insertion speed.



    Bye
    Gabor

  • You can use SQL Server 2005 database for your application, because of support for XML in SQL Server 2005 is humangous, you can insert images, query for it and even you can create indexes for faster retrival of data. 

  • But my issue is how to insert over 100 000 images at once into the database with a decent performance?

    And what I don't know is how to show a PDF file stored a BLOB in the database within an Acrobat reader (or any other reader being able show the PDF file)



    Bye
    Gabor

  • Why don't you just write the file to temporary folder on client machine and call an associated (by extension) PDF reader program?

  • Yeah, we do the temp file thing for our doc management app...

    But for some apps that accept a http:// URL, you could always shellExec the URL and have the app start and let it deal with temp files (which it may not even need).  I haven't tried this - not sure if the user will get a "Open or Save" window, or any of the other XP SP2 windows appear, but might be a solution, particularly if you send the URL straight to Adobe Reader.

  • Thanks guys. I think this temp file solution is the way to go.

    But do you have any idea how to store about 100 000 files at once into the database with an acceptable (i.e. super fast) performance?



    Bye
    Gabor

  • I have to store at least 2.5 M files of 10KB monthly and I have to keep the data for 10 years

    It seems to me that performance is not the main factor here. You have one choice to make: Are the files necessary to maintain the integrity of the system? If not, then go ahead and store them in some file system and just store a path in the database. But if the files are important, i.e. they must be saved (which I assume is really the case here) then I cannot see why you would not save them in the database. How can you guarantee the integrity otherwise?

    Regarding reading the files from database to client: How are the clients accessing the files? If it is over Internet then you can simply stream the contents through the response stream, and the client browser will react just as if it had specified the URL for a file that exists on the web server.

  • It seems to me that performance is not the main factor here
    True for the retrieval, but false for the bulk insert (over 100,000 at once)
     
    How are the clients accessing the files?

    Mainly out of SAP



    Bye
    Gabor

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply