Data population on SQL filestream enabled table

  • Hi team,

    We wanted to do load test  using the sql filestream enabled table. It is an audit tablr which captures the request and response API.

    We have enabled filestream on request and response fields.

    I understand the sql filestream will create the respective files for each record behind the scene.

    Now i want to pre-populate the table with around 5 Million records at one stretch, however this might not be the case in prod where the data gets loaded over the period of time.

    Consider the request and response as 5 MB.

    Constraints i could see,

    Time taken to perform insert - high

    Any problem with the backup file going to create for that day?

    Any other problems ?

    How we can do this with minimal impact ?.

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • I'm not quite sure what you mean, but with your questions.

     

    1. The backup file will contain this data inserted into a FILESTREAM table, so it will be larger.
    2. Not sure what you mean by problems. Filestream allows some data to be visible through a file system API, which means that access and xfer can be faster than T-SQL. I believe that with files above 1MB, FS is definitely faster. Below 256KB, it's not, and in between "it depends".
    3. Do what with minimal impact? It's unclear what you mean here.
  • Ok. To make it clear, if i insert around 5 million records in one stretch for load testing, will it be a problem however it might not be the case in prod where the data gets inserted over the period of time..

    **AGL enabled for database..

    Any suggestions for data insert in batches in filestream enabled table

    • This reply was modified 3 years, 5 months ago by  Mia2022. Reason: Added details
  • You haven't really explained what a "problem" is to you. You mean the server will run slower? It could with any insert in bulk.

    Filestream has tradeoffs, which is what you are load testing. Test the load as you need to with a normal table and inserting large volumes of data and then test with Filestream. Most of the Filestream benefits are when reading. When writing, it really depends on how your file system is structured and the underlying hardware.

    If you run a batch of 5mm records, you will get load from writing to the filesystem outside the MDF, as well as log records.

    As with any load, if you spread this out over time, the impact on the workload is less than if you try to run a large batch at once.

  • Thanks.

    Yes we are inserting records in batches not bulk insert.

    What i understood so far is, we need to focus more on normal table rather on focusing on filestream enabled tables for load testing.

    The load testing which i mean is to simulate production like scenario of large volume. In order to do that, i need to insert records i large volume..

    AGL enabled for database..

    • This reply was modified 3 years, 5 months ago by  Mia2022.
  • I'd watch both. Filestream has performed well for many customers. The downside is usually the code changes to take advantage of the feature. If you are storing documents that are on average, >1MB, it should work well. If they are small, it might not be better and not worth the effort.

    Run your load test a few different ways to see how things might perform. I'd try to get average times for queries in here, so you better understand what is changing, not just a total workload time. I'd be sure I tested some INSERT times, not just SELECTs.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply