July 2, 2009 at 12:54 am
Hi All,
I'm hoping someone might be able to help me. I'm new to these forums, so my appologies if this has been answered previously. I haven't been able to find anything on google.
I've just started experimenting with FileStream in a development instance of SQL 2008 Express. The problem I've encountered though is that when I delete rows containing FileStream data, the space allocated for the files is not being released.
I've tried doing a shrink database, as well as running checkpoint, but so far no luck.
I thought perhaps it would just re-use the space when I added new files, but it doesn't appear to work that way. The space allocated just keeps increasing as I add and remove files.
I was hoping to use FileStream to support large video files, but I would need to be able to free up the allocated space, or I'll very quickly fill up the disk.
Can someone tell me how to release the FileStream allocated space?
Thanks,
Eugene
July 3, 2009 at 3:48 am
Once deleted, the FILESTREAM data is under OS control. It can take some time before OS removes the file in the filestream catalogs.
N 56°04'39.16"
E 12°55'05.25"
July 3, 2009 at 10:47 pm
Sorry to press the issue, but what does "under OS control" actually mean? How does the OS know that it is supposed to delete the files, and what service does it use to actually do this?
I deleted some filestream rows containing large video files over a week ago on my Vista Ultimate laptop, and have restarted numerous times since, and the files are still there.
Is there anything that can be done to speed up this process
July 3, 2009 at 11:55 pm
No, the OS handles this.
See here for a starter about FILESTREAM
http://technet.microsoft.com/en-us/library/bb933993.aspx
The database and OS communicated over the FILESTREAM API.
When you delete a record with FILESTREAM attribute enabled, the operating system Garbage Collecter is responsible for deleting the OS file.
See here
http://technet.microsoft.com/en-us/library/cc645962.aspx
N 56°04'39.16"
E 12°55'05.25"
July 4, 2009 at 1:42 pm
I deleted some filestream rows containing large video files over a week ago on my Vista Ultimate laptop, and have restarted numerous times since, and the files are still there.
Is there anything that can be done to speed up this process.
Yes there is because this problem is related to recovery model, so checkpoint will release it to if this is development you could change to simple recovery model.
Kind regards,
Gift Peddie
July 4, 2009 at 2:04 pm
The files will be retained until SQL's garbage collector runs. The files are eligible for garbage collection once they are no longer needed for database recovery. If you're in simple recovery, that means after a checkpoint runs. If you're in full recovery, it's after the log has been backed up and a checkpoint has run after the log backup.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 4, 2009 at 2:15 pm
Eugene (7/3/2009)
Sorry to press the issue, but what does "under OS control" actually mean? How does the OS know that it is supposed to delete the files, and what service does it use to actually do this?I deleted some filestream rows containing large video files over a week ago on my Vista Ultimate laptop, and have restarted numerous times since, and the files are still there.
Is there anything that can be done to speed up this process
The files will be retained until SQL's garbage collector runs.
If SQL Server garbage collector is the same as .NET then OS restart should have cleaned it up so then it is retained if there is no log backup to remove it. The operating system restart eliminates the SQL Server garbage collector, there may be a bug in the implementation and I think Microsoft needs to document something this relevant to development.
In an Asp.net application server can run out of hard disk space using File Stream.
Kind regards,
Gift Peddie
July 5, 2009 at 5:12 am
Gift Peddie (7/4/2009)
If SQL Server garbage collector is the same as .NET then OS restart should have cleaned it up so then it is retained if there is no log backup to remove it.
I doubt it's exactly the same as .net's one.
If the DB is in full/bulk logged recovery, the old filestream files have to be retained until after a log backup. If no log backups are been run, they will not be removed as they are needed for database recovery.
Paul Randal's written a very good whitepaper on filestream. Well worth searching for and reading
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 5, 2009 at 5:29 am
Paul Randal's written a very good whitepaper on filestream. Well worth searching for and reading
I have seen it did not read I am still saying no garbage collection is going on because if only backup release deletes that is almost like shadow volumes introduced in Win2003 attached to SQL Server. It is sloppy implementation because Asp.net and SilverLight applications can run out of disk space because somebody did not add to clean up file stream deletes daily scheduled backup is required.
What makes this so sad is in .NET the streams children classes to the abstract class Textwriter are some of the best things in the base class so how did SQL Server end up with something not scalable and almost crude.
Kind regards,
Gift Peddie
July 5, 2009 at 5:52 am
Gift Peddie (7/5/2009)
I have seen it did not read I am still saying no garbage collection is going on because if only backup release deletes that is almost like shadow volumes introduced in Win2003 attached to SQL Server. It is sloppy implementation because Asp.net and SilverLight applications can run out of disk space because somebody did not add to clean up file stream deletes daily scheduled backup is required.
It's not a sloppy implementation. The deleted filestream data has to be kept around until it's not longer needed for database recovery. If it could be dropped before that it could lead to transactionally inconsistent databases. Remember, the filestream is part of the database and has to be transactionally consistent.
If the garbage collection could clean up a file while it was still necessary then any one of these cases could occur:
A rollback fails because the old version of the filestream is gone.
SQL crashed before a transaction involving filestream has committed. Restart recovery fails because the old version of the filestream is gone.
A log shipping job fails because when it goes to back the log up, the filestream data is not there. Or the log backup backs up only part of the changes.
If the first two could happen, the database would be considered suspect. If the last happens then the log shipping standby wouldn't be a copy of the primary. Any of those would be considered a major failing point for file system implementation.
You mentioned 'daily scheduled backups'. Ignoring the fact that a production system should have daily backups anyway, you missed the point. It's not full backups that are required. It's log backups and only if the DB is in full or bulk-logged recovery.
Don't want to back the log up and want the filestream stuff to disappear quickly? Simple solution - set the database into simple recovery. Of course, if point in time recovery's necessary, there should be log backups running already.
The major difference between SQL's 'garbage collector for filestream' and .net's garbage collector is that the latter doesn't have to worry past the ending of the application. It doesn't have to worry about database consistency or transaction rollbacks. SQL's one does.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 5, 2009 at 7:59 am
If the garbage collection could clean up a file while it was still necessary then any one of these cases could occur:
A rollback fails because the old version of the filestream is gone.
SQL crashed before a transaction involving filestream has committed. Restart recovery fails because the old version of the filestream is gone.
A log shipping job fails because when it goes to back the log up, the filestream data is not there. Or the log backup backs up only part of the changes.
All these are rudimentary RDBMS facts of the Microsoft implementation so why Microsoft did not add these to the documentation so developers knows what they are using. I am still saying this is backup and ACID attached component the docs needs to reflect that.
You mentioned 'daily scheduled backups'. Ignoring the fact that a production system should have daily backups anyway, you missed the point. It's not full backups that are required. It's log backups and only if the DB is in full or bulk-logged recovery.
Don't want to back the log up and want the filestream stuff to disappear quickly? Simple solution - set the database into simple recovery. Of course, if point in time recovery's necessary, there should be log backups running already.
No I did not because until Microsoft runs test of an application sitting with 500gig to 1TB File Stream deletes in an Asp.net and SilverLight application it will remain a suspect implementation because we don't know sustainable levels in a busy stateless application. That each Files Stream deletes must run log backup at the end of the task in full recovery and bulk logged recovery model because it is ACID attached is something developers need to know.
The major difference between SQL's 'garbage collector for filestream' and .net's garbage collector is that the latter doesn't have to worry past the ending of the application. It doesn't have to worry about database consistency or transaction rollbacks. SQL's one does.
In the current File Stream there is no garbage collection going one because when something exist after several restart of the relational engine nothing is collected because when things are collected Log Backup should not be required in any recovery model.
Kind regards,
Gift Peddie
July 5, 2009 at 9:24 pm
Thanks for all the responses. After a bit more hunting around I found some info about the recovery mode. I had set it to "Full" and didn't realise that this would prevent the files being deleted, although that does make sense.
Anyway I've tried doing a full-backup, as well as differential, and log backup, and still the files persist. However I have only just now tried running "checkpoint", so hopefully that will do the trick.
July 6, 2009 at 1:22 am
Check out that Whitepaper I mentioned. I don't know it's location offhand, but a search for "Paul Randal" and filestream should turn it up.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 6, 2009 at 1:25 am
Check out that Whitepaper I mentioned. I don't know it's location offhand, but a search for "Paul Randal" and filestream should turn it up.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 6, 2009 at 1:30 am
Check out that Whitepaper I mentioned. I don't know it's location offhand, but a search for "Paul Randal" and filestream should turn it up.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 24 total)
You must be logged in to reply to this topic. Login to reply