April 1, 2009 at 3:05 pm
Certainly that works if the file does not show up within the allocated # of intervals, but what if the file is not the expected size?
Frequently we get files that are not correct.
Our tell tale sign that something has gone wrong at the processor is the file is 1k in size, no where near the 15 mb the file usually contains.
So in addition to the file existing, I check the filesize, I set the lowest value to to the smallest file we have received in the last year, (Less a few mb). Sure it May send a false notification once in while, but its good to understand the when files are out of boundry if you will.
But for simplicity that surely works.
Ultimately I wouldn't want to log "Failure" for each retry simply because the file hasn't gotten there yet. I'd rather handle the exception so it exits gracefully in an expected situation.
April 7, 2009 at 9:41 am
Interesting view on building a "File watch" process. I have built similar solutions but I tend not to use VB to control the delay as I recall the CPU usage shoots up.
Instead, I use the VB purely to check for the file and return true or false back to the package/ populate a variable.
I also use this inside a "for loop container" and include a SQL task to process the delay (does not hit the CPU).
The properties on the "for loop container" are then configured to increment up to a given point,
i.e. n times the delay in the SQL Task
and include checks on the EvalExpresion to drop out when the file is found (previously populated variable =true) or when the total delay has been exceeded.
The work flows continuing from the container can then also check the previously populated variable and act accordingly.
Simple modifications to the VB to use a variable to hold the file to watch, this package can be called from a parent package, and collecting parent variables, allows the same child package to be called multiple times, e.g. waiting for n number of files to be available.
I am sure there are another dozen ways to achieve the same goal.
April 9, 2009 at 2:05 pm
Yes, you are right. The Retry Attempts and Retry Interval option from agent job relys on the failure of package execution and it will record every failure of retry in log. So if the source file is existed but is still in the process of being written, the uploading package will be failed anyway. That just takes the step of the job waiting for a next retry and that's it. I think the file size is quite unreliable to be taken as judgement. 1K means an empty file (it's ok to me. that means no new record today), but how big the size means good file and will that be accurate? Actually, I developed a custom task component to check the file existence (similar code but in C##). It can be used in any package that I want to check the file existence. All I need to do is to pass the full path file name as the parameter to that component. That works prety well for me.
April 15, 2009 at 2:57 pm
Great article! This is very useful including the additional tweaks other users added. These steps are working great for a package I created.
June 18, 2009 at 10:09 am
I believe you can use the FileInfo.OpenWrite() method to check if the file is avaialbe for opening to write (assuming that you will not be able to if something else is writing to it)
Dim oLoadFile As FileInfo = New FileInfo(cFilePath)
Dim lComplete As Boolean
Complete = False
...
'Loop until the file is openable (e.g. FTP or copy complete) or iTimeOut is reached
While lComplete = False
lComplete = True
Try
oLoadFile.OpenWrite()
Catch
lComplete = False
System.Threading.Thread.Sleep(iCycleTime)
End If
i += iCycleTime
If i > iTimeOut Then
Exit While
End Try
End While
...etc
If it ain't broke, don't fix it...
May 28, 2010 at 6:41 am
File.IsReadWrite() is a better approach, IMO
June 18, 2010 at 8:11 pm
I have a similar situation, except the files we receive (approx 30 files) are sent more frequently. The source system is scheduled to send the files every :15 throughout the day at :00, :15, :30 and :45. The problem is, depending on the file size and load on the source server, the files can arrive anywhere in that 15 minute window.
The expectation is that we try to stay as current as possible on this data, so we're checking for and processing updates every 5 minutes. Unfortunately, about 30% of the time, we end up grabbing a file as it's being written and end up with a partial file. We copy it from our FTP/File server to our SQL server for processing. I wish the senders could do a rename on the file after it's been written, but that's not currently an option. Since FTP doesn't lock the file, the Konesan's File Watcher task doesn't do us much good as it's able to open the file at any point as the file is being written.
I'm contemplating a solution that checks the file sizes after we've robocopied the files to the SQL server to see if the file sizes match. If they do, great, if not, re-run the robocopy script and keep checking until it does.
Curious what others might do in this situation?
July 22, 2015 at 1:20 pm
michael.groh (3/31/2009)
It always concerns me when I see an endless loop used as a timing device. It'd seem that having a number of packages running on a server, each using an endless loop as a file watcher, might be detrimental to performance.It's very easy to add a "Sleep" statement in the body of the loop to put the thread to sleep for a few seconds or minutes:
System.Threading.Thread.Sleep(milliseconds)
I've done the file watcher thing by putting the thread to sleep for 60000 milliseconds (one minute), and counting how many minutes have transpired before failing the task:
Dim i As int
While True
If System.IO.File.Exists(Dts.Variables("FlatFileConnection").Value.ToString)Then
Exit While
Else
System.Threading.Thread.Sleep(60000)
i += 1
'Exit after 240 minutes (4 hours):
If i > 240 Then
Exit While
End If
End If
End While
This is much simpler and more efficient code than using a DateTime variable and comparing it's value to Now(). The way the code in this article is written, Now() is called millions and millions of times in the body of the loop, while in the revised loop, the comparison is performed only once a minute, and, at most, 240 times. The revised loop achieves the same objective while requiring a lot fewer CPU cycles.
I created a DTS and a SSIS Package to check for the existence of a backup file, rename and move the file to an archive folder and a task to delete the files after a given date. It needs to have a timer and when it finds the file it loads it into a Table. That was years ago.
Does anyone have sample code to do this?
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Viewing 8 posts - 16 through 22 (of 22 total)
You must be logged in to reply to this topic. Login to reply