I AM HAVING THIS UNIQUE AND STUPID SITUATION IN SSIS FROM MAINFRAME

  • OK here we go--

    I have a SSIS packages. the package loads data into the stage table from flat file and from stage to table for users.

    I have dataflow task inside the ForeachLoop COntainer to load into Stage. From Stage to table I used proc.

    The flat file I load is FTP by mainframe guys in certain folder at my local machine.

    Now the problem --- Today they are saying that there will be 2 files in the folder-- the first one with data on it and the second one which is empty file. the empty file indicates that their FTP is over ( meaning that they have successfully loaded the file in my local folder).

    Now they are presenting this situtaion and I have to overcome this.

    Let's say the mainframe people start their job, and they are FTP ing the file to my local folder. The File size is GB. their copying process is still in progress. ( I still don't have that empty file in the folder, which indicates their FTP is over).

    Now I start my ssis job load. When I start I see a file in the folder which is still in the process of being copied by mainframe. Now my loop returns the file name to my connection strings and starts loading the file to my stage table. In this case now WHAT HAPPENS? they are still coyping more records to the file in the folder, and I am loading records in the stage. Does my Load will fail?.

    What they wants me to do is . run my ssis package and if i see any empty file in the folder then do the rest of the logic in the package , but if there is no empty file in the folder then just stop the package right there. DO not proceed further.

    They are saying if in the middle OF FTP process their job fails, so I only have incomplete set of records in the file. SO make sure I check the empty file and then proceed for rest of the logic in the package.

    The logic I was thinking ---

    Check the file size in the Folder. Put in for eack LOOP container ( there is two files always- data file and empty file)

    If file size is Zero then proceed to one way , if not log to a table .

    SO now I need a script component to check the file size in a Folder Put the value to variable true or flase . Put in the flow both constraint and suceess. BUT HOW?

    DOES THIS MAKE SENSE TO ANYONE??????

  • Check out this site: http://www.sqlis.com/

    I recall seeing there a File Watcher Task, and this may be what you need for your SSIS package.

    😎

  • Lynn,

    I am a great fan of this site. I used to have this task before but when I schedule the job to run ssis packages , it gaev me error becos task couldn't be loaded ,a nd also I am fearing that once I move to WTC we will not have this task registered on teh box, so might give error. Another File watcher task looks for file , not file size( thats what i think)

    what I need is Look for any 0 size file in teh folder if present then delete it and proceed further for more logoc in ssis. If not then log into table and stop package their as success.

  • Another question then. You say that the main frame guys FTP the file to your server. Does this mean that you have the FTP Service from IIS running on the database server?

    😎

  • I am not sure exactly if i got your question or not?

    The mainframe guys have their own server, and they FTP the file they create to my database server.

    I don't have FTP task on my ssis.

    Or do u know how to cehck for a particular file in the folder and set a boolean value to a variable. The empty file they will provide will be named as EMPTYFILE.

    So what i was thinking the logic

    Chcek for EMPTYFILE in folder A

    If present set variable to true

    Then use constraints and expressions in the flow to move to different direction.

    Any help will be highly appreciated

  • The solution sounds simple and good to me. If i were you i would just ask them to place the empty file and name it Lock.lck (lock file) before they start copying, this will indicate that the MainData file is still copying data and when they are finised copying they can remove this Lock.lck file. In a way its just same as the EMPTYFILE, its just the file name makes more sense. So if there is no Lock.lck you are good to go.

    Also if you are sharing this common location then you can also place the LOCK.lck file when you are processing the file so that they cannont start copying when the SSIS is processing the file. And you can also delete this Lock.lck file when you are done processing

    HTH

    Mukti

  • What is the naming convention for the pairs of files?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Mukti I believe I understood ur solution. I thought I was able to fix the problem, but I think I created another one.

    Mukti I hope things were like just u said. But, the main frame does this because they want to make sure that FTP is complete and then they will create this EMPTYFILE. But Eithet way the problem is that have one SSIS packages to load 3 dff files of same format . The mainframe has three job to generate 3 files ; let's say FileA, FileB, and FileC. They FTP all three files in the same Folder ; for example D:\test\Folder. all their jobs are schedule at 4 am . SO let's say when jobA on mainframe is complete I will have a FileA and EMPTYFILE in the Folder. And if job B completes then fileB and EMPTYFILE ( this will override the previous EMPTYFILE. similaraly for Job C...........

    Let's look at my SSIS packages first.

    I check in the Folder for EMPTYFILE. If present, I delete it and then move to load into my table. I have a dataflow inside ForEachLoop container. So when I find fileA, I load it and then move it to Archive folder. Also, since I am using foreach loop there is another file FileB, so i load it and then archieve it, and if present FileC the same way.

    I DON"T KNOW WHAT HAPPENS, BUT MY CONCERNS IS----------

    Let's say the mainframe three jobs kicks off 4 am. Then FileA is created in the folder with EMPTYFILE in 3o minutues. The other two jobs from mainframe are still running. At 5 am my ssis job kicks off

    SO i check EMPTYFILE and it is there so i start loading the flieA data into table. My loading is running( it is still in the foreachloopcontainer). then during this, mainfrmae job B start FTP job and start copying the FileB on the same folder. let's assume FileB is large, and takes a lot of time. Now I finish loading fileA, but since there is fileB in the folder , DOES my foreachloop will loop again to load fileB? or just it will bypass the loading because FileB is not completely loaded in my folder?. I am thinking my foreack will start another loop for FileB. If so the FTP is not complete yet, it is still copying from the mainframe. So how does my load gets affected?????? Now another problem is if in the middle of FTP mainframe jobs fails. So i have this incomplete data in the file which is being processed.

    How can I overcome this situation.??? I can asked the mainframe guys to change the names of EMPTYFILE into 3 diff types. But does this will solve my problem.?.. If i ask mainframe guys to put diff files with corresponding EMPTYFILE in diff folders, does this will solve my problem? Remeber I have only one SSIS packges to load all three files. AT present all three files are created in the same Folder.

    Any idea or help and suggestions will be highly appreciated.

    thanks

  • I'm still trying to figure out where the FTP server is in all this. You say the files are FTPed from the mainframe, so where does the FTP server reside, and how does it ensure that the files end up on your server.

    At my previous employer we FTPed files from our minicomputer system to an FTP server (a Windows server running IIS and SQL Server), and using virtual directories, the files actually were loaded on our database server (a seperate server running SQL Server). We setup the first server to log all FTP data transfers to a SQL Server database table, and used triggers on completion of successful data transfers to start the appropriate data import processes on the other server. We had, when I left, about 70 seperate transfers occuring.

    Doing things this way eliminated the problem we were having with extracts sometimes running longer than anticipated, and missing scheduled import windows. It worked like a charm. The FTP server was dedicated to just this purpose, so logging to a SQL Server table was not an issue. That was the only hing the SQL Server was really used for on that system.

    😎

  • Mukti Here the same logical problem for ur solution. Let's say my ssis job starts. I look for lock.lc file. I don't see it . So i start processing FileA. Now say during my loading , mainframe sends me anothe lock.lc file and FTP is starting copying file for FileB, now since I am already in the loop , when i finshed loading FileA I see fileB in the folder which is still in the process of copying from mainframe. i start loading fileB ( although there is lock.lc file, as a matter of fact lock.lc file will also be processed). so WHAT HAPPENS TO MY LOAD???? it will load as data gets copied to fileB from mainframe?? now if so how does that affects my speed of load? and also if during the load FTP fails on mainframe so now i have the same issue of incompltete data file in the process.

    thanks

  • Alright to make it clearer. When one side sees the Lock.lck they cannot start their operation. that means when mainfram team starts copying they need to first check for lock.lck file if its present they cannot start copying. Let me try to explain my logic:

    1. Mainframe team check for Lock.lck file in the shared folder, they don't see the file they start the copy process.

    2. When they are finised copying they delete the lock.lck file

    3. SSIS process starts, it does not see the lock.lck file so it can start its processing, but before that SSIS package should place the Lock.lck file that means all the files in the folder are locked for processing. SSIS starts processing, when finished they should delete this lock file.

    So, both teams need to agree on this protocal that if they see the lock file they cannot start processing and while they are doing the processin/coying for that duration they need to place this lock.lck file in the folder and when they are done processing they should delete the lock.lck file.

    HTH

    ~Mukti

  • Mukti I got it. U r good. BUt, Do you think mainframe guys will agree this. Lots of politics here in the environment.

    You know I did my little testing here One of the issue is solved automatically. Ok lets say I kick my SSIS package and I see EMPTYFILE ( this was for FileA) and then i Move down to load it. Once I enter in my "ForEachLoop" and start loading my FIleA. During the process If I receive FileB from mainframe which is still in the process of Copying from mainframe, it automatically ignores the FileB and moves further without loading FileB. SO I can load this file in my second Run ( my packages run every hour for 3 hours)

    The only problem now is that I kickoff my package and i See EMPTYFILE for FileA and also FIleB is present in the folder but not complete yet ( it is still being processed by mainframe) so I move down and enter the ForEacHloop Now this time i see two files so it will pick both the files FileA is OK but FileB is not complete yet.

    LYNN!!, Sorry got late response. I don't know still what you are trying to do or trying to say . We will have our own box. The mainframe guys use their own and they are responsible to FTP the file in my server to the folder. SO ate the end of each successful FTP they will create a emptyFile as an indication of success.

    Mukti Idea looks good. But the problem is mainframe guys have to agree on it. Also According to them they cannot run like SSIS. ( meaning like they cannot keep looking for lock.file) they are schedule at one time. if they see file then they proceess or not. thats it.

    Taking mukti idea what if I copy from one folder ( the folder they use to FTP file) to another folder ( the folder where I copy ) SO I copy from one folder to another then once copy is done I load.

    I know I am doing diuble work..

    Is there any other ways within SSIS to accomplsih this like using forloop or script task or any other keeping bunch of loop or expressions and constratints........

    Remeber guys I can rename each EMPTY file to FIleAEmpty , FileBempty and thus for C and I can tell mainframe guys to even FTP to different folder........

    Thanks to all out there.........

  • Just for grins, I'd still like to know the naming convention of the files that aren't empty../. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • MedClmPROF

    MedClmINST

    MedClmDENT

    all empty files will be named EMPTYFILE.................

    Rt now i have left as it is ..........becoz the scenerio I presented might not happen never, its just one of those WHAT IF??

    My solution will be later.........

    I will tell mainframe guys to PUT three Files in Diff folders.

    Before the kick off the pacakge. I will chacek if empty file exists in the folder. If so I will copy this data file to the folder from where i do loading. I know its one more step but hey....... i read few days back ,,,,focus to the solution, not to the problem. THis things are hard to put in the practical.

    so it will be like......

    Check empty file in FOlderA FOlderB and FolderC

    If none, Log to table

    If present any one Delete that empty file

    Copy that datafile to DataFolder and do the loading.

    I hope this will solve the problem at least for now...........

    and then enetr in the for each loop.......to load data from

  • No extensions on the file names?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply