August 19, 2008 at 4:59 pm
Hello Everyone,
Having an issue with a data flow task in which the destination is a flat file. This looks to be attributed to possibly the number of columns and maybe amount column data. Some specifications about my flat file:
Columns = 42 (columns range from various data types)
Code Page = 1252
Format = Delimited
Header Row Delimiter = {CR}{LF}
Column names in the first data row = true
Column delimiter = Tab {t}
running my process with only one column will succeed with no problems and will create a file with ~23k records so it is not a large file by any means. Does anyone know of any limitations or bugs with SSIS and the data flow task with regards to writing to a flat file. Any help would be much appreciated.
Thanks,
Anthony
August 19, 2008 at 7:19 pm
There are a few bugs in SSIS, and a few other frustrating issues with it like how it handles meta-data. However, I don't know of any bugs related to your specific issue.
Are you running this interactively? If so do you see the records making it to the destination object in the flow? does the file get created, but with zero bytes? Does SSIS report and warnings or errors?
August 19, 2008 at 8:07 pm
The file does get created and it stops at around 10k - 11k records, actually always on a specific record but haven't found any problems with any of the specific records as it doesn't give me an error just sits there in yellow status
August 20, 2008 at 1:24 am
Can you use Union all as a destination or Konesans Trash Destination.
Is it infact the destination that has the problem or something from your source?
Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!
August 20, 2008 at 7:17 am
If you are running this interactively you should be able to see how many records leave the source and how many records flow to the destination. It does sound strange that an error wouldn't be thrown versus just hanging even in the case where there is a bad record. One thing you could try is to replace your source's table with a SQL query that selects the bad record from the table and try to just send that one record. If you havent' identified it yet you could just just use a where clause to start bringing the 10,000th record and above assumeing you have a clustered identity column defined. If not you can probably still do something that would be roughly equivelant - then you can see if the process hangs on the specific record or if it still goes to 11k or whatever before it hangs
August 20, 2008 at 7:25 am
Situations like this are normally due to "back pressure" where the destination is telling SSIS to "hold on a sec while I catch up". This is often seen with SQL dest but I cannot remember seeing it with textfiles.
Likewise, a dodgy row in the source would not cause a delay. It should fail.
How sure are you that the source can dish out all the rows? Some time of blocking on the source? Weird though halfway through a query (Could be a really rubbish execution plan that spools out data as it finds it.)
Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!
August 20, 2008 at 8:08 am
So this is the smallest file that I am creating in my SSIS package with a total of ~23k records. In the same package I have files of over 11M records being created with no problem. The source is an OLE DB Source and it is the simplest of queries from a single table with no joins and a simple where clause. Here is the strangest part, in a task before this I use the exact same query to populate 3 other tables in different DBs other OLE DB Destinations using the multicast transformation. I used to have my flat file creation in there too with an added conditional split but experienced this issue. Once I removed the flat file destination from that task the task finished with no issues. So now the writing to the flat file has been moved into its own data flow task using the same query as the previous task (essentially duplicating what I did in the task before it) because if I tried pulling directly from the table it would hang, if I pulled using a different query omitting the conditional split it would hang, at one point using the exact same query with a conditional split everything worked so I left it that way. This is now why I have posted here because now this looks to hang as well where it was working last week. very very strange behaviour!! As I mentioned in a previous post if I remove all the columns in my flat file destination except 1 it will run fine. My next step is to delete every column except for one moving down the column list to see if it is a specific column causing the problem.
August 20, 2008 at 8:24 am
Is rather weird...
Some comments though:
Don't select a table as a source. Rather use a query. (and not * either)
Two reasons:
1: Good practice.
2: Are you going to use all the columns? If not, you are wasting memory. Even if you are, see point 1.
3: (I know I know) When your source is a view, you can get strange behaviour.
When selecting a view or table, SSIS sends a SET ROWCOUNT 1, select * From bla.
SQL Complies. Problem is, SQL just cached a query plan for the query based on 1 row returning. THIS IS A DOG. It will hurt you. Badly.
Giving it a query, it does a sp_prepare and just gets the meta data. Far Far better.
Could you not be suffering from this?
Come to think of it, your previous with a conditional split, could the SQL dest not be forcing some backpressure and the source was waiting for it to clear. Or had to gone green already?
Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!
August 20, 2008 at 8:35 am
Crispin thanks for the reply.
I am not selecting directly from a table or view I am using a query explicitly specifying all 42 columns. 🙂 I never use select * as it is bad practice, as you mention. 🙂 I tried putting the conditional split statement in my query's where clause but it would just hang, I got it to work once I kept the query the same and added the conditional split but again that was last week now it is just hanging. Also during all my testing this is the only task I run I do not run the entire SSIS package so there are no other tasks or processes affecting it. my SSIS creates 9 files ranging in record counts from ~11k to ~21M all the other files are created fine with the exception of this one file. I will let you know what comes about my testing.
Thanks,
Anthony
August 20, 2008 at 8:41 am
There's the gem. "When I put a where clause it hangs"
No it's not - it's SQL sh1tting itself to find the data because you do not have an index.
Here is my suspicion, verify by seeing the execution plan.
SQL has gone mad with parallelism and the reason you are getting back bits and pieces is because as each portion of your query (parallel process) completes and returns some data, SQL spits it out at ssis. SSIS then waits for more.
On a table that is 21mill rows big, tell me your query is indexed and doing a seek. Or at worst, a contiguous range withing a clustered index.
Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!
August 20, 2008 at 9:11 am
Sorry I don't think I stated it clearly. I have 9 files being created in different data flow tasks the files record counts range from 11k to 21M, each task pulls data from different tables (all using sql queries not direct table access). The file I am having problems with is only a 23k record count file. The source is really small and as I mentioned before I broke this task out into its own data flow task because it was causing the step before which is running the exact same query with the exact same where clause to hang. With the file creation removed it the task runs fine but now the task that is only creating the file hangs. Please let me know if I am not clear and I will post screen shots of what is going on. 🙂
Thanks,
Anthony
August 20, 2008 at 9:15 am
yup, makes sense. Is weird though that other processes can create files.
Think option of one column at a time is the best (last?) option now.
Curious to see what the problem is.
Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!
August 20, 2008 at 11:28 am
Ok I tested each column from the source individually to see which specific column was giving me a problem. Looks as those there are 4 columns that cause the data flow task to hang. The source columns have the same properties:
CodePage = 1252
DataType = text stream [DT_TEXT]
The destination column properties are all the same as well:
DataType = string [DT_STR]
I've tried changing the destination column to match the source and still hangs. Here are the flow details during testing:
Total records to file 19846 after conditional split (21634 from source)
Column1 - hangs at row 10910 after conditional split (11897 from source)
Column2 - stops at row 16374 after conditional split (17891 from source)
Column3 - stops at row 10910 after conditional split (11897 from source)
Column4 - stops at row 10910 after conditional split (11897 from source)
going to look at records in query to see if there is anything fishy about the column data at those specific records
August 20, 2008 at 1:20 pm
grasping at straws here - can youo change the destination to a SQL table (a empty copy of the first) and see if that runs. Then we can either confirm it's the file issue or not.
Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!
August 20, 2008 at 1:32 pm
I have a task already that is pulling the exact same data into an OLE DB destination and runs fine....it is all around the flat file destination data flow task. I'm going to be opening up a ticket with MS as I have not been able to find any reason why this is happening.
Thanks for your help!!
Anthony
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply