January 11, 2016 at 10:16 am
bpowers (1/11/2016)
Well after submitting the SSIS file for testing I was notified that the Header and Detail must be grouped together. Ughhh... If you still have a bcp option you'd be willing to work with me I would greatly appreciate it. Also, xp_cmdshell can be used as I will simply turn it on and off during the execution.Thanks
Jeff Moden, you have centre stage:-P
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 11, 2016 at 10:25 am
I can see a way of doing this in SSIS, if the output is a text file with one or two columns.
What's I'm visualizing is rows in the data flow that have both the header and detail data in each DF row. Toward the end of the data flow, there would be something to detect if the current row has a new header or not. If it's a new header, follow the path that has the header output logic. If it's not a new header, follow the path that has the details output logic.
Each row written to file would be all the required data fields, concatenated with the desired delimiters.
Sounds simple? Implementation might be a bit harder than you might expect.
For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]
January 11, 2016 at 10:54 am
Alvin Ramard (1/11/2016)
I can see a way of doing this in SSIS, if the output is a text file with one or two columns.What's I'm visualizing is rows in the data flow that have both the header and detail data in each DF row. Toward the end of the data flow, there would be something to detect if the current row has a new header or not. If it's a new header, follow the path that has the header output logic. If it's not a new header, follow the path that has the details output logic.
Each row written to file would be all the required data fields, concatenated with the desired delimiters.
Sounds simple? Implementation might be a bit harder than you might expect.
Yes, I can see how this would work. Hours of debugging fun await.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 11, 2016 at 11:02 am
For SSIS I was thinking more along the lines of this:
1. Execute SQL Task gets all IDs needing to go into the file and stores in SSIS variable of type Object (an ADO.NET DataSet under the covers).
2. Precedence Constraint to Foreach Loop Container iterates over SSIS Variable just populated. In Foreach:
2.1. Data Flow Task writes header data to file utilizing the ID from the outer loop.
2.2. Precedence Constraint from last Data Flow Task to new Data Flow Task that writes details rows to file utilizing the ID from the outer loop. The destination will use a different File Connection Manager pointed to same physical file.
Done.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
January 11, 2016 at 11:07 am
Orlando Colamatteo (1/11/2016)
For SSIS I was thinking more along the lines of this:1. Execute SQL Task gets all IDs needing to go into the file and stores in SSIS variable of type Object (an ADO.NET DataSet under the covers).
2. Precedence Constraint to Foreach Loop Container iterates over SSIS Variable just populated. In Foreach:
2.1. Data Flow Task writes header data to file utilizing the ID from the outer loop.
2.2. Precedence Constraint from last Data Flow Task to new Data Flow Task that writes details rows to file utilizing the ID from the outer loop. The destination will use a different File Connection Manager pointed to same physical file.
Done.
This is an elegant idea. There is a chance, I suspect, that the destination file will get locked by DF (header) such that DF2 (detail) fails. But if not, this is tidy.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 11, 2016 at 11:27 am
If using SSIS, I think it would be easier to first dump the headers and details into separate files then using a script task to merge the two files together.
For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]
January 11, 2016 at 11:32 am
Phil Parkin (1/11/2016)
Orlando Colamatteo (1/11/2016)
For SSIS I was thinking more along the lines of this:1. Execute SQL Task gets all IDs needing to go into the file and stores in SSIS variable of type Object (an ADO.NET DataSet under the covers).
2. Precedence Constraint to Foreach Loop Container iterates over SSIS Variable just populated. In Foreach:
2.1. Data Flow Task writes header data to file utilizing the ID from the outer loop.
2.2. Precedence Constraint from last Data Flow Task to new Data Flow Task that writes details rows to file utilizing the ID from the outer loop. The destination will use a different File Connection Manager pointed to same physical file.
Done.
This is an elegant idea. There is a chance, I suspect, that the destination file will get locked by DF (header) such that DF2 (detail) fails. But if not, this is tidy.
Thanks Phil. My design makes two assumptions:
1. The two DFTs* will not execute in parallel.
2. The Flat File Connection Managers will close their respective file handles once they are done writing to the file.
Re: #1 The DFTs should run in sequence because of the Precedence Constraint I have placed between them.
Re: #2 In the first DFT the file will be opened for writing, the header line will be written and then the file will be closed. Immediately after that the second DFT will open the file, write the detail lines associated with the header line and then the file will be closed again. That cycle would repeat until all IDs were processed.
* Just in case: DFT = Data Flows Task, DF = Data Flow and is used sometimes as shorthand for Data Flow Task
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
January 11, 2016 at 11:41 am
Alvin Ramard (1/11/2016)
If using SSIS, I think it would be easier to first dump the headers and details into separate files then using a script task to merge the two files together.
That could definitely work but extends the scope of required knowledge into C# development.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
January 11, 2016 at 11:44 am
Orlando Colamatteo (1/11/2016)
Alvin Ramard (1/11/2016)
If using SSIS, I think it would be easier to first dump the headers and details into separate files then using a script task to merge the two files together.That could definitely work but extends the scope of required knowledge into C# development.
Gained knowledge is not a bad thing. It could easily be the quickest way to get this done.
For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]
January 11, 2016 at 11:48 am
Orlando Colamatteo (1/11/2016)
Phil Parkin (1/11/2016)
Orlando Colamatteo (1/11/2016)
For SSIS I was thinking more along the lines of this:1. Execute SQL Task gets all IDs needing to go into the file and stores in SSIS variable of type Object (an ADO.NET DataSet under the covers).
2. Precedence Constraint to Foreach Loop Container iterates over SSIS Variable just populated. In Foreach:
2.1. Data Flow Task writes header data to file utilizing the ID from the outer loop.
2.2. Precedence Constraint from last Data Flow Task to new Data Flow Task that writes details rows to file utilizing the ID from the outer loop. The destination will use a different File Connection Manager pointed to same physical file.
Done.
This is an elegant idea. There is a chance, I suspect, that the destination file will get locked by DF (header) such that DF2 (detail) fails. But if not, this is tidy.
Thanks Phil. My design makes two assumptions:
1. The two DFTs* will not execute in parallel.
2. The Flat File Connection Managers will close their respective file handles once they are done writing to the file.
Re: #1 The DFTs should run in sequence because of the Precedence Constraint I have placed between them.
Re: #2 In the first DFT the file will be opened for writing, the header line will be written and then the file will be closed. Immediately after that the second DFT will open the file, write the detail lines associated with the header line and then the file will be closed again. That cycle would repeat until all IDs were processed.
* Just in case: DFT = Data Flows Task, DF = Data Flow and is used sometimes as shorthand for Data Flow Task
I agree that it should work. But even if it does, repeatedly opening, writing and saving the target file for a lot of Ids must take its toll on performance.
I sense Jeff waiting in the wings, munching his popcorn.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
January 11, 2016 at 11:50 am
Yet another option that still uses SSIS but that puts more focus into the T-SQL selection logic is to generate a resultset where the data is already interleaved that contains a wide column data that looks exactly how it should look in the data line of the file except for maybe some right-space-padding. Similar to what Phil said in his initial post on this thread. This way the SSIS Package would simply be:
1. Data Flow Task selects data where header and detail lines are already interleaved and each row contains a wide column containing data like each line in the file. The data would then be laid down in the file as it comes out of the database. If the header and detail lines were different total widths, e.g. headers are 200 characters wide and details are 300 characters wide, then the Flat File Connection Manager would need to be setup with Ragged Right formatting.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
January 11, 2016 at 11:57 am
No matter which output route you take, why not simply create a set with
key (order id?)
line (order line number/row_number())
data (nvarchar(max) fixed width data)
Insert the headers, all with [line] = 0
Insert the details, all with [line] > 0
Output [data] to text file, order by ,[line]
If you need file headers or order trailers/file trailer, they are simple enough to include in the set.
Oh, and archive this set with the date/time stamp so that you can recreate the files they accidentally lose without having to re-query the source data.
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 11, 2016 at 11:59 am
Phil Parkin (1/11/2016)
Orlando Colamatteo (1/11/2016)
Phil Parkin (1/11/2016)
Orlando Colamatteo (1/11/2016)
For SSIS I was thinking more along the lines of this:1. Execute SQL Task gets all IDs needing to go into the file and stores in SSIS variable of type Object (an ADO.NET DataSet under the covers).
2. Precedence Constraint to Foreach Loop Container iterates over SSIS Variable just populated. In Foreach:
2.1. Data Flow Task writes header data to file utilizing the ID from the outer loop.
2.2. Precedence Constraint from last Data Flow Task to new Data Flow Task that writes details rows to file utilizing the ID from the outer loop. The destination will use a different File Connection Manager pointed to same physical file.
Done.
This is an elegant idea. There is a chance, I suspect, that the destination file will get locked by DF (header) such that DF2 (detail) fails. But if not, this is tidy.
Thanks Phil. My design makes two assumptions:
1. The two DFTs* will not execute in parallel.
2. The Flat File Connection Managers will close their respective file handles once they are done writing to the file.
Re: #1 The DFTs should run in sequence because of the Precedence Constraint I have placed between them.
Re: #2 In the first DFT the file will be opened for writing, the header line will be written and then the file will be closed. Immediately after that the second DFT will open the file, write the detail lines associated with the header line and then the file will be closed again. That cycle would repeat until all IDs were processed.
* Just in case: DFT = Data Flows Task, DF = Data Flow and is used sometimes as shorthand for Data Flow Task
I agree that it should work. But even if it does, repeatedly opening, writing and saving the target file for a lot of Ids must take its toll on performance.
It could but it may not matter much in the grand schema of things. The good news is that the solution should scale linearly.
I sense Jeff waiting in the wings, munching his popcorn.
Quite possibly 🙂 This is partially why I threw out the option of shaping the resultset entirely in T-SQL and just having SSIS write that to a flat file as-is. That solution requires a lot more effort in the T-SQL area but should perform admirably.
I'm happy to take SSIS, C# and T-SQL to compete with anyone using xp_cmdshell, T-SQL and bcp, any day 🙂
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
January 11, 2016 at 12:03 pm
Oh, and archive this set with the date/time stamp so that you can recreate the files they accidentally lose without having to re-query the source data.
Why not just archive the generated file? I am thinking filer space would be much cheaper than database space for archiving and if a replacement file were needed it's right there. If there was a data correction you'd likely have to re-pull from the transactional source anyway.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
January 11, 2016 at 12:05 pm
bpowers (1/11/2016)
Well after submitting the SSIS file for testing I was notified that the Header and Detail must be grouped together. Ughhh... If you still have a bcp option you'd be willing to work with me I would greatly appreciate it. Also, xp_cmdshell can be used as I will simply turn it on and off during the execution.Thanks
Whether using SSIS or something else, now would be a great time to post some example header and detail rows as you'd like to see them. Also, when you say "flat file", do you really mean a CSV/TSV or fixed field format (with the understanding that header and detail rows will be different)?
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 16 through 30 (of 44 total)
You must be logged in to reply to this topic. Login to reply