February 24, 2011 at 10:43 am
Example:
declare @test-2 table
(
col2 varchar(15),col3 numeric(12,2)
)
insert into @test-2
select 'File1', 15 union all
select 'File1', 16 union all
select 'File1', 17 union all
select 'File2', 18 union all
select 'File2', 19 union all
select 'File2', 10 union all
select 'File3', 11 union all
select 'File3', 12 union all
select 'File3', 13 union all
select 'File3', 14 union all
select 'File4', 17
select * from @test-2
given this example (in reality it will end up being around 30k files), is there a simple (not ridiculously complicated) way to direct this single select into several files?
My first thought was cascading foreach loops, but that would take a crap load of time running, and as I'm new to the SSIS world, I'm kinda clueless on this.
Any thought appreciated
--
Thiago Dantas
@DantHimself
February 24, 2011 at 10:46 am
Strange requirement!
Can you do it just once and then do a file copy to the others - rather than running the data transfer over and over again?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
February 24, 2011 at 11:01 am
If you're doing this in the Data Flow (which you should), then you can use the Multicast component to make "copies" of the data flow - one in, two (or more) out - and send those to independent destinations.
Todd McDermid - SQL Server MVP, MCTS (SQL 08 BI), MCSD.Net
My Blog - Dimension Merge SCD Component for SSIS - SSIS Community Tasks and Components
February 24, 2011 at 11:07 am
tmcdermid (2/24/2011)
If you're doing this in the Data Flow (which you should), then you can use the Multicast component to make "copies" of the data flow - one in, two (or more) out - and send those to independent destinations.
sounds good, except I will need *many* destinations!
the way I see this, and please correct me if I'm wrong, is hardcoding the destinations, which is not possible.
@Phil
you mean, extracting the dataset to a single file and iterate it to generate other files? Not a bad idea really.
Any more ideas? Anyone?
--
Thiago Dantas
@DantHimself
February 24, 2011 at 11:32 am
Pretty much what I meant.
Where are you getting the destinations from?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
February 24, 2011 at 12:10 pm
I apologize, I thought it was clear. The destinations come from the query.
take a look at the example
what i need (with this example) is
filename: file1.csv
contents:
File115.00
File116.00
File117.00
filename: file2.csv
contents:
File218.00
File219.00
File210.00
filename: file3.csv
contents:
File311.00
File312.00
File313.00
File314.00
filename: file4.csv
contents:
File417.00
--
Thiago Dantas
@DantHimself
February 24, 2011 at 12:14 pm
Ah, in that case you'll need a conditional split to split out the data flow to the right destinations.
However, if you have a lot of destinations, it can be a PITA to configure it all.
So maybe you can create some script task in .NET to handle it dynamically?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
February 24, 2011 at 12:23 pm
Hmmm, wasn't clear to me! But now it is and my suggestion was not valid.
My first idea would be to use an Execute SQL task to populate an object variable with the results of
Select distinct filename
from ...
Now you have something to work with.
Next task is a foreach loop which iterates around ('shreds') the object variable.
Inside the foreach loop is a dataflow task which creates the files one at a time.
Your source query within the dataflow will be an expression
select * from ... where filename = object variable (can't remember the syntax, sorry)
and the destination path will also be an expression using the object variable.
This is quite advanced for an SSIS newbie 🙂
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
February 24, 2011 at 12:29 pm
Phil Parkin (2/24/2011)
Hmmm, wasn't clear to me! But now it is and my suggestion was not valid.My first idea would be to use an Execute SQL task to populate an object variable with the results of
Select distinct filename
from ...
Now you have something to work with.
Next task is a foreach loop which iterates around ('shreds') the object variable.
Inside the foreach loop is a dataflow task which creates the files one at a time.
Your source query within the dataflow will be an expression
select * from ... where filename = object variable (can't remember the syntax, sorry)
and the destination path will also be an expression using the object variable.
This is quite advanced for an SSIS newbie 🙂
that was kinda how I thought of it at first, but isn't a foreach like this pretty slow?
Speed is not really required here, but I don't want it running for hours either
--
Thiago Dantas
@DantHimself
February 24, 2011 at 12:36 pm
One way or another, you've got to create all of those files. Given that you need the destination to be dynamic, I can't think of a easy way to make it run in parallel - I guess you could break the files into ranges and have multiple foreach loops - one for each range - to introduce some parallelism. Not nice though.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
February 25, 2011 at 2:04 pm
the way im doing it right now is a foreach loop only to loop through the 5 databases i need to query.
the query also returns parts of the destination folder, to ease up my work
ADO.NET source with the source query and a Script Component as destination
inside the script component, on ProcessInputRow method i define a new IO.StreamWriter with the destination assembled, build the row with a Text.StringBuilder and call writeline on the streamwriter, and close it.
anything else i can do to speed this up? (im even newer to this kind of programming, and the company devs in Java :-P)
--
Thiago Dantas
@DantHimself
February 25, 2011 at 2:40 pm
Just got error 0xC0047020(unable to process the data) on the ado.net source then a following 0xC02090F5(something with PrimeOutput method in the message) error
I have no idea what this means and google hasn't been really useful =|
Anyone knows what these are?
--
Thiago Dantas
@DantHimself
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply