Filter data without using Conditional Split

  • Hi,

    I am writing an SSIS package that imports data from a text file into SQL Server.

    I have a data flow for that with a flat file source and an oledb destination.

    This text file has data for multiple countries and the dataflow needs to pick up the data only for a particular country. I have a variable for specifying the country which will be passed as a parameter at run time.

    If I use conditional split to pick up just the data for that particular country (looking up the variable value), it will anyways read the whole text file first and then apply the conditional split. Is this correct or I am misunderstanding the conditional split transformation? Also, is there anyway I can actually just filter the data in my text file the way we can filter from an oledb source?

    Thanks,

  • PGV (4/12/2012)


    Hi,

    I am writing an SSIS package that imports data from a text file into SQL Server.

    I have a data flow for that with a flat file source and an oledb destination.

    This text file has data for multiple countries and the dataflow needs to pick up the data only for a particular country. I have a variable for specifying the country which will be passed as a parameter at run time.

    If I use conditional split to pick up just the data for that particular country (looking up the variable value), it will anyways read the whole text file first and then apply the conditional split. Is this correct or I am misunderstanding the conditional split transformation? Also, is there anyway I can actually just filter the data in my text file the way we can filter from an oledb source?

    Thanks,

    It will read the whole text file, it has to in order to visit all records, inspecting the country field along the way, however the whole file does not need to be read into memory at one time before the first "split" evaluation occurs.

    As SSIS reads the file it will pass each record down the pipeline as soon as it knows it qualifies for one of the outputs on the Conditional Split. Unless there is a Sort Task later in your pipeline (or some other task requiring a "volume of data" operation) the record will continue down the pipeline while the reading of the file is continuing.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • You can write a SQL statement against a text file using the JET OLE DB provider, but it needs some clever scripting.

    An example:

    Importing CSV file into Database with Schema.ini

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thank you both for your replies.

    I ended up using the conditional split itself, because I realised that its not that bad after all 🙂

    Yes, I also noticed that it will read the file and will pass the record down the pipeline as soon as it qualifies for the split condition, which is good.

    And my main file has around 800,000 records out of which the split results into 70210 records. I also have a further conditional split transformation which redirects these 70210 records to 3 different destinations and all this finished executing in 22 seconds which I feel is not bad.

    I know conditional split much better now 🙂 Thanks!

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply