Update Lookup Table with missing entries and continue with lookup

  • Hi there.

    I have data coming from flat text files which have the parent object as text. In the database I normalize this and use the ParentID in the table to point to the parent object.

    What I want to do now is reading the text file and look up the ParentID and add that to the output. If the parent does not yet exist add it to the table and add the new ParentID to the output.

    That works partially. I have a lookup and have the failed output (those with no lookup entry found) flow out to an aggregate flow, then the a OLE DB destination. New entries are added (only once per object even if there are multiple entries in the source text file).

    On the other hand the ones where a lookup was found the ParentID gets added. But those that weren't found are only ending up in the database but I should have them continue in the normal flow afterwards. How would I do this?

    Regards,

    Marc

  • Marc

    It's very difficult to visualise this. Please will you show us what the structure of your databases table(s) and text file(s) looks like?

    Thanks

    John

  • Textfile looks like this:

    "Filername","AggregateName","SizeAvailable","SizeTotal","SizeUsed","SizePercentageUsed"

    "ch-sto101","aggrfc01","4781005754368","12833602928640","8052597174272","63"

    "ch-sto101","aggrfc02","3251355275264","5133441171456","1882085896192","37"

    "ch-sto101","aggrfc03","2363434860544","6673473544192","4310038683648","65"

    "ch-sto101","aggrfc04","5104901455872","6673473544192","1568572088320","24"

    "ch-sto101","aggrsa01","2988296765440","14374022676480","11385725911040","79"

    "ch-sto101","aggrsa02","1429209382912","9582681784320","8153472401408","85"

    Table Filer

    FilerID PK, int, not null (identity column)

    FilerName nvarchar(255)

    LocationID int,null

    TypeID int,null

    Table Aggregate:

    AggregateID PK, int, not null (identity column)

    AggregateName nvarchar(255)

    ParentID FK, int

    So when a new filer is found it should be added into the Filer Table. Then it should add aggregates into the Aggregate Table with the lookup of the filer put into the ParentID. There is also another file which has volume information. Same principle but the parent is then an AggregateID and not a FilerID.

    When I use a lookup I only get the FilerID of those who already have an entry. When I redirect errors to the output, do an aggregate task (group by) and then a OLE DB destination I get the new filers into the Filer table correctly but those entries that did were not able to lookup correctly will not get added into the Aggregate Table. How should I do this best? Any recommendations?

    REgards,

    Marc

  • Marc

    I've never used a Lookup transformation, but if you look at the editor, you'll see at the top that it "enables simple equi-joins". My guess is that this means it only does inner joins, and this is why you can only see the IDs that are already matched. My recommendation would be to try a Merge Join transformation instead.

    John

  • I'm probably missing something but would a merge join not just give me existing data? I would still have the issue to populate the parent data into the corresponding table first.

  • Not if you use a left (or right) join.

    You can either import from the text file into a staging table, or you can join use the direct input from the text file in your join.

    John

  • Marc,

    If I understand the question, you have the no match add it to the database and you want to get it back into the normal pipeline? If that is right then you can use the UNION ALL component in the data-flow.

    You can use this graphic that I created for a different thread to illustrate:

    CEWII

  • Well, I would say it's a more logical problem. I can think of one approach..

    DFT 1 - Infer All you parentID which are not there in table say dbo.DimParent.

    you need to read same text file with distinct values, Lookup on DimParent and add only those values which are not present.

    after this,

    DFT 2 - Will be your simple load process with lookup on dbo.DimParent table. you will get all your referenceIDs

    Make sense..??

    Thanks,

    Nimesh

  • Hi Elliot.

    Excellent. That was what I was looking for. This allows me to load the file only once. Not several times if I want to populate the Parent Tables first then the real data.

    Thanks a lot for your help.

    Regards,

    Marc

  • You are welcome.

    CEWII

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply