Any Idea How to Parse a file Like This

  • I am getting a single file with the following contents and I need to parse it into different tables. Each table is defined a the line which contains =====. Then the first line below that will contain the field headers. Then below that will eventually contain the data. Then the whole thing repeats.

    ===== Domain Name

    domain name~stage of life~stage of life ends~expiry date~ROID~Auth Info

    ===== Domain Contacts

    domain name~contact type~contact id

    ===== Domain Status

    domain name~status flag~status value

    ===== Domain Hosts

    domain name~host sequence~host name

    ===== Contact Details

    contact id~name~organization~email~phone~language~ROID~CPR category~whois privacy

    ===== Contact Postal Address

    contact id~name~organization~street1~street2~street3~city~province(or state)~postal code~country_code

    ===== Domain Name

    domain name2~stage of life2~stage of life ends2~expiry date2~ROID2~Auth Info2

    ===== Domain Contacts

    domain name2~contact type2~contact id2

    ===== Domain Status

    domain name2~status flag2~status value2

    ===== Domain Hosts

    domain name2~host sequence2~host name2

    ===== Contact Details

    contact id2~name2~organization2~email2~phone2~language2~ROID2~CPR category2~whois privacy2

    ===== Contact Postal Address

    contact id2~name2~organization2~street1-2~street2-2~street3-2~city2~province(or state)2~postal code2~country_code2

  • You have a file with many tables and non-uniform delimiters and no target schema generated yet?

    There is no graceful way to handle that.

  • I believe a script component is the only way to go. I hope you know VB or C#.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • I'm with Alvin (as usual) - Script Component with multiple outputs to your various different tables.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil, I think you and I might make an interesting team. 🙂



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Seems like we've both spent enough time doing this stuff to know what works, Mr One Million 😀

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Thanks very much, that's what I suspected but wanted to confirm before moving ahead with it.

  • Phil Parkin (10/7/2010)


    Seems like we've both spent enough time doing this stuff to know what works, Mr One Million 😀

    😀

    You understand that I didn't post a million all by myself, right? 😛



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • The table names are repeating(Domain Name etc. as you scroll down the file) as per the descrption provided.

    Raunak J

  • Raunak Jhawar (10/7/2010)


    The table names are repeating(Domain Name etc. as you scroll down the file) as per the descrption provided.

    And?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ramses2nd (10/7/2010)


    I am getting a single file with the following contents and I need to parse it into different tables. Each table is defined a the line which contains =====. Then the first line below that will contain the field headers. Then below that will eventually contain the data. Then the whole thing repeats.

    ===== Domain Name

    domain name~stage of life~stage of life ends~expiry date~ROID~Auth Info

    ===== Domain Contacts

    domain name~contact type~contact id

    ===== Domain Status

    domain name~status flag~status value

    ===== Domain Hosts

    domain name~host sequence~host name

    ===== Contact Details

    contact id~name~organization~email~phone~language~ROID~CPR category~whois privacy

    ===== Contact Postal Address

    contact id~name~organization~street1~street2~street3~city~province(or state)~postal code~country_code

    ===== Domain Name

    domain name2~stage of life2~stage of life ends2~expiry date2~ROID2~Auth Info2

    ===== Domain Contacts

    domain name2~contact type2~contact id2

    ===== Domain Status

    domain name2~status flag2~status value2

    ===== Domain Hosts

    domain name2~host sequence2~host name2

    ===== Contact Details

    contact id2~name2~organization2~email2~phone2~language2~ROID2~CPR category2~whois privacy2

    ===== Contact Postal Address

    contact id2~name2~organization2~street1-2~street2-2~street3-2~city2~province(or state)2~postal code2~country_code2

    So post an example that has data in it so we can try stuff out. 😉

    Also, should "Domain Name" be a table name of [Domain Name] or is it a schema.tablename combination as in Domain.Name?

    Last but not least, your life would be a whole lot easier if these were all separate files. Can you not change them?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply