Parse txt file

  • I need some help trying to parse a .txt file and capturing certain fields to be able to dump that information into a SQL table for analysis and reporting.

    This is what the .txt looks like and I have defined the fields I want captured, so basically each time it finds this line dump out everything to the right of the field catcher.

    1. This is how each record read (capture) would start looking for this line:

      /* ----------------- ????????? ----------------- */

    2. Then it would try to find these tags:  Notice on the first lookup insert_job: and job_type: are on the same line. All other field catchers are on a single line.

    Capture tags in BOLD:

    insert_job: XXXXXXXXX job_type: XXX

    command: XXXXXXX

    days_of_week: xx,xx,xx

    start_times: "xx:xx"

    run_calendar:

    start_mins:

    machine:

    watch_file:

    box_name:

    3. When it finds any of these tags take everything right of the colon(:) load to variable then insert those values to SQL table.

    4. The lookup ends when it finds and blank line during lookup(extraction,) or the next:

    /* ----------------- ????????? ----------------- */

    5. I tried to show how records are parsed using 1st record -- 2nd record

    6. Not all records will contain the fields to capture, so those values can be filled with null value

    7. SQL table and field mapping to text file:

    SQL               Lookup

    jobname -- Insert_Job

    job_type -- job_type

    dayofweek  -- days_of_week

    starttime -- start_times

    runcalendar -- run_calendar

    startmins -- statr_mins

    machine -- machine

    watchfile -- watch_file

    boxname -- box_name

    Many thanks for any help you can provide.

    /* ----------------- 1st record ----------------- */ 

    insert_job: XXXXXXXXX job_type: XXX
    command: XXXXXXX
    days_of_week: xx,xx,xx
    start_times: "xx:xx"
    run_calendar:
    start_mins:
    machine:
    watch_file:
    box_name:

    /* ----------------- 2nd record ----------------- */
    ...
    ...
    ...
    ...
  • Some real or better-simulated data for the file is absolutely necessary for us to help you here or the answer will be either to create a BCP format file or import the whole row and do a  computational "Hail Mary".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Well this should get you an array with all the records in it in a PSObject format, from there you would need to go through it build your insert statements.

     

    $output = @()
    $currentCapture = $null
    $startCapture = $null

    $content = (Get-Content "C:\Temp\testy.txt")
    $content = $content -replace 'job_type:', "rnjob_type:"

    ForEach($line in ($content -split "rn"))
    {
    If($line -match '/\* -----------------.*----------------- \*/')
    {
    $startCapture = $true
    if($currentCapture -ne $null)
    {
    $output += (New-Object PSObject -property $currentCapture)
    }
    $currentCapture = $null
    }
    ElseIf($line.Trim().length -eq 0 -and $startCapture -eq $true -and $currentCapture -ne $null)
    {
    $startCapture = $false
    }
    ElseIf($line.Trim().length -gt 0 -and $startCapture -eq $true)
    {
    If($currentCapture -eq $null)
    {
    $currentCapture = @{}
    }
    $index = $line.IndexOf(":")
    $key = $line.Substring(0, $index)
    $value = $line.Substring($index + 1, $line.Length - ($index + 1))
    $currentCapture.Add($key.ToString(), $value.ToString())
    }
    }

    if($currentCapture -ne $null)
    {
    $output += (New-Object PSObject -property $currentCapture)
    }

    ForEach($record in $output)
    {
    $record | Format-Table *
    }
  • cool thanks... I'll give it a test against my file..

  • Exception calling "Substring" with "2" argument(s): "Length cannot be less than zero.

    Parameter name: length"

    At C:\PowerShell\autosys_parse_jobs.ps1:30 char:9

    + $key = $line.Substring(0, $index)

    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    + CategoryInfo : NotSpecified: (:) [], MethodInvocationException

    + FullyQualifiedErrorId : ArgumentOutOfRangeException

    Exception calling "Add" with "2" argument(s): "Item has already been added. Key in dictionary: 'command' Key being added: 'command'"

    At C:\PowerShell\autosys_parse_jobs.ps1:32 char:9

    + $currentCapture.Add($key.ToString(), $value.ToString())

    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    + CategoryInfo : NotSpecified: (:) [], MethodInvocationException

    + FullyQualifiedErrorId : ArgumentExceptionIt threw this error:

     

     

  • Ok... no actual data.  Moving on.  Good luck. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sorry Jeff,    --- Thanks ZZartin for sample... I did include error messages from run

    To you both sorry for not including a sample file to work from.. This sample has some of the tags in my example and others missing which is a good example because to all sections will contain ALL tags.

     

     

    Attachments:
    You must be logged in to view attached files.
  • ZZartin, after I created my sample file that I attached, and ran your sample script against it worked, and dumped the tags out like expected.

    The file contains CRLF for each line in the file.. would that be the issue?

    Thanks again..

  • Can't seem to get original file to process gets errors listed above.. sample file will process.

     

    Any thoughts or suggestions?

     

    Thanks again

  • Bruin wrote:

    Can't seem to get original file to process gets errors listed above.. sample file will process.

    Any thoughts or suggestions?

    Thanks again

     

    At this point without seeing the real file and given that the sample file does process......

  • Bruin wrote:

    Can't seem to get original file to process gets errors listed above.. sample file will process.

    Any thoughts or suggestions?

    Thanks again

    I have some suggestions:

    1. Compare the files, understand the differences and determine the reason for the failure based on those differences.
    2. Ask people here what the problem might be, without providing the error message or the file which caused the error. We like to undertake blind-as-a-bat challenges.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Bruin wrote:

    Can't seem to get original file to process gets errors listed above.. sample file will process.

    Any thoughts or suggestions?

    Thanks again

     

    Instead of making us go back an unknown number of posts to find them, why not just post them again.  You're looking right at them. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How could original script posted by zzartin, be modified to just pickup these tags during extraction? Currently it's finding ALL which do some testing with the file might be the issue.

    Thanks.

    Insert_Job

    job_type

    days_of_week

    start_times

    run_calendar

    start_mins

    machine

    watch_file

    box_name

  • When you're pulling the records out of the array to do the inserts you should be able to just pull the fields you need.

  • Sorry maybe what I'm asking is if your original script could only find tags listed rather than reading entire record set.

    When I did some samples it maybe running into an issue getting things other than these:

    Insert_Job

    job_type

    days_of_week

    start_times

    run_calendar

    start_mins

    machine

    watch_file

    box_name

     

Viewing 15 posts - 1 through 15 (of 55 total)

You must be logged in to reply to this topic. Login to reply