March 20, 2021 at 5:19 pm
As I pointed out in my other posts - that check for 'unicode' is not the best option. The best option is to check for the specific BOM characters for each encoding - and then convert 'GetString' the data from the retrieved bytes.
To compare the bytes - use Compare-Object as I have outlined above and specifically look for those encodings that have a defined BOM. At a minimum, check for Unicode and UTF-8 and default to Ascii - that will be much safer in the long run.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
March 22, 2021 at 7:44 pm
The records I have in Inbound won't change, and it made it all the way thru the sub-folders without hitting any errors, I'm going to try and run it with a search against the outbound folder now and see if it has any trouble. It seems the X12 standard files with current
script seems okay but maybe the EDIFACT is another story...How can I add a file counter and maybe a "Searching Message" to
let user know something happening and not stalled.
Thanks for your posts and suggestions..
March 22, 2021 at 8:02 pm
I provided examples of how to work with the BOM and use that - I recommend (strongly) that you do so - instead of the earlier hack I provided to just check for unicode.
You can add anything you need...it is simply a matter of writing output to the host or to the currently defined output. I already showed how to output to the host using Write-Host in this line:
# Check for at least one parameter selected
if ($Sender -eq "" -and $FileDate -eq "" -and $RecordType -eq "") {
Write-Host -ForegroundColor Yellow "At least one parameter must be selected. Please try again.";
Exit;
}
You can also use Write-Output instead - which would output to the currently defined output (eg: stdout). To form the message - you can use concatenation or my preference: "Searching Message: $($fileName)"
The $(var) syntax tells PS to evaluate the variable and substitute that value.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
March 23, 2021 at 5:14 pm
You where correct about encoding of files it hit other files in folders and gave error messages. I know you gave examples of how to check, but not sure I know how to insert that into the logic into code. Will that logic checking on each file really slow down the process?
Thanks...
March 23, 2021 at 5:24 pm
If it hits a file that doesn't match criteria check, instead of dropping a message out to ISE could it just log that file, and continue
searching without throw error's to screen...
I really appreciate your responses and comments...
March 23, 2021 at 6:34 pm
I am not sure what you expect - I have provided more than enough examples to get you what you need.
From my previous post:
Here are some hints:
# Encoding check arrays
[byte[]]$utf7 = 43,45;
[byte[]]$unicode = 255,254;
[byte[]]$utf8 = 239,187,191;
if (-not (Compare-Object $bytes[0..1] $unicode)) {
$offset = 1
Write-Host 'Unicode encoded file identified';
$fileData = [System.Text.Encoding]::Unicode.GetString($bytes);
}
This can be extended using:
if (condition) {
<code here>
}
elseif (condition) {
<code here>
}
elseif (condition) {
<code here>
}
else {
<default code here>
}
If you want to write to the host running the process use Write-Host. If you want to write to an output file - there are several methods available: Out-File, Export-Csv, redirect stdout (then use Write-Output)
Note: to check for UTF-7 files we need to assume the file is an EDI file and that the first 3 characters are ISA. If we assume that - then we know what the 4th and 7th characters will be if the file is encoded with UTF-7 (see previous posts). To perform a check you need to use 2 compare-objects statements - one for the first character check and one for the second character check.
Try to put this together - and if you run into problems, post the code you are running and where you are having issues.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
March 25, 2021 at 1:21 am
Okay so I dumped my search folders to a file using out-file, and found that I have 3 different types of encoding.
most are ascii, and I have utf8, and there are some that are blank(which are corrupted files - can be skipped)..
Now I need some help to call this function and based upon output(encoding) go search file for the parms entered..
function Get-FileEncodingv2($Path) {
$bytes = [byte[]](Get-Content $Path -Encoding byte -ReadCount 300 -TotalCount 300)
if(!$bytes) { return 'utf8' }
switch -regex ('{0:x2}{1:x2}{2:x2}{3:x2}' -f $bytes[0],$bytes[1],$bytes[2],$bytes[3]) {
'^efbbbf' { return 'utf8' }
'^2b2f76' { return 'utf7' }
'^fffe' { return 'unicode' }
'^feff' { return 'bigendianunicode' }
'^0000feff' { return 'utf32' }
default { return 'ascii' }
}
}
dir -Path 'C:\temp\EDI\Temp' -File |
select Name,@{Name='Encoding';Expression={Get-FileEncodingv2 $_.FullName}} |
ft -AutoSize
Thanks.
March 25, 2021 at 2:59 pm
In Powershell - you can create a function at the top of the script, then call it later - or you can put a function in a separate file and include that file in your script - or you can create it as a module and import the module...or other methods. However, that isn't needed here - you just need to check the BOM after reading the first 500 bytes of data:
First - define the byte arrays for the BOM's:
# Encoding check arrays
[byte[]]$utf7 = 43,45;
[byte[]]$unicode = 255,254;
[byte[]]$utf8 = 239,187,191;
Then - read the first 500 bytes, determine the encoding and decode based on that encoding:
# Get the first 300 bytes from the file
$bytes = Get-Content $fileName -Encoding byte -TotalCount 500 -ReadCount 500;
# Check the encoding of the file - get $fileData based on encoding
if (-not (Compare-Object $bytes[0..1] $unicode)) {
$offset = 1
Write-Host 'Unicode encoded file identified';
$fileData = [System.Text.Encoding]::Unicode.GetString($bytes);
}
elseif (-not (Compare-Object $bytes[0..2] $utf8)) {
$offset = 1
Write-Host 'UTF-8 encoded file identified.';
$fileData = [System.Text.Encoding]::Utf8.GetString($bytes);
}
# Note: this assumes the file is an EDI file where the first 3 characters are ISA
# and the 4th character is a + and the 7th is a - (+ACo- = *, +AHw- = |, +AH4- = ~, ...)
elseif (-not (Compare-Object $bytes[3] $utf7[0]) -and -not (Compare-Object $bytes[7] $utf7[1])) {
$offset = 0
Write-Host 'UTF-7 encoded file identified.'
$fileData = [System.Text.Encoding]::Utf7.GetString($bytes);
}
else {
Write-Host 'No encoding identified - using default Ascii';
$fileData = [System.Text.Encoding]::Ascii.GetString($bytes);
}
And finally - parse the records:
# Validate this is an EDI file
if ($fileData.Substring($offset,3) -eq "ISA") {
# Get the data element separator and segment element separator
$dataElement = $fileData.Substring(3+$offset,1);
$segmentElement = $fileData.Substring(105+$offset,1);
# Split first row based on segment and data element separators - Index = 0
$firstRow = $fileData.Split($segmentElement)[0].Split($dataElement);
# If we match the sender and the date - get the second row and check the record type
if (($firstRow[6].Trim() -eq $Sender -or $Sender -eq "") -and ($firstRow[9] -eq $FileDate -or $FileDate -eq "")) {
# Get the second row based on the segment and data element separators - Index = 1
$secondRow = $fileData.Split($segmentElement)[1].Split($dataElement);
if ($secondRow[1] -eq $RecordType -or $RecordType -eq "") {
# Copy the file to the new location
Copy-Item -Path $fileName -Destination "C:\Temp\Archive\$($filename)" -WhatIf;
}
}
}
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
March 26, 2021 at 12:00 am
Very cool!!! I'm going to run it thru the folders and see how it works.
Thanks again, and will report back results...
March 28, 2021 at 1:27 pm
I just needed to include an exclusion for the corrupt files, but other than that it worked great...
I'm going to add a Progress-Bar to keep user informed of the search... and maybe a counter for files found..
Thanks for ALL the help suggestions it's great when someone extends there scripting skills to someone trying to learn ..
March 29, 2021 at 5:06 pm
Happy to help - glad to see you have something that is working now.
I know this won't be extremely fast but it is workable. To get something much faster you would need to change the approach - but that wouldn't be too difficult. You could use this script as a starting point and instead of copying the files at this point, update a table in a database with the key elements - schedule the script to run once a day (for example) - and make sure you have indexes on the key columns.
A second script could then be created to execute a query based on the user parameters - which returns a list of matching values from the table and that script would copy the files.
A possible third script would be a cleanup script - something that runs (as needed or scheduled) that validates all entries in the database. If the entry in the database no longer exists in the file system - delete from the database.
The first script would then search the folders - filtered by last write time - and just add new entries. Or - the first script could rebuild the table each time it runs (eliminating the requirement for a third script).
Many options - but at least you now have something that meets the requirements.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
March 30, 2021 at 7:09 pm
SO I created a table in SQL that has many of the key fields
doc_bu
doc_tp
doc_filename
doc_date
doc_type
The doc_filename has the file name that the PS script found, and the pointer on disk. How can I strip out just the file name
and do a folder search for that specific file.
example:
restored\or1998873csh.int
I just want the or1998873csh.int to pass to a search it will always be .int as the 2nd part and I need it to search backwards until it finds
the slash(\)
restored\or1998873csh.int
How could I use the results of that to find the real folder on disk. The example I'm using is a pointer not the true
directory\drive where the file resides.
I was thinking that might save opening up each file for READ..
Thanks.
March 30, 2021 at 7:49 pm
SO I created a table in SQL that has many of the key fields
doc_bu doc_tp doc_filename doc_date doc_type
The doc_filename has the file name that the PS script found, and the pointer on disk. How can I strip out just the file name and do a folder search for that specific file.
example: restored\or1998873csh.int
I just want the or1998873csh.int to pass to a search it will always be .int as the 2nd part and I need it to search backwards until it finds the slash(\) restored\or1998873csh.int
How could I use the results of that to find the real folder on disk. The example I'm using is a pointer not the true directory\drive where the file resides.
I was thinking that might save opening up each file for READ..
Thanks.
How are you updating the table? In Powershell - you can get just the file name using $_.BaseName (returns the file name without extension). Combine that with $_.Extension to get the file name and extension. Ex: "$($_.BaseName).$($_.Extension)"
You can get the folder using $_.DirectoryName or $_.Directory - store that in the table also...or, store $_.FullName to get the full path and name where the file exists.
Once you have everything in a table - use Invoke-SqlCmd to execute a query and return the results into a variable. As long as you have the path and file name as a column being returned you can reference that value in a foreach using $_.ColumnNameFromSql
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
April 1, 2021 at 6:47 pm
The table was updated from a SQL script that read interchange table to pull information. Now I want to strip the filename from
table and pass that to PS script to go search folders and copy file to an archive.
Thanks
April 6, 2021 at 5:43 pm
How can I take the Parms from script and connect to SQL and run a query using the Parms from above?
# Check for at least one parameter selected
if ($Sender -eq "" -and $FileDate -eq "" -and $RecordType -eq "") {
Write-Host -ForegroundColor Yellow "At least one parameter must be selected. Please try again.";
Exit;
}
Doc_TP = $Sender and Doc_Date = $FileDate and Doc_Type = $RecordType.. and in query make sure at least 1 of the 3 parms
are populated.
Thanks..
SQL Query..
Select distinct
Document_Archive.Doc_BU,
FileLocation,
tblFileLocations.file_name
from
tblFileLocations,
dbo.Document_Archive
where
tblFileLocations.Bu = Document_Archive.Doc_BU and
rtrim(tblFileLocations.file_name) = rtrim(Document_Archive.doc_parsed_filename) and
Doc_TP = Parm and Doc_Date = Parm and Doc_Type = Parm
Viewing 15 posts - 61 through 75 (of 88 total)
You must be logged in to reply to this topic. Login to reply