Multithreading to upload .bak files in parallel to S3 storage using PowerShell

  • Greetings -

    I wrote below PS to upload my backup files from local machine to AWS S3 storage, and it works. However, the process time is about 3 hours to upload 430 GB of backup files. The code goes through the loop to read each file, upload it, and then load the next file and so on. Can you please guide me on how to handle parallel data upload in PowerShell? I did some search, but I could not find a good answer, thank you. 

    $s3Bucket = 'TESTBUCKET'
    $backupPath = 'E:\BKUP\FULL\'
    $region = 'us-east-1'
    $accessKey = '123'
    $secretKey = '123'
    $results = Get-ChildItem -Path $backupPath -Recurse -Include "*.bak" -file
    foreach ($path in $results)
    {
        Write-Host $path
        $filename = [System.IO.Path]::GetFileName($path)
        Write-S3Object -BucketName $s3Bucket -File $path -Key /DestPath/subfolder/$filename -Region $region -AccessKey $accessKey -SecretKey $secretKey
    }

  • lsalih - Wednesday, February 1, 2017 12:36 PM

    Greetings -

    I wrote below PS to upload my backup files from local machine to AWS S3 storage, and it works. However, the process time is about 3 hours to upload 430 GB of backup files. The code goes through the loop to read each file, upload it, and then load the next file and so on. Can you please guide me on how to handle parallel data upload in PowerShell? I did some search, but I could not find a good answer, thank you. 

    $s3Bucket = 'TESTBUCKET'
    $backupPath = 'E:\BKUP\FULL\'
    $region = 'us-east-1'
    $accessKey = '123'
    $secretKey = '123'
    $results = Get-ChildItem -Path $backupPath -Recurse -Include "*.bak" -file
    foreach ($path in $results)
    {
        Write-Host $path
        $filename = [System.IO.Path]::GetFileName($path)
        Write-S3Object -BucketName $s3Bucket -File $path -Key /DestPath/subfolder/$filename -Region $region -AccessKey $accessKey -SecretKey $secretKey
    }

    Presumably you did a Google search for 'multithreading in powershell' or similar? There are plenty of results ... what did you try?

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil -
    I indeed searched before asking the question since most of the times there are examples/articles out there, I only found others mentioning the use of  Invoke-Command but couldn't find a clear example showing how to implement it.

    I tested with using cloudberry tool which saved 2 hours of process time, however I rather to use PS script and that is why I posted the question hoping to get an answer. Thank you.

  • lsalih - Wednesday, February 1, 2017 1:05 PM

    Phil -
    I indeed searched before asking the question since most of the times there are examples/articles out there, I only found others mentioning the use of  Invoke-Command but couldn't find a clear example showing how to implement it.

    I tested with using cloudberry tool which saved 2 hours of process time, however I rather to use PS script and that is why I posted the question hoping to get an answer. Thank you.

    Maybe this is of interest?

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil - Thank you for the link, I will look into it. Meantime please let me know if there are any other known simple ways to do it.

  • lsalih - Wednesday, February 1, 2017 12:36 PM

    Greetings -

    I wrote below PS to upload my backup files from local machine to AWS S3 storage, and it works. However, the process time is about 3 hours to upload 430 GB of backup files. The code goes through the loop to read each file, upload it, and then load the next file and so on. Can you please guide me on how to handle parallel data upload in PowerShell? I did some search, but I could not find a good answer, thank you. 

    $s3Bucket = 'TESTBUCKET'
    $backupPath = 'E:\BKUP\FULL\'
    $region = 'us-east-1'
    $accessKey = '123'
    $secretKey = '123'
    $results = Get-ChildItem -Path $backupPath -Recurse -Include "*.bak" -file
    foreach ($path in $results)
    {
        Write-Host $path
        $filename = [System.IO.Path]::GetFileName($path)
        Write-S3Object -BucketName $s3Bucket -File $path -Key /DestPath/subfolder/$filename -Region $region -AccessKey $accessKey -SecretKey $secretKey
    }

    Maybe I'm misreading that but it looks like the proverbial keys to the city are in that code  I just love it when people store keys and passwords in clear text.  It keeps hackers from damaging other things trying to hack out what they are and saves a whole lot of CPU in the process.  :sick: :Whistling:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • lsalih - Wednesday, February 1, 2017 2:33 PM

    Phil - Thank you for the link, I will look into it. Meantime please let me know if there are any other known simple ways to do it.

    Use SSIS? Parallelism is easy to implement there ... if you know SSIS 😉

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Phil Parkin - Wednesday, February 1, 2017 2:59 PM

    lsalih - Wednesday, February 1, 2017 2:33 PM

    Phil - Thank you for the link, I will look into it. Meantime please let me know if there are any other known simple ways to do it.

    Use SSIS? Parallelism is easy to implement there ... if you know SSIS 😉

    I have been wanting to update this post to include the script I wrote after I posted the question, so here I am to updating it: 
    First, Jeff - Thank you for your comment, I now use the profile for calling the key and secrete key, see below. Also Phil - I appreciate your guidance, I use script block and
    Start-ThreadJob to run the  Write-S3Object command for copying the files. With this script, I am able to upload 12 files a little over 500 GB total size in less than 2 hrs. 
    Script:
    Set-AWSCredentials -ProfileName S3_BKUPUpload
    $logPath="C:\S3_UploadLog\UploadFull_ScriptBlockMain.log";
    if (!(Test-Path $logPath))
    {Write-Verbose "Creating $logPath."
    $NewLogFile = New-Item $logPath -Force -ItemType File
    }
    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: Script Started" | Out-File -FilePath $logPath -Append
    $cmd = {
    param($a, $b)
    [string] $datestamp = $((Get-Date)).ToString("yyyyMMddHHmmss_fff")

    $logPath="C:\S3_UploadLog\UploadFULL_Thread_"+ $datestamp + ".log";

    try{ if (!(Test-Path $logPath))
      {
       Write-Verbose "Creating $logPath."
       $NewLogFile = New-Item $logPath -Force -ItemType File
      } 
    if (Test-Path "C:\Program Files (x86)")
      {

       Add-Type -Path "C:\Program Files (x86)\AWS SDK for .NET\bin\Net35\AWSSDK.Core.dll"
       Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: successfully added awssdk.core" | Out-File -FilePath $logPath -Append
      }
       Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: a $a " | Out-File -FilePath $logPath -Append
       Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: b $b " | Out-File -FilePath $logPath -Append
    $bucket = 'SQLBUCKETNAME'
    $region = 'us-east-1'
    $localpath = 'U:\LOCALBACKUPS\FULL\'
    foreach($LastBKUP in $b)  {
      $LastBKUP = $localpath+$LastBKUP;
      $filename = [System.IO.Path]::GetFileName($LastBKUP)
      Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: filename: $filename " | Out-File -FilePath $logPath -Append
      Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: backupfile: $LastBKUP " | Out-File -FilePath $logPath -Append
      Write-S3Object -BucketName $Bucket -File $LastBKUP -Key /Full/$filename -Region $region #-ServerSideEncryption AES256
      Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: sucessfully copied file: $LastBKUP " | Out-File -FilePath $logPath -Append
      }
    }catch
    {
      write-Output "Caught an exception: " | Out-File -FilePath $logPath -Append
      write-Output $_.Exception |format-list -force | Out-File -FilePath $logPath -Append
      $ErrorMessage = $_.Exception.Message
    }
    finally
    {  Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: Task Ended. **********" | Out-File -FilePath $logPath -Append

    }

    }#end of cmd

    $sourcepath = 'U:\LOCALBACKUPS\FULL\'

    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: source path : $sourcepath " | Out-File -FilePath $logPath -Append
    $files = Get-ChildItem -Path $SourcePath | sort-object lastwritetime -Descending | select -First 12
    $a=$files -split " "
    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: a: $a " | Out-File -FilePath $logPath -Append
    Write-Output $files.Length;
    $b = $a[0..1] 
    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: b: $b " | Out-File -FilePath $logPath -Append
    $c = $a[2..3]
    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: c: $c " | Out-File -FilePath $logPath -Append
    $d = $a[4..5]
    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: d: $d " | Out-File -FilePath $logPath -Append
    $e = $a[6..7]
    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: e: $e " | Out-File -FilePath $logPath -Append
    $g = $a[8..9]
    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: g: $g " | Out-File -FilePath $logPath -Append
    $h = $a[10..12]
    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: h: $h " | Out-File -FilePath $logPath -Append

    $j = 1;
    try
    {
    1..6 | ForEach-Object {
      if($j -eq 1)
    {Start-ThreadJob -ScriptBlock $cmd -ArgumentList $_, $b}
      if($j -eq 2)
    {Start-ThreadJob -ScriptBlock $cmd -ArgumentList $_, $c}
      if($j -eq 3)
    {Start-ThreadJob -ScriptBlock $cmd -ArgumentList $_, $d}
      if($j -eq 4)
    {Start-ThreadJob -ScriptBlock $cmd -ArgumentList $_, $e}
     if($j -eq 5)
    {Start-ThreadJob -ScriptBlock $cmd -ArgumentList $_, $g}
      if($j -eq 6)
    {Start-ThreadJob -ScriptBlock $cmd -ArgumentList $_, $h}
    $j = $j+1;
    }
    }catch
    {write-Output "Caught an exception: " | Out-File -FilePath $logPath -Append
    write-Output $_.Exception |format-
    list -force | Out-File -FilePath $logPath -Append
    $ErrorMessage = $_.Exception.Message
    }finally
    {
    While (Get-Job -State "Running") { Start-Sleep 2 }
    }
    Get-Job | Receive-Job
    Write-Output "$(Get-Date -Format "yyyy-MM-dd HH:mm:ss") INFO: Script Ended. **********" | Out-File -FilePath $logPath -Append
    Remove-Job *

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply