In this article, I wanted to discuss the PowerShell (PoSh) pipeline, which was confusing to me at first and took some time to get used to working with. I had used one in Linux before, but the PoSh one is more powerful, but also slightly different. Hopefully I can demystify some of the concepts for you in this basic article.
A DOS Pipeline
Many of you might have opened a command prompt before and done something like this:
In this case, despite a false start, I am redirecting the output of the echo command into a file with the >. This sends the output to a file. I can also use the >> to append or the < to take something from a file and pipe it into a command. Such as this:
I can also use the pipe character (|) to take the output from a command and pipe it into another one.One of the more common ways to do this is to pipe the result of the dir command into the "more" command, as in
dir /s | more
In all of these cases, I am redirecting text around from the shell (Command Prompt). That's how much of Unix works, but PowerShell builds on this.
The PowerShell Pipeline
In PoSh, the Pipeline doesn't pass text, but instead, it passes objects. That makes a difference as the object itself retains all the properties and methods, and you can access it without knowing any formatting of how the text exists. Here's a simple example. I can run the Get-Process cmdlet, which returns a lot of information. When I run this to the console, I get the process name, handle, and some resource usage, as shown here:
Now, what if I want to do something with this data, I can pipe the resulting object out to another cmdlet that will take the results and do things. One of the common cmdlets I use here is the Get-Member cmdlet, as in:
Get-Process | Get-Member
This lets me know what methods and properties are available from Get-Process. If I don't want the complete list, and it's a long list, I can further pipe this to filter to just properties:
In this case, the output of Get-Process is an object with a bunch of data, items like Handle, Id, MachineName, etc. Of these, the default display is for just the 7 columns shown a few images back. I can send that entire object through the pipeline (|) to the Get-Member cmdlet. This is the input to that cmdlet. Get-Member now give me the metadata about the object, letting me know the Name of a property or method, the type, the definition, and more. Again, only some of this is output.
In the last code I showed, I also then piped the output of Get-Member, which is another object, into the Where-Object cmdlet. Here I can filter things. I do this by picking some property, in this case MemberType, and filtering that by checking if it is equal to "Property". If so, it is output. If not, the value is bypassed. The $_ syntax refers to the current row being checked in the cmdlet.
Note, the results from the first cmdlet (Get-Process) are gone at the end. They do not pass through the Get-Member cmdlet.
Avoiding Loops
The pipeline is incredibly useful for processing lots of data in a terse fashion, without a lot of code. Suppose I had a lot of .sql files and I wanted to find out which ones had been edited recently. I know I could look in Explorer and sort the files by the modified date, but perhaps I want to do this programmatically, perhaps generate some sort of report. I could do this:
$a = Get-ChildItem *.sql foreach ($file in $a) { if ($file.LastAccessTime -gt "2020/12/25") { Write-Host $file.name } }
That works, but it's cumbersome code, and unnecessary. Instead, I could use the pipeline, passing through the results of Get-ChildItem into Where-Object and easily filtering things.
Get-ChildItem *.sql | where {$_.LastAccessTime -gt "2020/12/25"} | Format-Table name, LastAccessDate
Maybe a more interesting example is this:
Get-ChildItem *.bak, *.trn | where {$_.LastAccessTime -gt "2020/12/25"} | Format-Table name, LastAccessDate
Here I can now get all the files that are .bak or .trn, and filter those that have been accessed since a particular time. Useful for double checking if backups actually ran and produced files. I could further pipe this into some count, which might give me a quick check of the number of backups made. Useful for double checking jobs.
I can use the pipeline to string together multiple cmdlets in any way I wish.
Tracking What Is In the Pipeline
When you start to string together lots of cmdlets, it is each to get confused about what is being processed, passed through, and available. As I work with the pipeline, I often find that I will have a lot of items strung together. For example, I could get a set of cmdlets like this:
$a = Get-ChildItem *.bak | Where-Object {$_.Name -like "Adventure*"} |Select-Object Name, LastWriteTime, Length |Format-Table
If I don't get the results I expect, I find myself wondering what happened. Usually I perform one of two things. Either I work forward or backward, depending on the items I am working with. Usually backwards, because I find that I know some ways to use PoSh, but I'll add something extra in a new script and things stop working.
Working Forwards
If I work forwards, I want to start at the beginning and check each part of my pipeline. So I'll do this:
$a = Get-ChildItem *.bak | Where-Object {$_.Name -like "Adventure*"} |Select-Object Name, LastWriteTime, Length |Format-Table Get-ChildItem *.bak
I'll add the first part of my pipeline after the variable and see what is output. If this looks correct, I'll then more forward. I'll add the next part of my pipeline.
$a = Get-ChildItem *.bak | Where-Object {$_.Name -like "Adventure*"} |Select-Object Name, LastWriteTime, Length |Format-Table Get-ChildItem *.bak | Where-Object {$_.Name -like "Adventure*"}
I'll continue doing this until I find what I did wrong.
Working Backwards
Working backwards is the reverse. I'll start here.
$a = Get-ChildItem *.bak | Where-Object {$_.Name -like "Adventure*"} |Select-Object Name, LastWriteTime, Length |Format-Table Get-ChildItem *.bak | Where-Object {$_.Name -like "Adventure*"} |Select-Object Name, LastWriteTime, Length
In this case, I've removed the last part of the pipeline. If this is still wrong, I might then do this:
$a = Get-ChildItem *.bak | Where-Object {$_.Name -like "Adventure*"} |Select-Object Name, LastWriteTime, Length |Format-Table Get-ChildItem *.bak | Where-Object {$_.Name -like "Adventure*"}
I'll continue back until I find the issue.
Summary
The pipeline is a powerful way of stringing together cmdlets, shrinking your code into fewer lines, potentially preventing some bugs. The pipeline allows objects to pass wholly from one cmdlet to another, with very simple syntax.
There is more to the pipeline, including things like the -PipelineVariable option, though these are advanced items. I'm also not sure I completely understand them, but I'm working to improve my PoSh skills, and I hope you are as well. My aim with this article is to help you keep learning and growing.
Reference
- Get-Member - https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/get-member?view=powershell-7.1
- About_Pipelines - https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_pipelines?view=powershell-7.1
- Where-Object - https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/where-object?view=powershell-7.1
- Select-Object - https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/select-object?view=powershell-7.1