Disclaimer don't get the wrong idea about what you've found here

What appears below are my personal notes I wish were part of my long-term memory but don't always seem to fit. I strive for accuracy and clarity and appreciate feedback. If applying any of this information anywhere, confirm for youself the correctness of your work as what you see below might very well be, albeit unintentionally, incorrect or misleading. These notes are here as an easy reference for myself.

Information worthy of a more formal presentation will appear elsewhere than this "Scratch" area. - ksb


KSB's sh redirection and pipeline notes

Table of Contents

References

Redirections

Think of sh redirection syntax in terms of file descriptors and the two ways to modify them:

x> filename Change the output of file descriptor x to go to the file filename.
x>&y Make file descriptor x another name for file descriptor y.

Where x defaults to 1 in both cases.

That 2nd one is tricky until you realize that the shell is calling the dup system call making file descriptor x point to the same place that file descriptor y does. This has the effect redirecting output destined for file descriptor x to go rather where output destined for file descriptor y is going, without disturbing what was already going to y.

(The man page phases it "duplicates x to y", which confuses me into thinking that duplication of data is happening, when it is the duplication of the file descriptor itself that is actually happening - aka redirection.)


Use this cmd script to experiment with (which is an example of the 2nd modification rule):

#!/bin/sh
echo "This is Standard Out"
echo "This is Standard Error" >&2

Charting the default file descriptor setup of a command with no redirection:

$ ./cmd
  0 <-- tty  
  1 --> tty  
  2 --> tty  

File descriptors 0, 1 & 2 are conventionally referred to as stdin, stdout & stderr, respectively. Though those notions become a bit confused when you start redirectedting extensively.


Simple redirection:

./cmd < infile > outfile 2> errfile
  0 <-- tty  
  1 --> tty  
  2 --> tty  
  0 <-- infile  
  1 --> tty  
  2 --> tty  
  0 <-- infile  
  1 --> outfile  
  2 --> tty  
  0 <-- infile  
  1 --> outfile  
  2 --> errfile  

Sending both stdout and stderr to the same file:

./cmd > allfile 2>&1
  0 <-- tty  
  1 --> tty  
  2 --> tty  
  0 <-- tty  
  1 --> allfile  
  2 --> tty  
  0    <-- tty  
  1    --> allfile  
  2(1) --> allfile  

Here the 2>&1 makes fd2 a dup of fd1. Conceptually this accomplishes two things: loosing the original 'stderr' pipe by overwriting it with a new fd2, and making this new fd2 go to the same place that fd1 is going. This second 'historical' part is noted by placing a '(1)' next to the modified fd2. Note that from cmd's perspective, nothing has changed - it is still writing to fd1 and fd2.


Redirection is done left-to-right so order is important:

./cmd 2>&1 > outfile
  0 <-- tty  
  1 --> tty  
  2 --> tty  
  0    <-- tty  
  1    --> tty  
  2(1) --> tty  
  0    <-- tty  
  1    --> outfile  
  2(1) --> tty  

Again 2>&1 makes fd2 a dup of fd1, but this time fd1 is pointing to the same thing that fd2 is. The > outfile then comes along and changes fd1 to point to a file, in effect 'decoupling' fd1 and fd2. Since stderr already was going to tty, this doesn't do anything more than:

./cmd > outfile
  0 <-- tty  
  1 --> tty  
  2 --> tty  
  0 <-- tty  
  1 --> outfile  
  2 --> tty  

Now try this one on for size:

$ (./cmd 3>&2 2>&1 1>&3) > errfile 2> outfile

This introduces two new twists:

  1. A sub-shell by enclosing a command in parenthesies, and
  2. A new file descriptor #3.

The new file descriptor is easily dealt with by adding a new row in the chart. The sub-shell can be handled by using two charts: working from the inside out.

./cmd 3>&2 2>&1 1>&3
  0 <-- tty  
  1 --> tty  
  2 --> tty  

  0    <-- tty  
  1    --> tty  
  2    --> tty  
  3(2) --> tty  
  0    <-- tty  
  1    --> tty  
  2(1) --> tty  
  3(2) --> tty  
  0    <-- tty  
  1(3) --> tty  
  2(1) --> tty  
  3(2) --> tty  

This effectively switches stdout and stderr. Specifically, when cmd prints to its fd1 (the one inside the parenthesis), it will go to what we are now referring to as fd2. This distinction between what file descpriptors the command is using and which ones are being used outside of the command is essential - this is what this 2nd modification rule is doing for us - allowing us to 'change the name' of the pipe, outside of the command.

When cmd prints to it's fd2, it will go to what we are now referring to as fd1. This is a bit trickier, because we needed to 'hold' the command's internal fd2 in an external fd3 duplicate file descriptor. This is because in the second step we loose the internal fd2, but we've saved a copy of it in fd3. Then in the last step we make fd1 a dup of fd3.

Now you might think that redirecting fd3 might get you something, (by adding a '3> otherfile' at the end of the command) but it doesn't. ./cmd doesn't print anything to fd3 and making a dup of a file descriptor doesn't copy the output of what is sent to the original fd, it only makes 'another name' for writing to the same place as the original fd.

(I've discovered something odd here: the command './cmd 3>&2 2>&1 1>&3 > err 2> out', doesn't write stdout to out and strerr to err, as I thought it would. It appears that simple file redirection happens before any fd duplication. The command: './cmd 1>&- > out' supports this idea.)

So now when 'popping' out of the sub-shell you have another external level, where fd0, fd1 and fd2 are re-assigned. So what we have in the outter shell is:

( ) > errfile 2> outfile
  0    <-- tty  
  1(2) --> tty  
  2(1) --> tty  
  0    <-- tty  
  1(2) --> errfile  
  2(1) --> tty  
  0    <-- tty  
  1(2) --> errfile  
  2(1) --> outfile  

What we are left with is being able to capture stderr (fd2) by redirecting stdout (fd1).

This may seem like a contrived example, and it is, but this trick of switching stdout and stderr is helpfull when used with pipelines - which can only pick up output from stdout.



Pipelines

A pipeline takes the stdout of one command and makes that the stdin of another command. It is somewhat similar to redirecting the stdin, except that rather than having it come from a file, it comes from the stdout of another command. What is important to remember is that:

Bourne sh pipes will only operate on stdout as the stdin to the piped command.

So you need to get tricky with redirection to pipe something other than stdout.


Here is an example that makes stderr (and only stderr) go to the tty and saves both stdout and stderr in a log file.

((./cmd 3>&1 1>&2 2>&3) | tee /dev/tty) > all.log 2>&1

Two sub-shells, a pipe and a handfull of redirections. Starting with the innermost sub-shell, this is the equivalent to the above example where stdout and stderr are swapped with respect to fd1 and fd2.

./cmd 3>&1 1>&2 2>&3
  0 <-- tty  
  1 --> tty  
  2 --> tty  

  0    <-- tty  
  1    --> tty  
  2    --> tty  
  3(1) --> tty  
  0    <-- tty  
  1(2) --> tty  
  2    --> tty  
  3(1) --> tty  
  0    <-- tty  
  1(2) --> tty  
  2(3) --> tty  
  3(1) --> tty  

This was done so that stderr of the original command can be captured via a pipe to become the stdin of the tee command.

Now popping out of the first sub-shell, the tee command runs with this set-up:

| tee /dev/tty
  0(2) <-- tty  
  1(2) --> tty  
  2(1) --> tty  

Note that there is no shell redirection happening here, so stdout and stderr (of the tee command) are unchanged, but the tee command itself duplicates it's stdin to both stdout and, in this call, to the 'file' /dev/tty which is the same as stdout. This means that the tee command is similar to the fd dup redirection except that duplicates the output side of the pipe rather than the input side. So running the command at this point gives us:

$ (./cmd 3>&1 1>&2 2>&3) | tee /dev/tty
This is Standard Out
This is Standard Error
This is Standard Error

Everything is going to tty and the stderr has been duplicated (by the tee command, not the one of the redirects).

Because the tee command does not print anything to stderr, if we redirect it we will loose what is currently coming in on stderr (namely the original stdout of cmd). The trick is to use another sub-shell which will print it's stdout before applying the final redirects. So we now have:

( ) > all.log 2>&1
  0(2) <-- tty  
  1(2) --> tty  
  2(1) --> tty  
  0(2) <-- tty  
  1(2) --> all.log  
  2(1) --> tty  
  0(2)    <-- tty  
  1(2)    --> all.log  
  2(1(2)) --> all.log  

This last fd dup redirect now is operating on two levels of sub-shells, so there are nested parenthesis representing the original file descriptors.

Finally, we have:

$ ((./cmd 3>&1 1>&2 2>&3) | tee /dev/tty) > all.log 2>&1
This is Standard Error
$ cat all.log
This is Standard Out
This is Standard Error

Keith S. Beattie is responsible for this document, located at http://dst.lbl.gov/~ksb/Scratch/sh_redir_pipe.html, which is subject to LBNL's Privacy & Security Notice, Copyright Status and Disclaimers.

Last Modified: Monday, 25-Feb-2013 16:57:57 PST