Prefer tee -a, not >>, in CI
Shell scripts sometimes have to append data to a file. Redirecting output with >>
is the conventional way and works fine, but using tee -a
instead is a usually better default, especially in continuous integration. It’s just as easy and gives automatic introspection: the same value is printed to stdout and so appears in normal logs too.
1
2
3
4
# conventional approach (worse!):
echo "some_variable=some_value" >> "$GITHUB_ENV"
# preferable approach (better!):
echo "some_variable=some_value" | tee -a "$GITHUB_ENV"
This occurs for me most often in continuous integration (CI), and especially with GitHub Actions. GHA currently exposes four different files to the CI scripts, and appending to them controls features of the runner itself:
$GITHUB_ENV
to set environment variables for future steps$GITHUB_OUTPUT
to set “outputs” of the step$GITHUB_STEP_SUMMARY
to add a step summary$GITHUB_PATH
to set new$PATH
entries for future steps
(This may apply to other CI providers too, but I am not as familiar with them.)
GitHub’s docs give examples like the following, to create a environment variable called JSON_RESPONSE
containing the result of a curl
invocation. This variable will be available to later steps in the same job.
1
2
3
4
5
6
7
8
9
steps:
- name: Set the value in bash
id: step_one
run: |
{
echo 'JSON_RESPONSE<<EOF'
curl https://example.com
echo EOF
} >> "$GITHUB_ENV"
This works fine! But the $JSON_RESPONSE
environment variable will have a value that isn’t obvious from the step definition, and might change from run to run. If a human needs to know what the actual value is for a specific run, they’re out of luck. This applies equally well if the workflow has lots of if
conditions in it, where knowing exactly which decisions were made is very handy.
In either case, maybe a workflow does something unexpected and I have to debug: if the CI UI shows any computed/non-obvious values, it’s easier to track down the problem, and not get distracted by red herrings.
Teelling what’s going on
There’s a few ways to get more insight into what exactly is happening with these appended command files. Focusing on the $GITHUB_ENV
example above:
- When setting up the variable, save into a temporary, and both print it out manually and append it to the file:
1
2
3
4
5
6
7
response=$(curl https://example.com)
echo "response=$response"
{
echo 'JSON_RESPONSE<<EOF'
echo "$response"
echo EOF
} >> "$GITHUB_ENV"
- After the step that sets the variable, add extra steps—temporary or permanent—that print the resulting environment variable directly:
1
2
3
4
5
6
steps:
- name: Set the value in bash
...
- name: temporary debugging
run: |
echo $JSON_RESPONSE
- When setting up the variable, use
tee -a
instead of>>
:
1
2
3
4
5
{
echo 'JSON_RESPONSE<<EOF'
curl https://example.com
echo EOF
} | tee -a "$GITHUB_ENV"
The last one is easy to do, is only a tiny change small change from the original code, and can be used habitually with little effort: any time data is being appended to a “command” file, write the characters | tee -a
instead of >>
.
Using tee -a
by default means the exact value is visible when it inevitably turns out to be important, potentially years after it was first added!
Having the values available easily also helps with making changes: either having to decipher the current behaviour of an existing workflow, or just iterating on a new one.
Teeaching what it does
“The tee
utility copies standard input to standard output, making a copy in zero or more files [passed as CLI arguments]”. By the default, it overwrites those files, while the -a
flag switches it to appending instead.
Here’s a demo, piping some text into tee
’s stdin:
1
2
echo 'hello' | tee -a file.txt
echo 'world' | tee -a file.txt
Running that prints:
1
2
hello
world
And file.txt
contains the same contents.
Thus, when we run ... | tee -a $GITHUB_ENV
in our CI environment, we’ll both:
- append the relevant content to the file at the path stored in the
$GITHUB_ENV
environment variable, same as>> $GITHUB_ENV
- see it in our CI logs, same as
echo
ing it directly
With the JSON_RESPONSE
example, the GitHub Actions UI will show0:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
JSON_RESPONSE<<EOF
<!doctype html>
<html>
<head>
...
<div>
<h1>Example Domain</h1>
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>
<p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>
EOF
Maybe all that cool example.com content is particularly popular some day, the site gets crushed under the load, and starts returning 5xx errors. The workflow relying on it then goes off the rails and stops working: knowing the root cause (example.com
misbehaving leading to bad content in $JSON_RESPONSE
) will be crucial for any fix.
Using tee -a
, the root cause will likely be easy to confirm, because the output is already right there, in the UI.
Teempering inappropriate use
There’s definitely cases where the “conventional” >>
might be more sensible, such as:
- If the output is overwhelmingly verbose. This can be managed in other ways, like grouping log lines or keeping the steps that set such variables focused on just setting those variables and split other work into separate steps.
- If the output contains secrets and so shouldn’t be shown in logs. In this case, it’s probably best to register such values for masking anyway, in case a value is printed accidentally somewhere else.
- If the decisions and outputs are already communicated in other, better ways:
tee -a
is just a quick/good-enough approach especially handy for single-location commands. A reusable script or “action” is more likely to have explicit user-friendly logs. - If adding
|
piping interferes with the shell’s job-control/exit-code-handling in a way that is problematic for the task: piping simple commands likeecho
andcat
is almost certainly always fine, more complicated commands might need to avoid it1.
TeeL;DR
In scripts that append to files, especially in CI, consider using | tee -a $FILE
by default instead of >> $FILE
.
1
2
3
4
# conventional approach (worse!):
echo "some_variable=some_value" >> "$GITHUB_ENV"
# preferable approach (better!):
echo "some_variable=some_value" | tee -a "$GITHUB_ENV"
This applies to sending CI commands via files, like GitHub Action’s $GITHUB_ENV
, $GITHUB_OUTPUT
and $GITHUB_PATH
.
-
This
JSON_RESPONSE
variable is not JSON with the example as written! The use oftee -a
makes this clear, if that was a cause of a bug. ↩ -
I include this case because have an impression that there’s some subtleties around piping commands beyond what
set -o pipefail
can handle, but am not sure. This is merely rumour and heresay based on only a vague memory that I haven’t been able to confirm or falsify. ↩