How to program safely in bash

Why bash?


Bash has arrays and safe mode. When used correctly, bash is almost consistent with safe coding practices.

Fish is harder to make a mistake, but there is no safe mode. Therefore, a good idea is to prototype in fish, and then broadcast from fish to bash, if you know how to do it correctly.

Foreword


This manual accompanies ShellHarden, but the author also recommends ShellCheck so that ShellHarden rules do not diverge with ShellCheck.

Bash is not the language where the most correct way to solve a problem at the same time is the simplest . If you take the bash safe programming exam, the first rule of BashPitfalls would be: always use quotes.

The main thing you need to know about programming in bash


Manic put quotes! An unquoted variable should be regarded as a cocked bomb: it explodes on contact with a space. Yes, it “explodes” in the sense of dividing a string into an array . In particular, variable extensions like $var and command substitutions like $(cmd) undergo word splitting when the internal string expands into an array due to splitting in the $IFS special variable with a default space. This is usually invisible, because most often the result is an array of 1 element, indistinguishable from the expected string.

Not only that, but wildcards ( *? ) *? expanding as well. This process occurs after word splitting, so that if there is at least one wildcard in the word, the word becomes a group pattern that applies to any suitable file path. So this feature is starting to be applied to the file system!

Quoting suppresses both word splitting and template expansion for command variables and substitutions.

Variable Expansion:


Command substitution:


There are exceptions with optional quotes, but quotes never hurt, and the general rule is to be afraid of unquoted variables, so for the sake of your good, we will not look for border exceptions. It looks wrong, and the wrong practice is common enough to arouse suspicion: a lot of scripts have been written with broken processing of file names and spaces in them ...

ShellHarden only mentions a few exceptions - are these variables with numerical contents, such as $? , $# and ${#array[@]} .

Do I need to use reverse checkboxes?


Command substitutions can also have the following form:


Although this style can be used correctly, it looks less comfortable in quotes and less readable with nesting. The consensus is pretty clear here: avoid it.

ShellHarden rewrites such check marks in the form of brackets in dollars.

Do I need to use braces?


The brackets are used to interpolate strings, so they are usually redundant:


Theoretically, always using braces is not a problem, but according to your author’s experience there is a strong negative correlation between unnecessary use of braces and correct use of quotes - almost everyone chooses “bad and wordy” instead of “good, but wordy” form!

Theories of your author:


Therefore, it was decided to ban unnecessary curly braces: ShellHarden replaces these options with the simplest good form.

And now about string interpolation, where braces are really useful:


Concatenation and interpolation in bash equivalents even in arrays (which is ridiculous).

Since ShellHarden does not format styles, it is not supposed to change the correct code. This is true for the “good (interpolation)” option: from ShellHarden's point of view this will be the canonically correct form.

Now ShellHarden adds and removes braces as needed: in a bad example, var1 is supplied with brackets, but they are not allowed for var2 even in the case of “good (interpolation)”, since they are never needed at the end of the line. The latter requirement may well be canceled.

Got: Numbered Arguments


Unlike the variable names of a normal identifier (in regex: [_a-zA-Z][_a-zA-Z0-9]* ), the numbered arguments require parentheses (interpolation of lines does not require). ShellCheck says:

 echo "$10" ^-- SC1037: Braces are required for positionals over 9, eg ${10}. 

ShellHarden refuses to fix it (considers too thin a difference).

Since brackets are allowed up to 9, ShellHarden allows them for all numbered arguments.

Use arrays


To be able to quote all variables, you must use real arrays, rather than space-separated pseudo-massive strings.

The syntax is verbose, but you have to cope. This bashism is only one reason to refuse POSIX compatibility for most shell scripts.

Good:

 array=( a b ) array+=(c) if [ ${#array[@]} -gt 0 ]; then rm -- "${array[@]}" fi 

Poorly:

 pseudoarray=" \ a \ b \ " pseudoarray="$pseudoarray c" if ! [ "$pseudoarray" = '' ]; then rm -- $pseudoarray fi 

That is why arrays are so basic to the shell: command arguments are fundamentally arrays (and shell scripts are commands and arguments). It can be said that a shell that artificially makes impossible the transmission of several arguments will be comical and useless. Some common shells from this category include Dash and Busybox Ash. These are minimal POSIX-compatible shells — but what good is compatibility if the most important material is not on POSIX?

Exceptional cases when you are really going to break a string.


An example with \v as a data separator (note the second occurrence):

 IFS=$'\v' read -d '' -ra a < <(printf '%s\v' "$s") || true 

This way we avoid extending the template, and the way works even if the data separator is \n . The second occurrence of the data separator protects the last element if it turns out to be a space. For some reason, the -d option should be the first to go, so it -rad '' tempting to link parameters to -rad '' , but it won't work. Since in this case read returns a non-zero value, it should be protected from errexit ( || true ), if it is included. Tested in bash 4.0, 4.1, 4.2, 4.3 and 4.4.

Alternative for bash 4.4:

 readarray -td $'\v' a < <(printf '%s\v' "$s") 

How to start a bash script


From something like this:

 #!/usr/bin/env bash if test "$BASH" = "" || "$BASH" -uc "a=();true \"\${a[@]}\"" 2>/dev/null; then # Bash 4.4, Zsh set -euo pipefail else # Bash 4.3 and older chokes on empty arrays with set -u. set -eo pipefail fi shopt -s nullglob globstar 

It includes:


But not:

 IFS='' set -f shopt -s failglob 


How to complete the bash script


The exit status of the script is the status of the last command executed. Make sure it represents real success or failure.

The worst thing is to leave the solution unbound to the condition in the form of an AND list at the end of the script. If the condition is false, then the last command executed will be the condition itself.

For errexit, the conditions in the form of an AND list are never used in the first place. If errexit is not used, consider error handling even for the last command, so its exit status will not be masked if additional code is added to the script.

Poorly:

 condition && extra_stuff 

Good (errexit option):

 if condition; then extra_stuff fi 

Good (error handling):

 if condition; then extra_stuff || exit fi exit 0 

How to use errexit


Like set -e .

Delayed cleanup at program level


If errexit works as it should, use this to install any necessary cleanup on exit.

 tmpfile="$(mktemp -t myprogram-XXXXXX)" cleanup() { rm -f "$tmpfile" } trap cleanup EXIT 

Got: errexit ignored in command arguments


Here is a very tricky branching "bomb", the understanding of which was dear to me. My build script worked fine on different development machines, but put the build server on my knees:

 set -e # Fail if nproc is not installed make -j"$(nproc)" 

Correctly (substitution of a command in a task):

 set -e # Fail if nproc is not installed jobs="$(nproc)" make -j"$jobs" 

Warning: the local and export built-in commands remain commands, so this is still incorrect:

 set -e # Fail if nproc is not installed local jobs="$(nproc)" make -j"$jobs" 

ShellCheck only warns about special commands like local in this case.

To use local , separate the declaration from the task:

 set -e # Fail if nproc is not installed local jobs jobs="$(nproc)" make -j"$jobs" 

Got: errexit ignored depending on caller's context


Sometimes POSIX is terrible. Errexit is ignored in functions, group commands, and even subshells, if the caller verifies its success. All these examples print Unreachable and Great success , however strange it may seem.

Subshell:

 ( set -e false echo Unreachable ) && echo Great success 

Group team:

 { set -e false echo Unreachable } && echo Great success 

Function:

 f() { set -e false echo Unreachable } f && echo Great success 

Because of this, bash with errexit is practically unsuitable for building: yes, it is possible to wrap the errexit functions so that they work, but there are doubts that the saved effort (on explicit error handling) is worth it. Instead, consider splitting into fully autonomous scripts.

How to avoid calling the shell with incorrect quotes


When calling a command from other programming languages, the easiest way is to err and implicitly invoke a shell. If this shell command is static, then fine - it either works or not. But if your program somehow processes the lines for building this command, you need to understand - you are generating a shell script ! I rarely want to do this, and it’s quite tiring to arrange everything correctly:


Regardless of what programming language you do, there are at least three ways to build a command correctly. In order of preference:

Plan A: Do Without a Shell


If this is just a command with arguments (that is, no shell functions like pipelining or redirection), then choose an array option.


Bad (C ++):

 std::string cmd = "rm -rf "; cmd += path; system(cmd); 

Good (C / POSIX) minus error handling:

 char* const args[] = {"rm", "-rf", path, NULL}; pid_t child; posix_spawnp(&child, args[0], NULL, NULL, args, NULL); int status; waitpid(child, &status, 0); 

Plan B: static shell script


If a wrapper is required, let the arguments be the arguments. You might think that it was cumbersome to write a special shell script in your own file and access it until you see this trick:

Bad (python3): subprocess.check_call('docker exec {} bash -ec "printf %s {} > {}"'.format(instance, content, path))
Good (python3): subprocess.check_call(['docker', 'exec', instance, 'bash', '-ec', 'printf %s "$0" > "$1"', content, path])

Can you notice the shell script?

That's right, the printf command with redirection. Note the correctly quoted numbered arguments. Implementing a static shell script is fine.

These examples run in Docker, because otherwise they won't be as helpful, but Docker is also a great example of a command that runs other commands based on arguments. Unlike Ssh, as we will see later.

Last option: processing lines


If it should be a string (for example, because it should work via ssh ), then it cannot be bypassed. You have to quote each argument and escape any characters needed to exit these quotes. The simplest is the transition to single quotes, because they have the simplest screening rules. Only one rule: ''\" .

Typical filename in single quotes:

 echo 'Don'\''t stop (12" dub mix).mp3' 

How to use this trick to safely execute ssh commands? It's impossible! Well, here is the “often correct” solution:


We have to merge all the arguments into a string ourselves so that Ssh does not do it wrong: if you try to pass several ssh arguments, it will treacherously combine the arguments without quotes.

The reason why this is usually not possible is that the right decision depends on the preferences of the user at the other end, namely the remote shell, which can be anything. In principle, it may even be your mom. It is often “correct” to assume that the remote shell is bash or another POSIX-compatible shell, but fish is incompatible at this stage .

Source: https://habr.com/ru/post/413117/


All Articles