summaryrefslogtreecommitdiff
path: root/HACKING/code-quality.md
blob: f296621dcdb4a0763cb996834488744cb08ef545 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
Code content
============

Be aware of the `librelib(7)` library suite, which lives in `src/lib`.
It is a suite of Bash libraries that will help you out.  Most of the
people looking at the libretools code are familiar with the `messages`
part of it, which actually contains a bunch of utility routines, not
just message printing.  There is also a library for dealing with
`blacklist.txt`, and one for loading configuration files and
PKGBUILDs.  These are common tasks, but are tricky to handle
consistently--the libraries are there to make things easier.  Take a
look at the man pages.

Message printing: All of the message printing routines, except for
`term_title` and `flag`,  take printf-type arguments.  Take advantage
of that; don't use string interpolation (don't do `"foo ${var}
bar"`).  The reason for this is that if you don't do string
interpolation, messages can be automatically internationalized.

Message printing: The in `--help`/`-h` text, use `print` to print
lines that should not wrap, `echo` to print blank lines, `prose` to
print paragraphs, `bullet` to print bullet points, and `flag` to print
option flags.  The text should follow this general format:

	print "Usage: %s [OPTIONS] VARS_ARE_UNDERSCORE_AND_CAPITAL" "${program_name}"
	print "One line description of program, no period"
	echo
	prose "More details.  This is a paragraph."
	echo
	print "Options:"
	flag  "-h"         "Show this message"

In the "Usage:" line, use printf `%s` and the value `"${0##*/}"` to
determine the program name at runtime.

When using `set -u`, `set -e`, or `trap`, you should also use `set -E`
to have the error handling be passed down to subshells.

Feel free to use `set -e` (fail on error), but be careful of the
caveats (there are a bunch of them); don't assume all errors are
checked because of it.

Use `set -u` if you can; it makes using an unset variable an error.
 - If a variable not being set is valid (perhaps a configuration
   option), use `${var:-}` when accessing it to suppress the error.
 - An empty array counts as unset, so if you have an array that may be
   empty, use `${var+"${var[@]}"}` (don't put quotes around the outer
   pair of braces) to only access it if it's non-empty.
   - The reason for this is that a normal string variable is basically
     an array with length=1; an unset variable looks like an array
     with length=0.  Weird stuff.
   - Actually, that was only true in Bash <4.4; but since the change
     wasn't mentioned in the NEWS, I suspect that a subsequent version
     will revert to the pre-4.4 behavior.

In the shebang, use `#!/usr/bin/env bash`.  This allows us to not
hardcode the location of bash (I'm not sure why this is useful for
something distro-dependent like libretools, but fauno seems to have a
use-case for it).

In the shebang, don't pass flags to bash, besides breaking `env`
(above), it means people will make mistakes when debugging, and
running things with `bash -x FILENAME`.  Instead, use `set` to adjust
the flags inside of the program.

Obey `$TMPDIR`.  It's usually as easy as passing `--tmpdir` to
`mktemp`.

Use `trap` to clean up your temporary files.  This way, even if your
program terminates early, things will be cleaned up.

Bash best practices
===================

Basically, know what you are doing, and be safe with it.  The problem
is that most people don't know about safe bash scripting.

A lot of people look at the "Advanced Bash Scripting" ebook--DO NOT do
that, it is trash... though it contains a "reference card" page that
may be useful and isn't trash.

Take a look at Gentoo's Bash
guidelines
<http://devmanual.gentoo.org/tools-reference/bash/index.html>.
They're pretty good, and cover most of the "gotcha's" about Bash
syntax.  Though, it mentions but discourages the use of Bash 3
features... why?  Who still uses Bash 2?  For libretools, you should
feel free to use Bash 4 features!

I wrote an article on Bash arrays
<https://lukeshu.com/blog/bash-arrays.html>.  A lot of people think
they're tricky, but they're simple once you know how they work.  It's
short enough that you should read the whole thing.  Know the
difference between `"${array[@]}"` and `"${array[*]}"`.  And I'll say
it again here, ALWAYS wrap those in double quotes; there is no reason
I can think of that the unquoted behavior would ever be the correct
thing.

My brief rules of thumb:

 - Quote every variable.
   - That includes arrays: `"${array[@]}"` and `"${array[*]}"`.
   - In most (but not all!) cases inside of `[[ ... ]]` conditions,
     variables don't need to be quoted.  When in doubt, quote them.
   - When assigning one variable to another, you don't need quotes;
     you don't need quotes for `foo=$bar`
 - Try to avoid global variables; declare all variables in functions
   with `local`.
   - Or `declare`; inside of a function, unless you pass the `-g`
     flag, `declare` makes the variable local.
 - Use `local VAR` before a `for VAR in LIST` loop--the variable is created in the
   current scope, not the scope of the loop.
 - Feeding input to `while` loops is weird because of how subshells
   work:

	# Input from a file
	# BAD
	cat file | while read line; do
		...
	done
	# GOOD
	while read line; do
		...
	done <file

	# Input from a program
	# BAD
	prog | while read line; do
		...
	done
	# GOOD
	while read line; do
		...
	done < <(prog)