diff options
Diffstat (limited to 'man/systemd.exec.xml')
-rw-r--r-- | man/systemd.exec.xml | 91 |
1 files changed, 66 insertions, 25 deletions
diff --git a/man/systemd.exec.xml b/man/systemd.exec.xml index ec702d515d..c1867b4ed2 100644 --- a/man/systemd.exec.xml +++ b/man/systemd.exec.xml @@ -175,7 +175,9 @@ source path, destination path and option string, where the latter two are optional. If only a source path is specified the source and destination is taken to be the same. The option string may be either <literal>rbind</literal> or <literal>norbind</literal> for configuring a recursive or non-recursive bind - mount. If the destination path is omitted, the option string must be omitted too.</para> + mount. If the destination path is omitted, the option string must be omitted too. + Each bind mount definition may be prefixed with <literal>-</literal>, in which case it will be ignored + when its source path does not exist.</para> <para><varname>BindPaths=</varname> creates regular writable bind mounts (unless the source file system mount is already marked read-only), while <varname>BindReadOnlyPaths=</varname> creates read-only bind mounts. These @@ -631,8 +633,8 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting> processes. In this modes multiple units running processes under the same user ID may share key material. Unless <option>inherit</option> is selected the unique invocation ID for the unit (see below) is added as a protected key by the name <literal>invocation_id</literal> to the newly created session keyring. Defaults to - <option>private</option> for the system service manager and to <option>inherit</option> for the user service - manager.</para></listitem> + <option>private</option> for services of the system service manager and to <option>inherit</option> for + non-service units and for services of the user service manager.</para></listitem> </varlistentry> <varlistentry> @@ -786,14 +788,24 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting> <varlistentry> <term><varname>ProtectHome=</varname></term> - <listitem><para>Takes a boolean argument or <literal>read-only</literal>. If true, the directories - <filename>/home</filename>, <filename>/root</filename> and <filename>/run/user</filename> are made inaccessible - and empty for processes invoked by this unit. If set to <literal>read-only</literal>, the three directories are - made read-only instead. It is recommended to enable this setting for all long-running services (in particular - network-facing ones), to ensure they cannot get access to private user data, unless the services actually - require access to the user's private data. This setting is implied if <varname>DynamicUser=</varname> is - set. For this setting the same restrictions regarding mount propagation and privileges apply as for - <varname>ReadOnlyPaths=</varname> and related calls, see below.</para></listitem> + <listitem><para>Takes a boolean argument or the special values <literal>read-only</literal> or + <literal>tmpfs</literal>. If true, the directories <filename>/home</filename>, <filename>/root</filename> and + <filename>/run/user</filename> are made inaccessible and empty for processes invoked by this unit. If set to + <literal>read-only</literal>, the three directories are made read-only instead. If set to <literal>tmpfs</literal>, + temporary file systems are mounted on the three directories in read-only mode. The value <literal>tmpfs</literal> + is useful to hide home directories not relevant to the processes invoked by the unit, while necessary directories + are still visible by combining with <varname>BindPaths=</varname> or <varname>BindReadOnlyPaths=</varname>.</para> + + <para>Setting this to <literal>yes</literal> is mostly equivalent to set the three directories in + <varname>InaccessiblePaths=</varname>. Similary, <literal>read-only</literal> is mostly equivalent to + <varname>ReadOnlyPaths=</varname>, and <literal>tmpfs</literal> is mostly equivalent to + <varname>TemporaryFileSystem=</varname>.</para> + + <para> It is recommended to enable this setting for all long-running services (in particular network-facing ones), + to ensure they cannot get access to private user data, unless the services actually require access to the user's + private data. This setting is implied if <varname>DynamicUser=</varname> is set. For this setting the same + restrictions regarding mount propagation and privileges apply as for <varname>ReadOnlyPaths=</varname> and related + calls, see below.</para></listitem> </varlistentry> <varlistentry> @@ -904,9 +916,13 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting> reading only, writing will be refused even if the usual file access controls would permit this. Nest <varname>ReadWritePaths=</varname> inside of <varname>ReadOnlyPaths=</varname> in order to provide writable subdirectories within read-only directories. Use <varname>ReadWritePaths=</varname> in order to whitelist - specific paths for write access if <varname>ProtectSystem=strict</varname> is used. Paths listed in - <varname>InaccessiblePaths=</varname> will be made inaccessible for processes inside the namespace (along with - everything below them in the file system hierarchy).</para> + specific paths for write access if <varname>ProtectSystem=strict</varname> is used.</para> + + <para>Paths listed in <varname>InaccessiblePaths=</varname> will be made inaccessible for processes inside + the namespace along with everything below them in the file system hierarchy. This may be more restrictive than + desired, because it is not possible to nest <varname>ReadWritePaths=</varname>, <varname>ReadOnlyPaths=</varname>, + <varname>BindPaths=</varname>, or <varname>BindReadOnlyPaths=</varname> inside it. For a more flexible option, + see <varname>TemporaryFileSystem=</varname>.</para> <para>Note that restricting access with these options does not extend to submounts of a directory that are created later on. Non-directory paths may be specified as well. These options may be specified more than once, @@ -931,6 +947,29 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting> </varlistentry> <varlistentry> + <term><varname>TemporaryFileSystem=</varname></term> + + <listitem><para>Takes a space-separated list of mount points for temporary file systems (tmpfs). If set, a new file + system namespace is set up for executed processes, and a temporary file system is mounted on each mount point. + This option may be specified more than once, in which case temporary file systems are mounted on all listed mount + points. If the empty string is assigned to this option, the list is reset, and all prior assignments have no effect. + Each mount point may optionally be suffixed with a colon (<literal>:</literal>) and mount options such as + <literal>size=10%</literal> or <literal>ro</literal>. By default, each temporary file system is mounted + with <literal>nodev,strictatime,mode=0755</literal>. These can be disabled by explicitly specifying the corresponding + mount options, e.g., <literal>dev</literal> or <literal>nostrictatime</literal>.</para> + + <para>This is useful to hide files or directories not relevant to the processes invoked by the unit, while necessary + files or directories can be still accessed by combining with <varname>BindPaths=</varname> or + <varname>BindReadOnlyPaths=</varname>. See the example below.</para> + + <para>Example: if a unit has the following, + <programlisting>TemporaryFileSystem=/var:ro +BindReadOnlyPaths=/var/lib/systemd</programlisting> + then the invoked processes by the unit cannot see any files or directories under <filename>/var</filename> except for + <filename>/var/lib/systemd</filename> or its contents.</para></listitem> + </varlistentry> + + <varlistentry> <term><varname>PrivateTmp=</varname></term> <listitem><para>Takes a boolean argument. If true, sets up a new file system namespace for the executed @@ -1429,17 +1468,19 @@ CapabilityBoundingSet=~CAP_B CAP_C</programlisting> filter. The known architecture identifiers are the same as for <varname>ConditionArchitecture=</varname> described in <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry>, as well as <constant>x32</constant>, <constant>mips64-n32</constant>, <constant>mips64-le-n32</constant>, and - the special identifier <constant>native</constant>. Only system calls of the specified architectures will be - permitted to processes of this unit. This is an effective way to disable compatibility with non-native - architectures for processes, for example to prohibit execution of 32-bit x86 binaries on 64-bit x86-64 - systems. The special <constant>native</constant> identifier implicitly maps to the native architecture of the - system (or more strictly: to the architecture the system manager is compiled for). If running in user mode, or - in system mode, but without the <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting - <varname>User=nobody</varname>), <varname>NoNewPrivileges=yes</varname> is implied. Note that setting this - option to a non-empty list implies that <constant>native</constant> is included too. By default, this option is - set to the empty list, i.e. no system call architecture filtering is applied.</para> - - <para>Note that system call filtering is not equally effective on all architectures. For example, on x86 + the special identifier <constant>native</constant>. The special identifier <constant>native</constant> + implicitly maps to the native architecture of the system (or more precisely: to the architecture the system + manager is compiled for). If running in user mode, or in system mode, but without the + <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting <varname>User=nobody</varname>), + <varname>NoNewPrivileges=yes</varname> is implied. By default, this option is set to the empty list, i.e. no + system call architecture filtering is applied.</para> + + <para>If this setting is used, processes of this unit will only be permitted to call native system calls, and + system calls of the specified architectures. For the purposes of this option, the x32 architecture is treated + as including x86-64 system calls. However, this setting still fulfills its purpose, as explained below, on + x32.</para> + + <para>System call filtering is not equally effective on all architectures. For example, on x86 filtering of network socket-related calls is not possible, due to ABI limitations — a limitation that x86-64 does not have, however. On systems supporting multiple ABIs at the same time — such as x86/x86-64 — it is hence recommended to limit the set of permitted system call architectures so that secondary ABIs may not be used to |