Loading...
1.. SPDX-License-Identifier: GPL-2.0
2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
3.. Copyright © 2019-2020 ANSSI
4.. Copyright © 2021-2022 Microsoft Corporation
5
6=====================================
7Landlock: unprivileged access control
8=====================================
9
10:Author: Mickaël Salaün
11:Date: October 2024
12
13The goal of Landlock is to enable restriction of ambient rights (e.g. global
14filesystem or network access) for a set of processes. Because Landlock
15is a stackable LSM, it makes it possible to create safe security sandboxes as
16new security layers in addition to the existing system-wide access-controls.
17This kind of sandbox is expected to help mitigate the security impact of bugs or
18unexpected/malicious behaviors in user space applications. Landlock empowers
19any process, including unprivileged ones, to securely restrict themselves.
20
21We can quickly make sure that Landlock is enabled in the running system by
22looking for "landlock: Up and running" in kernel logs (as root):
23``dmesg | grep landlock || journalctl -kb -g landlock`` .
24Developers can also easily check for Landlock support with a
25:ref:`related system call <landlock_abi_versions>`.
26If Landlock is not currently supported, we need to
27:ref:`configure the kernel appropriately <kernel_support>`.
28
29Landlock rules
30==============
31
32A Landlock rule describes an action on an object which the process intends to
33perform. A set of rules is aggregated in a ruleset, which can then restrict
34the thread enforcing it, and its future children.
35
36The two existing types of rules are:
37
38Filesystem rules
39 For these rules, the object is a file hierarchy,
40 and the related filesystem actions are defined with
41 `filesystem access rights`.
42
43Network rules (since ABI v4)
44 For these rules, the object is a TCP port,
45 and the related actions are defined with `network access rights`.
46
47Defining and enforcing a security policy
48----------------------------------------
49
50We first need to define the ruleset that will contain our rules.
51
52For this example, the ruleset will contain rules that only allow filesystem
53read actions and establish a specific TCP connection. Filesystem write
54actions and other TCP actions will be denied.
55
56The ruleset then needs to handle both these kinds of actions. This is
57required for backward and forward compatibility (i.e. the kernel and user
58space may not know each other's supported restrictions), hence the need
59to be explicit about the denied-by-default access rights.
60
61.. code-block:: c
62
63 struct landlock_ruleset_attr ruleset_attr = {
64 .handled_access_fs =
65 LANDLOCK_ACCESS_FS_EXECUTE |
66 LANDLOCK_ACCESS_FS_WRITE_FILE |
67 LANDLOCK_ACCESS_FS_READ_FILE |
68 LANDLOCK_ACCESS_FS_READ_DIR |
69 LANDLOCK_ACCESS_FS_REMOVE_DIR |
70 LANDLOCK_ACCESS_FS_REMOVE_FILE |
71 LANDLOCK_ACCESS_FS_MAKE_CHAR |
72 LANDLOCK_ACCESS_FS_MAKE_DIR |
73 LANDLOCK_ACCESS_FS_MAKE_REG |
74 LANDLOCK_ACCESS_FS_MAKE_SOCK |
75 LANDLOCK_ACCESS_FS_MAKE_FIFO |
76 LANDLOCK_ACCESS_FS_MAKE_BLOCK |
77 LANDLOCK_ACCESS_FS_MAKE_SYM |
78 LANDLOCK_ACCESS_FS_REFER |
79 LANDLOCK_ACCESS_FS_TRUNCATE |
80 LANDLOCK_ACCESS_FS_IOCTL_DEV,
81 .handled_access_net =
82 LANDLOCK_ACCESS_NET_BIND_TCP |
83 LANDLOCK_ACCESS_NET_CONNECT_TCP,
84 .scoped =
85 LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
86 LANDLOCK_SCOPE_SIGNAL,
87 };
88
89Because we may not know which kernel version an application will be executed
90on, it is safer to follow a best-effort security approach. Indeed, we
91should try to protect users as much as possible whatever the kernel they are
92using.
93
94To be compatible with older Linux versions, we detect the available Landlock ABI
95version, and only use the available subset of access rights:
96
97.. code-block:: c
98
99 int abi;
100
101 abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
102 if (abi < 0) {
103 /* Degrades gracefully if Landlock is not handled. */
104 perror("The running kernel does not enable to use Landlock");
105 return 0;
106 }
107 switch (abi) {
108 case 1:
109 /* Removes LANDLOCK_ACCESS_FS_REFER for ABI < 2 */
110 ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
111 __attribute__((fallthrough));
112 case 2:
113 /* Removes LANDLOCK_ACCESS_FS_TRUNCATE for ABI < 3 */
114 ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE;
115 __attribute__((fallthrough));
116 case 3:
117 /* Removes network support for ABI < 4 */
118 ruleset_attr.handled_access_net &=
119 ~(LANDLOCK_ACCESS_NET_BIND_TCP |
120 LANDLOCK_ACCESS_NET_CONNECT_TCP);
121 __attribute__((fallthrough));
122 case 4:
123 /* Removes LANDLOCK_ACCESS_FS_IOCTL_DEV for ABI < 5 */
124 ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_IOCTL_DEV;
125 __attribute__((fallthrough));
126 case 5:
127 /* Removes LANDLOCK_SCOPE_* for ABI < 6 */
128 ruleset_attr.scoped &= ~(LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET |
129 LANDLOCK_SCOPE_SIGNAL);
130 }
131
132This enables the creation of an inclusive ruleset that will contain our rules.
133
134.. code-block:: c
135
136 int ruleset_fd;
137
138 ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
139 if (ruleset_fd < 0) {
140 perror("Failed to create a ruleset");
141 return 1;
142 }
143
144We can now add a new rule to this ruleset thanks to the returned file
145descriptor referring to this ruleset. The rule will only allow reading the
146file hierarchy ``/usr``. Without another rule, write actions would then be
147denied by the ruleset. To add ``/usr`` to the ruleset, we open it with the
148``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
149descriptor.
150
151.. code-block:: c
152
153 int err;
154 struct landlock_path_beneath_attr path_beneath = {
155 .allowed_access =
156 LANDLOCK_ACCESS_FS_EXECUTE |
157 LANDLOCK_ACCESS_FS_READ_FILE |
158 LANDLOCK_ACCESS_FS_READ_DIR,
159 };
160
161 path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
162 if (path_beneath.parent_fd < 0) {
163 perror("Failed to open file");
164 close(ruleset_fd);
165 return 1;
166 }
167 err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
168 &path_beneath, 0);
169 close(path_beneath.parent_fd);
170 if (err) {
171 perror("Failed to update ruleset");
172 close(ruleset_fd);
173 return 1;
174 }
175
176It may also be required to create rules following the same logic as explained
177for the ruleset creation, by filtering access rights according to the Landlock
178ABI version. In this example, this is not required because all of the requested
179``allowed_access`` rights are already available in ABI 1.
180
181For network access-control, we can add a set of rules that allow to use a port
182number for a specific action: HTTPS connections.
183
184.. code-block:: c
185
186 struct landlock_net_port_attr net_port = {
187 .allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP,
188 .port = 443,
189 };
190
191 err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
192 &net_port, 0);
193
194The next step is to restrict the current thread from gaining more privileges
195(e.g. through a SUID binary). We now have a ruleset with the first rule
196allowing read access to ``/usr`` while denying all other handled accesses for
197the filesystem, and a second rule allowing HTTPS connections.
198
199.. code-block:: c
200
201 if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
202 perror("Failed to restrict privileges");
203 close(ruleset_fd);
204 return 1;
205 }
206
207The current thread is now ready to sandbox itself with the ruleset.
208
209.. code-block:: c
210
211 if (landlock_restrict_self(ruleset_fd, 0)) {
212 perror("Failed to enforce ruleset");
213 close(ruleset_fd);
214 return 1;
215 }
216 close(ruleset_fd);
217
218If the ``landlock_restrict_self`` system call succeeds, the current thread is
219now restricted and this policy will be enforced on all its subsequently created
220children as well. Once a thread is landlocked, there is no way to remove its
221security policy; only adding more restrictions is allowed. These threads are
222now in a new Landlock domain, which is a merger of their parent one (if any)
223with the new ruleset.
224
225Full working code can be found in `samples/landlock/sandboxer.c`_.
226
227Good practices
228--------------
229
230It is recommended to set access rights to file hierarchy leaves as much as
231possible. For instance, it is better to be able to have ``~/doc/`` as a
232read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to
233``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy.
234Following this good practice leads to self-sufficient hierarchies that do not
235depend on their location (i.e. parent directories). This is particularly
236relevant when we want to allow linking or renaming. Indeed, having consistent
237access rights per directory enables changing the location of such directories
238without relying on the destination directory access rights (except those that
239are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER``
240documentation).
241
242Having self-sufficient hierarchies also helps to tighten the required access
243rights to the minimal set of data. This also helps avoid sinkhole directories,
244i.e. directories where data can be linked to but not linked from. However,
245this depends on data organization, which might not be controlled by developers.
246In this case, granting read-write access to ``~/tmp/``, instead of write-only
247access, would potentially allow moving ``~/tmp/`` to a non-readable directory
248and still keep the ability to list the content of ``~/tmp/``.
249
250Layers of file path access rights
251---------------------------------
252
253Each time a thread enforces a ruleset on itself, it updates its Landlock domain
254with a new layer of policy. This complementary policy is stacked with any
255other rulesets potentially already restricting this thread. A sandboxed thread
256can then safely add more constraints to itself with a new enforced ruleset.
257
258One policy layer grants access to a file path if at least one of its rules
259encountered on the path grants the access. A sandboxed thread can only access
260a file path if all its enforced policy layers grant the access as well as all
261the other system access controls (e.g. filesystem DAC, other LSM policies,
262etc.).
263
264Bind mounts and OverlayFS
265-------------------------
266
267Landlock enables restricting access to file hierarchies, which means that these
268access rights can be propagated with bind mounts (cf.
269Documentation/filesystems/sharedsubtree.rst) but not with
270Documentation/filesystems/overlayfs.rst.
271
272A bind mount mirrors a source file hierarchy to a destination. The destination
273hierarchy is then composed of the exact same files, on which Landlock rules can
274be tied, either via the source or the destination path. These rules restrict
275access when they are encountered on a path, which means that they can restrict
276access to multiple file hierarchies at the same time, whether these hierarchies
277are the result of bind mounts or not.
278
279An OverlayFS mount point consists of upper and lower layers. These layers are
280combined in a merge directory, and that merged directory becomes available at
281the mount point. This merge hierarchy may include files from the upper and
282lower layers, but modifications performed on the merge hierarchy only reflect
283on the upper layer. From a Landlock policy point of view, all OverlayFS layers
284and merge hierarchies are standalone and each contains their own set of files
285and directories, which is different from bind mounts. A policy restricting an
286OverlayFS layer will not restrict the resulted merged hierarchy, and vice versa.
287Landlock users should then only think about file hierarchies they want to allow
288access to, regardless of the underlying filesystem.
289
290Inheritance
291-----------
292
293Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
294restrictions from its parent. This is similar to seccomp inheritance (cf.
295Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with
296task's :manpage:`credentials(7)`. For instance, one process's thread may apply
297Landlock rules to itself, but they will not be automatically applied to other
298sibling threads (unlike POSIX thread credential changes, cf.
299:manpage:`nptl(7)`).
300
301When a thread sandboxes itself, we have the guarantee that the related security
302policy will stay enforced on all this thread's descendants. This allows
303creating standalone and modular security policies per application, which will
304automatically be composed between themselves according to their runtime parent
305policies.
306
307Ptrace restrictions
308-------------------
309
310A sandboxed process has less privileges than a non-sandboxed process and must
311then be subject to additional restrictions when manipulating another process.
312To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
313process, a sandboxed process should have a superset of the target process's
314access rights, which means the tracee must be in a sub-domain of the tracer.
315
316IPC scoping
317-----------
318
319Similar to the implicit `Ptrace restrictions`_, we may want to further restrict
320interactions between sandboxes. Each Landlock domain can be explicitly scoped
321for a set of actions by specifying it on a ruleset. For example, if a
322sandboxed process should not be able to :manpage:`connect(2)` to a
323non-sandboxed process through abstract :manpage:`unix(7)` sockets, we can
324specify such a restriction with ``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET``.
325Moreover, if a sandboxed process should not be able to send a signal to a
326non-sandboxed process, we can specify this restriction with
327``LANDLOCK_SCOPE_SIGNAL``.
328
329A sandboxed process can connect to a non-sandboxed process when its domain is
330not scoped. If a process's domain is scoped, it can only connect to sockets
331created by processes in the same scope.
332Moreover, If a process is scoped to send signal to a non-scoped process, it can
333only send signals to processes in the same scope.
334
335A connected datagram socket behaves like a stream socket when its domain is
336scoped, meaning if the domain is scoped after the socket is connected , it can
337still :manpage:`send(2)` data just like a stream socket. However, in the same
338scenario, a non-connected datagram socket cannot send data (with
339:manpage:`sendto(2)`) outside its scope.
340
341A process with a scoped domain can inherit a socket created by a non-scoped
342process. The process cannot connect to this socket since it has a scoped
343domain.
344
345IPC scoping does not support exceptions, so if a domain is scoped, no rules can
346be added to allow access to resources or processes outside of the scope.
347
348Truncating files
349----------------
350
351The operations covered by ``LANDLOCK_ACCESS_FS_WRITE_FILE`` and
352``LANDLOCK_ACCESS_FS_TRUNCATE`` both change the contents of a file and sometimes
353overlap in non-intuitive ways. It is recommended to always specify both of
354these together.
355
356A particularly surprising example is :manpage:`creat(2)`. The name suggests
357that this system call requires the rights to create and write files. However,
358it also requires the truncate right if an existing file under the same name is
359already present.
360
361It should also be noted that truncating files does not require the
362``LANDLOCK_ACCESS_FS_WRITE_FILE`` right. Apart from the :manpage:`truncate(2)`
363system call, this can also be done through :manpage:`open(2)` with the flags
364``O_RDONLY | O_TRUNC``.
365
366The truncate right is associated with the opened file (see below).
367
368Rights associated with file descriptors
369---------------------------------------
370
371When opening a file, the availability of the ``LANDLOCK_ACCESS_FS_TRUNCATE`` and
372``LANDLOCK_ACCESS_FS_IOCTL_DEV`` rights is associated with the newly created
373file descriptor and will be used for subsequent truncation and ioctl attempts
374using :manpage:`ftruncate(2)` and :manpage:`ioctl(2)`. The behavior is similar
375to opening a file for reading or writing, where permissions are checked during
376:manpage:`open(2)`, but not during the subsequent :manpage:`read(2)` and
377:manpage:`write(2)` calls.
378
379As a consequence, it is possible that a process has multiple open file
380descriptors referring to the same file, but Landlock enforces different things
381when operating with these file descriptors. This can happen when a Landlock
382ruleset gets enforced and the process keeps file descriptors which were opened
383both before and after the enforcement. It is also possible to pass such file
384descriptors between processes, keeping their Landlock properties, even when some
385of the involved processes do not have an enforced Landlock ruleset.
386
387Compatibility
388=============
389
390Backward and forward compatibility
391----------------------------------
392
393Landlock is designed to be compatible with past and future versions of the
394kernel. This is achieved thanks to the system call attributes and the
395associated bitflags, particularly the ruleset's ``handled_access_fs``. Making
396handled access rights explicit enables the kernel and user space to have a clear
397contract with each other. This is required to make sure sandboxing will not
398get stricter with a system update, which could break applications.
399
400Developers can subscribe to the `Landlock mailing list
401<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and
402test their applications with the latest available features. In the interest of
403users, and because they may use different kernel versions, it is strongly
404encouraged to follow a best-effort security approach by checking the Landlock
405ABI version at runtime and only enforcing the supported features.
406
407.. _landlock_abi_versions:
408
409Landlock ABI versions
410---------------------
411
412The Landlock ABI version can be read with the sys_landlock_create_ruleset()
413system call:
414
415.. code-block:: c
416
417 int abi;
418
419 abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
420 if (abi < 0) {
421 switch (errno) {
422 case ENOSYS:
423 printf("Landlock is not supported by the current kernel.\n");
424 break;
425 case EOPNOTSUPP:
426 printf("Landlock is currently disabled.\n");
427 break;
428 }
429 return 0;
430 }
431 if (abi >= 2) {
432 printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n");
433 }
434
435The following kernel interfaces are implicitly supported by the first ABI
436version. Features only supported from a specific version are explicitly marked
437as such.
438
439Kernel interface
440================
441
442Access rights
443-------------
444
445.. kernel-doc:: include/uapi/linux/landlock.h
446 :identifiers: fs_access net_access scope
447
448Creating a new ruleset
449----------------------
450
451.. kernel-doc:: security/landlock/syscalls.c
452 :identifiers: sys_landlock_create_ruleset
453
454.. kernel-doc:: include/uapi/linux/landlock.h
455 :identifiers: landlock_ruleset_attr
456
457Extending a ruleset
458-------------------
459
460.. kernel-doc:: security/landlock/syscalls.c
461 :identifiers: sys_landlock_add_rule
462
463.. kernel-doc:: include/uapi/linux/landlock.h
464 :identifiers: landlock_rule_type landlock_path_beneath_attr
465 landlock_net_port_attr
466
467Enforcing a ruleset
468-------------------
469
470.. kernel-doc:: security/landlock/syscalls.c
471 :identifiers: sys_landlock_restrict_self
472
473Current limitations
474===================
475
476Filesystem topology modification
477--------------------------------
478
479Threads sandboxed with filesystem restrictions cannot modify filesystem
480topology, whether via :manpage:`mount(2)` or :manpage:`pivot_root(2)`.
481However, :manpage:`chroot(2)` calls are not denied.
482
483Special filesystems
484-------------------
485
486Access to regular files and directories can be restricted by Landlock,
487according to the handled accesses of a ruleset. However, files that do not
488come from a user-visible filesystem (e.g. pipe, socket), but can still be
489accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
490restricted. Likewise, some special kernel filesystems such as nsfs, which can
491be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
492restricted. However, thanks to the `ptrace restrictions`_, access to such
493sensitive ``/proc`` files are automatically restricted according to domain
494hierarchies. Future Landlock evolutions could still enable to explicitly
495restrict such paths with dedicated ruleset flags.
496
497Ruleset layers
498--------------
499
500There is a limit of 16 layers of stacked rulesets. This can be an issue for a
501task willing to enforce a new ruleset in complement to its 16 inherited
502rulesets. Once this limit is reached, sys_landlock_restrict_self() returns
503E2BIG. It is then strongly suggested to carefully build rulesets once in the
504life of a thread, especially for applications able to launch other applications
505that may also want to sandbox themselves (e.g. shells, container managers,
506etc.).
507
508Memory usage
509------------
510
511Kernel memory allocated to create rulesets is accounted and can be restricted
512by the Documentation/admin-guide/cgroup-v1/memory.rst.
513
514IOCTL support
515-------------
516
517The ``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right restricts the use of
518:manpage:`ioctl(2)`, but it only applies to *newly opened* device files. This
519means specifically that pre-existing file descriptors like stdin, stdout and
520stderr are unaffected.
521
522Users should be aware that TTY devices have traditionally permitted to control
523other processes on the same TTY through the ``TIOCSTI`` and ``TIOCLINUX`` IOCTL
524commands. Both of these require ``CAP_SYS_ADMIN`` on modern Linux systems, but
525the behavior is configurable for ``TIOCSTI``.
526
527On older systems, it is therefore recommended to close inherited TTY file
528descriptors, or to reopen them from ``/proc/self/fd/*`` without the
529``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right, if possible.
530
531Landlock's IOCTL support is coarse-grained at the moment, but may become more
532fine-grained in the future. Until then, users are advised to establish the
533guarantees that they need through the file hierarchy, by only allowing the
534``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right on files where it is really required.
535
536Previous limitations
537====================
538
539File renaming and linking (ABI < 2)
540-----------------------------------
541
542Because Landlock targets unprivileged access controls, it needs to properly
543handle composition of rules. Such property also implies rules nesting.
544Properly handling multiple layers of rulesets, each one of them able to
545restrict access to files, also implies inheritance of the ruleset restrictions
546from a parent to its hierarchy. Because files are identified and restricted by
547their hierarchy, moving or linking a file from one directory to another implies
548propagation of the hierarchy constraints, or restriction of these actions
549according to the potentially lost constraints. To protect against privilege
550escalations through renaming or linking, and for the sake of simplicity,
551Landlock previously limited linking and renaming to the same directory.
552Starting with the Landlock ABI version 2, it is now possible to securely
553control renaming and linking thanks to the new ``LANDLOCK_ACCESS_FS_REFER``
554access right.
555
556File truncation (ABI < 3)
557-------------------------
558
559File truncation could not be denied before the third Landlock ABI, so it is
560always allowed when using a kernel that only supports the first or second ABI.
561
562Starting with the Landlock ABI version 3, it is now possible to securely control
563truncation thanks to the new ``LANDLOCK_ACCESS_FS_TRUNCATE`` access right.
564
565TCP bind and connect (ABI < 4)
566------------------------------
567
568Starting with the Landlock ABI version 4, it is now possible to restrict TCP
569bind and connect actions to only a set of allowed ports thanks to the new
570``LANDLOCK_ACCESS_NET_BIND_TCP`` and ``LANDLOCK_ACCESS_NET_CONNECT_TCP``
571access rights.
572
573Device IOCTL (ABI < 5)
574----------------------
575
576IOCTL operations could not be denied before the fifth Landlock ABI, so
577:manpage:`ioctl(2)` is always allowed when using a kernel that only supports an
578earlier ABI.
579
580Starting with the Landlock ABI version 5, it is possible to restrict the use of
581:manpage:`ioctl(2)` on character and block devices using the new
582``LANDLOCK_ACCESS_FS_IOCTL_DEV`` right.
583
584Abstract UNIX socket (ABI < 6)
585------------------------------
586
587Starting with the Landlock ABI version 6, it is possible to restrict
588connections to an abstract :manpage:`unix(7)` socket by setting
589``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET`` to the ``scoped`` ruleset attribute.
590
591Signal (ABI < 6)
592----------------
593
594Starting with the Landlock ABI version 6, it is possible to restrict
595:manpage:`signal(7)` sending by setting ``LANDLOCK_SCOPE_SIGNAL`` to the
596``scoped`` ruleset attribute.
597
598.. _kernel_support:
599
600Kernel support
601==============
602
603Build time configuration
604------------------------
605
606Landlock was first introduced in Linux 5.13 but it must be configured at build
607time with ``CONFIG_SECURITY_LANDLOCK=y``. Landlock must also be enabled at boot
608time like other security modules. The list of security modules enabled by
609default is set with ``CONFIG_LSM``. The kernel configuration should then
610contain ``CONFIG_LSM=landlock,[...]`` with ``[...]`` as the list of other
611potentially useful security modules for the running system (see the
612``CONFIG_LSM`` help).
613
614Boot time configuration
615-----------------------
616
617If the running kernel does not have ``landlock`` in ``CONFIG_LSM``, then we can
618enable Landlock by adding ``lsm=landlock,[...]`` to
619Documentation/admin-guide/kernel-parameters.rst in the boot loader
620configuration.
621
622For example, if the current built-in configuration is:
623
624.. code-block:: console
625
626 $ zgrep -h "^CONFIG_LSM=" "/boot/config-$(uname -r)" /proc/config.gz 2>/dev/null
627 CONFIG_LSM="lockdown,yama,integrity,apparmor"
628
629...and if the cmdline doesn't contain ``landlock`` either:
630
631.. code-block:: console
632
633 $ sed -n 's/.*\(\<lsm=\S\+\).*/\1/p' /proc/cmdline
634 lsm=lockdown,yama,integrity,apparmor
635
636...we should configure the boot loader to set a cmdline extending the ``lsm``
637list with the ``landlock,`` prefix::
638
639 lsm=landlock,lockdown,yama,integrity,apparmor
640
641After a reboot, we can check that Landlock is up and running by looking at
642kernel logs:
643
644.. code-block:: console
645
646 # dmesg | grep landlock || journalctl -kb -g landlock
647 [ 0.000000] Command line: [...] lsm=landlock,lockdown,yama,integrity,apparmor
648 [ 0.000000] Kernel command line: [...] lsm=landlock,lockdown,yama,integrity,apparmor
649 [ 0.000000] LSM: initializing lsm=lockdown,capability,landlock,yama,integrity,apparmor
650 [ 0.000000] landlock: Up and running.
651
652The kernel may be configured at build time to always load the ``lockdown`` and
653``capability`` LSMs. In that case, these LSMs will appear at the beginning of
654the ``LSM: initializing`` log line as well, even if they are not configured in
655the boot loader.
656
657Network support
658---------------
659
660To be able to explicitly allow TCP operations (e.g., adding a network rule with
661``LANDLOCK_ACCESS_NET_BIND_TCP``), the kernel must support TCP
662(``CONFIG_INET=y``). Otherwise, sys_landlock_add_rule() returns an
663``EAFNOSUPPORT`` error, which can safely be ignored because this kind of TCP
664operation is already not possible.
665
666Questions and answers
667=====================
668
669What about user space sandbox managers?
670---------------------------------------
671
672Using user space processes to enforce restrictions on kernel resources can lead
673to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
674the OS code and state
675<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
676
677What about namespaces and containers?
678-------------------------------------
679
680Namespaces can help create sandboxes but they are not designed for
681access-control and then miss useful features for such use case (e.g. no
682fine-grained restrictions). Moreover, their complexity can lead to security
683issues, especially when untrusted processes can manipulate them (cf.
684`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
685
686Additional documentation
687========================
688
689* Documentation/security/landlock.rst
690* https://landlock.io
691
692.. Links
693.. _samples/landlock/sandboxer.c:
694 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c
1.. SPDX-License-Identifier: GPL-2.0
2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
3.. Copyright © 2019-2020 ANSSI
4.. Copyright © 2021 Microsoft Corporation
5
6=====================================
7Landlock: unprivileged access control
8=====================================
9
10:Author: Mickaël Salaün
11:Date: March 2021
12
13The goal of Landlock is to enable to restrict ambient rights (e.g. global
14filesystem access) for a set of processes. Because Landlock is a stackable
15LSM, it makes possible to create safe security sandboxes as new security layers
16in addition to the existing system-wide access-controls. This kind of sandbox
17is expected to help mitigate the security impact of bugs or
18unexpected/malicious behaviors in user space applications. Landlock empowers
19any process, including unprivileged ones, to securely restrict themselves.
20
21Landlock rules
22==============
23
24A Landlock rule describes an action on an object. An object is currently a
25file hierarchy, and the related filesystem actions are defined with `access
26rights`_. A set of rules is aggregated in a ruleset, which can then restrict
27the thread enforcing it, and its future children.
28
29Defining and enforcing a security policy
30----------------------------------------
31
32We first need to create the ruleset that will contain our rules. For this
33example, the ruleset will contain rules that only allow read actions, but write
34actions will be denied. The ruleset then needs to handle both of these kind of
35actions.
36
37.. code-block:: c
38
39 int ruleset_fd;
40 struct landlock_ruleset_attr ruleset_attr = {
41 .handled_access_fs =
42 LANDLOCK_ACCESS_FS_EXECUTE |
43 LANDLOCK_ACCESS_FS_WRITE_FILE |
44 LANDLOCK_ACCESS_FS_READ_FILE |
45 LANDLOCK_ACCESS_FS_READ_DIR |
46 LANDLOCK_ACCESS_FS_REMOVE_DIR |
47 LANDLOCK_ACCESS_FS_REMOVE_FILE |
48 LANDLOCK_ACCESS_FS_MAKE_CHAR |
49 LANDLOCK_ACCESS_FS_MAKE_DIR |
50 LANDLOCK_ACCESS_FS_MAKE_REG |
51 LANDLOCK_ACCESS_FS_MAKE_SOCK |
52 LANDLOCK_ACCESS_FS_MAKE_FIFO |
53 LANDLOCK_ACCESS_FS_MAKE_BLOCK |
54 LANDLOCK_ACCESS_FS_MAKE_SYM,
55 };
56
57 ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
58 if (ruleset_fd < 0) {
59 perror("Failed to create a ruleset");
60 return 1;
61 }
62
63We can now add a new rule to this ruleset thanks to the returned file
64descriptor referring to this ruleset. The rule will only allow reading the
65file hierarchy ``/usr``. Without another rule, write actions would then be
66denied by the ruleset. To add ``/usr`` to the ruleset, we open it with the
67``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
68descriptor.
69
70.. code-block:: c
71
72 int err;
73 struct landlock_path_beneath_attr path_beneath = {
74 .allowed_access =
75 LANDLOCK_ACCESS_FS_EXECUTE |
76 LANDLOCK_ACCESS_FS_READ_FILE |
77 LANDLOCK_ACCESS_FS_READ_DIR,
78 };
79
80 path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
81 if (path_beneath.parent_fd < 0) {
82 perror("Failed to open file");
83 close(ruleset_fd);
84 return 1;
85 }
86 err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
87 &path_beneath, 0);
88 close(path_beneath.parent_fd);
89 if (err) {
90 perror("Failed to update ruleset");
91 close(ruleset_fd);
92 return 1;
93 }
94
95We now have a ruleset with one rule allowing read access to ``/usr`` while
96denying all other handled accesses for the filesystem. The next step is to
97restrict the current thread from gaining more privileges (e.g. thanks to a SUID
98binary).
99
100.. code-block:: c
101
102 if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
103 perror("Failed to restrict privileges");
104 close(ruleset_fd);
105 return 1;
106 }
107
108The current thread is now ready to sandbox itself with the ruleset.
109
110.. code-block:: c
111
112 if (landlock_restrict_self(ruleset_fd, 0)) {
113 perror("Failed to enforce ruleset");
114 close(ruleset_fd);
115 return 1;
116 }
117 close(ruleset_fd);
118
119If the `landlock_restrict_self` system call succeeds, the current thread is now
120restricted and this policy will be enforced on all its subsequently created
121children as well. Once a thread is landlocked, there is no way to remove its
122security policy; only adding more restrictions is allowed. These threads are
123now in a new Landlock domain, merge of their parent one (if any) with the new
124ruleset.
125
126Full working code can be found in `samples/landlock/sandboxer.c`_.
127
128Layers of file path access rights
129---------------------------------
130
131Each time a thread enforces a ruleset on itself, it updates its Landlock domain
132with a new layer of policy. Indeed, this complementary policy is stacked with
133the potentially other rulesets already restricting this thread. A sandboxed
134thread can then safely add more constraints to itself with a new enforced
135ruleset.
136
137One policy layer grants access to a file path if at least one of its rules
138encountered on the path grants the access. A sandboxed thread can only access
139a file path if all its enforced policy layers grant the access as well as all
140the other system access controls (e.g. filesystem DAC, other LSM policies,
141etc.).
142
143Bind mounts and OverlayFS
144-------------------------
145
146Landlock enables to restrict access to file hierarchies, which means that these
147access rights can be propagated with bind mounts (cf.
148Documentation/filesystems/sharedsubtree.rst) but not with
149Documentation/filesystems/overlayfs.rst.
150
151A bind mount mirrors a source file hierarchy to a destination. The destination
152hierarchy is then composed of the exact same files, on which Landlock rules can
153be tied, either via the source or the destination path. These rules restrict
154access when they are encountered on a path, which means that they can restrict
155access to multiple file hierarchies at the same time, whether these hierarchies
156are the result of bind mounts or not.
157
158An OverlayFS mount point consists of upper and lower layers. These layers are
159combined in a merge directory, result of the mount point. This merge hierarchy
160may include files from the upper and lower layers, but modifications performed
161on the merge hierarchy only reflects on the upper layer. From a Landlock
162policy point of view, each OverlayFS layers and merge hierarchies are
163standalone and contains their own set of files and directories, which is
164different from bind mounts. A policy restricting an OverlayFS layer will not
165restrict the resulted merged hierarchy, and vice versa. Landlock users should
166then only think about file hierarchies they want to allow access to, regardless
167of the underlying filesystem.
168
169Inheritance
170-----------
171
172Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
173restrictions from its parent. This is similar to the seccomp inheritance (cf.
174Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with
175task's :manpage:`credentials(7)`. For instance, one process's thread may apply
176Landlock rules to itself, but they will not be automatically applied to other
177sibling threads (unlike POSIX thread credential changes, cf.
178:manpage:`nptl(7)`).
179
180When a thread sandboxes itself, we have the guarantee that the related security
181policy will stay enforced on all this thread's descendants. This allows
182creating standalone and modular security policies per application, which will
183automatically be composed between themselves according to their runtime parent
184policies.
185
186Ptrace restrictions
187-------------------
188
189A sandboxed process has less privileges than a non-sandboxed process and must
190then be subject to additional restrictions when manipulating another process.
191To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
192process, a sandboxed process should have a subset of the target process rules,
193which means the tracee must be in a sub-domain of the tracer.
194
195Kernel interface
196================
197
198Access rights
199-------------
200
201.. kernel-doc:: include/uapi/linux/landlock.h
202 :identifiers: fs_access
203
204Creating a new ruleset
205----------------------
206
207.. kernel-doc:: security/landlock/syscalls.c
208 :identifiers: sys_landlock_create_ruleset
209
210.. kernel-doc:: include/uapi/linux/landlock.h
211 :identifiers: landlock_ruleset_attr
212
213Extending a ruleset
214-------------------
215
216.. kernel-doc:: security/landlock/syscalls.c
217 :identifiers: sys_landlock_add_rule
218
219.. kernel-doc:: include/uapi/linux/landlock.h
220 :identifiers: landlock_rule_type landlock_path_beneath_attr
221
222Enforcing a ruleset
223-------------------
224
225.. kernel-doc:: security/landlock/syscalls.c
226 :identifiers: sys_landlock_restrict_self
227
228Current limitations
229===================
230
231File renaming and linking
232-------------------------
233
234Because Landlock targets unprivileged access controls, it is needed to properly
235handle composition of rules. Such property also implies rules nesting.
236Properly handling multiple layers of ruleset, each one of them able to restrict
237access to files, also implies to inherit the ruleset restrictions from a parent
238to its hierarchy. Because files are identified and restricted by their
239hierarchy, moving or linking a file from one directory to another implies to
240propagate the hierarchy constraints. To protect against privilege escalations
241through renaming or linking, and for the sake of simplicity, Landlock currently
242limits linking and renaming to the same directory. Future Landlock evolutions
243will enable more flexibility for renaming and linking, with dedicated ruleset
244flags.
245
246Filesystem topology modification
247--------------------------------
248
249As for file renaming and linking, a sandboxed thread cannot modify its
250filesystem topology, whether via :manpage:`mount(2)` or
251:manpage:`pivot_root(2)`. However, :manpage:`chroot(2)` calls are not denied.
252
253Special filesystems
254-------------------
255
256Access to regular files and directories can be restricted by Landlock,
257according to the handled accesses of a ruleset. However, files that do not
258come from a user-visible filesystem (e.g. pipe, socket), but can still be
259accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
260restricted. Likewise, some special kernel filesystems such as nsfs, which can
261be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
262restricted. However, thanks to the `ptrace restrictions`_, access to such
263sensitive ``/proc`` files are automatically restricted according to domain
264hierarchies. Future Landlock evolutions could still enable to explicitly
265restrict such paths with dedicated ruleset flags.
266
267Ruleset layers
268--------------
269
270There is a limit of 64 layers of stacked rulesets. This can be an issue for a
271task willing to enforce a new ruleset in complement to its 64 inherited
272rulesets. Once this limit is reached, sys_landlock_restrict_self() returns
273E2BIG. It is then strongly suggested to carefully build rulesets once in the
274life of a thread, especially for applications able to launch other applications
275that may also want to sandbox themselves (e.g. shells, container managers,
276etc.).
277
278Memory usage
279------------
280
281Kernel memory allocated to create rulesets is accounted and can be restricted
282by the Documentation/admin-guide/cgroup-v1/memory.rst.
283
284Questions and answers
285=====================
286
287What about user space sandbox managers?
288---------------------------------------
289
290Using user space process to enforce restrictions on kernel resources can lead
291to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
292the OS code and state
293<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
294
295What about namespaces and containers?
296-------------------------------------
297
298Namespaces can help create sandboxes but they are not designed for
299access-control and then miss useful features for such use case (e.g. no
300fine-grained restrictions). Moreover, their complexity can lead to security
301issues, especially when untrusted processes can manipulate them (cf.
302`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
303
304Additional documentation
305========================
306
307* Documentation/security/landlock.rst
308* https://landlock.io
309
310.. Links
311.. _samples/landlock/sandboxer.c:
312 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c