Linux Audio

Check our new training course

Linux kernel drivers training

May 6-19, 2025
Register
Loading...
v6.13.7
  1.. SPDX-License-Identifier: GPL-2.0
  2.. Copyright (C) 2020, Google LLC.
  3
  4Kernel Electric-Fence (KFENCE)
  5==============================
  6
  7Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety
  8error detector. KFENCE detects heap out-of-bounds access, use-after-free, and
  9invalid-free errors.
 10
 11KFENCE is designed to be enabled in production kernels, and has near zero
 12performance overhead. Compared to KASAN, KFENCE trades performance for
 13precision. The main motivation behind KFENCE's design, is that with enough
 14total uptime KFENCE will detect bugs in code paths not typically exercised by
 15non-production test workloads. One way to quickly achieve a large enough total
 16uptime is when the tool is deployed across a large fleet of machines.
 17
 18Usage
 19-----
 20
 21To enable KFENCE, configure the kernel with::
 22
 23    CONFIG_KFENCE=y
 24
 25To build a kernel with KFENCE support, but disabled by default (to enable, set
 26``kfence.sample_interval`` to non-zero value), configure the kernel with::
 27
 28    CONFIG_KFENCE=y
 29    CONFIG_KFENCE_SAMPLE_INTERVAL=0
 30
 31KFENCE provides several other configuration options to customize behaviour (see
 32the respective help text in ``lib/Kconfig.kfence`` for more info).
 33
 34Tuning performance
 35~~~~~~~~~~~~~~~~~~
 36
 37The most important parameter is KFENCE's sample interval, which can be set via
 38the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The
 39sample interval determines the frequency with which heap allocations will be
 40guarded by KFENCE. The default is configurable via the Kconfig option
 41``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0``
 42disables KFENCE.
 43
 44The sample interval controls a timer that sets up KFENCE allocations. By
 45default, to keep the real sample interval predictable, the normal timer also
 46causes CPU wake-ups when the system is completely idle. This may be undesirable
 47on power-constrained systems. The boot parameter ``kfence.deferrable=1``
 48instead switches to a "deferrable" timer which does not force CPU wake-ups on
 49idle systems, at the risk of unpredictable sample intervals. The default is
 50configurable via the Kconfig option ``CONFIG_KFENCE_DEFERRABLE``.
 51
 52.. warning::
 53   The KUnit test suite is very likely to fail when using a deferrable timer
 54   since it currently causes very unpredictable sample intervals.
 55
 56By default KFENCE will only sample 1 heap allocation within each sample
 57interval. *Burst mode* allows to sample successive heap allocations, where the
 58kernel boot parameter ``kfence.burst`` can be set to a non-zero value which
 59denotes the *additional* successive allocations within a sample interval;
 60setting ``kfence.burst=N`` means that ``1 + N`` successive allocations are
 61attempted through KFENCE for each sample interval.
 62
 63The KFENCE memory pool is of fixed size, and if the pool is exhausted, no
 64further KFENCE allocations occur. With ``CONFIG_KFENCE_NUM_OBJECTS`` (default
 65255), the number of available guarded objects can be controlled. Each object
 66requires 2 pages, one for the object itself and the other one used as a guard
 67page; object pages are interleaved with guard pages, and every object page is
 68therefore surrounded by two guard pages.
 69
 70The total memory dedicated to the KFENCE memory pool can be computed as::
 71
 72    ( #objects + 1 ) * 2 * PAGE_SIZE
 73
 74Using the default config, and assuming a page size of 4 KiB, results in
 75dedicating 2 MiB to the KFENCE memory pool.
 76
 77Note: On architectures that support huge pages, KFENCE will ensure that the
 78pool is using pages of size ``PAGE_SIZE``. This will result in additional page
 79tables being allocated.
 80
 81Error reports
 82~~~~~~~~~~~~~
 83
 84A typical out-of-bounds access looks like this::
 85
 86    ==================================================================
 87    BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa6/0x234
 88
 89    Out-of-bounds read at 0xffff8c3f2e291fff (1B left of kfence-#72):
 90     test_out_of_bounds_read+0xa6/0x234
 91     kunit_try_run_case+0x61/0xa0
 92     kunit_generic_run_threadfn_adapter+0x16/0x30
 93     kthread+0x176/0x1b0
 94     ret_from_fork+0x22/0x30
 95
 96    kfence-#72: 0xffff8c3f2e292000-0xffff8c3f2e29201f, size=32, cache=kmalloc-32
 97
 98    allocated by task 484 on cpu 0 at 32.919330s:
 99     test_alloc+0xfe/0x738
100     test_out_of_bounds_read+0x9b/0x234
101     kunit_try_run_case+0x61/0xa0
102     kunit_generic_run_threadfn_adapter+0x16/0x30
103     kthread+0x176/0x1b0
104     ret_from_fork+0x22/0x30
105
106    CPU: 0 PID: 484 Comm: kunit_try_catch Not tainted 5.13.0-rc3+ #7
107    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
108    ==================================================================
109
110The header of the report provides a short summary of the function involved in
111the access. It is followed by more detailed information about the access and
112its origin. Note that, real kernel addresses are only shown when using the
113kernel command line option ``no_hash_pointers``.
114
115Use-after-free accesses are reported as::
116
117    ==================================================================
118    BUG: KFENCE: use-after-free read in test_use_after_free_read+0xb3/0x143
119
120    Use-after-free read at 0xffff8c3f2e2a0000 (in kfence-#79):
121     test_use_after_free_read+0xb3/0x143
122     kunit_try_run_case+0x61/0xa0
123     kunit_generic_run_threadfn_adapter+0x16/0x30
124     kthread+0x176/0x1b0
125     ret_from_fork+0x22/0x30
126
127    kfence-#79: 0xffff8c3f2e2a0000-0xffff8c3f2e2a001f, size=32, cache=kmalloc-32
128
129    allocated by task 488 on cpu 2 at 33.871326s:
130     test_alloc+0xfe/0x738
131     test_use_after_free_read+0x76/0x143
132     kunit_try_run_case+0x61/0xa0
133     kunit_generic_run_threadfn_adapter+0x16/0x30
134     kthread+0x176/0x1b0
135     ret_from_fork+0x22/0x30
136
137    freed by task 488 on cpu 2 at 33.871358s:
138     test_use_after_free_read+0xa8/0x143
139     kunit_try_run_case+0x61/0xa0
140     kunit_generic_run_threadfn_adapter+0x16/0x30
141     kthread+0x176/0x1b0
142     ret_from_fork+0x22/0x30
143
144    CPU: 2 PID: 488 Comm: kunit_try_catch Tainted: G    B             5.13.0-rc3+ #7
145    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
146    ==================================================================
147
148KFENCE also reports on invalid frees, such as double-frees::
149
150    ==================================================================
151    BUG: KFENCE: invalid free in test_double_free+0xdc/0x171
152
153    Invalid free of 0xffff8c3f2e2a4000 (in kfence-#81):
154     test_double_free+0xdc/0x171
155     kunit_try_run_case+0x61/0xa0
156     kunit_generic_run_threadfn_adapter+0x16/0x30
157     kthread+0x176/0x1b0
158     ret_from_fork+0x22/0x30
159
160    kfence-#81: 0xffff8c3f2e2a4000-0xffff8c3f2e2a401f, size=32, cache=kmalloc-32
161
162    allocated by task 490 on cpu 1 at 34.175321s:
163     test_alloc+0xfe/0x738
164     test_double_free+0x76/0x171
165     kunit_try_run_case+0x61/0xa0
166     kunit_generic_run_threadfn_adapter+0x16/0x30
167     kthread+0x176/0x1b0
168     ret_from_fork+0x22/0x30
169
170    freed by task 490 on cpu 1 at 34.175348s:
171     test_double_free+0xa8/0x171
172     kunit_try_run_case+0x61/0xa0
173     kunit_generic_run_threadfn_adapter+0x16/0x30
174     kthread+0x176/0x1b0
175     ret_from_fork+0x22/0x30
176
177    CPU: 1 PID: 490 Comm: kunit_try_catch Tainted: G    B             5.13.0-rc3+ #7
178    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
179    ==================================================================
180
181KFENCE also uses pattern-based redzones on the other side of an object's guard
182page, to detect out-of-bounds writes on the unprotected side of the object.
183These are reported on frees::
184
185    ==================================================================
186    BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184
187
188    Corrupted memory at 0xffff8c3f2e33aff9 [ 0xac . . . . . . ] (in kfence-#156):
189     test_kmalloc_aligned_oob_write+0xef/0x184
190     kunit_try_run_case+0x61/0xa0
191     kunit_generic_run_threadfn_adapter+0x16/0x30
192     kthread+0x176/0x1b0
193     ret_from_fork+0x22/0x30
194
195    kfence-#156: 0xffff8c3f2e33afb0-0xffff8c3f2e33aff8, size=73, cache=kmalloc-96
196
197    allocated by task 502 on cpu 7 at 42.159302s:
198     test_alloc+0xfe/0x738
199     test_kmalloc_aligned_oob_write+0x57/0x184
200     kunit_try_run_case+0x61/0xa0
201     kunit_generic_run_threadfn_adapter+0x16/0x30
202     kthread+0x176/0x1b0
203     ret_from_fork+0x22/0x30
204
205    CPU: 7 PID: 502 Comm: kunit_try_catch Tainted: G    B             5.13.0-rc3+ #7
206    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
207    ==================================================================
208
209For such errors, the address where the corruption occurred as well as the
210invalidly written bytes (offset from the address) are shown; in this
211representation, '.' denote untouched bytes. In the example above ``0xac`` is
212the value written to the invalid address at offset 0, and the remaining '.'
213denote that no following bytes have been touched. Note that, real values are
214only shown if the kernel was booted with ``no_hash_pointers``; to avoid
215information disclosure otherwise, '!' is used instead to denote invalidly
216written bytes.
217
218And finally, KFENCE may also report on invalid accesses to any protected page
219where it was not possible to determine an associated object, e.g. if adjacent
220object pages had not yet been allocated::
221
222    ==================================================================
223    BUG: KFENCE: invalid read in test_invalid_access+0x26/0xe0
224
225    Invalid read at 0xffffffffb670b00a:
226     test_invalid_access+0x26/0xe0
227     kunit_try_run_case+0x51/0x85
228     kunit_generic_run_threadfn_adapter+0x16/0x30
229     kthread+0x137/0x160
230     ret_from_fork+0x22/0x30
231
232    CPU: 4 PID: 124 Comm: kunit_try_catch Tainted: G        W         5.8.0-rc6+ #7
233    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
234    ==================================================================
235
236DebugFS interface
237~~~~~~~~~~~~~~~~~
238
239Some debugging information is exposed via debugfs:
240
241* The file ``/sys/kernel/debug/kfence/stats`` provides runtime statistics.
242
243* The file ``/sys/kernel/debug/kfence/objects`` provides a list of objects
244  allocated via KFENCE, including those already freed but protected.
245
246Implementation Details
247----------------------
248
249Guarded allocations are set up based on the sample interval. After expiration
250of the sample interval, the next allocation through the main allocator (SLAB or
251SLUB) returns a guarded allocation from the KFENCE object pool (allocation
252sizes up to PAGE_SIZE are supported). At this point, the timer is reset, and
253the next allocation is set up after the expiration of the interval.
254
255When using ``CONFIG_KFENCE_STATIC_KEYS=y``, KFENCE allocations are "gated"
256through the main allocator's fast-path by relying on static branches via the
257static keys infrastructure. The static branch is toggled to redirect the
258allocation to KFENCE. Depending on sample interval, target workloads, and
259system architecture, this may perform better than the simple dynamic branch.
260Careful benchmarking is recommended.
261
262KFENCE objects each reside on a dedicated page, at either the left or right
263page boundaries selected at random. The pages to the left and right of the
264object page are "guard pages", whose attributes are changed to a protected
265state, and cause page faults on any attempted access. Such page faults are then
266intercepted by KFENCE, which handles the fault gracefully by reporting an
267out-of-bounds access, and marking the page as accessible so that the faulting
268code can (wrongly) continue executing (set ``panic_on_warn`` to panic instead).
269
270To detect out-of-bounds writes to memory within the object's page itself,
271KFENCE also uses pattern-based redzones. For each object page, a redzone is set
272up for all non-object memory. For typical alignments, the redzone is only
273required on the unguarded side of an object. Because KFENCE must honor the
274cache's requested alignment, special alignments may result in unprotected gaps
275on either side of an object, all of which are redzoned.
276
277The following figure illustrates the page layout::
278
279    ---+-----------+-----------+-----------+-----------+-----------+---
280       | xxxxxxxxx | O :       | xxxxxxxxx |       : O | xxxxxxxxx |
281       | xxxxxxxxx | B :       | xxxxxxxxx |       : B | xxxxxxxxx |
282       | x GUARD x | J : RED-  | x GUARD x | RED-  : J | x GUARD x |
283       | xxxxxxxxx | E :  ZONE | xxxxxxxxx |  ZONE : E | xxxxxxxxx |
284       | xxxxxxxxx | C :       | xxxxxxxxx |       : C | xxxxxxxxx |
285       | xxxxxxxxx | T :       | xxxxxxxxx |       : T | xxxxxxxxx |
286    ---+-----------+-----------+-----------+-----------+-----------+---
287
288Upon deallocation of a KFENCE object, the object's page is again protected and
289the object is marked as freed. Any further access to the object causes a fault
290and KFENCE reports a use-after-free access. Freed objects are inserted at the
291tail of KFENCE's freelist, so that the least recently freed objects are reused
292first, and the chances of detecting use-after-frees of recently freed objects
293is increased.
294
295If pool utilization reaches 75% (default) or above, to reduce the risk of the
296pool eventually being fully occupied by allocated objects yet ensure diverse
297coverage of allocations, KFENCE limits currently covered allocations of the
298same source from further filling up the pool. The "source" of an allocation is
299based on its partial allocation stack trace. A side-effect is that this also
300limits frequent long-lived allocations (e.g. pagecache) of the same source
301filling up the pool permanently, which is the most common risk for the pool
302becoming full and the sampled allocation rate dropping to zero. The threshold
303at which to start limiting currently covered allocations can be configured via
304the boot parameter ``kfence.skip_covered_thresh`` (pool usage%).
305
306Interface
307---------
308
309The following describes the functions which are used by allocators as well as
310page handling code to set up and deal with KFENCE allocations.
311
312.. kernel-doc:: include/linux/kfence.h
313   :functions: is_kfence_address
314               kfence_shutdown_cache
315               kfence_alloc kfence_free __kfence_free
316               kfence_ksize kfence_object_start
317               kfence_handle_page_fault
318
319Related Tools
320-------------
321
322In userspace, a similar approach is taken by `GWP-ASan
323<http://llvm.org/docs/GwpAsan.html>`_. GWP-ASan also relies on guard pages and
324a sampling strategy to detect memory unsafety bugs at scale. KFENCE's design is
325directly influenced by GWP-ASan, and can be seen as its kernel sibling. Another
326similar but non-sampling approach, that also inspired the name "KFENCE", can be
327found in the userspace `Electric Fence Malloc Debugger
328<https://linux.die.net/man/3/efence>`_.
329
330In the kernel, several tools exist to debug memory access errors, and in
331particular KASAN can detect all bug classes that KFENCE can detect. While KASAN
332is more precise, relying on compiler instrumentation, this comes at a
333performance cost.
334
335It is worth highlighting that KASAN and KFENCE are complementary, with
336different target environments. For instance, KASAN is the better debugging-aid,
337where test cases or reproducers exists: due to the lower chance to detect the
338error, it would require more effort using KFENCE to debug. Deployments at scale
339that cannot afford to enable KASAN, however, would benefit from using KFENCE to
340discover bugs due to code paths not exercised by test cases or fuzzers.
v6.2
  1.. SPDX-License-Identifier: GPL-2.0
  2.. Copyright (C) 2020, Google LLC.
  3
  4Kernel Electric-Fence (KFENCE)
  5==============================
  6
  7Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety
  8error detector. KFENCE detects heap out-of-bounds access, use-after-free, and
  9invalid-free errors.
 10
 11KFENCE is designed to be enabled in production kernels, and has near zero
 12performance overhead. Compared to KASAN, KFENCE trades performance for
 13precision. The main motivation behind KFENCE's design, is that with enough
 14total uptime KFENCE will detect bugs in code paths not typically exercised by
 15non-production test workloads. One way to quickly achieve a large enough total
 16uptime is when the tool is deployed across a large fleet of machines.
 17
 18Usage
 19-----
 20
 21To enable KFENCE, configure the kernel with::
 22
 23    CONFIG_KFENCE=y
 24
 25To build a kernel with KFENCE support, but disabled by default (to enable, set
 26``kfence.sample_interval`` to non-zero value), configure the kernel with::
 27
 28    CONFIG_KFENCE=y
 29    CONFIG_KFENCE_SAMPLE_INTERVAL=0
 30
 31KFENCE provides several other configuration options to customize behaviour (see
 32the respective help text in ``lib/Kconfig.kfence`` for more info).
 33
 34Tuning performance
 35~~~~~~~~~~~~~~~~~~
 36
 37The most important parameter is KFENCE's sample interval, which can be set via
 38the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The
 39sample interval determines the frequency with which heap allocations will be
 40guarded by KFENCE. The default is configurable via the Kconfig option
 41``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0``
 42disables KFENCE.
 43
 44The sample interval controls a timer that sets up KFENCE allocations. By
 45default, to keep the real sample interval predictable, the normal timer also
 46causes CPU wake-ups when the system is completely idle. This may be undesirable
 47on power-constrained systems. The boot parameter ``kfence.deferrable=1``
 48instead switches to a "deferrable" timer which does not force CPU wake-ups on
 49idle systems, at the risk of unpredictable sample intervals. The default is
 50configurable via the Kconfig option ``CONFIG_KFENCE_DEFERRABLE``.
 51
 52.. warning::
 53   The KUnit test suite is very likely to fail when using a deferrable timer
 54   since it currently causes very unpredictable sample intervals.
 55
 
 
 
 
 
 
 
 56The KFENCE memory pool is of fixed size, and if the pool is exhausted, no
 57further KFENCE allocations occur. With ``CONFIG_KFENCE_NUM_OBJECTS`` (default
 58255), the number of available guarded objects can be controlled. Each object
 59requires 2 pages, one for the object itself and the other one used as a guard
 60page; object pages are interleaved with guard pages, and every object page is
 61therefore surrounded by two guard pages.
 62
 63The total memory dedicated to the KFENCE memory pool can be computed as::
 64
 65    ( #objects + 1 ) * 2 * PAGE_SIZE
 66
 67Using the default config, and assuming a page size of 4 KiB, results in
 68dedicating 2 MiB to the KFENCE memory pool.
 69
 70Note: On architectures that support huge pages, KFENCE will ensure that the
 71pool is using pages of size ``PAGE_SIZE``. This will result in additional page
 72tables being allocated.
 73
 74Error reports
 75~~~~~~~~~~~~~
 76
 77A typical out-of-bounds access looks like this::
 78
 79    ==================================================================
 80    BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa6/0x234
 81
 82    Out-of-bounds read at 0xffff8c3f2e291fff (1B left of kfence-#72):
 83     test_out_of_bounds_read+0xa6/0x234
 84     kunit_try_run_case+0x61/0xa0
 85     kunit_generic_run_threadfn_adapter+0x16/0x30
 86     kthread+0x176/0x1b0
 87     ret_from_fork+0x22/0x30
 88
 89    kfence-#72: 0xffff8c3f2e292000-0xffff8c3f2e29201f, size=32, cache=kmalloc-32
 90
 91    allocated by task 484 on cpu 0 at 32.919330s:
 92     test_alloc+0xfe/0x738
 93     test_out_of_bounds_read+0x9b/0x234
 94     kunit_try_run_case+0x61/0xa0
 95     kunit_generic_run_threadfn_adapter+0x16/0x30
 96     kthread+0x176/0x1b0
 97     ret_from_fork+0x22/0x30
 98
 99    CPU: 0 PID: 484 Comm: kunit_try_catch Not tainted 5.13.0-rc3+ #7
100    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
101    ==================================================================
102
103The header of the report provides a short summary of the function involved in
104the access. It is followed by more detailed information about the access and
105its origin. Note that, real kernel addresses are only shown when using the
106kernel command line option ``no_hash_pointers``.
107
108Use-after-free accesses are reported as::
109
110    ==================================================================
111    BUG: KFENCE: use-after-free read in test_use_after_free_read+0xb3/0x143
112
113    Use-after-free read at 0xffff8c3f2e2a0000 (in kfence-#79):
114     test_use_after_free_read+0xb3/0x143
115     kunit_try_run_case+0x61/0xa0
116     kunit_generic_run_threadfn_adapter+0x16/0x30
117     kthread+0x176/0x1b0
118     ret_from_fork+0x22/0x30
119
120    kfence-#79: 0xffff8c3f2e2a0000-0xffff8c3f2e2a001f, size=32, cache=kmalloc-32
121
122    allocated by task 488 on cpu 2 at 33.871326s:
123     test_alloc+0xfe/0x738
124     test_use_after_free_read+0x76/0x143
125     kunit_try_run_case+0x61/0xa0
126     kunit_generic_run_threadfn_adapter+0x16/0x30
127     kthread+0x176/0x1b0
128     ret_from_fork+0x22/0x30
129
130    freed by task 488 on cpu 2 at 33.871358s:
131     test_use_after_free_read+0xa8/0x143
132     kunit_try_run_case+0x61/0xa0
133     kunit_generic_run_threadfn_adapter+0x16/0x30
134     kthread+0x176/0x1b0
135     ret_from_fork+0x22/0x30
136
137    CPU: 2 PID: 488 Comm: kunit_try_catch Tainted: G    B             5.13.0-rc3+ #7
138    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
139    ==================================================================
140
141KFENCE also reports on invalid frees, such as double-frees::
142
143    ==================================================================
144    BUG: KFENCE: invalid free in test_double_free+0xdc/0x171
145
146    Invalid free of 0xffff8c3f2e2a4000 (in kfence-#81):
147     test_double_free+0xdc/0x171
148     kunit_try_run_case+0x61/0xa0
149     kunit_generic_run_threadfn_adapter+0x16/0x30
150     kthread+0x176/0x1b0
151     ret_from_fork+0x22/0x30
152
153    kfence-#81: 0xffff8c3f2e2a4000-0xffff8c3f2e2a401f, size=32, cache=kmalloc-32
154
155    allocated by task 490 on cpu 1 at 34.175321s:
156     test_alloc+0xfe/0x738
157     test_double_free+0x76/0x171
158     kunit_try_run_case+0x61/0xa0
159     kunit_generic_run_threadfn_adapter+0x16/0x30
160     kthread+0x176/0x1b0
161     ret_from_fork+0x22/0x30
162
163    freed by task 490 on cpu 1 at 34.175348s:
164     test_double_free+0xa8/0x171
165     kunit_try_run_case+0x61/0xa0
166     kunit_generic_run_threadfn_adapter+0x16/0x30
167     kthread+0x176/0x1b0
168     ret_from_fork+0x22/0x30
169
170    CPU: 1 PID: 490 Comm: kunit_try_catch Tainted: G    B             5.13.0-rc3+ #7
171    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
172    ==================================================================
173
174KFENCE also uses pattern-based redzones on the other side of an object's guard
175page, to detect out-of-bounds writes on the unprotected side of the object.
176These are reported on frees::
177
178    ==================================================================
179    BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184
180
181    Corrupted memory at 0xffff8c3f2e33aff9 [ 0xac . . . . . . ] (in kfence-#156):
182     test_kmalloc_aligned_oob_write+0xef/0x184
183     kunit_try_run_case+0x61/0xa0
184     kunit_generic_run_threadfn_adapter+0x16/0x30
185     kthread+0x176/0x1b0
186     ret_from_fork+0x22/0x30
187
188    kfence-#156: 0xffff8c3f2e33afb0-0xffff8c3f2e33aff8, size=73, cache=kmalloc-96
189
190    allocated by task 502 on cpu 7 at 42.159302s:
191     test_alloc+0xfe/0x738
192     test_kmalloc_aligned_oob_write+0x57/0x184
193     kunit_try_run_case+0x61/0xa0
194     kunit_generic_run_threadfn_adapter+0x16/0x30
195     kthread+0x176/0x1b0
196     ret_from_fork+0x22/0x30
197
198    CPU: 7 PID: 502 Comm: kunit_try_catch Tainted: G    B             5.13.0-rc3+ #7
199    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
200    ==================================================================
201
202For such errors, the address where the corruption occurred as well as the
203invalidly written bytes (offset from the address) are shown; in this
204representation, '.' denote untouched bytes. In the example above ``0xac`` is
205the value written to the invalid address at offset 0, and the remaining '.'
206denote that no following bytes have been touched. Note that, real values are
207only shown if the kernel was booted with ``no_hash_pointers``; to avoid
208information disclosure otherwise, '!' is used instead to denote invalidly
209written bytes.
210
211And finally, KFENCE may also report on invalid accesses to any protected page
212where it was not possible to determine an associated object, e.g. if adjacent
213object pages had not yet been allocated::
214
215    ==================================================================
216    BUG: KFENCE: invalid read in test_invalid_access+0x26/0xe0
217
218    Invalid read at 0xffffffffb670b00a:
219     test_invalid_access+0x26/0xe0
220     kunit_try_run_case+0x51/0x85
221     kunit_generic_run_threadfn_adapter+0x16/0x30
222     kthread+0x137/0x160
223     ret_from_fork+0x22/0x30
224
225    CPU: 4 PID: 124 Comm: kunit_try_catch Tainted: G        W         5.8.0-rc6+ #7
226    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
227    ==================================================================
228
229DebugFS interface
230~~~~~~~~~~~~~~~~~
231
232Some debugging information is exposed via debugfs:
233
234* The file ``/sys/kernel/debug/kfence/stats`` provides runtime statistics.
235
236* The file ``/sys/kernel/debug/kfence/objects`` provides a list of objects
237  allocated via KFENCE, including those already freed but protected.
238
239Implementation Details
240----------------------
241
242Guarded allocations are set up based on the sample interval. After expiration
243of the sample interval, the next allocation through the main allocator (SLAB or
244SLUB) returns a guarded allocation from the KFENCE object pool (allocation
245sizes up to PAGE_SIZE are supported). At this point, the timer is reset, and
246the next allocation is set up after the expiration of the interval.
247
248When using ``CONFIG_KFENCE_STATIC_KEYS=y``, KFENCE allocations are "gated"
249through the main allocator's fast-path by relying on static branches via the
250static keys infrastructure. The static branch is toggled to redirect the
251allocation to KFENCE. Depending on sample interval, target workloads, and
252system architecture, this may perform better than the simple dynamic branch.
253Careful benchmarking is recommended.
254
255KFENCE objects each reside on a dedicated page, at either the left or right
256page boundaries selected at random. The pages to the left and right of the
257object page are "guard pages", whose attributes are changed to a protected
258state, and cause page faults on any attempted access. Such page faults are then
259intercepted by KFENCE, which handles the fault gracefully by reporting an
260out-of-bounds access, and marking the page as accessible so that the faulting
261code can (wrongly) continue executing (set ``panic_on_warn`` to panic instead).
262
263To detect out-of-bounds writes to memory within the object's page itself,
264KFENCE also uses pattern-based redzones. For each object page, a redzone is set
265up for all non-object memory. For typical alignments, the redzone is only
266required on the unguarded side of an object. Because KFENCE must honor the
267cache's requested alignment, special alignments may result in unprotected gaps
268on either side of an object, all of which are redzoned.
269
270The following figure illustrates the page layout::
271
272    ---+-----------+-----------+-----------+-----------+-----------+---
273       | xxxxxxxxx | O :       | xxxxxxxxx |       : O | xxxxxxxxx |
274       | xxxxxxxxx | B :       | xxxxxxxxx |       : B | xxxxxxxxx |
275       | x GUARD x | J : RED-  | x GUARD x | RED-  : J | x GUARD x |
276       | xxxxxxxxx | E :  ZONE | xxxxxxxxx |  ZONE : E | xxxxxxxxx |
277       | xxxxxxxxx | C :       | xxxxxxxxx |       : C | xxxxxxxxx |
278       | xxxxxxxxx | T :       | xxxxxxxxx |       : T | xxxxxxxxx |
279    ---+-----------+-----------+-----------+-----------+-----------+---
280
281Upon deallocation of a KFENCE object, the object's page is again protected and
282the object is marked as freed. Any further access to the object causes a fault
283and KFENCE reports a use-after-free access. Freed objects are inserted at the
284tail of KFENCE's freelist, so that the least recently freed objects are reused
285first, and the chances of detecting use-after-frees of recently freed objects
286is increased.
287
288If pool utilization reaches 75% (default) or above, to reduce the risk of the
289pool eventually being fully occupied by allocated objects yet ensure diverse
290coverage of allocations, KFENCE limits currently covered allocations of the
291same source from further filling up the pool. The "source" of an allocation is
292based on its partial allocation stack trace. A side-effect is that this also
293limits frequent long-lived allocations (e.g. pagecache) of the same source
294filling up the pool permanently, which is the most common risk for the pool
295becoming full and the sampled allocation rate dropping to zero. The threshold
296at which to start limiting currently covered allocations can be configured via
297the boot parameter ``kfence.skip_covered_thresh`` (pool usage%).
298
299Interface
300---------
301
302The following describes the functions which are used by allocators as well as
303page handling code to set up and deal with KFENCE allocations.
304
305.. kernel-doc:: include/linux/kfence.h
306   :functions: is_kfence_address
307               kfence_shutdown_cache
308               kfence_alloc kfence_free __kfence_free
309               kfence_ksize kfence_object_start
310               kfence_handle_page_fault
311
312Related Tools
313-------------
314
315In userspace, a similar approach is taken by `GWP-ASan
316<http://llvm.org/docs/GwpAsan.html>`_. GWP-ASan also relies on guard pages and
317a sampling strategy to detect memory unsafety bugs at scale. KFENCE's design is
318directly influenced by GWP-ASan, and can be seen as its kernel sibling. Another
319similar but non-sampling approach, that also inspired the name "KFENCE", can be
320found in the userspace `Electric Fence Malloc Debugger
321<https://linux.die.net/man/3/efence>`_.
322
323In the kernel, several tools exist to debug memory access errors, and in
324particular KASAN can detect all bug classes that KFENCE can detect. While KASAN
325is more precise, relying on compiler instrumentation, this comes at a
326performance cost.
327
328It is worth highlighting that KASAN and KFENCE are complementary, with
329different target environments. For instance, KASAN is the better debugging-aid,
330where test cases or reproducers exists: due to the lower chance to detect the
331error, it would require more effort using KFENCE to debug. Deployments at scale
332that cannot afford to enable KASAN, however, would benefit from using KFENCE to
333discover bugs due to code paths not exercised by test cases or fuzzers.