Loading...
Note: File does not exist in v4.6.
1===========================================
2Fault injection capabilities infrastructure
3===========================================
4
5See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug.
6
7
8Available fault injection capabilities
9--------------------------------------
10
11- failslab
12
13 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
14
15- fail_page_alloc
16
17 injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
18
19- fail_usercopy
20
21 injects failures in user memory access functions. (copy_from_user(), get_user(), ...)
22
23- fail_futex
24
25 injects futex deadlock and uaddr fault errors.
26
27- fail_make_request
28
29 injects disk IO errors on devices permitted by setting
30 /sys/block/<device>/make-it-fail or
31 /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct())
32
33- fail_mmc_request
34
35 injects MMC data errors on devices permitted by setting
36 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
37
38- fail_function
39
40 injects error return on specific functions, which are marked by
41 ALLOW_ERROR_INJECTION() macro, by setting debugfs entries
42 under /sys/kernel/debug/fail_function. No boot option supported.
43
44- NVMe fault injection
45
46 inject NVMe status code and retry flag on devices permitted by setting
47 debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default
48 status code is NVME_SC_INVALID_OPCODE with no retry. The status code and
49 retry flag can be set via the debugfs.
50
51
52Configure fault-injection capabilities behavior
53-----------------------------------------------
54
55debugfs entries
56^^^^^^^^^^^^^^^
57
58fault-inject-debugfs kernel module provides some debugfs entries for runtime
59configuration of fault-injection capabilities.
60
61- /sys/kernel/debug/fail*/probability:
62
63 likelihood of failure injection, in percent.
64
65 Format: <percent>
66
67 Note that one-failure-per-hundred is a very high error rate
68 for some testcases. Consider setting probability=100 and configure
69 /sys/kernel/debug/fail*/interval for such testcases.
70
71- /sys/kernel/debug/fail*/interval:
72
73 specifies the interval between failures, for calls to
74 should_fail() that pass all the other tests.
75
76 Note that if you enable this, by setting interval>1, you will
77 probably want to set probability=100.
78
79- /sys/kernel/debug/fail*/times:
80
81 specifies how many times failures may happen at most. A value of -1
82 means "no limit". Note, though, that this file only accepts unsigned
83 values. So, if you want to specify -1, you better use 'printf' instead
84 of 'echo', e.g.: $ printf %#x -1 > times
85
86- /sys/kernel/debug/fail*/space:
87
88 specifies an initial resource "budget", decremented by "size"
89 on each call to should_fail(,size). Failure injection is
90 suppressed until "space" reaches zero.
91
92- /sys/kernel/debug/fail*/verbose
93
94 Format: { 0 | 1 | 2 }
95
96 specifies the verbosity of the messages when failure is
97 injected. '0' means no messages; '1' will print only a single
98 log line per failure; '2' will print a call trace too -- useful
99 to debug the problems revealed by fault injection.
100
101- /sys/kernel/debug/fail*/task-filter:
102
103 Format: { 'Y' | 'N' }
104
105 A value of 'N' disables filtering by process (default).
106 Any positive value limits failures to only processes indicated by
107 /proc/<pid>/make-it-fail==1.
108
109- /sys/kernel/debug/fail*/require-start,
110 /sys/kernel/debug/fail*/require-end,
111 /sys/kernel/debug/fail*/reject-start,
112 /sys/kernel/debug/fail*/reject-end:
113
114 specifies the range of virtual addresses tested during
115 stacktrace walking. Failure is injected only if some caller
116 in the walked stacktrace lies within the required range, and
117 none lies within the rejected range.
118 Default required range is [0,ULONG_MAX) (whole of virtual address space).
119 Default rejected range is [0,0).
120
121- /sys/kernel/debug/fail*/stacktrace-depth:
122
123 specifies the maximum stacktrace depth walked during search
124 for a caller within [require-start,require-end) OR
125 [reject-start,reject-end).
126
127- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
128
129 Format: { 'Y' | 'N' }
130
131 default is 'N', setting it to 'Y' won't inject failures into
132 highmem/user allocations.
133
134- /sys/kernel/debug/failslab/ignore-gfp-wait:
135- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
136
137 Format: { 'Y' | 'N' }
138
139 default is 'N', setting it to 'Y' will inject failures
140 only into non-sleep allocations (GFP_ATOMIC allocations).
141
142- /sys/kernel/debug/fail_page_alloc/min-order:
143
144 specifies the minimum page allocation order to be injected
145 failures.
146
147- /sys/kernel/debug/fail_futex/ignore-private:
148
149 Format: { 'Y' | 'N' }
150
151 default is 'N', setting it to 'Y' will disable failure injections
152 when dealing with private (address space) futexes.
153
154- /sys/kernel/debug/fail_function/inject:
155
156 Format: { 'function-name' | '!function-name' | '' }
157
158 specifies the target function of error injection by name.
159 If the function name leads '!' prefix, given function is
160 removed from injection list. If nothing specified ('')
161 injection list is cleared.
162
163- /sys/kernel/debug/fail_function/injectable:
164
165 (read only) shows error injectable functions and what type of
166 error values can be specified. The error type will be one of
167 below;
168 - NULL: retval must be 0.
169 - ERRNO: retval must be -1 to -MAX_ERRNO (-4096).
170 - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096).
171
172- /sys/kernel/debug/fail_function/<function-name>/retval:
173
174 specifies the "error" return value to inject to the given function.
175 This will be created when the user specifies a new injection entry.
176 Note that this file only accepts unsigned values. So, if you want to
177 use a negative errno, you better use 'printf' instead of 'echo', e.g.:
178 $ printf %#x -12 > retval
179
180Boot option
181^^^^^^^^^^^
182
183In order to inject faults while debugfs is not available (early boot time),
184use the boot option::
185
186 failslab=
187 fail_page_alloc=
188 fail_usercopy=
189 fail_make_request=
190 fail_futex=
191 mmc_core.fail_request=<interval>,<probability>,<space>,<times>
192
193proc entries
194^^^^^^^^^^^^
195
196- /proc/<pid>/fail-nth,
197 /proc/self/task/<tid>/fail-nth:
198
199 Write to this file of integer N makes N-th call in the task fail.
200 Read from this file returns a integer value. A value of '0' indicates
201 that the fault setup with a previous write to this file was injected.
202 A positive integer N indicates that the fault wasn't yet injected.
203 Note that this file enables all types of faults (slab, futex, etc).
204 This setting takes precedence over all other generic debugfs settings
205 like probability, interval, times, etc. But per-capability settings
206 (e.g. fail_futex/ignore-private) take precedence over it.
207
208 This feature is intended for systematic testing of faults in a single
209 system call. See an example below.
210
211How to add new fault injection capability
212-----------------------------------------
213
214- #include <linux/fault-inject.h>
215
216- define the fault attributes
217
218 DECLARE_FAULT_ATTR(name);
219
220 Please see the definition of struct fault_attr in fault-inject.h
221 for details.
222
223- provide a way to configure fault attributes
224
225- boot option
226
227 If you need to enable the fault injection capability from boot time, you can
228 provide boot option to configure it. There is a helper function for it:
229
230 setup_fault_attr(attr, str);
231
232- debugfs entries
233
234 failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way.
235 Helper functions:
236
237 fault_create_debugfs_attr(name, parent, attr);
238
239- module parameters
240
241 If the scope of the fault injection capability is limited to a
242 single kernel module, it is better to provide module parameters to
243 configure the fault attributes.
244
245- add a hook to insert failures
246
247 Upon should_fail() returning true, client code should inject a failure:
248
249 should_fail(attr, size);
250
251Application Examples
252--------------------
253
254- Inject slab allocation failures into module init/exit code::
255
256 #!/bin/bash
257
258 FAILTYPE=failslab
259 echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
260 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
261 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
262 printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
263 echo 0 > /sys/kernel/debug/$FAILTYPE/space
264 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
265 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
266
267 faulty_system()
268 {
269 bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
270 }
271
272 if [ $# -eq 0 ]
273 then
274 echo "Usage: $0 modulename [ modulename ... ]"
275 exit 1
276 fi
277
278 for m in $*
279 do
280 echo inserting $m...
281 faulty_system modprobe $m
282
283 echo removing $m...
284 faulty_system modprobe -r $m
285 done
286
287------------------------------------------------------------------------------
288
289- Inject page allocation failures only for a specific module::
290
291 #!/bin/bash
292
293 FAILTYPE=fail_page_alloc
294 module=$1
295
296 if [ -z $module ]
297 then
298 echo "Usage: $0 <modulename>"
299 exit 1
300 fi
301
302 modprobe $module
303
304 if [ ! -d /sys/module/$module/sections ]
305 then
306 echo Module $module is not loaded
307 exit 1
308 fi
309
310 cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
311 cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
312
313 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
314 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
315 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
316 printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
317 echo 0 > /sys/kernel/debug/$FAILTYPE/space
318 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
319 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
320 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
321 echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
322
323 trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
324
325 echo "Injecting errors into the module $module... (interrupt to stop)"
326 sleep 1000000
327
328------------------------------------------------------------------------------
329
330- Inject open_ctree error while btrfs mount::
331
332 #!/bin/bash
333
334 rm -f testfile.img
335 dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
336 DEVICE=$(losetup --show -f testfile.img)
337 mkfs.btrfs -f $DEVICE
338 mkdir -p tmpmnt
339
340 FAILTYPE=fail_function
341 FAILFUNC=open_ctree
342 echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
343 printf %#x -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
344 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
345 echo 100 > /sys/kernel/debug/$FAILTYPE/probability
346 echo 0 > /sys/kernel/debug/$FAILTYPE/interval
347 printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
348 echo 0 > /sys/kernel/debug/$FAILTYPE/space
349 echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
350
351 mount -t btrfs $DEVICE tmpmnt
352 if [ $? -ne 0 ]
353 then
354 echo "SUCCESS!"
355 else
356 echo "FAILED!"
357 umount tmpmnt
358 fi
359
360 echo > /sys/kernel/debug/$FAILTYPE/inject
361
362 rmdir tmpmnt
363 losetup -d $DEVICE
364 rm testfile.img
365
366
367Tool to run command with failslab or fail_page_alloc
368----------------------------------------------------
369In order to make it easier to accomplish the tasks mentioned above, we can use
370tools/testing/fault-injection/failcmd.sh. Please run a command
371"./tools/testing/fault-injection/failcmd.sh --help" for more information and
372see the following examples.
373
374Examples:
375
376Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
377allocation failure::
378
379 # ./tools/testing/fault-injection/failcmd.sh \
380 -- make -C tools/testing/selftests/ run_tests
381
382Same as above except to specify 100 times failures at most instead of one time
383at most by default::
384
385 # ./tools/testing/fault-injection/failcmd.sh --times=100 \
386 -- make -C tools/testing/selftests/ run_tests
387
388Same as above except to inject page allocation failure instead of slab
389allocation failure::
390
391 # env FAILCMD_TYPE=fail_page_alloc \
392 ./tools/testing/fault-injection/failcmd.sh --times=100 \
393 -- make -C tools/testing/selftests/ run_tests
394
395Systematic faults using fail-nth
396---------------------------------
397
398The following code systematically faults 0-th, 1-st, 2-nd and so on
399capabilities in the socketpair() system call::
400
401 #include <sys/types.h>
402 #include <sys/stat.h>
403 #include <sys/socket.h>
404 #include <sys/syscall.h>
405 #include <fcntl.h>
406 #include <unistd.h>
407 #include <string.h>
408 #include <stdlib.h>
409 #include <stdio.h>
410 #include <errno.h>
411
412 int main()
413 {
414 int i, err, res, fail_nth, fds[2];
415 char buf[128];
416
417 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
418 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
419 fail_nth = open(buf, O_RDWR);
420 for (i = 1;; i++) {
421 sprintf(buf, "%d", i);
422 write(fail_nth, buf, strlen(buf));
423 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
424 err = errno;
425 pread(fail_nth, buf, sizeof(buf), 0);
426 if (res == 0) {
427 close(fds[0]);
428 close(fds[1]);
429 }
430 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
431 res, err);
432 if (atoi(buf))
433 break;
434 }
435 return 0;
436 }
437
438An example output::
439
440 1-th fault Y: res=-1/23
441 2-th fault Y: res=-1/23
442 3-th fault Y: res=-1/12
443 4-th fault Y: res=-1/12
444 5-th fault Y: res=-1/23
445 6-th fault Y: res=-1/23
446 7-th fault Y: res=-1/23
447 8-th fault Y: res=-1/12
448 9-th fault Y: res=-1/12
449 10-th fault Y: res=-1/12
450 11-th fault Y: res=-1/12
451 12-th fault Y: res=-1/12
452 13-th fault Y: res=-1/12
453 14-th fault Y: res=-1/12
454 15-th fault Y: res=-1/12
455 16-th fault N: res=0/12