Linux Audio

Check our new training course

Loading...
v6.13.7
  1.. SPDX-License-Identifier: GPL-2.0
  2
  3=================
  4Lockdep-RCU Splat
  5=================
  6
  7Lockdep-RCU was added to the Linux kernel in early 2010
  8(http://lwn.net/Articles/371986/).  This facility checks for some common
  9misuses of the RCU API, most notably using one of the rcu_dereference()
 10family to access an RCU-protected pointer without the proper protection.
 11When such misuse is detected, an lockdep-RCU splat is emitted.
 12
 13The usual cause of a lockdep-RCU splat is someone accessing an
 14RCU-protected data structure without either (1) being in the right kind of
 15RCU read-side critical section or (2) holding the right update-side lock.
 16This problem can therefore be serious: it might result in random memory
 17overwriting or worse.  There can of course be false positives, this
 18being the real world and all that.
 19
 20So let's look at an example RCU lockdep splat from 3.0-rc5, one that
 21has long since been fixed::
 22
 23    =============================
 24    WARNING: suspicious RCU usage
 25    -----------------------------
 26    block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage!
 27
 28other info that might help us debug this::
 29
 30    rcu_scheduler_active = 1, debug_locks = 0
 31    3 locks held by scsi_scan_6/1552:
 32    #0:  (&shost->scan_mutex){+.+.}, at: [<ffffffff8145efca>]
 33    scsi_scan_host_selected+0x5a/0x150
 34    #1:  (&eq->sysfs_lock){+.+.}, at: [<ffffffff812a5032>]
 35    elevator_exit+0x22/0x60
 36    #2:  (&(&q->__queue_lock)->rlock){-.-.}, at: [<ffffffff812b6233>]
 37    cfq_exit_queue+0x43/0x190
 38
 39    stack backtrace:
 40    Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17
 41    Call Trace:
 42    [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0
 43    [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120
 44    [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190
 45    [<ffffffff812a5046>] elevator_exit+0x36/0x60
 46    [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60
 47    [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10
 48    [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0
 49    [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10
 50    [<ffffffff817da069>] ? error_exit+0x29/0xb0
 51    [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
 52    [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680
 53    [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 54    [<ffffffff817da069>] ? error_exit+0x29/0xb0
 55    [<ffffffff812bcc60>] ? kobject_del+0x40/0x40
 56    [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0
 57    [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150
 58    [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90
 59    [<ffffffff8145f170>] do_scan_async+0x20/0x160
 60    [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90
 61    [<ffffffff810975b6>] kthread+0xa6/0xb0
 62    [<ffffffff817db154>] kernel_thread_helper+0x4/0x10
 63    [<ffffffff81066430>] ? finish_task_switch+0x80/0x110
 64    [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe
 65    [<ffffffff81097510>] ? __kthread_init_worker+0x70/0x70
 66    [<ffffffff817db150>] ? gs_change+0xb/0xb
 67
 68Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows::
 69
 70	if (rcu_dereference(ioc->ioc_data) == cic) {
 71
 72This form says that it must be in a plain vanilla RCU read-side critical
 73section, but the "other info" list above shows that this is not the
 74case.  Instead, we hold three locks, one of which might be RCU related.
 75And maybe that lock really does protect this reference.  If so, the fix
 76is to inform RCU, perhaps by changing __cfq_exit_single_io_context() to
 77take the struct request_queue "q" from cfq_exit_queue() as an argument,
 78which would permit us to invoke rcu_dereference_protected as follows::
 79
 80	if (rcu_dereference_protected(ioc->ioc_data,
 81				      lockdep_is_held(&q->queue_lock)) == cic) {
 82
 83With this change, there would be no lockdep-RCU splat emitted if this
 84code was invoked either from within an RCU read-side critical section
 85or with the ->queue_lock held.  In particular, this would have suppressed
 86the above lockdep-RCU splat because ->queue_lock is held (see #2 in the
 87list above).
 88
 89On the other hand, perhaps we really do need an RCU read-side critical
 90section.  In this case, the critical section must span the use of the
 91return value from rcu_dereference(), or at least until there is some
 92reference count incremented or some such.  One way to handle this is to
 93add rcu_read_lock() and rcu_read_unlock() as follows::
 94
 95	rcu_read_lock();
 96	if (rcu_dereference(ioc->ioc_data) == cic) {
 97		spin_lock(&ioc->lock);
 98		rcu_assign_pointer(ioc->ioc_data, NULL);
 99		spin_unlock(&ioc->lock);
100	}
101	rcu_read_unlock();
102
103With this change, the rcu_dereference() is always within an RCU
104read-side critical section, which again would have suppressed the
105above lockdep-RCU splat.
106
107But in this particular case, we don't actually dereference the pointer
108returned from rcu_dereference().  Instead, that pointer is just compared
109to the cic pointer, which means that the rcu_dereference() can be replaced
110by rcu_access_pointer() as follows::
111
112	if (rcu_access_pointer(ioc->ioc_data) == cic) {
113
114Because it is legal to invoke rcu_access_pointer() without protection,
115this change would also suppress the above lockdep-RCU splat.
v6.8
  1.. SPDX-License-Identifier: GPL-2.0
  2
  3=================
  4Lockdep-RCU Splat
  5=================
  6
  7Lockdep-RCU was added to the Linux kernel in early 2010
  8(http://lwn.net/Articles/371986/).  This facility checks for some common
  9misuses of the RCU API, most notably using one of the rcu_dereference()
 10family to access an RCU-protected pointer without the proper protection.
 11When such misuse is detected, an lockdep-RCU splat is emitted.
 12
 13The usual cause of a lockdep-RCU splat is someone accessing an
 14RCU-protected data structure without either (1) being in the right kind of
 15RCU read-side critical section or (2) holding the right update-side lock.
 16This problem can therefore be serious: it might result in random memory
 17overwriting or worse.  There can of course be false positives, this
 18being the real world and all that.
 19
 20So let's look at an example RCU lockdep splat from 3.0-rc5, one that
 21has long since been fixed::
 22
 23    =============================
 24    WARNING: suspicious RCU usage
 25    -----------------------------
 26    block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage!
 27
 28other info that might help us debug this::
 29
 30    rcu_scheduler_active = 1, debug_locks = 0
 31    3 locks held by scsi_scan_6/1552:
 32    #0:  (&shost->scan_mutex){+.+.}, at: [<ffffffff8145efca>]
 33    scsi_scan_host_selected+0x5a/0x150
 34    #1:  (&eq->sysfs_lock){+.+.}, at: [<ffffffff812a5032>]
 35    elevator_exit+0x22/0x60
 36    #2:  (&(&q->__queue_lock)->rlock){-.-.}, at: [<ffffffff812b6233>]
 37    cfq_exit_queue+0x43/0x190
 38
 39    stack backtrace:
 40    Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17
 41    Call Trace:
 42    [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0
 43    [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120
 44    [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190
 45    [<ffffffff812a5046>] elevator_exit+0x36/0x60
 46    [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60
 47    [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10
 48    [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0
 49    [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10
 50    [<ffffffff817da069>] ? error_exit+0x29/0xb0
 51    [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
 52    [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680
 53    [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 54    [<ffffffff817da069>] ? error_exit+0x29/0xb0
 55    [<ffffffff812bcc60>] ? kobject_del+0x40/0x40
 56    [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0
 57    [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150
 58    [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90
 59    [<ffffffff8145f170>] do_scan_async+0x20/0x160
 60    [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90
 61    [<ffffffff810975b6>] kthread+0xa6/0xb0
 62    [<ffffffff817db154>] kernel_thread_helper+0x4/0x10
 63    [<ffffffff81066430>] ? finish_task_switch+0x80/0x110
 64    [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe
 65    [<ffffffff81097510>] ? __kthread_init_worker+0x70/0x70
 66    [<ffffffff817db150>] ? gs_change+0xb/0xb
 67
 68Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows::
 69
 70	if (rcu_dereference(ioc->ioc_data) == cic) {
 71
 72This form says that it must be in a plain vanilla RCU read-side critical
 73section, but the "other info" list above shows that this is not the
 74case.  Instead, we hold three locks, one of which might be RCU related.
 75And maybe that lock really does protect this reference.  If so, the fix
 76is to inform RCU, perhaps by changing __cfq_exit_single_io_context() to
 77take the struct request_queue "q" from cfq_exit_queue() as an argument,
 78which would permit us to invoke rcu_dereference_protected as follows::
 79
 80	if (rcu_dereference_protected(ioc->ioc_data,
 81				      lockdep_is_held(&q->queue_lock)) == cic) {
 82
 83With this change, there would be no lockdep-RCU splat emitted if this
 84code was invoked either from within an RCU read-side critical section
 85or with the ->queue_lock held.  In particular, this would have suppressed
 86the above lockdep-RCU splat because ->queue_lock is held (see #2 in the
 87list above).
 88
 89On the other hand, perhaps we really do need an RCU read-side critical
 90section.  In this case, the critical section must span the use of the
 91return value from rcu_dereference(), or at least until there is some
 92reference count incremented or some such.  One way to handle this is to
 93add rcu_read_lock() and rcu_read_unlock() as follows::
 94
 95	rcu_read_lock();
 96	if (rcu_dereference(ioc->ioc_data) == cic) {
 97		spin_lock(&ioc->lock);
 98		rcu_assign_pointer(ioc->ioc_data, NULL);
 99		spin_unlock(&ioc->lock);
100	}
101	rcu_read_unlock();
102
103With this change, the rcu_dereference() is always within an RCU
104read-side critical section, which again would have suppressed the
105above lockdep-RCU splat.
106
107But in this particular case, we don't actually dereference the pointer
108returned from rcu_dereference().  Instead, that pointer is just compared
109to the cic pointer, which means that the rcu_dereference() can be replaced
110by rcu_access_pointer() as follows::
111
112	if (rcu_access_pointer(ioc->ioc_data) == cic) {
113
114Because it is legal to invoke rcu_access_pointer() without protection,
115this change would also suppress the above lockdep-RCU splat.