Linux Audio

Check our new training course

Loading...
v6.13.7
  1.. _memory_hotplug:
  2
  3==============
  4Memory hotplug
  5==============
  6
  7Memory hotplug event notifier
  8=============================
  9
 10Hotplugging events are sent to a notification queue.
 11
 12There are six types of notification defined in ``include/linux/memory.h``:
 13
 14MEM_GOING_ONLINE
 15  Generated before new memory becomes available in order to be able to
 16  prepare subsystems to handle memory. The page allocator is still unable
 17  to allocate from the new memory.
 18
 19MEM_CANCEL_ONLINE
 20  Generated if MEM_GOING_ONLINE fails.
 21
 22MEM_ONLINE
 23  Generated when memory has successfully brought online. The callback may
 24  allocate pages from the new memory.
 25
 26MEM_GOING_OFFLINE
 27  Generated to begin the process of offlining memory. Allocations are no
 28  longer possible from the memory but some of the memory to be offlined
 29  is still in use. The callback can be used to free memory known to a
 30  subsystem from the indicated memory block.
 31
 32MEM_CANCEL_OFFLINE
 33  Generated if MEM_GOING_OFFLINE fails. Memory is available again from
 34  the memory block that we attempted to offline.
 35
 36MEM_OFFLINE
 37  Generated after offlining memory is complete.
 38
 39A callback routine can be registered by calling::
 40
 41  hotplug_memory_notifier(callback_func, priority)
 42
 43Callback functions with higher values of priority are called before callback
 44functions with lower values.
 45
 46A callback function must have the following prototype::
 47
 48  int callback_func(
 49    struct notifier_block *self, unsigned long action, void *arg);
 50
 51The first argument of the callback function (self) is a pointer to the block
 52of the notifier chain that points to the callback function itself.
 53The second argument (action) is one of the event types described above.
 54The third argument (arg) passes a pointer of struct memory_notify::
 55
 56	struct memory_notify {
 57		unsigned long start_pfn;
 58		unsigned long nr_pages;
 59		int status_change_nid_normal;
 
 60		int status_change_nid;
 61	}
 62
 63- start_pfn is start_pfn of online/offline memory.
 64- nr_pages is # of pages of online/offline memory.
 65- status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
 
 
 66  is (will be) set/clear, if this is -1, then nodemask status is not changed.
 67- status_change_nid is set node id when N_MEMORY of nodemask is (will be)
 68  set/clear. It means a new(memoryless) node gets new memory by online and a
 69  node loses all memory. If this is -1, then nodemask status is not changed.
 70
 71  If status_changed_nid* >= 0, callback should create/discard structures for the
 72  node if necessary.
 73
 74The callback routine shall return one of the values
 75NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP
 76defined in ``include/linux/notifier.h``
 77
 78NOTIFY_DONE and NOTIFY_OK have no effect on the further processing.
 79
 80NOTIFY_BAD is used as response to the MEM_GOING_ONLINE, MEM_GOING_OFFLINE,
 81MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops
 82further processing of the notification queue.
 83
 84NOTIFY_STOP stops further processing of the notification queue.
 85
 86Locking Internals
 87=================
 88
 89When adding/removing memory that uses memory block devices (i.e. ordinary RAM),
 90the device_hotplug_lock should be held to:
 91
 92- synchronize against online/offline requests (e.g. via sysfs). This way, memory
 93  block devices can only be accessed (.online/.state attributes) by user
 94  space once memory has been fully added. And when removing memory, we
 95  know nobody is in critical sections.
 96- synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC)
 97
 98Especially, there is a possible lock inversion that is avoided using
 99device_hotplug_lock when adding memory and user space tries to online that
100memory faster than expected:
101
102- device_online() will first take the device_lock(), followed by
103  mem_hotplug_lock
104- add_memory_resource() will first take the mem_hotplug_lock, followed by
105  the device_lock() (while creating the devices, during bus_add_device()).
106
107As the device is visible to user space before taking the device_lock(), this
108can result in a lock inversion.
109
110onlining/offlining of memory should be done via device_online()/
111device_offline() - to make sure it is properly synchronized to actions
112via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type)
113
114When adding/removing/onlining/offlining memory or adding/removing
115heterogeneous/device memory, we should always hold the mem_hotplug_lock in
116write mode to serialise memory hotplug (e.g. access to global/zone
117variables).
118
119In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read
120mode allows for a quite efficient get_online_mems/put_online_mems
121implementation, so code accessing memory can protect from that memory
122vanishing.
v5.4
  1.. _memory_hotplug:
  2
  3==============
  4Memory hotplug
  5==============
  6
  7Memory hotplug event notifier
  8=============================
  9
 10Hotplugging events are sent to a notification queue.
 11
 12There are six types of notification defined in ``include/linux/memory.h``:
 13
 14MEM_GOING_ONLINE
 15  Generated before new memory becomes available in order to be able to
 16  prepare subsystems to handle memory. The page allocator is still unable
 17  to allocate from the new memory.
 18
 19MEM_CANCEL_ONLINE
 20  Generated if MEM_GOING_ONLINE fails.
 21
 22MEM_ONLINE
 23  Generated when memory has successfully brought online. The callback may
 24  allocate pages from the new memory.
 25
 26MEM_GOING_OFFLINE
 27  Generated to begin the process of offlining memory. Allocations are no
 28  longer possible from the memory but some of the memory to be offlined
 29  is still in use. The callback can be used to free memory known to a
 30  subsystem from the indicated memory block.
 31
 32MEM_CANCEL_OFFLINE
 33  Generated if MEM_GOING_OFFLINE fails. Memory is available again from
 34  the memory block that we attempted to offline.
 35
 36MEM_OFFLINE
 37  Generated after offlining memory is complete.
 38
 39A callback routine can be registered by calling::
 40
 41  hotplug_memory_notifier(callback_func, priority)
 42
 43Callback functions with higher values of priority are called before callback
 44functions with lower values.
 45
 46A callback function must have the following prototype::
 47
 48  int callback_func(
 49    struct notifier_block *self, unsigned long action, void *arg);
 50
 51The first argument of the callback function (self) is a pointer to the block
 52of the notifier chain that points to the callback function itself.
 53The second argument (action) is one of the event types described above.
 54The third argument (arg) passes a pointer of struct memory_notify::
 55
 56	struct memory_notify {
 57		unsigned long start_pfn;
 58		unsigned long nr_pages;
 59		int status_change_nid_normal;
 60		int status_change_nid_high;
 61		int status_change_nid;
 62	}
 63
 64- start_pfn is start_pfn of online/offline memory.
 65- nr_pages is # of pages of online/offline memory.
 66- status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
 67  is (will be) set/clear, if this is -1, then nodemask status is not changed.
 68- status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask
 69  is (will be) set/clear, if this is -1, then nodemask status is not changed.
 70- status_change_nid is set node id when N_MEMORY of nodemask is (will be)
 71  set/clear. It means a new(memoryless) node gets new memory by online and a
 72  node loses all memory. If this is -1, then nodemask status is not changed.
 73
 74  If status_changed_nid* >= 0, callback should create/discard structures for the
 75  node if necessary.
 76
 77The callback routine shall return one of the values
 78NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP
 79defined in ``include/linux/notifier.h``
 80
 81NOTIFY_DONE and NOTIFY_OK have no effect on the further processing.
 82
 83NOTIFY_BAD is used as response to the MEM_GOING_ONLINE, MEM_GOING_OFFLINE,
 84MEM_ONLINE, or MEM_OFFLINE action to cancel hotplugging. It stops
 85further processing of the notification queue.
 86
 87NOTIFY_STOP stops further processing of the notification queue.
 88
 89Locking Internals
 90=================
 91
 92When adding/removing memory that uses memory block devices (i.e. ordinary RAM),
 93the device_hotplug_lock should be held to:
 94
 95- synchronize against online/offline requests (e.g. via sysfs). This way, memory
 96  block devices can only be accessed (.online/.state attributes) by user
 97  space once memory has been fully added. And when removing memory, we
 98  know nobody is in critical sections.
 99- synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC)
100
101Especially, there is a possible lock inversion that is avoided using
102device_hotplug_lock when adding memory and user space tries to online that
103memory faster than expected:
104
105- device_online() will first take the device_lock(), followed by
106  mem_hotplug_lock
107- add_memory_resource() will first take the mem_hotplug_lock, followed by
108  the device_lock() (while creating the devices, during bus_add_device()).
109
110As the device is visible to user space before taking the device_lock(), this
111can result in a lock inversion.
112
113onlining/offlining of memory should be done via device_online()/
114device_offline() - to make sure it is properly synchronized to actions
115via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type)
116
117When adding/removing/onlining/offlining memory or adding/removing
118heterogeneous/device memory, we should always hold the mem_hotplug_lock in
119write mode to serialise memory hotplug (e.g. access to global/zone
120variables).
121
122In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read
123mode allows for a quite efficient get_online_mems/put_online_mems
124implementation, so code accessing memory can protect from that memory
125vanishing.