Linux Audio

Check our new training course

Loading...
v5.4
  1.. SPDX-License-Identifier: GPL-2.0+
  2
  3======
  4XArray
  5======
  6
  7:Author: Matthew Wilcox
  8
  9Overview
 10========
 11
 12The XArray is an abstract data type which behaves like a very large array
 13of pointers.  It meets many of the same needs as a hash or a conventional
 14resizable array.  Unlike a hash, it allows you to sensibly go to the
 15next or previous entry in a cache-efficient manner.  In contrast to a
 16resizable array, there is no need to copy data or change MMU mappings in
 17order to grow the array.  It is more memory-efficient, parallelisable
 18and cache friendly than a doubly-linked list.  It takes advantage of
 19RCU to perform lookups without locking.
 20
 21The XArray implementation is efficient when the indices used are densely
 22clustered; hashing the object and using the hash as the index will not
 23perform well.  The XArray is optimised for small indices, but still has
 24good performance with large indices.  If your index can be larger than
 25``ULONG_MAX`` then the XArray is not the data type for you.  The most
 26important user of the XArray is the page cache.
 27
 28Each non-``NULL`` entry in the array has three bits associated with
 29it called marks.  Each mark may be set or cleared independently of
 30the others.  You can iterate over entries which are marked.
 31
 32Normal pointers may be stored in the XArray directly.  They must be 4-byte
 33aligned, which is true for any pointer returned from kmalloc() and
 34alloc_page().  It isn't true for arbitrary user-space pointers,
 35nor for function pointers.  You can store pointers to statically allocated
 36objects, as long as those objects have an alignment of at least 4.
 37
 38You can also store integers between 0 and ``LONG_MAX`` in the XArray.
 39You must first convert it into an entry using xa_mk_value().
 40When you retrieve an entry from the XArray, you can check whether it is
 41a value entry by calling xa_is_value(), and convert it back to
 42an integer by calling xa_to_value().
 43
 44Some users want to store tagged pointers instead of using the marks
 45described above.  They can call xa_tag_pointer() to create an
 46entry with a tag, xa_untag_pointer() to turn a tagged entry
 47back into an untagged pointer and xa_pointer_tag() to retrieve
 48the tag of an entry.  Tagged pointers use the same bits that are used
 49to distinguish value entries from normal pointers, so each user must
 50decide whether they want to store value entries or tagged pointers in
 51any particular XArray.
 52
 53The XArray does not support storing IS_ERR() pointers as some
 54conflict with value entries or internal entries.
 55
 56An unusual feature of the XArray is the ability to create entries which
 57occupy a range of indices.  Once stored to, looking up any index in
 58the range will return the same entry as looking up any other index in
 59the range.  Setting a mark on one index will set it on all of them.
 60Storing to any index will store to all of them.  Multi-index entries can
 61be explicitly split into smaller entries, or storing ``NULL`` into any
 62entry will cause the XArray to forget about the range.
 63
 64Normal API
 65==========
 66
 67Start by initialising an XArray, either with DEFINE_XARRAY()
 68for statically allocated XArrays or xa_init() for dynamically
 69allocated ones.  A freshly-initialised XArray contains a ``NULL``
 70pointer at every index.
 71
 72You can then set entries using xa_store() and get entries
 73using xa_load().  xa_store will overwrite any entry with the
 74new entry and return the previous entry stored at that index.  You can
 75use xa_erase() instead of calling xa_store() with a
 76``NULL`` entry.  There is no difference between an entry that has never
 77been stored to, one that has been erased and one that has most recently
 78had ``NULL`` stored to it.
 79
 80You can conditionally replace an entry at an index by using
 81xa_cmpxchg().  Like cmpxchg(), it will only succeed if
 82the entry at that index has the 'old' value.  It also returns the entry
 83which was at that index; if it returns the same entry which was passed as
 84'old', then xa_cmpxchg() succeeded.
 85
 86If you want to only store a new entry to an index if the current entry
 87at that index is ``NULL``, you can use xa_insert() which
 88returns ``-EBUSY`` if the entry is not empty.
 89
 90You can enquire whether a mark is set on an entry by using
 91xa_get_mark().  If the entry is not ``NULL``, you can set a mark
 92on it by using xa_set_mark() and remove the mark from an entry by
 93calling xa_clear_mark().  You can ask whether any entry in the
 94XArray has a particular mark set by calling xa_marked().
 95
 96You can copy entries out of the XArray into a plain array by calling
 97xa_extract().  Or you can iterate over the present entries in
 98the XArray by calling xa_for_each().  You may prefer to use
 99xa_find() or xa_find_after() to move to the next present
100entry in the XArray.
101
102Calling xa_store_range() stores the same entry in a range
103of indices.  If you do this, some of the other operations will behave
104in a slightly odd way.  For example, marking the entry at one index
105may result in the entry being marked at some, but not all of the other
106indices.  Storing into one index may result in the entry retrieved by
107some, but not all of the other indices changing.
108
109Sometimes you need to ensure that a subsequent call to xa_store()
110will not need to allocate memory.  The xa_reserve() function
111will store a reserved entry at the indicated index.  Users of the
112normal API will see this entry as containing ``NULL``.  If you do
113not need to use the reserved entry, you can call xa_release()
114to remove the unused entry.  If another user has stored to the entry
115in the meantime, xa_release() will do nothing; if instead you
116want the entry to become ``NULL``, you should use xa_erase().
117Using xa_insert() on a reserved entry will fail.
118
119If all entries in the array are ``NULL``, the xa_empty() function
120will return ``true``.
121
122Finally, you can remove all entries from an XArray by calling
123xa_destroy().  If the XArray entries are pointers, you may wish
124to free the entries first.  You can do this by iterating over all present
125entries in the XArray using the xa_for_each() iterator.
126
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
127Allocating XArrays
128------------------
129
130If you use DEFINE_XARRAY_ALLOC() to define the XArray, or
131initialise it by passing ``XA_FLAGS_ALLOC`` to xa_init_flags(),
132the XArray changes to track whether entries are in use or not.
133
134You can call xa_alloc() to store the entry at an unused index
135in the XArray.  If you need to modify the array from interrupt context,
136you can use xa_alloc_bh() or xa_alloc_irq() to disable
137interrupts while allocating the ID.
138
139Using xa_store(), xa_cmpxchg() or xa_insert() will
140also mark the entry as being allocated.  Unlike a normal XArray, storing
141``NULL`` will mark the entry as being in use, like xa_reserve().
142To free an entry, use xa_erase() (or xa_release() if
143you only want to free the entry if it's ``NULL``).
144
145By default, the lowest free entry is allocated starting from 0.  If you
146want to allocate entries starting at 1, it is more efficient to use
147DEFINE_XARRAY_ALLOC1() or ``XA_FLAGS_ALLOC1``.  If you want to
148allocate IDs up to a maximum, then wrap back around to the lowest free
149ID, you can use xa_alloc_cyclic().
150
151You cannot use ``XA_MARK_0`` with an allocating XArray as this mark
152is used to track whether an entry is free or not.  The other marks are
153available for your use.
154
155Memory allocation
156-----------------
157
158The xa_store(), xa_cmpxchg(), xa_alloc(),
159xa_reserve() and xa_insert() functions take a gfp_t
160parameter in case the XArray needs to allocate memory to store this entry.
161If the entry is being deleted, no memory allocation needs to be performed,
162and the GFP flags specified will be ignored.
163
164It is possible for no memory to be allocatable, particularly if you pass
165a restrictive set of GFP flags.  In that case, the functions return a
166special value which can be turned into an errno using xa_err().
167If you don't need to know exactly which error occurred, using
168xa_is_err() is slightly more efficient.
169
170Locking
171-------
172
173When using the Normal API, you do not have to worry about locking.
174The XArray uses RCU and an internal spinlock to synchronise access:
175
176No lock needed:
177 * xa_empty()
178 * xa_marked()
179
180Takes RCU read lock:
181 * xa_load()
182 * xa_for_each()
 
 
183 * xa_find()
184 * xa_find_after()
185 * xa_extract()
186 * xa_get_mark()
187
188Takes xa_lock internally:
189 * xa_store()
190 * xa_store_bh()
191 * xa_store_irq()
192 * xa_insert()
193 * xa_insert_bh()
194 * xa_insert_irq()
195 * xa_erase()
196 * xa_erase_bh()
197 * xa_erase_irq()
198 * xa_cmpxchg()
199 * xa_cmpxchg_bh()
200 * xa_cmpxchg_irq()
201 * xa_store_range()
202 * xa_alloc()
203 * xa_alloc_bh()
204 * xa_alloc_irq()
205 * xa_reserve()
206 * xa_reserve_bh()
207 * xa_reserve_irq()
208 * xa_destroy()
209 * xa_set_mark()
210 * xa_clear_mark()
211
212Assumes xa_lock held on entry:
213 * __xa_store()
214 * __xa_insert()
215 * __xa_erase()
216 * __xa_cmpxchg()
217 * __xa_alloc()
218 * __xa_set_mark()
219 * __xa_clear_mark()
220
221If you want to take advantage of the lock to protect the data structures
222that you are storing in the XArray, you can call xa_lock()
223before calling xa_load(), then take a reference count on the
224object you have found before calling xa_unlock().  This will
225prevent stores from removing the object from the array between looking
226up the object and incrementing the refcount.  You can also use RCU to
227avoid dereferencing freed memory, but an explanation of that is beyond
228the scope of this document.
229
230The XArray does not disable interrupts or softirqs while modifying
231the array.  It is safe to read the XArray from interrupt or softirq
232context as the RCU lock provides enough protection.
233
234If, for example, you want to store entries in the XArray in process
235context and then erase them in softirq context, you can do that this way::
236
237    void foo_init(struct foo *foo)
238    {
239        xa_init_flags(&foo->array, XA_FLAGS_LOCK_BH);
240    }
241
242    int foo_store(struct foo *foo, unsigned long index, void *entry)
243    {
244        int err;
245
246        xa_lock_bh(&foo->array);
247        err = xa_err(__xa_store(&foo->array, index, entry, GFP_KERNEL));
248        if (!err)
249            foo->count++;
250        xa_unlock_bh(&foo->array);
251        return err;
252    }
253
254    /* foo_erase() is only called from softirq context */
255    void foo_erase(struct foo *foo, unsigned long index)
256    {
257        xa_lock(&foo->array);
258        __xa_erase(&foo->array, index);
259        foo->count--;
260        xa_unlock(&foo->array);
261    }
262
263If you are going to modify the XArray from interrupt or softirq context,
264you need to initialise the array using xa_init_flags(), passing
265``XA_FLAGS_LOCK_IRQ`` or ``XA_FLAGS_LOCK_BH``.
266
267The above example also shows a common pattern of wanting to extend the
268coverage of the xa_lock on the store side to protect some statistics
269associated with the array.
270
271Sharing the XArray with interrupt context is also possible, either
272using xa_lock_irqsave() in both the interrupt handler and process
273context, or xa_lock_irq() in process context and xa_lock()
274in the interrupt handler.  Some of the more common patterns have helper
275functions such as xa_store_bh(), xa_store_irq(),
276xa_erase_bh(), xa_erase_irq(), xa_cmpxchg_bh()
277and xa_cmpxchg_irq().
278
279Sometimes you need to protect access to the XArray with a mutex because
280that lock sits above another mutex in the locking hierarchy.  That does
281not entitle you to use functions like __xa_erase() without taking
282the xa_lock; the xa_lock is used for lockdep validation and will be used
283for other purposes in the future.
284
285The __xa_set_mark() and __xa_clear_mark() functions are also
286available for situations where you look up an entry and want to atomically
287set or clear a mark.  It may be more efficient to use the advanced API
288in this case, as it will save you from walking the tree twice.
289
290Advanced API
291============
292
293The advanced API offers more flexibility and better performance at the
294cost of an interface which can be harder to use and has fewer safeguards.
295No locking is done for you by the advanced API, and you are required
296to use the xa_lock while modifying the array.  You can choose whether
297to use the xa_lock or the RCU lock while doing read-only operations on
298the array.  You can mix advanced and normal operations on the same array;
299indeed the normal API is implemented in terms of the advanced API.  The
300advanced API is only available to modules with a GPL-compatible license.
301
302The advanced API is based around the xa_state.  This is an opaque data
303structure which you declare on the stack using the XA_STATE()
304macro.  This macro initialises the xa_state ready to start walking
305around the XArray.  It is used as a cursor to maintain the position
306in the XArray and let you compose various operations together without
307having to restart from the top every time.
 
 
 
 
308
309The xa_state is also used to store errors.  You can call
310xas_error() to retrieve the error.  All operations check whether
311the xa_state is in an error state before proceeding, so there's no need
312for you to check for an error after each call; you can make multiple
313calls in succession and only check at a convenient point.  The only
314errors currently generated by the XArray code itself are ``ENOMEM`` and
315``EINVAL``, but it supports arbitrary errors in case you want to call
316xas_set_err() yourself.
317
318If the xa_state is holding an ``ENOMEM`` error, calling xas_nomem()
319will attempt to allocate more memory using the specified gfp flags and
320cache it in the xa_state for the next attempt.  The idea is that you take
321the xa_lock, attempt the operation and drop the lock.  The operation
322attempts to allocate memory while holding the lock, but it is more
323likely to fail.  Once you have dropped the lock, xas_nomem()
324can try harder to allocate more memory.  It will return ``true`` if it
325is worth retrying the operation (i.e. that there was a memory error *and*
326more memory was allocated).  If it has previously allocated memory, and
327that memory wasn't used, and there is no error (or some error that isn't
328``ENOMEM``), then it will free the memory previously allocated.
329
330Internal Entries
331----------------
332
333The XArray reserves some entries for its own purposes.  These are never
334exposed through the normal API, but when using the advanced API, it's
335possible to see them.  Usually the best way to handle them is to pass them
336to xas_retry(), and retry the operation if it returns ``true``.
337
338.. flat-table::
339   :widths: 1 1 6
340
341   * - Name
342     - Test
343     - Usage
344
345   * - Node
346     - xa_is_node()
347     - An XArray node.  May be visible when using a multi-index xa_state.
348
349   * - Sibling
350     - xa_is_sibling()
351     - A non-canonical entry for a multi-index entry.  The value indicates
352       which slot in this node has the canonical entry.
353
354   * - Retry
355     - xa_is_retry()
356     - This entry is currently being modified by a thread which has the
357       xa_lock.  The node containing this entry may be freed at the end
358       of this RCU period.  You should restart the lookup from the head
359       of the array.
360
361   * - Zero
362     - xa_is_zero()
363     - Zero entries appear as ``NULL`` through the Normal API, but occupy
364       an entry in the XArray which can be used to reserve the index for
365       future use.  This is used by allocating XArrays for allocated entries
366       which are ``NULL``.
367
368Other internal entries may be added in the future.  As far as possible, they
369will be handled by xas_retry().
370
371Additional functionality
372------------------------
373
374The xas_create_range() function allocates all the necessary memory
375to store every entry in a range.  It will set ENOMEM in the xa_state if
376it cannot allocate memory.
377
378You can use xas_init_marks() to reset the marks on an entry
379to their default state.  This is usually all marks clear, unless the
380XArray is marked with ``XA_FLAGS_TRACK_FREE``, in which case mark 0 is set
381and all other marks are clear.  Replacing one entry with another using
382xas_store() will not reset the marks on that entry; if you want
383the marks reset, you should do that explicitly.
384
385The xas_load() will walk the xa_state as close to the entry
386as it can.  If you know the xa_state has already been walked to the
387entry and need to check that the entry hasn't changed, you can use
388xas_reload() to save a function call.
389
390If you need to move to a different index in the XArray, call
391xas_set().  This resets the cursor to the top of the tree, which
392will generally make the next operation walk the cursor to the desired
393spot in the tree.  If you want to move to the next or previous index,
394call xas_next() or xas_prev().  Setting the index does
395not walk the cursor around the array so does not require a lock to be
396held, while moving to the next or previous index does.
397
398You can search for the next present entry using xas_find().  This
399is the equivalent of both xa_find() and xa_find_after();
400if the cursor has been walked to an entry, then it will find the next
401entry after the one currently referenced.  If not, it will return the
402entry at the index of the xa_state.  Using xas_next_entry() to
403move to the next present entry instead of xas_find() will save
404a function call in the majority of cases at the expense of emitting more
405inline code.
406
407The xas_find_marked() function is similar.  If the xa_state has
408not been walked, it will return the entry at the index of the xa_state,
409if it is marked.  Otherwise, it will return the first marked entry after
410the entry referenced by the xa_state.  The xas_next_marked()
411function is the equivalent of xas_next_entry().
412
413When iterating over a range of the XArray using xas_for_each()
414or xas_for_each_marked(), it may be necessary to temporarily stop
415the iteration.  The xas_pause() function exists for this purpose.
416After you have done the necessary work and wish to resume, the xa_state
417is in an appropriate state to continue the iteration after the entry
418you last processed.  If you have interrupts disabled while iterating,
419then it is good manners to pause the iteration and reenable interrupts
420every ``XA_CHECK_SCHED`` entries.
421
422The xas_get_mark(), xas_set_mark() and
423xas_clear_mark() functions require the xa_state cursor to have
424been moved to the appropriate location in the xarray; they will do
425nothing if you have called xas_pause() or xas_set()
426immediately before.
427
428You can call xas_set_update() to have a callback function
429called each time the XArray updates a node.  This is used by the page
430cache workingset code to maintain its list of nodes which contain only
431shadow entries.
432
433Multi-Index Entries
434-------------------
435
436The XArray has the ability to tie multiple indices together so that
437operations on one index affect all indices.  For example, storing into
438any index will change the value of the entry retrieved from any index.
439Setting or clearing a mark on any index will set or clear the mark
440on every index that is tied together.  The current implementation
441only allows tying ranges which are aligned powers of two together;
442eg indices 64-127 may be tied together, but 2-6 may not be.  This may
443save substantial quantities of memory; for example tying 512 entries
444together will save over 4kB.
445
446You can create a multi-index entry by using XA_STATE_ORDER()
447or xas_set_order() followed by a call to xas_store().
448Calling xas_load() with a multi-index xa_state will walk the
449xa_state to the right location in the tree, but the return value is not
450meaningful, potentially being an internal entry or ``NULL`` even when there
451is an entry stored within the range.  Calling xas_find_conflict()
452will return the first entry within the range or ``NULL`` if there are no
453entries in the range.  The xas_for_each_conflict() iterator will
454iterate over every entry which overlaps the specified range.
455
456If xas_load() encounters a multi-index entry, the xa_index
457in the xa_state will not be changed.  When iterating over an XArray
458or calling xas_find(), if the initial index is in the middle
459of a multi-index entry, it will not be altered.  Subsequent calls
460or iterations will move the index to the first index in the range.
461Each entry will only be returned once, no matter how many indices it
462occupies.
463
464Using xas_next() or xas_prev() with a multi-index xa_state
465is not supported.  Using either of these functions on a multi-index entry
466will reveal sibling entries; these should be skipped over by the caller.
467
468Storing ``NULL`` into any index of a multi-index entry will set the entry
469at every index to ``NULL`` and dissolve the tie.  Splitting a multi-index
470entry into entries occupying smaller ranges is not yet supported.
 
 
471
472Functions and structures
473========================
474
475.. kernel-doc:: include/linux/xarray.h
476.. kernel-doc:: lib/xarray.c
v6.13.7
  1.. SPDX-License-Identifier: GPL-2.0+
  2
  3======
  4XArray
  5======
  6
  7:Author: Matthew Wilcox
  8
  9Overview
 10========
 11
 12The XArray is an abstract data type which behaves like a very large array
 13of pointers.  It meets many of the same needs as a hash or a conventional
 14resizable array.  Unlike a hash, it allows you to sensibly go to the
 15next or previous entry in a cache-efficient manner.  In contrast to a
 16resizable array, there is no need to copy data or change MMU mappings in
 17order to grow the array.  It is more memory-efficient, parallelisable
 18and cache friendly than a doubly-linked list.  It takes advantage of
 19RCU to perform lookups without locking.
 20
 21The XArray implementation is efficient when the indices used are densely
 22clustered; hashing the object and using the hash as the index will not
 23perform well.  The XArray is optimised for small indices, but still has
 24good performance with large indices.  If your index can be larger than
 25``ULONG_MAX`` then the XArray is not the data type for you.  The most
 26important user of the XArray is the page cache.
 27
 
 
 
 
 28Normal pointers may be stored in the XArray directly.  They must be 4-byte
 29aligned, which is true for any pointer returned from kmalloc() and
 30alloc_page().  It isn't true for arbitrary user-space pointers,
 31nor for function pointers.  You can store pointers to statically allocated
 32objects, as long as those objects have an alignment of at least 4.
 33
 34You can also store integers between 0 and ``LONG_MAX`` in the XArray.
 35You must first convert it into an entry using xa_mk_value().
 36When you retrieve an entry from the XArray, you can check whether it is
 37a value entry by calling xa_is_value(), and convert it back to
 38an integer by calling xa_to_value().
 39
 40Some users want to tag the pointers they store in the XArray.  You can
 41call xa_tag_pointer() to create an entry with a tag, xa_untag_pointer()
 42to turn a tagged entry back into an untagged pointer and xa_pointer_tag()
 43to retrieve the tag of an entry.  Tagged pointers use the same bits that
 44are used to distinguish value entries from normal pointers, so you must
 
 45decide whether they want to store value entries or tagged pointers in
 46any particular XArray.
 47
 48The XArray does not support storing IS_ERR() pointers as some
 49conflict with value entries or internal entries.
 50
 51An unusual feature of the XArray is the ability to create entries which
 52occupy a range of indices.  Once stored to, looking up any index in
 53the range will return the same entry as looking up any other index in
 54the range.  Storing to any index will store to all of them.  Multi-index
 55entries can be explicitly split into smaller entries, or storing ``NULL``
 56into any entry will cause the XArray to forget about the range.
 
 57
 58Normal API
 59==========
 60
 61Start by initialising an XArray, either with DEFINE_XARRAY()
 62for statically allocated XArrays or xa_init() for dynamically
 63allocated ones.  A freshly-initialised XArray contains a ``NULL``
 64pointer at every index.
 65
 66You can then set entries using xa_store() and get entries
 67using xa_load().  xa_store will overwrite any entry with the
 68new entry and return the previous entry stored at that index.  You can
 69use xa_erase() instead of calling xa_store() with a
 70``NULL`` entry.  There is no difference between an entry that has never
 71been stored to, one that has been erased and one that has most recently
 72had ``NULL`` stored to it.
 73
 74You can conditionally replace an entry at an index by using
 75xa_cmpxchg().  Like cmpxchg(), it will only succeed if
 76the entry at that index has the 'old' value.  It also returns the entry
 77which was at that index; if it returns the same entry which was passed as
 78'old', then xa_cmpxchg() succeeded.
 79
 80If you want to only store a new entry to an index if the current entry
 81at that index is ``NULL``, you can use xa_insert() which
 82returns ``-EBUSY`` if the entry is not empty.
 83
 
 
 
 
 
 
 84You can copy entries out of the XArray into a plain array by calling
 85xa_extract().  Or you can iterate over the present entries in the XArray
 86by calling xa_for_each(), xa_for_each_start() or xa_for_each_range().
 87You may prefer to use xa_find() or xa_find_after() to move to the next
 88present entry in the XArray.
 89
 90Calling xa_store_range() stores the same entry in a range
 91of indices.  If you do this, some of the other operations will behave
 92in a slightly odd way.  For example, marking the entry at one index
 93may result in the entry being marked at some, but not all of the other
 94indices.  Storing into one index may result in the entry retrieved by
 95some, but not all of the other indices changing.
 96
 97Sometimes you need to ensure that a subsequent call to xa_store()
 98will not need to allocate memory.  The xa_reserve() function
 99will store a reserved entry at the indicated index.  Users of the
100normal API will see this entry as containing ``NULL``.  If you do
101not need to use the reserved entry, you can call xa_release()
102to remove the unused entry.  If another user has stored to the entry
103in the meantime, xa_release() will do nothing; if instead you
104want the entry to become ``NULL``, you should use xa_erase().
105Using xa_insert() on a reserved entry will fail.
106
107If all entries in the array are ``NULL``, the xa_empty() function
108will return ``true``.
109
110Finally, you can remove all entries from an XArray by calling
111xa_destroy().  If the XArray entries are pointers, you may wish
112to free the entries first.  You can do this by iterating over all present
113entries in the XArray using the xa_for_each() iterator.
114
115Search Marks
116------------
117
118Each entry in the array has three bits associated with it called marks.
119Each mark may be set or cleared independently of the others.  You can
120iterate over marked entries by using the xa_for_each_marked() iterator.
121
122You can enquire whether a mark is set on an entry by using
123xa_get_mark().  If the entry is not ``NULL``, you can set a mark on it
124by using xa_set_mark() and remove the mark from an entry by calling
125xa_clear_mark().  You can ask whether any entry in the XArray has a
126particular mark set by calling xa_marked().  Erasing an entry from the
127XArray causes all marks associated with that entry to be cleared.
128
129Setting or clearing a mark on any index of a multi-index entry will
130affect all indices covered by that entry.  Querying the mark on any
131index will return the same result.
132
133There is no way to iterate over entries which are not marked; the data
134structure does not allow this to be implemented efficiently.  There are
135not currently iterators to search for logical combinations of bits (eg
136iterate over all entries which have both ``XA_MARK_1`` and ``XA_MARK_2``
137set, or iterate over all entries which have ``XA_MARK_0`` or ``XA_MARK_2``
138set).  It would be possible to add these if a user arises.
139
140Allocating XArrays
141------------------
142
143If you use DEFINE_XARRAY_ALLOC() to define the XArray, or
144initialise it by passing ``XA_FLAGS_ALLOC`` to xa_init_flags(),
145the XArray changes to track whether entries are in use or not.
146
147You can call xa_alloc() to store the entry at an unused index
148in the XArray.  If you need to modify the array from interrupt context,
149you can use xa_alloc_bh() or xa_alloc_irq() to disable
150interrupts while allocating the ID.
151
152Using xa_store(), xa_cmpxchg() or xa_insert() will
153also mark the entry as being allocated.  Unlike a normal XArray, storing
154``NULL`` will mark the entry as being in use, like xa_reserve().
155To free an entry, use xa_erase() (or xa_release() if
156you only want to free the entry if it's ``NULL``).
157
158By default, the lowest free entry is allocated starting from 0.  If you
159want to allocate entries starting at 1, it is more efficient to use
160DEFINE_XARRAY_ALLOC1() or ``XA_FLAGS_ALLOC1``.  If you want to
161allocate IDs up to a maximum, then wrap back around to the lowest free
162ID, you can use xa_alloc_cyclic().
163
164You cannot use ``XA_MARK_0`` with an allocating XArray as this mark
165is used to track whether an entry is free or not.  The other marks are
166available for your use.
167
168Memory allocation
169-----------------
170
171The xa_store(), xa_cmpxchg(), xa_alloc(),
172xa_reserve() and xa_insert() functions take a gfp_t
173parameter in case the XArray needs to allocate memory to store this entry.
174If the entry is being deleted, no memory allocation needs to be performed,
175and the GFP flags specified will be ignored.
176
177It is possible for no memory to be allocatable, particularly if you pass
178a restrictive set of GFP flags.  In that case, the functions return a
179special value which can be turned into an errno using xa_err().
180If you don't need to know exactly which error occurred, using
181xa_is_err() is slightly more efficient.
182
183Locking
184-------
185
186When using the Normal API, you do not have to worry about locking.
187The XArray uses RCU and an internal spinlock to synchronise access:
188
189No lock needed:
190 * xa_empty()
191 * xa_marked()
192
193Takes RCU read lock:
194 * xa_load()
195 * xa_for_each()
196 * xa_for_each_start()
197 * xa_for_each_range()
198 * xa_find()
199 * xa_find_after()
200 * xa_extract()
201 * xa_get_mark()
202
203Takes xa_lock internally:
204 * xa_store()
205 * xa_store_bh()
206 * xa_store_irq()
207 * xa_insert()
208 * xa_insert_bh()
209 * xa_insert_irq()
210 * xa_erase()
211 * xa_erase_bh()
212 * xa_erase_irq()
213 * xa_cmpxchg()
214 * xa_cmpxchg_bh()
215 * xa_cmpxchg_irq()
216 * xa_store_range()
217 * xa_alloc()
218 * xa_alloc_bh()
219 * xa_alloc_irq()
220 * xa_reserve()
221 * xa_reserve_bh()
222 * xa_reserve_irq()
223 * xa_destroy()
224 * xa_set_mark()
225 * xa_clear_mark()
226
227Assumes xa_lock held on entry:
228 * __xa_store()
229 * __xa_insert()
230 * __xa_erase()
231 * __xa_cmpxchg()
232 * __xa_alloc()
233 * __xa_set_mark()
234 * __xa_clear_mark()
235
236If you want to take advantage of the lock to protect the data structures
237that you are storing in the XArray, you can call xa_lock()
238before calling xa_load(), then take a reference count on the
239object you have found before calling xa_unlock().  This will
240prevent stores from removing the object from the array between looking
241up the object and incrementing the refcount.  You can also use RCU to
242avoid dereferencing freed memory, but an explanation of that is beyond
243the scope of this document.
244
245The XArray does not disable interrupts or softirqs while modifying
246the array.  It is safe to read the XArray from interrupt or softirq
247context as the RCU lock provides enough protection.
248
249If, for example, you want to store entries in the XArray in process
250context and then erase them in softirq context, you can do that this way::
251
252    void foo_init(struct foo *foo)
253    {
254        xa_init_flags(&foo->array, XA_FLAGS_LOCK_BH);
255    }
256
257    int foo_store(struct foo *foo, unsigned long index, void *entry)
258    {
259        int err;
260
261        xa_lock_bh(&foo->array);
262        err = xa_err(__xa_store(&foo->array, index, entry, GFP_KERNEL));
263        if (!err)
264            foo->count++;
265        xa_unlock_bh(&foo->array);
266        return err;
267    }
268
269    /* foo_erase() is only called from softirq context */
270    void foo_erase(struct foo *foo, unsigned long index)
271    {
272        xa_lock(&foo->array);
273        __xa_erase(&foo->array, index);
274        foo->count--;
275        xa_unlock(&foo->array);
276    }
277
278If you are going to modify the XArray from interrupt or softirq context,
279you need to initialise the array using xa_init_flags(), passing
280``XA_FLAGS_LOCK_IRQ`` or ``XA_FLAGS_LOCK_BH``.
281
282The above example also shows a common pattern of wanting to extend the
283coverage of the xa_lock on the store side to protect some statistics
284associated with the array.
285
286Sharing the XArray with interrupt context is also possible, either
287using xa_lock_irqsave() in both the interrupt handler and process
288context, or xa_lock_irq() in process context and xa_lock()
289in the interrupt handler.  Some of the more common patterns have helper
290functions such as xa_store_bh(), xa_store_irq(),
291xa_erase_bh(), xa_erase_irq(), xa_cmpxchg_bh()
292and xa_cmpxchg_irq().
293
294Sometimes you need to protect access to the XArray with a mutex because
295that lock sits above another mutex in the locking hierarchy.  That does
296not entitle you to use functions like __xa_erase() without taking
297the xa_lock; the xa_lock is used for lockdep validation and will be used
298for other purposes in the future.
299
300The __xa_set_mark() and __xa_clear_mark() functions are also
301available for situations where you look up an entry and want to atomically
302set or clear a mark.  It may be more efficient to use the advanced API
303in this case, as it will save you from walking the tree twice.
304
305Advanced API
306============
307
308The advanced API offers more flexibility and better performance at the
309cost of an interface which can be harder to use and has fewer safeguards.
310No locking is done for you by the advanced API, and you are required
311to use the xa_lock while modifying the array.  You can choose whether
312to use the xa_lock or the RCU lock while doing read-only operations on
313the array.  You can mix advanced and normal operations on the same array;
314indeed the normal API is implemented in terms of the advanced API.  The
315advanced API is only available to modules with a GPL-compatible license.
316
317The advanced API is based around the xa_state.  This is an opaque data
318structure which you declare on the stack using the XA_STATE() macro.
319This macro initialises the xa_state ready to start walking around the
320XArray.  It is used as a cursor to maintain the position in the XArray
321and let you compose various operations together without having to restart
322from the top every time.  The contents of the xa_state are protected by
323the rcu_read_lock() or the xas_lock().  If you need to drop whichever of
324those locks is protecting your state and tree, you must call xas_pause()
325so that future calls do not rely on the parts of the state which were
326left unprotected.
327
328The xa_state is also used to store errors.  You can call
329xas_error() to retrieve the error.  All operations check whether
330the xa_state is in an error state before proceeding, so there's no need
331for you to check for an error after each call; you can make multiple
332calls in succession and only check at a convenient point.  The only
333errors currently generated by the XArray code itself are ``ENOMEM`` and
334``EINVAL``, but it supports arbitrary errors in case you want to call
335xas_set_err() yourself.
336
337If the xa_state is holding an ``ENOMEM`` error, calling xas_nomem()
338will attempt to allocate more memory using the specified gfp flags and
339cache it in the xa_state for the next attempt.  The idea is that you take
340the xa_lock, attempt the operation and drop the lock.  The operation
341attempts to allocate memory while holding the lock, but it is more
342likely to fail.  Once you have dropped the lock, xas_nomem()
343can try harder to allocate more memory.  It will return ``true`` if it
344is worth retrying the operation (i.e. that there was a memory error *and*
345more memory was allocated).  If it has previously allocated memory, and
346that memory wasn't used, and there is no error (or some error that isn't
347``ENOMEM``), then it will free the memory previously allocated.
348
349Internal Entries
350----------------
351
352The XArray reserves some entries for its own purposes.  These are never
353exposed through the normal API, but when using the advanced API, it's
354possible to see them.  Usually the best way to handle them is to pass them
355to xas_retry(), and retry the operation if it returns ``true``.
356
357.. flat-table::
358   :widths: 1 1 6
359
360   * - Name
361     - Test
362     - Usage
363
364   * - Node
365     - xa_is_node()
366     - An XArray node.  May be visible when using a multi-index xa_state.
367
368   * - Sibling
369     - xa_is_sibling()
370     - A non-canonical entry for a multi-index entry.  The value indicates
371       which slot in this node has the canonical entry.
372
373   * - Retry
374     - xa_is_retry()
375     - This entry is currently being modified by a thread which has the
376       xa_lock.  The node containing this entry may be freed at the end
377       of this RCU period.  You should restart the lookup from the head
378       of the array.
379
380   * - Zero
381     - xa_is_zero()
382     - Zero entries appear as ``NULL`` through the Normal API, but occupy
383       an entry in the XArray which can be used to reserve the index for
384       future use.  This is used by allocating XArrays for allocated entries
385       which are ``NULL``.
386
387Other internal entries may be added in the future.  As far as possible, they
388will be handled by xas_retry().
389
390Additional functionality
391------------------------
392
393The xas_create_range() function allocates all the necessary memory
394to store every entry in a range.  It will set ENOMEM in the xa_state if
395it cannot allocate memory.
396
397You can use xas_init_marks() to reset the marks on an entry
398to their default state.  This is usually all marks clear, unless the
399XArray is marked with ``XA_FLAGS_TRACK_FREE``, in which case mark 0 is set
400and all other marks are clear.  Replacing one entry with another using
401xas_store() will not reset the marks on that entry; if you want
402the marks reset, you should do that explicitly.
403
404The xas_load() will walk the xa_state as close to the entry
405as it can.  If you know the xa_state has already been walked to the
406entry and need to check that the entry hasn't changed, you can use
407xas_reload() to save a function call.
408
409If you need to move to a different index in the XArray, call
410xas_set().  This resets the cursor to the top of the tree, which
411will generally make the next operation walk the cursor to the desired
412spot in the tree.  If you want to move to the next or previous index,
413call xas_next() or xas_prev().  Setting the index does
414not walk the cursor around the array so does not require a lock to be
415held, while moving to the next or previous index does.
416
417You can search for the next present entry using xas_find().  This
418is the equivalent of both xa_find() and xa_find_after();
419if the cursor has been walked to an entry, then it will find the next
420entry after the one currently referenced.  If not, it will return the
421entry at the index of the xa_state.  Using xas_next_entry() to
422move to the next present entry instead of xas_find() will save
423a function call in the majority of cases at the expense of emitting more
424inline code.
425
426The xas_find_marked() function is similar.  If the xa_state has
427not been walked, it will return the entry at the index of the xa_state,
428if it is marked.  Otherwise, it will return the first marked entry after
429the entry referenced by the xa_state.  The xas_next_marked()
430function is the equivalent of xas_next_entry().
431
432When iterating over a range of the XArray using xas_for_each()
433or xas_for_each_marked(), it may be necessary to temporarily stop
434the iteration.  The xas_pause() function exists for this purpose.
435After you have done the necessary work and wish to resume, the xa_state
436is in an appropriate state to continue the iteration after the entry
437you last processed.  If you have interrupts disabled while iterating,
438then it is good manners to pause the iteration and reenable interrupts
439every ``XA_CHECK_SCHED`` entries.
440
441The xas_get_mark(), xas_set_mark() and xas_clear_mark() functions require
442the xa_state cursor to have been moved to the appropriate location in the
443XArray; they will do nothing if you have called xas_pause() or xas_set()
 
444immediately before.
445
446You can call xas_set_update() to have a callback function
447called each time the XArray updates a node.  This is used by the page
448cache workingset code to maintain its list of nodes which contain only
449shadow entries.
450
451Multi-Index Entries
452-------------------
453
454The XArray has the ability to tie multiple indices together so that
455operations on one index affect all indices.  For example, storing into
456any index will change the value of the entry retrieved from any index.
457Setting or clearing a mark on any index will set or clear the mark
458on every index that is tied together.  The current implementation
459only allows tying ranges which are aligned powers of two together;
460eg indices 64-127 may be tied together, but 2-6 may not be.  This may
461save substantial quantities of memory; for example tying 512 entries
462together will save over 4kB.
463
464You can create a multi-index entry by using XA_STATE_ORDER()
465or xas_set_order() followed by a call to xas_store().
466Calling xas_load() with a multi-index xa_state will walk the
467xa_state to the right location in the tree, but the return value is not
468meaningful, potentially being an internal entry or ``NULL`` even when there
469is an entry stored within the range.  Calling xas_find_conflict()
470will return the first entry within the range or ``NULL`` if there are no
471entries in the range.  The xas_for_each_conflict() iterator will
472iterate over every entry which overlaps the specified range.
473
474If xas_load() encounters a multi-index entry, the xa_index
475in the xa_state will not be changed.  When iterating over an XArray
476or calling xas_find(), if the initial index is in the middle
477of a multi-index entry, it will not be altered.  Subsequent calls
478or iterations will move the index to the first index in the range.
479Each entry will only be returned once, no matter how many indices it
480occupies.
481
482Using xas_next() or xas_prev() with a multi-index xa_state is not
483supported.  Using either of these functions on a multi-index entry will
484reveal sibling entries; these should be skipped over by the caller.
485
486Storing ``NULL`` into any index of a multi-index entry will set the
487entry at every index to ``NULL`` and dissolve the tie.  A multi-index
488entry can be split into entries occupying smaller ranges by calling
489xas_split_alloc() without the xa_lock held, followed by taking the lock
490and calling xas_split().
491
492Functions and structures
493========================
494
495.. kernel-doc:: include/linux/xarray.h
496.. kernel-doc:: lib/xarray.c