events-kmem.txt - Documentation/trace/events-kmem.txt - Linux source code v6.13.7

Note: File does not exist in v6.13.7.
  1			Subsystem Trace Points: kmem
  2
  3The kmem tracing system captures events related to object and page allocation
  4within the kernel. Broadly speaking there are five major subheadings.
  5
  6  o Slab allocation of small objects of unknown type (kmalloc)
  7  o Slab allocation of small objects of known type
  8  o Page allocation
  9  o Per-CPU Allocator Activity
 10  o External Fragmentation
 11
 12This document describes what each of the tracepoints is and why they
 13might be useful.
 14
 151. Slab allocation of small objects of unknown type
 16===================================================
 17kmalloc		call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s
 18kmalloc_node	call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d
 19kfree		call_site=%lx ptr=%p
 20
 21Heavy activity for these events may indicate that a specific cache is
 22justified, particularly if kmalloc slab pages are getting significantly
 23internal fragmented as a result of the allocation pattern. By correlating
 24kmalloc with kfree, it may be possible to identify memory leaks and where
 25the allocation sites were.
 26
 27
 282. Slab allocation of small objects of known type
 29=================================================
 30kmem_cache_alloc	call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s
 31kmem_cache_alloc_node	call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d
 32kmem_cache_free		call_site=%lx ptr=%p
 33
 34These events are similar in usage to the kmalloc-related events except that
 35it is likely easier to pin the event down to a specific cache. At the time
 36of writing, no information is available on what slab is being allocated from,
 37but the call_site can usually be used to extrapolate that information.
 38
 393. Page allocation
 40==================
 41mm_page_alloc		  page=%p pfn=%lu order=%d migratetype=%d gfp_flags=%s
 42mm_page_alloc_zone_locked page=%p pfn=%lu order=%u migratetype=%d cpu=%d percpu_refill=%d
 43mm_page_free_direct	  page=%p pfn=%lu order=%d
 44mm_pagevec_free		  page=%p pfn=%lu order=%d cold=%d
 45
 46These four events deal with page allocation and freeing. mm_page_alloc is
 47a simple indicator of page allocator activity. Pages may be allocated from
 48the per-CPU allocator (high performance) or the buddy allocator.
 49
 50If pages are allocated directly from the buddy allocator, the
 51mm_page_alloc_zone_locked event is triggered. This event is important as high
 52amounts of activity imply high activity on the zone->lock. Taking this lock
 53impairs performance by disabling interrupts, dirtying cache lines between
 54CPUs and serialising many CPUs.
 55
 56When a page is freed directly by the caller, the mm_page_free_direct event
 57is triggered. Significant amounts of activity here could indicate that the
 58callers should be batching their activities.
 59
 60When pages are freed using a pagevec, the mm_pagevec_free is
 61triggered. Broadly speaking, pages are taken off the LRU lock in bulk and
 62freed in batch with a pagevec. Significant amounts of activity here could
 63indicate that the system is under memory pressure and can also indicate
 64contention on the zone->lru_lock.
 65
 664. Per-CPU Allocator Activity
 67=============================
 68mm_page_alloc_zone_locked	page=%p pfn=%lu order=%u migratetype=%d cpu=%d percpu_refill=%d
 69mm_page_pcpu_drain		page=%p pfn=%lu order=%d cpu=%d migratetype=%d
 70
 71In front of the page allocator is a per-cpu page allocator. It exists only
 72for order-0 pages, reduces contention on the zone->lock and reduces the
 73amount of writing on struct page.
 74
 75When a per-CPU list is empty or pages of the wrong type are allocated,
 76the zone->lock will be taken once and the per-CPU list refilled. The event
 77triggered is mm_page_alloc_zone_locked for each page allocated with the
 78event indicating whether it is for a percpu_refill or not.
 79
 80When the per-CPU list is too full, a number of pages are freed, each one
 81which triggers a mm_page_pcpu_drain event.
 82
 83The individual nature of the events is so that pages can be tracked
 84between allocation and freeing. A number of drain or refill pages that occur
 85consecutively imply the zone->lock being taken once. Large amounts of per-CPU
 86refills and drains could imply an imbalance between CPUs where too much work
 87is being concentrated in one place. It could also indicate that the per-CPU
 88lists should be a larger size. Finally, large amounts of refills on one CPU
 89and drains on another could be a factor in causing large amounts of cache
 90line bounces due to writes between CPUs and worth investigating if pages
 91can be allocated and freed on the same CPU through some algorithm change.
 92
 935. External Fragmentation
 94=========================
 95mm_page_alloc_extfrag		page=%p pfn=%lu alloc_order=%d fallback_order=%d pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d change_ownership=%d
 96
 97External fragmentation affects whether a high-order allocation will be
 98successful or not. For some types of hardware, this is important although
 99it is avoided where possible. If the system is using huge pages and needs
100to be able to resize the pool over the lifetime of the system, this value
101is important.
102
103Large numbers of this event implies that memory is fragmenting and
104high-order allocations will start failing at some time in the future. One
105means of reducing the occurrence of this event is to increase the size of
106min_free_kbytes in increments of 3*pageblock_size*nr_online_nodes where
107pageblock_size is usually the size of the default hugepage size.