Linux Audio

Check our new training course

Loading...
v6.13.7
  1==============
  2OSNOISE Tracer
  3==============
  4
  5In the context of high-performance computing (HPC), the Operating System
  6Noise (*osnoise*) refers to the interference experienced by an application
  7due to activities inside the operating system. In the context of Linux,
  8NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the
  9system. Moreover, hardware-related jobs can also cause noise, for example,
 10via SMIs.
 11
 12hwlat_detector is one of the tools used to identify the most complex
 13source of noise: *hardware noise*.
 14
 15In a nutshell, the hwlat_detector creates a thread that runs
 16periodically for a given period. At the beginning of a period, the thread
 17disables interrupt and starts sampling. While running, the hwlatd
 18thread reads the time in a loop. As interrupts are disabled, threads,
 19IRQs, and SoftIRQs cannot interfere with the hwlatd thread. Hence, the
 20cause of any gap between two different reads of the time roots either on
 21NMI or in the hardware itself. At the end of the period, hwlatd enables
 22interrupts and reports the max observed gap between the reads. It also
 23prints a NMI occurrence counter. If the output does not report NMI
 24executions, the user can conclude that the hardware is the culprit for
 25the latency. The hwlat detects the NMI execution by observing
 26the entry and exit of a NMI.
 27
 28The osnoise tracer leverages the hwlat_detector by running a
 29similar loop with preemption, SoftIRQs and IRQs enabled, thus allowing
 30all the sources of *osnoise* during its execution. Using the same approach
 31of hwlat, osnoise takes note of the entry and exit point of any
 32source of interferences, increasing a per-cpu interference counter. The
 33osnoise tracer also saves an interference counter for each source of
 34interference. The interference counter for NMI, IRQs, SoftIRQs, and
 35threads is increased anytime the tool observes these interferences' entry
 36events. When a noise happens without any interference from the operating
 37system level, the hardware noise counter increases, pointing to a
 38hardware-related noise. In this way, osnoise can account for any
 39source of interference. At the end of the period, the osnoise tracer
 40prints the sum of all noise, the max single noise, the percentage of CPU
 41available for the thread, and the counters for the noise sources.
 42
 43Usage
 44-----
 45
 46Write the ASCII text "osnoise" into the current_tracer file of the
 47tracing system (generally mounted at /sys/kernel/tracing).
 48
 49For example::
 50
 51        [root@f32 ~]# cd /sys/kernel/tracing/
 52        [root@f32 tracing]# echo osnoise > current_tracer
 53
 54It is possible to follow the trace by reading the trace file::
 55
 56        [root@f32 tracing]# cat trace
 57        # tracer: osnoise
 58        #
 59        #                                _-----=> irqs-off
 60        #                               / _----=> need-resched
 61        #                              | / _---=> hardirq/softirq
 62        #                              || / _--=> preempt-depth                            MAX
 63        #                              || /                                             SINGLE     Interference counters:
 64        #                              ||||               RUNTIME      NOISE   % OF CPU  NOISE    +-----------------------------+
 65        #           TASK-PID      CPU# ||||   TIMESTAMP    IN US       IN US  AVAILABLE  IN US     HW    NMI    IRQ   SIRQ THREAD
 66        #              | |         |   ||||      |           |             |    |            |      |      |      |      |      |
 67                   <...>-859     [000] ....    81.637220: 1000000        190  99.98100       9     18      0   1007     18      1
 68                   <...>-860     [001] ....    81.638154: 1000000        656  99.93440      74     23      0   1006     16      3
 69                   <...>-861     [002] ....    81.638193: 1000000       5675  99.43250     202      6      0   1013     25     21
 70                   <...>-862     [003] ....    81.638242: 1000000        125  99.98750      45      1      0   1011     23      0
 71                   <...>-863     [004] ....    81.638260: 1000000       1721  99.82790     168      7      0   1002     49     41
 72                   <...>-864     [005] ....    81.638286: 1000000        263  99.97370      57      6      0   1006     26      2
 73                   <...>-865     [006] ....    81.638302: 1000000        109  99.98910      21      3      0   1006     18      1
 74                   <...>-866     [007] ....    81.638326: 1000000       7816  99.21840     107      8      0   1016     39     19
 75
 76In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the
 77tracer prints a message at the end of each period for each CPU that is
 78running an osnoise/ thread. The osnoise specific fields report:
 79
 80 - The RUNTIME IN US reports the amount of time in microseconds that
 81   the osnoise thread kept looping reading the time.
 82 - The NOISE IN US reports the sum of noise in microseconds observed
 83   by the osnoise tracer during the associated runtime.
 84 - The % OF CPU AVAILABLE reports the percentage of CPU available for
 85   the osnoise thread during the runtime window.
 86 - The MAX SINGLE NOISE IN US reports the maximum single noise observed
 87   during the runtime window.
 88 - The Interference counters display how many each of the respective
 89   interference happened during the runtime window.
 90
 91Note that the example above shows a high number of HW noise samples.
 92The reason being is that this sample was taken on a virtual machine,
 93and the host interference is detected as a hardware interference.
 94
 95Tracer Configuration
 96--------------------
 97
 98The tracer has a set of options inside the osnoise directory, they are:
 99
100 - osnoise/cpus: CPUs at which a osnoise thread will execute.
101 - osnoise/period_us: the period of the osnoise thread.
102 - osnoise/runtime_us: how long an osnoise thread will look for noise.
103 - osnoise/stop_tracing_us: stop the system tracing if a single noise
104   higher than the configured value happens. Writing 0 disables this
105   option.
106 - osnoise/stop_tracing_total_us: stop the system tracing if total noise
107   higher than the configured value happens. Writing 0 disables this
108   option.
109 - tracing_threshold: the minimum delta between two time() reads to be
110   considered as noise, in us. When set to 0, the default value will
111   be used, which is currently 1 us.
112 - osnoise/options: a set of on/off options that can be enabled by
113   writing the option name to the file or disabled by writing the option
114   name preceded with the 'NO\_' prefix. For example, writing
115   NO_OSNOISE_WORKLOAD disables the OSNOISE_WORKLOAD option. The
116   special DEAFAULTS option resets all options to the default value.
117
118Tracer Options
119--------------
120
121The osnoise/options file exposes a set of on/off configuration options for
122the osnoise tracer. These options are:
123
124 - DEFAULTS: reset the options to the default value.
125 - OSNOISE_WORKLOAD: do not dispatch osnoise workload (see dedicated
126   section below).
127 - PANIC_ON_STOP: call panic() if the tracer stops. This option serves to
128   capture a vmcore.
129 - OSNOISE_PREEMPT_DISABLE: disable preemption while running the osnoise
130   workload, allowing only IRQ and hardware-related noise.
131 - OSNOISE_IRQ_DISABLE: disable IRQs while running the osnoise workload,
132   allowing only NMIs and hardware-related noise, like hwlat tracer.
133
134Additional Tracing
135------------------
136
137In addition to the tracer, a set of tracepoints were added to
138facilitate the identification of the osnoise source.
139
140 - osnoise:sample_threshold: printed anytime a noise is higher than
141   the configurable tolerance_ns.
142 - osnoise:nmi_noise: noise from NMI, including the duration.
143 - osnoise:irq_noise: noise from an IRQ, including the duration.
144 - osnoise:softirq_noise: noise from a SoftIRQ, including the
145   duration.
146 - osnoise:thread_noise: noise from a thread, including the duration.
147
148Note that all the values are *net values*. For example, if while osnoise
149is running, another thread preempts the osnoise thread, it will start a
150thread_noise duration at the start. Then, an IRQ takes place, preempting
151the thread_noise, starting a irq_noise. When the IRQ ends its execution,
152it will compute its duration, and this duration will be subtracted from
153the thread_noise, in such a way as to avoid the double accounting of the
154IRQ execution. This logic is valid for all sources of noise.
155
156Here is one example of the usage of these tracepoints::
157
158       osnoise/8-961     [008] d.h.  5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns
159       osnoise/8-961     [008] dNh.  5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns
160     migration/8-54      [008] d...  5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns
161       osnoise/8-961     [008] ....  5789.858413: sample_threshold: start 5789.858404555 duration 8812 ns interferences 2
162
163In this example, a noise sample of 8 microseconds was reported in the last
164line, pointing to two interferences. Looking backward in the trace, the
165two previous entries were about the migration thread running after a
166timer IRQ execution. The first event is not part of the noise because
167it took place one millisecond before.
168
169It is worth noticing that the sum of the duration reported in the
170tracepoints is smaller than eight us reported in the sample_threshold.
171The reason roots in the overhead of the entry and exit code that happens
172before and after any interference execution. This justifies the dual
173approach: measuring thread and tracing.
174
175Running osnoise tracer without workload
176---------------------------------------
177
178By enabling the osnoise tracer with the NO_OSNOISE_WORKLOAD option set,
179the osnoise: tracepoints serve to measure the execution time of
180any type of Linux task, free from the interference of other tasks.
v6.2
  1==============
  2OSNOISE Tracer
  3==============
  4
  5In the context of high-performance computing (HPC), the Operating System
  6Noise (*osnoise*) refers to the interference experienced by an application
  7due to activities inside the operating system. In the context of Linux,
  8NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the
  9system. Moreover, hardware-related jobs can also cause noise, for example,
 10via SMIs.
 11
 12hwlat_detector is one of the tools used to identify the most complex
 13source of noise: *hardware noise*.
 14
 15In a nutshell, the hwlat_detector creates a thread that runs
 16periodically for a given period. At the beginning of a period, the thread
 17disables interrupt and starts sampling. While running, the hwlatd
 18thread reads the time in a loop. As interrupts are disabled, threads,
 19IRQs, and SoftIRQs cannot interfere with the hwlatd thread. Hence, the
 20cause of any gap between two different reads of the time roots either on
 21NMI or in the hardware itself. At the end of the period, hwlatd enables
 22interrupts and reports the max observed gap between the reads. It also
 23prints a NMI occurrence counter. If the output does not report NMI
 24executions, the user can conclude that the hardware is the culprit for
 25the latency. The hwlat detects the NMI execution by observing
 26the entry and exit of a NMI.
 27
 28The osnoise tracer leverages the hwlat_detector by running a
 29similar loop with preemption, SoftIRQs and IRQs enabled, thus allowing
 30all the sources of *osnoise* during its execution. Using the same approach
 31of hwlat, osnoise takes note of the entry and exit point of any
 32source of interferences, increasing a per-cpu interference counter. The
 33osnoise tracer also saves an interference counter for each source of
 34interference. The interference counter for NMI, IRQs, SoftIRQs, and
 35threads is increased anytime the tool observes these interferences' entry
 36events. When a noise happens without any interference from the operating
 37system level, the hardware noise counter increases, pointing to a
 38hardware-related noise. In this way, osnoise can account for any
 39source of interference. At the end of the period, the osnoise tracer
 40prints the sum of all noise, the max single noise, the percentage of CPU
 41available for the thread, and the counters for the noise sources.
 42
 43Usage
 44-----
 45
 46Write the ASCII text "osnoise" into the current_tracer file of the
 47tracing system (generally mounted at /sys/kernel/tracing).
 48
 49For example::
 50
 51        [root@f32 ~]# cd /sys/kernel/tracing/
 52        [root@f32 tracing]# echo osnoise > current_tracer
 53
 54It is possible to follow the trace by reading the trace file::
 55
 56        [root@f32 tracing]# cat trace
 57        # tracer: osnoise
 58        #
 59        #                                _-----=> irqs-off
 60        #                               / _----=> need-resched
 61        #                              | / _---=> hardirq/softirq
 62        #                              || / _--=> preempt-depth                            MAX
 63        #                              || /                                             SINGLE     Interference counters:
 64        #                              ||||               RUNTIME      NOISE   % OF CPU  NOISE    +-----------------------------+
 65        #           TASK-PID      CPU# ||||   TIMESTAMP    IN US       IN US  AVAILABLE  IN US     HW    NMI    IRQ   SIRQ THREAD
 66        #              | |         |   ||||      |           |             |    |            |      |      |      |      |      |
 67                   <...>-859     [000] ....    81.637220: 1000000        190  99.98100       9     18      0   1007     18      1
 68                   <...>-860     [001] ....    81.638154: 1000000        656  99.93440      74     23      0   1006     16      3
 69                   <...>-861     [002] ....    81.638193: 1000000       5675  99.43250     202      6      0   1013     25     21
 70                   <...>-862     [003] ....    81.638242: 1000000        125  99.98750      45      1      0   1011     23      0
 71                   <...>-863     [004] ....    81.638260: 1000000       1721  99.82790     168      7      0   1002     49     41
 72                   <...>-864     [005] ....    81.638286: 1000000        263  99.97370      57      6      0   1006     26      2
 73                   <...>-865     [006] ....    81.638302: 1000000        109  99.98910      21      3      0   1006     18      1
 74                   <...>-866     [007] ....    81.638326: 1000000       7816  99.21840     107      8      0   1016     39     19
 75
 76In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the
 77tracer prints a message at the end of each period for each CPU that is
 78running an osnoise/ thread. The osnoise specific fields report:
 79
 80 - The RUNTIME IN US reports the amount of time in microseconds that
 81   the osnoise thread kept looping reading the time.
 82 - The NOISE IN US reports the sum of noise in microseconds observed
 83   by the osnoise tracer during the associated runtime.
 84 - The % OF CPU AVAILABLE reports the percentage of CPU available for
 85   the osnoise thread during the runtime window.
 86 - The MAX SINGLE NOISE IN US reports the maximum single noise observed
 87   during the runtime window.
 88 - The Interference counters display how many each of the respective
 89   interference happened during the runtime window.
 90
 91Note that the example above shows a high number of HW noise samples.
 92The reason being is that this sample was taken on a virtual machine,
 93and the host interference is detected as a hardware interference.
 94
 95Tracer Configuration
 96--------------------
 97
 98The tracer has a set of options inside the osnoise directory, they are:
 99
100 - osnoise/cpus: CPUs at which a osnoise thread will execute.
101 - osnoise/period_us: the period of the osnoise thread.
102 - osnoise/runtime_us: how long an osnoise thread will look for noise.
103 - osnoise/stop_tracing_us: stop the system tracing if a single noise
104   higher than the configured value happens. Writing 0 disables this
105   option.
106 - osnoise/stop_tracing_total_us: stop the system tracing if total noise
107   higher than the configured value happens. Writing 0 disables this
108   option.
109 - tracing_threshold: the minimum delta between two time() reads to be
110   considered as noise, in us. When set to 0, the default value will
111   be used, which is currently 5 us.
112 - osnoise/options: a set of on/off options that can be enabled by
113   writing the option name to the file or disabled by writing the option
114   name preceded with the 'NO\_' prefix. For example, writing
115   NO_OSNOISE_WORKLOAD disables the OSNOISE_WORKLOAD option. The
116   special DEAFAULTS option resets all options to the default value.
117
118Tracer Options
119--------------
120
121The osnoise/options file exposes a set of on/off configuration options for
122the osnoise tracer. These options are:
123
124 - DEFAULTS: reset the options to the default value.
125 - OSNOISE_WORKLOAD: do not dispatch osnoise workload (see dedicated
126   section below).
127 - PANIC_ON_STOP: call panic() if the tracer stops. This option serves to
128   capture a vmcore.
129 - OSNOISE_PREEMPT_DISABLE: disable preemption while running the osnoise
130   workload, allowing only IRQ and hardware-related noise.
131 - OSNOISE_IRQ_DISABLE: disable IRQs while running the osnoise workload,
132   allowing only NMIs and hardware-related noise, like hwlat tracer.
133
134Additional Tracing
135------------------
136
137In addition to the tracer, a set of tracepoints were added to
138facilitate the identification of the osnoise source.
139
140 - osnoise:sample_threshold: printed anytime a noise is higher than
141   the configurable tolerance_ns.
142 - osnoise:nmi_noise: noise from NMI, including the duration.
143 - osnoise:irq_noise: noise from an IRQ, including the duration.
144 - osnoise:softirq_noise: noise from a SoftIRQ, including the
145   duration.
146 - osnoise:thread_noise: noise from a thread, including the duration.
147
148Note that all the values are *net values*. For example, if while osnoise
149is running, another thread preempts the osnoise thread, it will start a
150thread_noise duration at the start. Then, an IRQ takes place, preempting
151the thread_noise, starting a irq_noise. When the IRQ ends its execution,
152it will compute its duration, and this duration will be subtracted from
153the thread_noise, in such a way as to avoid the double accounting of the
154IRQ execution. This logic is valid for all sources of noise.
155
156Here is one example of the usage of these tracepoints::
157
158       osnoise/8-961     [008] d.h.  5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns
159       osnoise/8-961     [008] dNh.  5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns
160     migration/8-54      [008] d...  5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns
161       osnoise/8-961     [008] ....  5789.858413: sample_threshold: start 5789.858404555 duration 8812 ns interferences 2
162
163In this example, a noise sample of 8 microseconds was reported in the last
164line, pointing to two interferences. Looking backward in the trace, the
165two previous entries were about the migration thread running after a
166timer IRQ execution. The first event is not part of the noise because
167it took place one millisecond before.
168
169It is worth noticing that the sum of the duration reported in the
170tracepoints is smaller than eight us reported in the sample_threshold.
171The reason roots in the overhead of the entry and exit code that happens
172before and after any interference execution. This justifies the dual
173approach: measuring thread and tracing.
174
175Running osnoise tracer without workload
176---------------------------------------
177
178By enabling the osnoise tracer with the NO_OSNOISE_WORKLOAD option set,
179the osnoise: tracepoints serve to measure the execution time of
180any type of Linux task, free from the interference of other tasks.