perf-bench.txt - tools/perf/Documentation/perf-bench.txt - Linux diff v4.17

  1perf-bench(1)
  2=============
  3
  4NAME
  5----
  6perf-bench - General framework for benchmark suites
  7
  8SYNOPSIS
  9--------
 10[verse]
 11'perf bench' [<common options>] <subsystem> <suite> [<options>]
 12
 13DESCRIPTION
 14-----------
 15This 'perf bench' command is a general framework for benchmark suites.
 16
 17COMMON OPTIONS
 18--------------
 19-r::
 20--repeat=::
 21Specify amount of times to repeat the run (default 10).
 22
 23-f::
 24--format=::
 25Specify format style.
 26Current available format styles are:
 27
 28'default'::
 29Default style. This is mainly for human reading.
 30---------------------
 31% perf bench sched pipe                      # with no style specified
 32(executing 1000000 pipe operations between two tasks)
 33        Total time:5.855 sec
 34                5.855061 usecs/op
 35		170792 ops/sec
 36---------------------
 37
 38'simple'::
 39This simple style is friendly for automated
 40processing by scripts.
 41---------------------
 42% perf bench --format=simple sched pipe      # specified simple
 435.988
 44---------------------
 45
 46SUBSYSTEM
 47---------
 48
 49'sched'::
 50	Scheduler and IPC mechanisms.
 51
 
 
 
 52'mem'::
 53	Memory access performance.
 54
 55'numa'::
 56	NUMA scheduling and MM benchmarks.
 57
 58'futex'::
 59	Futex stressing benchmarks.
 60
 
 
 
 
 
 
 
 
 
 61'all'::
 62	All benchmark subsystems.
 63
 64SUITES FOR 'sched'
 65~~~~~~~~~~~~~~~~~~
 66*messaging*::
 67Suite for evaluating performance of scheduler and IPC mechanisms.
 68Based on hackbench by Rusty Russell.
 69
 70Options of *messaging*
 71^^^^^^^^^^^^^^^^^^^^^^
 72-p::
 73--pipe::
 74Use pipe() instead of socketpair()
 75
 76-t::
 77--thread::
 78Be multi thread instead of multi process
 79
 80-g::
 81--group=::
 82Specify number of groups
 83
 84-l::
 85--nr_loops=::
 86Specify number of loops
 87
 88Example of *messaging*
 89^^^^^^^^^^^^^^^^^^^^^^
 90
 91---------------------
 92% perf bench sched messaging                 # run with default
 93options (20 sender and receiver processes per group)
 94(10 groups == 400 processes run)
 95
 96      Total time:0.308 sec
 97
 98% perf bench sched messaging -t -g 20        # be multi-thread, with 20 groups
 99(20 sender and receiver threads per group)
100(20 groups == 800 threads run)
101
102      Total time:0.582 sec
103---------------------
104
105*pipe*::
106Suite for pipe() system call.
107Based on pipe-test-1m.c by Ingo Molnar.
108
109Options of *pipe*
110^^^^^^^^^^^^^^^^^
111-l::
112--loop=::
113Specify number of loops.
114
 
 
 
 
 
 
 
 
115Example of *pipe*
116^^^^^^^^^^^^^^^^^
117
118---------------------
119% perf bench sched pipe
120(executing 1000000 pipe operations between two tasks)
121
122        Total time:8.091 sec
123                8.091833 usecs/op
124                123581 ops/sec
125
126% perf bench sched pipe -l 1000              # loop 1000
127(executing 1000 pipe operations between two tasks)
128
129        Total time:0.016 sec
130                16.948000 usecs/op
131                59004 ops/sec
 
 
 
 
 
 
 
 
 
 
 
132---------------------
133
 
 
 
 
 
 
 
 
134SUITES FOR 'mem'
135~~~~~~~~~~~~~~~~
136*memcpy*::
137Suite for evaluating performance of simple memory copy in various ways.
138
139Options of *memcpy*
140^^^^^^^^^^^^^^^^^^^
141-l::
142--size::
143Specify size of memory to copy (default: 1MB).
144Available units are B, KB, MB, GB and TB (case insensitive).
145
146-f::
147--function::
148Specify function to copy (default: default).
149Available functions are depend on the architecture.
150On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported.
151
152-l::
153--nr_loops::
154Repeat memcpy invocation this number of times.
155
156-c::
157--cycles::
158Use perf's cpu-cycles event instead of gettimeofday syscall.
159
160*memset*::
161Suite for evaluating performance of simple memory set in various ways.
162
163Options of *memset*
164^^^^^^^^^^^^^^^^^^^
165-l::
166--size::
167Specify size of memory to set (default: 1MB).
168Available units are B, KB, MB, GB and TB (case insensitive).
169
170-f::
171--function::
172Specify function to set (default: default).
173Available functions are depend on the architecture.
174On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported.
175
176-l::
177--nr_loops::
178Repeat memset invocation this number of times.
179
180-c::
181--cycles::
182Use perf's cpu-cycles event instead of gettimeofday syscall.
183
184SUITES FOR 'numa'
185~~~~~~~~~~~~~~~~~
186*mem*::
187Suite for evaluating NUMA workloads.
188
189SUITES FOR 'futex'
190~~~~~~~~~~~~~~~~~~
191*hash*::
192Suite for evaluating hash tables.
193
194*wake*::
195Suite for evaluating wake calls.
196
197*wake-parallel*::
198Suite for evaluating parallel wake calls.
199
200*requeue*::
201Suite for evaluating requeue calls.
202
203*lock-pi*::
204Suite for evaluating futex lock_pi calls.
205
 
 
 
 
 
 
 
 
 
 
 
 
206
207SEE ALSO
208--------
209linkperf:perf[1]

  1perf-bench(1)
  2=============
  3
  4NAME
  5----
  6perf-bench - General framework for benchmark suites
  7
  8SYNOPSIS
  9--------
 10[verse]
 11'perf bench' [<common options>] <subsystem> <suite> [<options>]
 12
 13DESCRIPTION
 14-----------
 15This 'perf bench' command is a general framework for benchmark suites.
 16
 17COMMON OPTIONS
 18--------------
 19-r::
 20--repeat=::
 21Specify number of times to repeat the run (default 10).
 22
 23-f::
 24--format=::
 25Specify format style.
 26Current available format styles are:
 27
 28'default'::
 29Default style. This is mainly for human reading.
 30---------------------
 31% perf bench sched pipe                      # with no style specified
 32(executing 1000000 pipe operations between two tasks)
 33        Total time:5.855 sec
 34                5.855061 usecs/op
 35		170792 ops/sec
 36---------------------
 37
 38'simple'::
 39This simple style is friendly for automated
 40processing by scripts.
 41---------------------
 42% perf bench --format=simple sched pipe      # specified simple
 435.988
 44---------------------
 45
 46SUBSYSTEM
 47---------
 48
 49'sched'::
 50	Scheduler and IPC mechanisms.
 51
 52'syscall'::
 53	System call performance (throughput).
 54
 55'mem'::
 56	Memory access performance.
 57
 58'numa'::
 59	NUMA scheduling and MM benchmarks.
 60
 61'futex'::
 62	Futex stressing benchmarks.
 63
 64'epoll'::
 65	Eventpoll (epoll) stressing benchmarks.
 66
 67'internals'::
 68	Benchmark internal perf functionality.
 69
 70'uprobe'::
 71	Benchmark overhead of uprobe + BPF.
 72
 73'all'::
 74	All benchmark subsystems.
 75
 76SUITES FOR 'sched'
 77~~~~~~~~~~~~~~~~~~
 78*messaging*::
 79Suite for evaluating performance of scheduler and IPC mechanisms.
 80Based on hackbench by Rusty Russell.
 81
 82Options of *messaging*
 83^^^^^^^^^^^^^^^^^^^^^^
 84-p::
 85--pipe::
 86Use pipe() instead of socketpair()
 87
 88-t::
 89--thread::
 90Be multi thread instead of multi process
 91
 92-g::
 93--group=::
 94Specify number of groups
 95
 96-l::
 97--nr_loops=::
 98Specify number of loops
 99
100Example of *messaging*
101^^^^^^^^^^^^^^^^^^^^^^
102
103---------------------
104% perf bench sched messaging                 # run with default
105options (20 sender and receiver processes per group)
106(10 groups == 400 processes run)
107
108      Total time:0.308 sec
109
110% perf bench sched messaging -t -g 20        # be multi-thread, with 20 groups
111(20 sender and receiver threads per group)
112(20 groups == 800 threads run)
113
114      Total time:0.582 sec
115---------------------
116
117*pipe*::
118Suite for pipe() system call.
119Based on pipe-test-1m.c by Ingo Molnar.
120
121Options of *pipe*
122^^^^^^^^^^^^^^^^^
123-l::
124--loop=::
125Specify number of loops.
126
127-G::
128--cgroups=::
129Names of cgroups for sender and receiver, separated by a comma.
130This is useful to check cgroup context switching overhead.
131Note that perf doesn't create nor delete the cgroups, so users should
132make sure that the cgroups exist and are accessible before use.
133
134
135Example of *pipe*
136^^^^^^^^^^^^^^^^^
137
138---------------------
139% perf bench sched pipe
140(executing 1000000 pipe operations between two tasks)
141
142        Total time:8.091 sec
143                8.091833 usecs/op
144                123581 ops/sec
145
146% perf bench sched pipe -l 1000              # loop 1000
147(executing 1000 pipe operations between two tasks)
148
149        Total time:0.016 sec
150                16.948000 usecs/op
151                59004 ops/sec
152
153% perf bench sched pipe -G AAA,BBB
154(executing 1000000 pipe operations between cgroups)
155# Running 'sched/pipe' benchmark:
156# Executed 1000000 pipe operations between two processes
157
158     Total time: 6.886 [sec]
159
160       6.886208 usecs/op
161         145217 ops/sec
162
163---------------------
164
165SUITES FOR 'syscall'
166~~~~~~~~~~~~~~~~~~
167*basic*::
168Suite for evaluating performance of core system call throughput (both usecs/op and ops/sec metrics).
169This uses a single thread simply doing getppid(2), which is a simple syscall where the result is not
170cached by glibc.
171
172
173SUITES FOR 'mem'
174~~~~~~~~~~~~~~~~
175*memcpy*::
176Suite for evaluating performance of simple memory copy in various ways.
177
178Options of *memcpy*
179^^^^^^^^^^^^^^^^^^^
180-l::
181--size::
182Specify size of memory to copy (default: 1MB).
183Available units are B, KB, MB, GB and TB (case insensitive).
184
185-f::
186--function::
187Specify function to copy (default: default).
188Available functions are depend on the architecture.
189On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported.
190
191-l::
192--nr_loops::
193Repeat memcpy invocation this number of times.
194
195-c::
196--cycles::
197Use perf's cpu-cycles event instead of gettimeofday syscall.
198
199*memset*::
200Suite for evaluating performance of simple memory set in various ways.
201
202Options of *memset*
203^^^^^^^^^^^^^^^^^^^
204-l::
205--size::
206Specify size of memory to set (default: 1MB).
207Available units are B, KB, MB, GB and TB (case insensitive).
208
209-f::
210--function::
211Specify function to set (default: default).
212Available functions are depend on the architecture.
213On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported.
214
215-l::
216--nr_loops::
217Repeat memset invocation this number of times.
218
219-c::
220--cycles::
221Use perf's cpu-cycles event instead of gettimeofday syscall.
222
223SUITES FOR 'numa'
224~~~~~~~~~~~~~~~~~
225*mem*::
226Suite for evaluating NUMA workloads.
227
228SUITES FOR 'futex'
229~~~~~~~~~~~~~~~~~~
230*hash*::
231Suite for evaluating hash tables.
232
233*wake*::
234Suite for evaluating wake calls.
235
236*wake-parallel*::
237Suite for evaluating parallel wake calls.
238
239*requeue*::
240Suite for evaluating requeue calls.
241
242*lock-pi*::
243Suite for evaluating futex lock_pi calls.
244
245SUITES FOR 'epoll'
246~~~~~~~~~~~~~~~~~~
247*wait*::
248Suite for evaluating concurrent epoll_wait calls.
249
250*ctl*::
251Suite for evaluating multiple epoll_ctl calls.
252
253SUITES FOR 'internals'
254~~~~~~~~~~~~~~~~~~~~~~
255*synthesize*::
256Suite for evaluating perf's event synthesis performance.
257
258SEE ALSO
259--------
260linkperf:perf[1]