Linux Audio

Check our new training course

Loading...
v6.2
  1Intel hybrid support
  2--------------------
  3Support for Intel hybrid events within perf tools.
  4
  5For some Intel platforms, such as AlderLake, which is hybrid platform and
  6it consists of atom cpu and core cpu. Each cpu has dedicated event list.
  7Part of events are available on core cpu, part of events are available
  8on atom cpu and even part of events are available on both.
  9
 10Kernel exports two new cpu pmus via sysfs:
 11/sys/devices/cpu_core
 12/sys/devices/cpu_atom
 13
 14The 'cpus' files are created under the directories. For example,
 15
 16cat /sys/devices/cpu_core/cpus
 170-15
 18
 19cat /sys/devices/cpu_atom/cpus
 2016-23
 21
 22It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus.
 23
 
 
 
 
 
 24As before, use perf-list to list the symbolic event.
 25
 26perf list
 27
 28inst_retired.any
 29	[Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
 30inst_retired.any
 31	[Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]
 32
 33The 'Unit: xxx' is added to brief description to indicate which pmu
 34the event is belong to. Same event name but with different pmu can
 35be supported.
 36
 37Enable hybrid event with a specific pmu
 
 38
 39To enable a core only event or atom only event, following syntax is supported:
 40
 41	cpu_core/<event name>/
 42or
 43	cpu_atom/<event name>/
 44
 45For example, count the 'cycles' event on core cpus.
 46
 47	perf stat -e cpu_core/cycles/
 48
 49Create two events for one hardware event automatically
 
 50
 51When creating one event and the event is available on both atom and core,
 52two events are created automatically. One is for atom, the other is for
 53core. Most of hardware events and cache events are available on both
 54cpu_core and cpu_atom.
 55
 56For hardware events, they have pre-defined configs (e.g. 0 for cycles).
 57But on hybrid platform, kernel needs to know where the event comes from
 58(from atom or from core). The original perf event type PERF_TYPE_HARDWARE
 59can't carry pmu information. So now this type is extended to be PMU aware
 60type. The PMU type ID is stored at attr.config[63:32].
 61
 62PMU type ID is retrieved from sysfs.
 63/sys/devices/cpu_atom/type
 64/sys/devices/cpu_core/type
 65
 66The new attr.config layout for PERF_TYPE_HARDWARE:
 67
 68PERF_TYPE_HARDWARE:                 0xEEEEEEEE000000AA
 69                                    AA: hardware event ID
 70                                    EEEEEEEE: PMU type ID
 71
 72Cache event is similar. The type PERF_TYPE_HW_CACHE is extended to be
 73PMU aware type. The PMU type ID is stored at attr.config[63:32].
 74
 75The new attr.config layout for PERF_TYPE_HW_CACHE:
 76
 77PERF_TYPE_HW_CACHE:                 0xEEEEEEEE00DDCCBB
 78                                    BB: hardware cache ID
 79                                    CC: hardware cache op ID
 80                                    DD: hardware cache op result ID
 81                                    EEEEEEEE: PMU type ID
 82
 83When enabling a hardware event without specified pmu, such as,
 84perf stat -e cycles -a (use system-wide in this example), two events
 85are created automatically.
 86
 87  ------------------------------------------------------------
 88  perf_event_attr:
 89    size                             120
 90    config                           0x400000000
 91    sample_type                      IDENTIFIER
 92    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
 93    disabled                         1
 94    inherit                          1
 95    exclude_guest                    1
 96  ------------------------------------------------------------
 97
 98and
 99
100  ------------------------------------------------------------
101  perf_event_attr:
102    size                             120
103    config                           0x800000000
104    sample_type                      IDENTIFIER
105    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
106    disabled                         1
107    inherit                          1
108    exclude_guest                    1
109  ------------------------------------------------------------
110
111type 0 is PERF_TYPE_HARDWARE.
1120x4 in 0x400000000 indicates it's cpu_core pmu.
1130x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random).
114
115The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus),
116and create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus).
117
118For perf-stat result, it displays two events:
119
120 Performance counter stats for 'system wide':
121
122           6,744,979      cpu_core/cycles/
123           1,965,552      cpu_atom/cycles/
124
125The first 'cycles' is core event, the second 'cycles' is atom event.
126
127Thread mode example:
 
128
129perf-stat reports the scaled counts for hybrid event and with a percentage
130displayed. The percentage is the event's running time/enabling time.
131
132One example, 'triad_loop' runs on cpu16 (atom core), while we can see the
133scaled value for core cycles is 160,444,092 and the percentage is 0.47%.
134
135perf stat -e cycles \-- taskset -c 16 ./triad_loop
136
137As previous, two events are created.
138
139------------------------------------------------------------
140perf_event_attr:
141  size                             120
142  config                           0x400000000
143  sample_type                      IDENTIFIER
144  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
145  disabled                         1
146  inherit                          1
147  enable_on_exec                   1
148  exclude_guest                    1
149------------------------------------------------------------
150
151and
152
153------------------------------------------------------------
154perf_event_attr:
155  size                             120
156  config                           0x800000000
157  sample_type                      IDENTIFIER
158  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
159  disabled                         1
160  inherit                          1
161  enable_on_exec                   1
162  exclude_guest                    1
163------------------------------------------------------------
164
165 Performance counter stats for 'taskset -c 16 ./triad_loop':
166
167       233,066,666      cpu_core/cycles/                                              (0.43%)
168       604,097,080      cpu_atom/cycles/                                              (99.57%)
169
170perf-record:
 
171
172If there is no '-e' specified in perf record, on hybrid platform,
173it creates two default 'cycles' and adds them to event list. One
174is for core, the other is for atom.
175
176perf-stat:
 
177
178If there is no '-e' specified in perf stat, on hybrid platform,
179besides of software events, following events are created and
180added to event list in order.
181
182cpu_core/cycles/,
183cpu_atom/cycles/,
184cpu_core/instructions/,
185cpu_atom/instructions/,
186cpu_core/branches/,
187cpu_atom/branches/,
188cpu_core/branch-misses/,
189cpu_atom/branch-misses/
190
191Of course, both perf-stat and perf-record support to enable
192hybrid event with a specific pmu.
193
194e.g.
195perf stat -e cpu_core/cycles/
196perf stat -e cpu_atom/cycles/
197perf stat -e cpu_core/r1a/
198perf stat -e cpu_atom/L1-icache-loads/
199perf stat -e cpu_core/cycles/,cpu_atom/instructions/
200perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}'
201
202But '{cpu_core/cycles/,cpu_atom/instructions/}' will return
203warning and disable grouping, because the pmus in group are
204not matched (cpu_core vs. cpu_atom).
v5.14.15
  1Intel hybrid support
  2--------------------
  3Support for Intel hybrid events within perf tools.
  4
  5For some Intel platforms, such as AlderLake, which is hybrid platform and
  6it consists of atom cpu and core cpu. Each cpu has dedicated event list.
  7Part of events are available on core cpu, part of events are available
  8on atom cpu and even part of events are available on both.
  9
 10Kernel exports two new cpu pmus via sysfs:
 11/sys/devices/cpu_core
 12/sys/devices/cpu_atom
 13
 14The 'cpus' files are created under the directories. For example,
 15
 16cat /sys/devices/cpu_core/cpus
 170-15
 18
 19cat /sys/devices/cpu_atom/cpus
 2016-23
 21
 22It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus.
 23
 24Quickstart
 25
 26List hybrid event
 27-----------------
 28
 29As before, use perf-list to list the symbolic event.
 30
 31perf list
 32
 33inst_retired.any
 34	[Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
 35inst_retired.any
 36	[Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]
 37
 38The 'Unit: xxx' is added to brief description to indicate which pmu
 39the event is belong to. Same event name but with different pmu can
 40be supported.
 41
 42Enable hybrid event with a specific pmu
 43---------------------------------------
 44
 45To enable a core only event or atom only event, following syntax is supported:
 46
 47	cpu_core/<event name>/
 48or
 49	cpu_atom/<event name>/
 50
 51For example, count the 'cycles' event on core cpus.
 52
 53	perf stat -e cpu_core/cycles/
 54
 55Create two events for one hardware event automatically
 56------------------------------------------------------
 57
 58When creating one event and the event is available on both atom and core,
 59two events are created automatically. One is for atom, the other is for
 60core. Most of hardware events and cache events are available on both
 61cpu_core and cpu_atom.
 62
 63For hardware events, they have pre-defined configs (e.g. 0 for cycles).
 64But on hybrid platform, kernel needs to know where the event comes from
 65(from atom or from core). The original perf event type PERF_TYPE_HARDWARE
 66can't carry pmu information. So now this type is extended to be PMU aware
 67type. The PMU type ID is stored at attr.config[63:32].
 68
 69PMU type ID is retrieved from sysfs.
 70/sys/devices/cpu_atom/type
 71/sys/devices/cpu_core/type
 72
 73The new attr.config layout for PERF_TYPE_HARDWARE:
 74
 75PERF_TYPE_HARDWARE:                 0xEEEEEEEE000000AA
 76                                    AA: hardware event ID
 77                                    EEEEEEEE: PMU type ID
 78
 79Cache event is similar. The type PERF_TYPE_HW_CACHE is extended to be
 80PMU aware type. The PMU type ID is stored at attr.config[63:32].
 81
 82The new attr.config layout for PERF_TYPE_HW_CACHE:
 83
 84PERF_TYPE_HW_CACHE:                 0xEEEEEEEE00DDCCBB
 85                                    BB: hardware cache ID
 86                                    CC: hardware cache op ID
 87                                    DD: hardware cache op result ID
 88                                    EEEEEEEE: PMU type ID
 89
 90When enabling a hardware event without specified pmu, such as,
 91perf stat -e cycles -a (use system-wide in this example), two events
 92are created automatically.
 93
 94  ------------------------------------------------------------
 95  perf_event_attr:
 96    size                             120
 97    config                           0x400000000
 98    sample_type                      IDENTIFIER
 99    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
100    disabled                         1
101    inherit                          1
102    exclude_guest                    1
103  ------------------------------------------------------------
104
105and
106
107  ------------------------------------------------------------
108  perf_event_attr:
109    size                             120
110    config                           0x800000000
111    sample_type                      IDENTIFIER
112    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
113    disabled                         1
114    inherit                          1
115    exclude_guest                    1
116  ------------------------------------------------------------
117
118type 0 is PERF_TYPE_HARDWARE.
1190x4 in 0x400000000 indicates it's cpu_core pmu.
1200x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random).
121
122The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus),
123and create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus).
124
125For perf-stat result, it displays two events:
126
127 Performance counter stats for 'system wide':
128
129           6,744,979      cpu_core/cycles/
130           1,965,552      cpu_atom/cycles/
131
132The first 'cycles' is core event, the second 'cycles' is atom event.
133
134Thread mode example:
135--------------------
136
137perf-stat reports the scaled counts for hybrid event and with a percentage
138displayed. The percentage is the event's running time/enabling time.
139
140One example, 'triad_loop' runs on cpu16 (atom core), while we can see the
141scaled value for core cycles is 160,444,092 and the percentage is 0.47%.
142
143perf stat -e cycles -- taskset -c 16 ./triad_loop
144
145As previous, two events are created.
146
147------------------------------------------------------------
148perf_event_attr:
149  size                             120
150  config                           0x400000000
151  sample_type                      IDENTIFIER
152  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
153  disabled                         1
154  inherit                          1
155  enable_on_exec                   1
156  exclude_guest                    1
157------------------------------------------------------------
158
159and
160
161------------------------------------------------------------
162perf_event_attr:
163  size                             120
164  config                           0x800000000
165  sample_type                      IDENTIFIER
166  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
167  disabled                         1
168  inherit                          1
169  enable_on_exec                   1
170  exclude_guest                    1
171------------------------------------------------------------
172
173 Performance counter stats for 'taskset -c 16 ./triad_loop':
174
175       233,066,666      cpu_core/cycles/                                              (0.43%)
176       604,097,080      cpu_atom/cycles/                                              (99.57%)
177
178perf-record:
179------------
180
181If there is no '-e' specified in perf record, on hybrid platform,
182it creates two default 'cycles' and adds them to event list. One
183is for core, the other is for atom.
184
185perf-stat:
186----------
187
188If there is no '-e' specified in perf stat, on hybrid platform,
189besides of software events, following events are created and
190added to event list in order.
191
192cpu_core/cycles/,
193cpu_atom/cycles/,
194cpu_core/instructions/,
195cpu_atom/instructions/,
196cpu_core/branches/,
197cpu_atom/branches/,
198cpu_core/branch-misses/,
199cpu_atom/branch-misses/
200
201Of course, both perf-stat and perf-record support to enable
202hybrid event with a specific pmu.
203
204e.g.
205perf stat -e cpu_core/cycles/
206perf stat -e cpu_atom/cycles/
207perf stat -e cpu_core/r1a/
208perf stat -e cpu_atom/L1-icache-loads/
209perf stat -e cpu_core/cycles/,cpu_atom/instructions/
210perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}'
211
212But '{cpu_core/cycles/,cpu_atom/instructions/}' will return
213warning and disable grouping, because the pmus in group are
214not matched (cpu_core vs. cpu_atom).