162306a36Sopenharmony_ciIntel hybrid support
262306a36Sopenharmony_ci--------------------
362306a36Sopenharmony_ciSupport for Intel hybrid events within perf tools.
462306a36Sopenharmony_ci
562306a36Sopenharmony_ciFor some Intel platforms, such as AlderLake, which is hybrid platform and
662306a36Sopenharmony_ciit consists of atom cpu and core cpu. Each cpu has dedicated event list.
762306a36Sopenharmony_ciPart of events are available on core cpu, part of events are available
862306a36Sopenharmony_cion atom cpu and even part of events are available on both.
962306a36Sopenharmony_ci
1062306a36Sopenharmony_ciKernel exports two new cpu pmus via sysfs:
1162306a36Sopenharmony_ci/sys/devices/cpu_core
1262306a36Sopenharmony_ci/sys/devices/cpu_atom
1362306a36Sopenharmony_ci
1462306a36Sopenharmony_ciThe 'cpus' files are created under the directories. For example,
1562306a36Sopenharmony_ci
1662306a36Sopenharmony_cicat /sys/devices/cpu_core/cpus
1762306a36Sopenharmony_ci0-15
1862306a36Sopenharmony_ci
1962306a36Sopenharmony_cicat /sys/devices/cpu_atom/cpus
2062306a36Sopenharmony_ci16-23
2162306a36Sopenharmony_ci
2262306a36Sopenharmony_ciIt indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus.
2362306a36Sopenharmony_ci
2462306a36Sopenharmony_ciAs before, use perf-list to list the symbolic event.
2562306a36Sopenharmony_ci
2662306a36Sopenharmony_ciperf list
2762306a36Sopenharmony_ci
2862306a36Sopenharmony_ciinst_retired.any
2962306a36Sopenharmony_ci	[Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
3062306a36Sopenharmony_ciinst_retired.any
3162306a36Sopenharmony_ci	[Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]
3262306a36Sopenharmony_ci
3362306a36Sopenharmony_ciThe 'Unit: xxx' is added to brief description to indicate which pmu
3462306a36Sopenharmony_cithe event is belong to. Same event name but with different pmu can
3562306a36Sopenharmony_cibe supported.
3662306a36Sopenharmony_ci
3762306a36Sopenharmony_ciEnable hybrid event with a specific pmu
3862306a36Sopenharmony_ci
3962306a36Sopenharmony_ciTo enable a core only event or atom only event, following syntax is supported:
4062306a36Sopenharmony_ci
4162306a36Sopenharmony_ci	cpu_core/<event name>/
4262306a36Sopenharmony_cior
4362306a36Sopenharmony_ci	cpu_atom/<event name>/
4462306a36Sopenharmony_ci
4562306a36Sopenharmony_ciFor example, count the 'cycles' event on core cpus.
4662306a36Sopenharmony_ci
4762306a36Sopenharmony_ci	perf stat -e cpu_core/cycles/
4862306a36Sopenharmony_ci
4962306a36Sopenharmony_ciCreate two events for one hardware event automatically
5062306a36Sopenharmony_ci
5162306a36Sopenharmony_ciWhen creating one event and the event is available on both atom and core,
5262306a36Sopenharmony_citwo events are created automatically. One is for atom, the other is for
5362306a36Sopenharmony_cicore. Most of hardware events and cache events are available on both
5462306a36Sopenharmony_cicpu_core and cpu_atom.
5562306a36Sopenharmony_ci
5662306a36Sopenharmony_ciFor hardware events, they have pre-defined configs (e.g. 0 for cycles).
5762306a36Sopenharmony_ciBut on hybrid platform, kernel needs to know where the event comes from
5862306a36Sopenharmony_ci(from atom or from core). The original perf event type PERF_TYPE_HARDWARE
5962306a36Sopenharmony_cican't carry pmu information. So now this type is extended to be PMU aware
6062306a36Sopenharmony_citype. The PMU type ID is stored at attr.config[63:32].
6162306a36Sopenharmony_ci
6262306a36Sopenharmony_ciPMU type ID is retrieved from sysfs.
6362306a36Sopenharmony_ci/sys/devices/cpu_atom/type
6462306a36Sopenharmony_ci/sys/devices/cpu_core/type
6562306a36Sopenharmony_ci
6662306a36Sopenharmony_ciThe new attr.config layout for PERF_TYPE_HARDWARE:
6762306a36Sopenharmony_ci
6862306a36Sopenharmony_ciPERF_TYPE_HARDWARE:                 0xEEEEEEEE000000AA
6962306a36Sopenharmony_ci                                    AA: hardware event ID
7062306a36Sopenharmony_ci                                    EEEEEEEE: PMU type ID
7162306a36Sopenharmony_ci
7262306a36Sopenharmony_ciCache event is similar. The type PERF_TYPE_HW_CACHE is extended to be
7362306a36Sopenharmony_ciPMU aware type. The PMU type ID is stored at attr.config[63:32].
7462306a36Sopenharmony_ci
7562306a36Sopenharmony_ciThe new attr.config layout for PERF_TYPE_HW_CACHE:
7662306a36Sopenharmony_ci
7762306a36Sopenharmony_ciPERF_TYPE_HW_CACHE:                 0xEEEEEEEE00DDCCBB
7862306a36Sopenharmony_ci                                    BB: hardware cache ID
7962306a36Sopenharmony_ci                                    CC: hardware cache op ID
8062306a36Sopenharmony_ci                                    DD: hardware cache op result ID
8162306a36Sopenharmony_ci                                    EEEEEEEE: PMU type ID
8262306a36Sopenharmony_ci
8362306a36Sopenharmony_ciWhen enabling a hardware event without specified pmu, such as,
8462306a36Sopenharmony_ciperf stat -e cycles -a (use system-wide in this example), two events
8562306a36Sopenharmony_ciare created automatically.
8662306a36Sopenharmony_ci
8762306a36Sopenharmony_ci  ------------------------------------------------------------
8862306a36Sopenharmony_ci  perf_event_attr:
8962306a36Sopenharmony_ci    size                             120
9062306a36Sopenharmony_ci    config                           0x400000000
9162306a36Sopenharmony_ci    sample_type                      IDENTIFIER
9262306a36Sopenharmony_ci    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
9362306a36Sopenharmony_ci    disabled                         1
9462306a36Sopenharmony_ci    inherit                          1
9562306a36Sopenharmony_ci    exclude_guest                    1
9662306a36Sopenharmony_ci  ------------------------------------------------------------
9762306a36Sopenharmony_ci
9862306a36Sopenharmony_ciand
9962306a36Sopenharmony_ci
10062306a36Sopenharmony_ci  ------------------------------------------------------------
10162306a36Sopenharmony_ci  perf_event_attr:
10262306a36Sopenharmony_ci    size                             120
10362306a36Sopenharmony_ci    config                           0x800000000
10462306a36Sopenharmony_ci    sample_type                      IDENTIFIER
10562306a36Sopenharmony_ci    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
10662306a36Sopenharmony_ci    disabled                         1
10762306a36Sopenharmony_ci    inherit                          1
10862306a36Sopenharmony_ci    exclude_guest                    1
10962306a36Sopenharmony_ci  ------------------------------------------------------------
11062306a36Sopenharmony_ci
11162306a36Sopenharmony_citype 0 is PERF_TYPE_HARDWARE.
11262306a36Sopenharmony_ci0x4 in 0x400000000 indicates it's cpu_core pmu.
11362306a36Sopenharmony_ci0x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random).
11462306a36Sopenharmony_ci
11562306a36Sopenharmony_ciThe kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus),
11662306a36Sopenharmony_ciand create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus).
11762306a36Sopenharmony_ci
11862306a36Sopenharmony_ciFor perf-stat result, it displays two events:
11962306a36Sopenharmony_ci
12062306a36Sopenharmony_ci Performance counter stats for 'system wide':
12162306a36Sopenharmony_ci
12262306a36Sopenharmony_ci           6,744,979      cpu_core/cycles/
12362306a36Sopenharmony_ci           1,965,552      cpu_atom/cycles/
12462306a36Sopenharmony_ci
12562306a36Sopenharmony_ciThe first 'cycles' is core event, the second 'cycles' is atom event.
12662306a36Sopenharmony_ci
12762306a36Sopenharmony_ciThread mode example:
12862306a36Sopenharmony_ci
12962306a36Sopenharmony_ciperf-stat reports the scaled counts for hybrid event and with a percentage
13062306a36Sopenharmony_cidisplayed. The percentage is the event's running time/enabling time.
13162306a36Sopenharmony_ci
13262306a36Sopenharmony_ciOne example, 'triad_loop' runs on cpu16 (atom core), while we can see the
13362306a36Sopenharmony_ciscaled value for core cycles is 160,444,092 and the percentage is 0.47%.
13462306a36Sopenharmony_ci
13562306a36Sopenharmony_ciperf stat -e cycles \-- taskset -c 16 ./triad_loop
13662306a36Sopenharmony_ci
13762306a36Sopenharmony_ciAs previous, two events are created.
13862306a36Sopenharmony_ci
13962306a36Sopenharmony_ci------------------------------------------------------------
14062306a36Sopenharmony_ciperf_event_attr:
14162306a36Sopenharmony_ci  size                             120
14262306a36Sopenharmony_ci  config                           0x400000000
14362306a36Sopenharmony_ci  sample_type                      IDENTIFIER
14462306a36Sopenharmony_ci  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
14562306a36Sopenharmony_ci  disabled                         1
14662306a36Sopenharmony_ci  inherit                          1
14762306a36Sopenharmony_ci  enable_on_exec                   1
14862306a36Sopenharmony_ci  exclude_guest                    1
14962306a36Sopenharmony_ci------------------------------------------------------------
15062306a36Sopenharmony_ci
15162306a36Sopenharmony_ciand
15262306a36Sopenharmony_ci
15362306a36Sopenharmony_ci------------------------------------------------------------
15462306a36Sopenharmony_ciperf_event_attr:
15562306a36Sopenharmony_ci  size                             120
15662306a36Sopenharmony_ci  config                           0x800000000
15762306a36Sopenharmony_ci  sample_type                      IDENTIFIER
15862306a36Sopenharmony_ci  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
15962306a36Sopenharmony_ci  disabled                         1
16062306a36Sopenharmony_ci  inherit                          1
16162306a36Sopenharmony_ci  enable_on_exec                   1
16262306a36Sopenharmony_ci  exclude_guest                    1
16362306a36Sopenharmony_ci------------------------------------------------------------
16462306a36Sopenharmony_ci
16562306a36Sopenharmony_ci Performance counter stats for 'taskset -c 16 ./triad_loop':
16662306a36Sopenharmony_ci
16762306a36Sopenharmony_ci       233,066,666      cpu_core/cycles/                                              (0.43%)
16862306a36Sopenharmony_ci       604,097,080      cpu_atom/cycles/                                              (99.57%)
16962306a36Sopenharmony_ci
17062306a36Sopenharmony_ciperf-record:
17162306a36Sopenharmony_ci
17262306a36Sopenharmony_ciIf there is no '-e' specified in perf record, on hybrid platform,
17362306a36Sopenharmony_ciit creates two default 'cycles' and adds them to event list. One
17462306a36Sopenharmony_ciis for core, the other is for atom.
17562306a36Sopenharmony_ci
17662306a36Sopenharmony_ciperf-stat:
17762306a36Sopenharmony_ci
17862306a36Sopenharmony_ciIf there is no '-e' specified in perf stat, on hybrid platform,
17962306a36Sopenharmony_cibesides of software events, following events are created and
18062306a36Sopenharmony_ciadded to event list in order.
18162306a36Sopenharmony_ci
18262306a36Sopenharmony_cicpu_core/cycles/,
18362306a36Sopenharmony_cicpu_atom/cycles/,
18462306a36Sopenharmony_cicpu_core/instructions/,
18562306a36Sopenharmony_cicpu_atom/instructions/,
18662306a36Sopenharmony_cicpu_core/branches/,
18762306a36Sopenharmony_cicpu_atom/branches/,
18862306a36Sopenharmony_cicpu_core/branch-misses/,
18962306a36Sopenharmony_cicpu_atom/branch-misses/
19062306a36Sopenharmony_ci
19162306a36Sopenharmony_ciOf course, both perf-stat and perf-record support to enable
19262306a36Sopenharmony_cihybrid event with a specific pmu.
19362306a36Sopenharmony_ci
19462306a36Sopenharmony_cie.g.
19562306a36Sopenharmony_ciperf stat -e cpu_core/cycles/
19662306a36Sopenharmony_ciperf stat -e cpu_atom/cycles/
19762306a36Sopenharmony_ciperf stat -e cpu_core/r1a/
19862306a36Sopenharmony_ciperf stat -e cpu_atom/L1-icache-loads/
19962306a36Sopenharmony_ciperf stat -e cpu_core/cycles/,cpu_atom/instructions/
20062306a36Sopenharmony_ciperf stat -e '{cpu_core/cycles/,cpu_core/instructions/}'
20162306a36Sopenharmony_ci
20262306a36Sopenharmony_ciBut '{cpu_core/cycles/,cpu_atom/instructions/}' will return
20362306a36Sopenharmony_ciwarning and disable grouping, because the pmus in group are
20462306a36Sopenharmony_cinot matched (cpu_core vs. cpu_atom).
205