Lines Matching +full:oe +full:- +full:extra +full:- +full:delay
1 perf-intel-pt(1)
5 ----
6 perf-intel-pt - Support for Intel Processor Trace within perf tools
9 --------
11 'perf record' -e intel_pt//
14 -----------
19 Technical details are documented in the Intel 64 and IA-32 Architectures
23 processors that are based on the Intel micro-architecture code name Broadwell.
33 Decoding is done on-the-fly. The decoder outputs samples in the same format as
43 builds, however the executed images are needed - which makes use in JIT-compiled
44 environments, or with self-modified code, a challenge. Also symbols need to be
51 vary depending on the use-case and architecture.
55 ----------
61 Data is captured with 'perf record' e.g. to trace 'ls' userspace-only:
63 perf record -e intel_pt//u ls
69 To also trace kernel space presents a problem, namely kernel self-modifying
73 --kcore is used, but access to /proc/kcore is restricted e.g.
75 sudo perf record -o pt_ls --kcore -e intel_pt// -- ls
82 sudo perf report -i pt_ls
84 Because samples are synthesized after-the-fact, the sampling period can be
87 sudo perf report pt_ls --itrace=i1usge
89 See the sections below for more information about the --itrace option.
103 perf record -e intel_pt//u ls
104 perf script --itrace=iybxwpe
109 perf script --itrace=iybxwpe -F+flags
113 in transaction, VM-entry, VM-exit, interrupt disabled, and interrupt disable
118 perf script --insn-trace=disasm
123 perf script --insn-trace --xed
128 perf script --call-trace
132 perf script --call-ret-trace
137 perf script --time starttime,stoptime --insn-trace=disasm
140 the -C option
142 perf script --time starttime,stoptime --insn-trace=disasm -C 1
149 perf script --itrace=be -F+ipc
151 There are two ways that instructions-per-cycle (IPC) can be calculated depending
156 MTC packets are used - refer to the 'mtc' config term. When MTC is used, however,
174 useful to use the 'A' option in conjunction with dlfilter-show-cycles.so to
183 Another note, in the case of "branches" events, non-taken branches are not
185 TNT packet that starts with a non-taken branch. To see every possible IPC
186 value, "instructions" events can be used e.g. --itrace=i0ns
190 Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,
191 and to script exported-sql-viewer.py for an example of using the database.
193 There is also script intel-pt-events.py which provides an example of how to
197 - --insn-trace - instruction trace
198 - --src-trace - source trace
200 The intel-pt-events.py script also has options:
202 - --all-switch-events - display all switch events, not only the last consecutive.
203 - --interleave [<n>] - interleave sample output for the same timestamp so that
213 by inability to access the executed image, self-modified or JIT-ed code, or the
214 inability to match side-band information (such as context switches and mmaps)
223 -----------
232 -e intel_pt//
236 -e intel_pt/tsc,noretcomp=0/
240 -e intel_pt/tsc=1,noretcomp=0/
242 Note there are other config terms - see section <<_config_terms,config terms>> further below.
249 $ grep -H . /sys/bus/event_source/devices/intel_pt/format/*
251 /sys/bus/event_source/devices/intel_pt/format/cyc_thresh:config:19-22
253 /sys/bus/event_source/devices/intel_pt/format/mtc_period:config:14-17
255 /sys/bus/event_source/devices/intel_pt/format/psb_period:config:24-27
260 -e intel_pt/noretcomp=0/
264 -e intel_pt/tsc=1,noretcomp=0/
268 -e intel_pt/tsc=0/
272 -e intel_pt/config=0x400/
287 perf_event_attr is displayed if the -vv option is used e.g.
289 ------------------------------------------------------------
303 ------------------------------------------------------------
304 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
305 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
306 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
307 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
308 ------------------------------------------------------------
314 Config terms are parameters specified with the -e intel_pt// event option,
317 -e intel_pt/cyc/
322 -e intel_pt/cyc=1/
327 -e intel_pt/branch=0/
331 -e intel_pt/cyc,mtc_period=9/
333 There are also common config terms, see linkperf:perf-record[1] documentation.
340 without timing information, for example a per-thread context
382 $ perf record -e intel_pt/psb_period=15/u uname
383 Invalid psb_period for intel_pt. Valid values are: 0-5
410 The frequency of MTC packets can also be specified - see
414 Specifies how frequently MTC packets are produced - see mtc
426 CTC-frequency / (2 ^ value)
428 e.g. value 3 means one eighth of CTC-frequency
436 $ perf record -e intel_pt/mtc_period=15/u uname
458 a threshold - see cyc_thresh below.
461 Specifies how frequently CYC packets are produced - see cyc
475 2 ^ (value - 1)
484 $ perf record -e intel_pt/cyc,cyc_thresh=15/u uname
485 Invalid cyc_thresh for intel_pt. Valid values are: 0-12
490 Specifies pass-through which enables the 'branch' config term.
523 changes to the CPU C-state.
547 return compression is disabled - see noretcomp) return statements.
558 *aux-action=start-paused*::
566 and PEBS-via-PT. In those cases, the other events can have config terms below:
568 *aux-sample-size*::
572 *aux-output*::
573 Used to select PEBS-via-PT, refer to the
576 *aux-action*::
585 --aux-sample
589 --aux-sample=8192
593 -e intel_pt//u
596 following will create Intel PT samples on the branch-misses event, note the
599 perf record --aux-sample -e '{intel_pt//u,branch-misses:u}'
601 An alternative to '--aux-sample' is to add the config term 'aux-sample-size' to
604 perf record -e intel_pt//u -e branch-misses/aux-sample-size=8192/u
608 perf record -e '{intel_pt//u,branch-misses/aux-sample-size=8192/u}'
612 …perf record -e intel_pt//u --filter 'filter * @/bin/ls' -e branch-misses/aux-sample-size=8192/u --…
642 -S
646 -S0x100000
655 The snapshot size is displayed if the option -vv is used e.g.
663 Intel PT buffer size is specified by an addition to the -m option e.g.
665 -m,16
669 Note that the existing functionality of -m is unchanged. The auxtrace mmap size
683 In full-trace mode, powers of two are allowed for buffer size, with a minimum
687 The mmap size and auxtrace mmap size are displayed if the -vv option is used e.g.
697 full-trace mode
701 Full-trace mode traces continuously e.g.
703 perf record -e intel_pt//u uname
707 perf record --aux-sample -e intel_pt//u -e branch-misses:u
712 perf record -v -e intel_pt//u -S ./loopy 1000000000 &
714 kill -USR2 11435
718 Note that "Recording AUX area tracing snapshot" is displayed because the -v
728 $ sudo ~/bin/perf record --control fifo:perf.control,perf.ack -S -e intel_pt//u -- sleep 60 &
730 $ ps -e | grep perf
732 $ kill -USR2 15244
733 bash: kill: (15244) - Operation not permitted
756 In full-trace mode, the driver waits for data to be copied out before allowing
757 the (logical) buffer to wrap-around. If data is not copied out quickly enough,
760 that happens, perf tools always re-enable the intel_pt event after copying out
767 By default "perf record" post-processes the event stream to find all build ids
775 perf buildid-list
779 perf buildid-list --with-hits
787 collection of side-band information. In order to prevent that, a dummy
790 there is complete side-band information to allow the decoding of subsequent
813 "per thread" mode is selected by -t or by --per-thread (with -p or -u or just a
815 "per cpu" is selected by -C or -a.
819 In per-thread mode an exact list of threads is traced. There is no inheritance.
822 In per-cpu mode all processes (or processes from the selected cgroup i.e. -G
823 option, or processes selected with -p or -u) are traced. Each cpu has its own
826 In workload-only mode, the workload is traced but with per-cpu buffers.
827 Inheritance is allowed. Note that you can now trace a workload in per-thread
828 mode by using the --per-thread option.
831 Privileged vs non-privileged users
834 Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users
850 Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users are
851 not permitted to use tracepoints which means there is insufficient side-band
852 information to decode Intel PT in per-cpu mode, and potentially workload-only
855 Note also, that to use tracepoints, read-access to debugfs is required. So if
856 debugfs is not mounted or the user does not have read-access, it will again not
857 be possible to decode Intel PT in per-cpu mode.
863 The sched_switch tracepoint is used to provide side-band data for Intel PT
870 $ perf record -vv -e intel_pt//u uname
871 ------------------------------------------------------------
885 ------------------------------------------------------------
886 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
887 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
888 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
889 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
890 ------------------------------------------------------------
901 ------------------------------------------------------------
902 sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
903 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
904 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
905 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
906 ------------------------------------------------------------
925 ------------------------------------------------------------
926 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
927 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
928 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
929 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
939 and only in per-cpu mode.
947 -----------
950 This can be further controlled by new option --itrace.
953 New --itrace option
958 --itrace
962 --itrace=cepwxy
974 o synthesize PEBS-via-PT events
985 Z prefer to ignore timestamps (so-called "timeless" decoding)
987 "Instructions" events look like they were recorded by "perf record -e
990 "Cycles" events look like they were recorded by "perf record -e cycles"
1000 "Branches" events look like they were recorded by "perf record -e branches". "c"
1018 "Power" events correspond to power event packets and CBR (core-to-bus ratio)
1022 C-state changes, whereas CBR is indicative of CPU frequency. perf script
1027 pwre: hw: 0 cstate: 2 sub-cstate: 0
1033 "cbr" includes the frequency and the percentage of maximum non-turbo
1035 "pwre" shows C-state transitions (to a C-state deeper than C0) and
1041 For more details refer to the Intel 64 and IA-32 Architectures Software
1044 PSB events show when a PSB+ occurred and also the byte-offset in the trace.
1045 Emitting a PSB+ can cause a CPU a slight delay. When doing timing analysis
1052 will or will not be reported. Each flag must be preceded by either '+' or '-'.
1055 -o Suppress overflow errors
1056 -l Suppress trace data lost errors
1060 --itrace=e-o-l
1066 must be preceded by either '+' or '-'. The flags support by Intel PT are:
1068 -a Suppress logging of perf events
1076 linkperf:perf-config[1] e.g. perf config itrace.debug-log-buffer-size=30000
1080 --itrace=i10us
1098 'instructions' (i.e. --itrace=i1i).
1103 --itrace=ig32
1104 --itrace=xg32
1109 --itrace=il10
1110 --itrace=xl10
1117 instead of synthesized events. For example, to record branch-misses events for
1120 perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' -- ls
1121 perf report --itrace=Ge
1133 - hardware supports it
1134 - PEBS is used
1135 - event period is specified, instead of frequency
1136 - the sample type is limited to the following flags:
1145 cases, avoid specifying the event period i.e. avoid the 'perf record' -c option,
1146 --count option, or 'period' config term.
1148 To disable trace decoding entirely, use the option --no-itrace.
1153 --itrace=i0nss1000000
1164 ranges that could then be decoded fully using the --time option.
1168 - direct calls and jmps
1169 - conditional branches
1170 - non-branch instructions
1174 - asynchronous branches such as interrupts
1175 - indirect branches
1176 - function return target address *if* the noretcomp config term (refer
1178 - start of (control-flow) tracing
1179 - end of (control-flow) tracing, if it is not out of context
1180 - power events, ptwrite, transaction start and abort
1181 - instruction pointer associated with PSB packets
1186 Repeating the q option (double-q i.e. qq) results in even faster decoding and even
1195 - everything except instruction pointer associated with PSB packets
1199 - instruction pointer associated with PSB packets
1206 dlfilter-show-cycles.so
1209 Cycles can be displayed using dlfilter-show-cycles.so in which case the itrace A
1212 perf script --itrace=A --call-trace --dlfilter dlfilter-show-cycles.so
1216 perf script -v --list-dlfilters
1218 See also linkperf:perf-dlfilters[1]
1224 perf script has an option (-D) to "dump" the events i.e. display the binary
1227 When -D is used, Intel PT packets are displayed. The packet decoder does not
1228 pay attention to PSB packets, but just decodes the bytes - so the packets seen
1230 One example of that would be when the buffer-switching interrupt has been too
1235 To disable the display of Intel PT packets, combine the -D option with
1236 --no-itrace.
1240 -----------
1243 This can be further controlled by new option --itrace exactly the same as
1244 perf script, with the exception that the default is --itrace=igxe.
1248 -----------
1250 perf inject also accepts the --itrace option in which case tracing data is
1253 perf inject --itrace -i perf.data -o perf.data.new
1260 $ gcc-5 -O3 sort.c -o sort_optimized
1266 [intel-pt]
1267 mispred-all = on
1269 $ perf record -e intel_pt//u ./sort 3000
1274 $ perf inject -i perf.data -o inj --itrace=i100usle --strip
1275 $ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1
1276 $ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
1286 -----------------
1289 Recording is selected by using the aux-output config term e.g.
1291 perf record -c 10000 -e '{intel_pt/branch=0/,cycles/aux-output/ppp}' uname
1295 kernels and perf tools add support for the PERF_RECORD_AUX_OUTPUT_HW_ID side-band event.
1296 To check for the presence of that event in a PEBS-via-PT trace:
1298 perf script -D --no-itrace | grep PERF_RECORD_AUX_OUTPUT_HW_ID
1302 perf script --itrace=oe
1305 ---
1307 include::build-xed.txt[]
1311 --------------------------------------
1314 (i.e. no TSC timestamps) or VM Time Correlation. VM Time Correlation is an extra step
1321 …Guest kernel self-modifying code (e.g. jump labels or JIT-compiled eBPF) will result in decoding e…
1333 Mount the guest file system. Note sshfs needs -o direct_io to enable reading of proc files. root …
1336 $ sshfs -o direct_io root@vm0:/ vm0
1340 $ perf buildid-cache -v --kcore vm0/proc/kcore
1341 …kcore added to build-id cache directory /home/user/.debug/[kernel.kcore]/9600f316a53a0f54278885e8d…
1346 $ ps -eLl | grep 'KVM\|PID'
1348 3 S 64055 1430 1 1440 1 80 0 - 1921718 - ? 00:02:47 CPU 0/KVM
1349 3 S 64055 1430 1 1441 1 80 0 - 1921718 - ? 00:02:41 CPU 1/KVM
1350 3 S 64055 1430 1 1442 1 80 0 - 1921718 - ? 00:02:38 CPU 2/KVM
1351 3 S 64055 1430 1 1443 2 80 0 - 1921718 - ? 00:03:18 CPU 3/KVM
1353 Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to…
1357 Intel PT traces both the host and the guest so --guest and --host need to be specified.
1358 Without timestamps, --per-thread must be specified to distinguish threads.
1360 …$ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/tsc=0,mtc=0,cy…
1367 $ perf script --guestkallsyms $KALLSYMS --insn-trace=disasm -F+ipc | grep -C10 vmresume | head -21
1397 Mount the guest file system. Note sshfs needs -o direct_io to enable reading of proc files. root …
1399 $ mkdir -p vm0
1400 $ sshfs -o direct_io root@vm0:/ vm0
1404 $ perf buildid-cache -v --kcore vm0/proc/kcore
1410 $ ps -eLl | grep 'KVM\|PID'
1412 3 S 64055 16998 1 17005 13 80 0 - 1818189 - ? 00:00:16 CPU 0/KVM
1413 3 S 64055 16998 1 17006 4 80 0 - 1818189 - ? 00:00:05 CPU 1/KVM
1414 3 S 64055 16998 1 17007 3 80 0 - 1818189 - ? 00:00:04 CPU 2/KVM
1415 3 S 64055 16998 1 17008 4 80 0 - 1818189 - ? 00:00:05 CPU 3/KVM
1417 Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to…
1420 Intel PT traces both the host and the guest so --guest and --host need to be specified.
1422 …$ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/cyc=1/k -p 169…
1427 only 7-bytes, so the TSC Offset might differ from the actual value in the 8th byte. That will
1430 $ perf inject -i perf.data.kvm --vm-time-correlation=dry-run
1446 $ perf inject -i perf.data.kvm --vm-time-correlation="dry-run 0xffffe42722c64c41"
1448 Note the options for 'perf inject' --vm-time-correlation are:
1450 [ dry-run ] [ <TSC Offset> [ : <VMCS> [ , <VMCS> ]... ] ]...
1453 The option "dry-run" will cause the file to be processed but without updating it.
1454 Note it is also possible to get a intel_pt.log file by adding option --itrace=d
1458 $ perf inject -i perf.data.kvm --vm-time-correlation=0xffffe42722c64c41 --force
1462 $ perf script -i perf.data.kvm --guestkallsyms $KALLSYMS --itrace=e-o
1468 …$ perf script -i perf.data.kvm --guestkallsyms $KALLSYMS --insn-trace=disasm -F+ipc | grep -C10 vm…
1493 -----------------------------------------------
1502 Check that no-kvmclock kernel command line option was used to boot:
1507 …BOOT_IMAGE=/boot/vmlinuz-5.10.0-16-amd64 root=UUID=cb49c910-e573-47e0-bce7-79e293df8e1d ro no-kvmc…
1516 …$ sudo perf record -o guest-sideband-testing-guest-perf.data --sample-identifier --buildid-all --s…
1524 $ sudo perf record -o guest-sideband-testing-host-perf.data -m,64M --kcore -a -e intel_pt/cyc/
1539 [ perf record: Captured and wrote 76.122 MB guest-sideband-testing-host-perf.data ]
1547 [ perf record: Captured and wrote 1.247 MB guest-sideband-testing-guest-perf.data ]
1549 And then copy guest-sideband-testing-guest-perf.data to the host (not shown here).
1557 $ perf inject -i guest-sideband-testing-host-perf.data --vm-time-correlation=dry-run
1565 …$ perf inject -i guest-sideband-testing-host-perf.data --vm-time-correlation=0xfffffa6ae070cb20 --…
1569 $ perf script -i guest-sideband-testing-host-perf.data --no-itrace --show-task-events | grep KVM
1575 Note, the QEMU option -name debug-threads=on is needed so that thread names
1580 $ mkdir -p ~/guestmount/13376
1581 $ sshfs -o direct_io vm_to_test:/ ~/guestmount/13376
1586 If needed, VDSO can be copied manually in a fashion similar to that used by the perf-archive script.
1588 …$ perf inject -i guest-sideband-testing-host-perf.data -o inj --guestmount ~/guestmount --guest-da…
1594 - the CPU displayed, [002] in this case, is always the host CPU
1595 …- events happening in the virtual machine start with VM:13376 VCPU:003, which shows the hypervisor…
1596 - only calls and errors are displayed i.e. --itrace=ce
1597 …- branches entering and exiting the virtual machine are split, and show as 2 branches to/from "0 […
1599 …$ perf script -i inj --itrace=ce -F+machine_pid,+vcpu,+addr,+pid,+tid,-period --ns --time 7919.408…
1604 …nown] ([unknown]) => 7f851c9b5a5c init_cacheinfo+0x3ac (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
1605 … branches: 7f851c9b5a5a init_cacheinfo+0x3aa (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => …
1655 …nown] ([unknown]) => 7f851c9b5a5c init_cacheinfo+0x3ac (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
1656 …dl_init+0x74 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) => 7f851cb7bf50 call_init.part.0+0x0 (/usr…
1665 Tracing Virtual Machines - Guest Code
1666 -------------------------------------
1673 addresses. To support that, option "--guest-code" has been added to perf script
1678 …# perf record --kcore -e intel_pt/cyc/ -- tools/testing/selftests/kselftest_install/kvm/tsc_msrs_t…
1681 # perf script --guest-code --itrace=bep --ns -F-period,+addr,+flags
1711 # perf kvm --guest-code --guest --host report -i perf.data --stdio | head -20
1713 # To display the perf.data header info, please use --header/--header-only options.
1726 ---entry_SYSCALL_64_after_hwframe
1729 |--29.44%--syscall_exit_to_user_mode
1736 -----------
1749 7 VMENTRY VM-Entry
1750 8 VMEXIT VM-Entry
1751 9 VMEXIT_INTR VM-Exit due to interrupt
1754 For more details, refer to the Intel 64 and IA-32 Architectures Software
1762 perf record -e intel_pt/event/u uname
1764 Event trace events are output using the --itrace I option. e.g.
1766 perf script --itrace=Ie
1779 iflag: t IFLAG: 1->0 via branch
1789 t interrupts become disabled IF=1 -> IF=0
1791 Dt interrupts become enabled IF=0 -> IF=1
1793 The intel-pt-events.py script illustrates how to access Event Trace information
1798 -----------
1802 perf record -e intel_pt/notnt/u uname
1804 In that case the --itrace q option is forced because walking executable code
1809 ----------------
1885 $ gcc -Wall -Wextra -O3 -g -o eg_ptw eg_ptw.c
1886 $ perf record -e intel_pt//u ./eg_ptw 0x1234567890abcdef
1889 $ perf script --itrace=ew
1895 ---------
1907 - full-trace, system wide : when buffer passes watermark
1908 - full-trace, not system-wide : when buffer passes watermark or
1910 - snapshot mode : as above but also when a snapshot is made
1911 - sample mode : as above but also when a sample is made
1913 That means finished-round ordering doesn't work. An auxtrace buffer
1925 -----------------------
1928 or resume Intel PT tracing. This is configured by using the "aux-action"
1931 "aux-action=pause" is used with events that are to pause Intel PT tracing.
1933 "aux-action=resume" is used with events that are to resume Intel PT tracing.
1935 "aux-action=start-paused" is used with the Intel PT event to start in a
1941 …$ perf record --kcore -e intel_pt/aux-action=start-paused/k,syscalls:sys_enter_newuname/aux-action…
1945 $ perf script --call-trace
1996 …ord --kcore -a -e intel_pt/aux-action=start-paused/k -e mem:0xffffffffb605bf60:x/aux-action=resume…
2002 $ sudo perf probe --add '__alloc_pages order'
2004 $ sudo perf probe --add __alloc_pages%return
2006 …cord --kcore -aR -e intel_pt/aux-action=start-paused/k -e probe:__alloc_pages/aux-action=resume/ -…
2012 $ sudo perf probe -x /usr/bin/uname main
2014 …$ sudo perf record -e intel_pt/-aux-action=start-paused/u -e probe_uname:main/aux-action=resume/ -…
2021 … --kcore -a -m,64M -e intel_pt/aux-action=start-paused/k -e cycles/aux-action=pause,period=1000000…
2027 -------
2035 --------
2037 linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1],
2038 linkperf:perf-inject[1]