1Perfetto Tracing 2================ 3 4Mesa has experimental support for `Perfetto <https://perfetto.dev>`__ for 5GPU performance monitoring. Perfetto supports multiple 6`producers <https://perfetto.dev/docs/concepts/service-model>`__ each with 7one or more data-sources. Perfetto already provides various producers and 8data-sources for things like: 9 10- CPU scheduling events (``linux.ftrace``) 11- CPU frequency scaling (``linux.ftrace``) 12- System calls (``linux.ftrace``) 13- Process memory utilization (``linux.process_stats``) 14 15As well as various domain specific producers. 16 17The mesa Perfetto support adds additional producers, to allow for visualizing 18GPU performance (frequency, utilization, performance counters, etc) on the 19same timeline, to better understand and tune/debug system level performance: 20 21- pps-producer: A systemwide daemon that can collect global performance 22 counters. 23- mesa: Per-process producer within mesa to capture render-stage traces 24 on the GPU timeline, track events on the CPU timeline, etc. 25 26The exact supported features vary per driver: 27 28.. list-table:: Supported data-sources 29 :header-rows: 1 30 31 * - Driver 32 - PPS Counters 33 - Render Stages 34 * - Freedreno 35 - ``gpu.counters.msm`` 36 - ``gpu.renderstages.msm`` 37 * - Turnip 38 - ``gpu.counters.msm`` 39 - ``gpu.renderstages.msm`` 40 * - Intel 41 - ``gpu.counters.i915`` 42 - ``gpu.renderstages.intel`` 43 * - Panfrost 44 - ``gpu.counters.panfrost`` 45 - 46 47Run 48--- 49 50To capture a trace with Perfetto you need to take the following steps: 51 521. Build Perfetto from sources available at ``subprojects/perfetto`` following 53 `this guide <https://perfetto.dev/docs/quickstart/linux-tracing>`__. 54 552. Create a `trace config <https://perfetto.dev/docs/concepts/config>`__, which is 56 a json formatted text file with extension ``.cfg``, or use one of the config 57 files under the ``src/tool/pps/cfg`` directory. More examples of config files 58 can be found in ``subprojects/perfetto/test/configs``. 59 603. Change directory to ``subprojects/perfetto`` and run a 61 `convenience script <https://perfetto.dev/docs/quickstart/linux-tracing#capturing-a-trace>`__ 62 to start the tracing service: 63 64 .. code-block:: sh 65 66 cd subprojects/perfetto 67 CONFIG=<path/to/gpu.cfg> OUT=out/linux_clang_release ./tools/tmux -n 68 694. Start other producers you may need, e.g. ``pps-producer``. 70 715. Start ``perfetto`` under the tmux session initiated in step 3. 72 736. Once tracing has finished, you can detach from tmux with :kbd:`Ctrl+b`, 74 :kbd:`d`, and the convenience script should automatically copy the trace 75 files into ``$HOME/Downloads``. 76 777. Go to `ui.perfetto.dev <https://ui.perfetto.dev>`__ and upload 78 ``$HOME/Downloads/trace.protobuf`` by clicking on **Open trace file**. 79 808. Alternatively you can open the trace in `AGI <https://gpuinspector.dev/>`__ 81 (which despite the name can be used to view non-android traces). 82 83To be a bit more explicit, here is a listing of commands reproducing 84the steps above : 85 86.. code-block:: sh 87 88 # Configure Mesa with perfetto 89 mesa $ meson . build -Dperfetto=true -Dvulkan-drivers=intel,broadcom -Dgallium-drivers= 90 # Build mesa 91 mesa $ meson compile -C build 92 93 # Within the Mesa repo, build perfetto 94 mesa $ cd subprojects/perfetto 95 perfetto $ ./tools/install-build-deps 96 perfetto $ ./tools/gn gen --args='is_debug=false' out/linux 97 perfetto $ ./tools/ninja -C out/linux 98 99 # Start perfetto 100 perfetto $ CONFIG=../../src/tool/pps/cfg/gpu.cfg OUT=out/linux/ ./tools/tmux -n 101 102 # In parallel from the Mesa repo, start the PPS producer 103 mesa $ ./build/src/tool/pps/pps-producer 104 105 # Back in the perfetto tmux, press enter to start the capture 106 107CPU Tracing 108~~~~~~~~~~~ 109 110Mesa's CPU tracepoints (``MESA_TRACE_*``) use Perfetto track events when 111Perfetto is enabled. They use ``mesa.default`` and ``mesa.slow`` categories. 112 113Currently, only EGL and Freedreno have CPU tracepoints. 114 115Vulkan data sources 116~~~~~~~~~~~~~~~~~~~ 117 118The Vulkan API gives the application control over recording of command 119buffers as well as when they are submitted to the hardware. As a 120consequence, we need to ensure command buffers are properly 121instrumented for the Perfetto driver data sources prior to Perfetto 122actually collecting traces. 123 124This can be achieved by setting the :envvar:`MESA_GPU_TRACES` 125environment variable before starting a Vulkan application : 126 127.. code-block:: sh 128 129 MESA_GPU_TRACES=perfetto ./build/my_vulkan_app 130 131Driver Specifics 132~~~~~~~~~~~~~~~~ 133 134Below is driver specific information/instructions for the PPS producer. 135 136Freedreno / Turnip 137^^^^^^^^^^^^^^^^^^ 138 139The Freedreno PPS driver needs root access to read system-wide 140performance counters, so you can simply run it with sudo: 141 142.. code-block:: sh 143 144 sudo ./build/src/tool/pps/pps-producer 145 146Intel 147^^^^^ 148 149The Intel PPS driver needs root access to read system-wide 150`RenderBasic <https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2023-0/gpu-metrics-reference.html>`__ 151performance counters, so you can simply run it with sudo: 152 153.. code-block:: sh 154 155 sudo ./build/src/tool/pps/pps-producer 156 157Another option to enable access wide data without root permissions would be running the following: 158 159.. code-block:: sh 160 161 sudo sysctl dev.i915.perf_stream_paranoid=0 162 163Alternatively using the ``CAP_PERFMON`` permission on the binary should work too. 164 165A particular metric set can also be selected to capture a different 166set of HW counters : 167 168.. code-block:: sh 169 170 INTEL_PERFETTO_METRIC_SET=RasterizerAndPixelBackend ./build/src/tool/pps/pps-producer 171 172Vulkan applications can also be instrumented to be Perfetto producers. 173To enable this for given application, set the environment variable as 174follow : 175 176.. code-block:: sh 177 178 PERFETTO_TRACE=1 my_vulkan_app 179 180Panfrost 181^^^^^^^^ 182 183The Panfrost PPS driver uses unstable ioctls that behave correctly on 184kernel version `5.4.23+ <https://lwn.net/Articles/813601/>`__ and 185`5.5.7+ <https://lwn.net/Articles/813600/>`__. 186 187To run the producer, follow these two simple steps: 188 1891. Enable Panfrost unstable ioctls via kernel parameter: 190 191 .. code-block:: sh 192 193 modprobe panfrost unstable_ioctls=1 194 195 Alternatively you could add ``panfrost.unstable_ioctls=1`` to your kernel command line, or ``echo 1 > /sys/module/panfrost/parameters/unstable_ioctls``. 196 1972. Run the producer: 198 199 .. code-block:: sh 200 201 ./build/pps-producer 202 203Troubleshooting 204--------------- 205 206Tmux 207~~~~ 208 209If the convenience script ``tools/tmux`` keeps copying artifacts to your 210``SSH_TARGET`` without starting the tmux session, make sure you have ``tmux`` 211installed in your system. 212 213.. code-block:: sh 214 215 apt install tmux 216 217Missing counter names 218~~~~~~~~~~~~~~~~~~~~~ 219 220If the trace viewer shows a list of counters with a description like 221``gpu_counter(#)`` instead of their proper names, maybe you had a data loss due 222to the trace buffer being full and wrapped. 223 224In order to prevent this loss of data you can tweak the trace config file in 225two different ways: 226 227- Increase the size of the buffer in use: 228 229 .. code-block:: javascript 230 231 buffers { 232 size_kb: 2048, 233 fill_policy: RING_BUFFER, 234 } 235 236- Periodically flush the trace buffer into the output file: 237 238 .. code-block:: javascript 239 240 write_into_file: true 241 file_write_period_ms: 250 242 243 244- Discard new traces when the buffer fills: 245 246 .. code-block:: javascript 247 248 buffers { 249 size_kb: 2048, 250 fill_policy: DISCARD, 251 } 252