1.. _module-pw_bloat: 2 3======== 4pw_bloat 5======== 6.. pigweed-module:: 7 :name: pw_bloat 8 9``pw_bloat`` provides tools and helpers around using 10`Bloaty McBloatface <https://github.com/google/bloaty>`_ including generating 11size report cards for output binaries through :ref:`Pigweed's GN build 12system <module-pw_build-gn>`. 13 14Bloat report cards allow tracking the memory usage of a system over time as code 15changes are made and provide a breakdown of which parts of the code have the 16largest size impact. 17 18------------------------ 19``pw bloat`` CLI command 20------------------------ 21``pw_bloat`` includes a plugin for the Pigweed command line capable of running 22size reports on ELF binaries. 23 24.. note:: 25 26 The bloat CLI plugin is still experimental and only supports a small subset 27 of ``pw_bloat``'s capabilities. 28 29Basic usage 30=========== 31 32Running a size report on a single executable 33-------------------------------------------- 34By default, ``pw bloat`` assumes that 35:ref:`memoryregions <module-pw_bloat-memoryregions>` symbols are defined in 36binaries, and uses them to automatically generate a Bloaty config file. 37 38.. code-block:: sh 39 40 $ pw bloat out/docs/obj/pw_result/size_report/bin/ladder_and_then.elf 41 42 ▒█████▄ █▓ ▄███▒ ▒█ ▒█ ░▓████▒ ░▓████▒ ▒▓████▄ 43 ▒█░ █░ ░█▒ ██▒ ▀█▒ ▒█░ █ ▒█ ▒█ ▀ ▒█ ▀ ▒█ ▀█▌ 44 ▒█▄▄▄█░ ░█▒ █▓░ ▄▄░ ▒█░ █ ▒█ ▒███ ▒███ ░█ █▌ 45 ▒█▀ ░█░ ▓█ █▓ ░█░ █ ▒█ ▒█ ▄ ▒█ ▄ ░█ ▄█▌ 46 ▒█ ░█░ ░▓███▀ ▒█▓▀▓█░ ░▓████▒ ░▓████▒ ▒▓████▀ 47 48 +----------------------+---------+ 49 | memoryregions | sizes | 50 +======================+=========+ 51 |FLASH |1,048,064| 52 |RAM | 196,608| 53 |VECTOR_TABLE | 512| 54 +======================+=========+ 55 |Total |1,245,184| 56 +----------------------+---------+ 57 58Running a size report diff 59-------------------------- 60 61.. code-block:: sh 62 63 $ pw bloat out/docs/obj/pw_metric/size_report/bin/one_metric.elf \ 64 --diff out/docs/obj/pw_metric/size_report/bin/base.elf \ 65 -d symbols 66 67 ▒█████▄ █▓ ▄███▒ ▒█ ▒█ ░▓████▒ ░▓████▒ ▒▓████▄ 68 ▒█░ █░ ░█▒ ██▒ ▀█▒ ▒█░ █ ▒█ ▒█ ▀ ▒█ ▀ ▒█ ▀█▌ 69 ▒█▄▄▄█░ ░█▒ █▓░ ▄▄░ ▒█░ █ ▒█ ▒███ ▒███ ░█ █▌ 70 ▒█▀ ░█░ ▓█ █▓ ░█░ █ ▒█ ▒█ ▄ ▒█ ▄ ░█ ▄█▌ 71 ▒█ ░█░ ░▓███▀ ▒█▓▀▓█░ ░▓████▒ ░▓████▒ ▒▓████▀ 72 73 +-----------------------------------------------------------------------------------+ 74 | | 75 +-----------------------------------------------------------------------------------+ 76 | diff| memoryregions | symbols | sizes| 77 +=====+======================+===============================================+======+ 78 | |FLASH | | -4| 79 | | |[section .FLASH.unused_space] | -408| 80 | | |main | +60| 81 | | |__sf_fake_stdout | +4| 82 | | |pw_boot_PreStaticMemoryInit | -2| 83 | | |_isatty | -2| 84 | NEW| |_GLOBAL__sub_I_group_foo | +84| 85 | NEW| |pw::metric::Group::~Group() | +34| 86 | NEW| |pw::intrusive_list_impl::List::insert_after() | +32| 87 | NEW| |pw::metric::Metric::Increment() | +32| 88 | NEW| |__cxa_atexit | +28| 89 | NEW| |pw::metric::Metric::Metric() | +28| 90 | NEW| |pw::metric::Metric::as_int() | +28| 91 | NEW| |pw::intrusive_list_impl::List::Item::unlist() | +20| 92 | NEW| |pw::metric::Group::Group() | +18| 93 | NEW| |pw::intrusive_list_impl::List::Item::previous()| +14| 94 | NEW| |pw::metric::TypedMetric<>::~TypedMetric() | +14| 95 | NEW| |__aeabi_atexit | +12| 96 +-----+----------------------+-----------------------------------------------+------+ 97 | |RAM | | 0| 98 | | |[section .stack] | -32| 99 | NEW| |group_foo | +16| 100 | NEW| |metric_x | +12| 101 | NEW| |[section .static_init_ram] | +4| 102 +=====+======================+===============================================+======+ 103 |Total| | | -4| 104 +-----+----------------------+-----------------------------------------------+------+ 105 106Specifying a custom Bloaty config file 107-------------------------------------- 108If the linker script for a target does not define memory regions, a custom 109Bloaty config can be provided using the ``-c / --custom-config`` option. 110 111.. code-block:: 112 113 $ pw bloat out/pw_strict_host_clang_debug/obj/pw_status/test/status_test -c targets/host/linux.bloaty 114 115 ▒█████▄ █▓ ▄███▒ ▒█ ▒█ ░▓████▒ ░▓████▒ ▒▓████▄ 116 ▒█░ █░ ░█▒ ██▒ ▀█▒ ▒█░ █ ▒█ ▒█ ▀ ▒█ ▀ ▒█ ▀█▌ 117 ▒█▄▄▄█░ ░█▒ █▓░ ▄▄░ ▒█░ █ ▒█ ▒███ ▒███ ░█ █▌ 118 ▒█▀ ░█░ ▓█ █▓ ░█░ █ ▒█ ▒█ ▄ ▒█ ▄ ░█ ▄█▌ 119 ▒█ ░█░ ░▓███▀ ▒█▓▀▓█░ ░▓████▒ ░▓████▒ ▒▓████▀ 120 121 +------------+---------------------+-------+ 122 | segments | sections | sizes | 123 +============+=====================+=======+ 124 |LOAD #3 [RX]| |138,176| 125 | |.text |137,524| 126 | |.plt | 608| 127 | |.init | 24| 128 | |.fini | 20| 129 +------------+---------------------+-------+ 130 |LOAD #2 [R] | | 87,816| 131 | |.rela.dyn | 32,664| 132 | |.rodata | 23,176| 133 | |.eh_frame | 23,152| 134 | |.eh_frame_hdr | 4,236| 135 | |.gcc_except_table | 1,140| 136 | |.dynsym | 1,008| 137 | |.rela.plt | 888| 138 | |[ELF Program Headers]| 616| 139 | |.dynstr | 556| 140 | |.gnu.version_r | 116| 141 | |.gnu.version | 84| 142 | |[ELF Header] | 64| 143 | |.note.ABI-tag | 32| 144 | |.gnu.hash | 28| 145 | |.interp | 28| 146 | |.note.gnu.build-id | 28| 147 +------------+---------------------+-------+ 148 |LOAD #5 [RW]| | 20,216| 149 | |.bss | 19,824| 150 | |.got.plt | 328| 151 | |.data | 64| 152 +------------+---------------------+-------+ 153 |LOAD #4 [RW]| | 15,664| 154 | |.data.rel.ro | 12,240| 155 | |.relro_padding | 2,872| 156 | |.dynamic | 464| 157 | |.got | 56| 158 | |.fini_array | 16| 159 | |.init_array | 16| 160 +============+=====================+=======+ 161 |Total | |261,872| 162 +------------+---------------------+-------+ 163 164.. _bloat-howto: 165 166--------------------------- 167Defining size reports in GN 168--------------------------- 169 170Diff size reports 171================= 172Size reports can be defined using the GN template ``pw_size_diff``. The template 173requires at least two executable targets on which to perform a size diff. The 174base for the size diff can be specified either globally through the top-level 175``base`` argument, or individually per-binary within the ``binaries`` list. 176 177Arguments 178--------- 179 180* ``base``: Optional default base target for all listed binaries. 181* ``source_filter``: Optional global regex to filter labels in the diff output. 182* ``data_sources``: Optional global list of datasources from bloaty config file 183* ``binaries``: List of binaries to size diff. Each binary specifies a target, 184 a label for the diff, and optionally a base target, source filter, and data 185 sources that override the global ones (if specified). 186 187 188.. code-block:: 189 190 import("$dir_pw_bloat/bloat.gni") 191 192 executable("empty_base") { 193 sources = [ "empty_main.cc" ] 194 } 195 196 executable("hello_world_printf") { 197 sources = [ "hello_printf.cc" ] 198 } 199 200 executable("hello_world_iostream") { 201 sources = [ "hello_iostream.cc" ] 202 } 203 204 pw_size_diff("my_size_report") { 205 base = ":empty_base" 206 data_sources = "symbols,segments" 207 binaries = [ 208 { 209 target = ":hello_world_printf" 210 label = "Hello world using printf" 211 }, 212 { 213 target = ":hello_world_iostream" 214 label = "Hello world using iostream" 215 data_sources = "symbols" 216 }, 217 ] 218 } 219 220A sample ``pw_size_diff`` reStructuredText size report table can be found 221within module docs. For example, see the :ref:`pw_checksum-size-report` 222section of the ``pw_checksum`` module for more detail. 223 224Single binary size reports 225========================== 226Size reports can also be defined using ``pw_size_report``, which provides 227a size report for a single binary. The template requires a target binary. 228 229Arguments 230--------- 231* ``target``: Binary target to run size report on. 232* ``data_sources``: Optional list of data sources to organize outputs. 233* ``source_filter``: Optional regex to filter labels in the output. 234* ``json_key_prefix``: Optional prefix for key names in json size report. 235* ``full_json_summary``: Optional boolean to print json size report by label 236* level hierarchy. Defaults to only use top-level label in size report. 237* ``ignore_unused_labels``: Optional boolean to remove labels that have size of 238* zero in json size report. 239 240.. code-block:: 241 242 import("$dir_pw_bloat/bloat.gni") 243 244 executable("hello_world_iostream") { 245 sources = [ "hello_iostream.cc" ] 246 } 247 248 pw_size_report("hello_world_iostream_size_report") { 249 target = ":hello_iostream" 250 data_sources = "segments,symbols" 251 source_filter = "pw::hello" 252 json_key_prefix = "hello_world_iostream" 253 full_json_summary = true 254 ignore_unused_labels = true 255 } 256 257Example of the generated ASCII table for a single binary: 258 259.. code-block:: 260 261 ┌─────────────┬──────────────────────────────────────────────────┬──────┐ 262 │segment_names│ symbols │ sizes│ 263 ├═════════════┼══════════════════════════════════════════════════┼══════┤ 264 │FLASH │ │12,072│ 265 │ │pw::kvs::KeyValueStore::InitializeMetadata() │ 684│ 266 │ │pw::kvs::KeyValueStore::Init() │ 456│ 267 │ │pw::kvs::internal::EntryCache::Find() │ 444│ 268 │ │pw::kvs::FakeFlashMemory::Write() │ 240│ 269 │ │pw::kvs::internal::Entry::VerifyChecksumInFlash() │ 228│ 270 │ │pw::kvs::KeyValueStore::GarbageCollectSector() │ 220│ 271 │ │pw::kvs::KeyValueStore::RemoveDeletedKeyEntries() │ 220│ 272 │ │pw::kvs::KeyValueStore::AppendEntry() │ 204│ 273 │ │pw::kvs::KeyValueStore::Get() │ 194│ 274 │ │pw::kvs::internal::Entry::Read() │ 188│ 275 │ │pw::kvs::ChecksumAlgorithm::Finish() │ 26│ 276 │ │pw::kvs::internal::Entry::ReadKey() │ 26│ 277 │ │pw::kvs::internal::Sectors::BaseAddress() │ 24│ 278 │ │pw::kvs::ChecksumAlgorithm::Update() │ 20│ 279 │ │pw::kvs::FlashTestPartition() │ 8│ 280 │ │pw::kvs::FakeFlashMemory::Disable() │ 6│ 281 │ │pw::kvs::FakeFlashMemory::Enable() │ 6│ 282 │ │pw::kvs::FlashMemory::SelfTest() │ 6│ 283 │ │pw::kvs::FlashPartition::Init() │ 6│ 284 │ │pw::kvs::FlashPartition::sector_size_bytes() │ 6│ 285 │ │pw::kvs::FakeFlashMemory::IsEnabled() │ 4│ 286 ├─────────────┼──────────────────────────────────────────────────┼──────┤ 287 │RAM │ │ 1,424│ 288 │ │test_kvs │ 992│ 289 │ │pw::kvs::(anonymous namespace)::test_flash │ 384│ 290 │ │pw::kvs::(anonymous namespace)::test_partition │ 24│ 291 │ │pw::kvs::FakeFlashMemory::no_errors_ │ 12│ 292 │ │borrowable_kvs │ 8│ 293 │ │kvs_entry_count │ 4│ 294 ├═════════════┼══════════════════════════════════════════════════┼══════┤ 295 │Total │ │13,496│ 296 └─────────────┴──────────────────────────────────────────────────┴──────┘ 297 298 299Size reports are typically included in reStructuredText, as described in 300`Documentation integration`_. Size reports may also be printed in the build 301output if desired. To enable this in the GN build 302(``pigweed/pw_bloat/bloat.gni``), set the ``pw_bloat_SHOW_SIZE_REPORTS`` 303build arg to ``true``. 304 305Collecting size report data 306=========================== 307Each ``pw_size_report`` target outputs a JSON file containing the sizes of all 308top-level labels in the binary. (By default, this represents "segments", i.e. 309ELF program headers.) If ``full_json_summary`` is set to true, sizes for all 310label levels are reported (i.e. default labels would show size of each symbol 311per segment). If a build produces multiple images, it may be useful to collect 312all of their sizes into a single file to provide a snapshot of sizes at some 313point in time --- for example, to display per-commit size deltas through CI. 314 315The ``pw_size_report_aggregation`` template is provided to collect multiple size 316reports' data into a single JSON file. 317 318Arguments 319--------- 320* ``deps``: List of ``pw_size_report`` targets whose data to collect. 321* ``output``: Path to the output JSON file. 322 323.. code-block:: 324 325 import("$dir_pw_bloat/bloat.gni") 326 327 pw_size_report_aggregation("image_sizes") { 328 deps = [ 329 ":app_image_size_report", 330 ":bootloader_image_size_report", 331 ] 332 output = "$root_gen_dir/artifacts/image_sizes.json" 333 } 334 335.. _module-pw_bloat-docs: 336 337------------------------- 338Documentation integration 339------------------------- 340Bloat reports are easy to add to documentation files. All ``pw_size_diff`` 341and ``pw_size_report`` targets output a file containing a tabular report card. 342This file can be imported directly into a reStructuredText file using the 343``include`` directive. 344 345For example, the ``simple_bloat_loop`` and ``simple_bloat_function`` size 346reports under ``//pw_bloat/examples`` are imported into this file as follows: 347 348.. code-block:: rst 349 350 Simple bloat loop example 351 ^^^^^^^^^^^^^^^^^^^^^^^^^ 352 .. include:: examples/simple_bloat_loop 353 354 Simple bloat function example 355 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 356 .. include:: examples/simple_bloat_function 357 358Resulting in this output: 359 360Simple bloat loop example 361========================= 362.. include:: examples/simple_bloat_loop 363 364Simple bloat function example 365============================= 366.. include:: examples/simple_bloat_function 367 368.. _module-pw_bloat-sources: 369 370------------------------------ 371Additional Bloaty data sources 372------------------------------ 373`Bloaty McBloatface <https://github.com/google/bloaty>`_ by itself cannot help 374answer some questions which embedded developers frequently face such as 375understanding how much space is left. To address this, Pigweed provides Python 376tooling (``pw_bloat.bloaty_config``) to generate bloaty configuration files 377based on the final ELF files through small tweaks in the linker scripts to 378expose extra information. 379 380See the sections below on how to enable the additional data sections through 381modifications in your linker script(s). 382 383As an example to generate the helper configuration which enables additional data 384sources for ``example.elf`` if you've updated your linker script(s) accordingly, 385simply run 386``python -m pw_bloaty.bloaty_config example.elf > example.bloaty``. The 387``example.bloaty`` can then be used with bloaty using the ``-c`` flag, for 388example 389``bloaty -c example.bloaty example.elf --domain vm -d memoryregions,utilization`` 390which may return something like: 391 392.. code-block:: 393 394 84.2% 1023Ki FLASH 395 94.2% 963Ki Free space 396 5.8% 59.6Ki Used space 397 15.8% 192Ki RAM 398 100.0% 192Ki Used space 399 0.0% 512 VECTOR_TABLE 400 96.9% 496 Free space 401 3.1% 16 Used space 402 0.0% 0 Not resident in memory 403 NAN% 0 Used space 404 405.. _module-pw_bloat-utilization: 406 407``utilization`` data source 408=========================== 409The most common question many embedded developers face when using ``bloaty`` is 410how much space you are using and how much space is left. To correctly answer 411this, section sizes must be used in order to correctly account for section 412alignment requirements. 413 414The generated ``utilization`` data source will work with any ELF file, where 415``Used Space`` is reported for the sum of virtual memory size of all sections. 416``Padding`` captures the amount of memory that is utilized to enfore alignment 417requirements. Tracking ``Padding`` size can help monitor application growth 418for changes that are too small to force realignment. 419 420In order for ``Free Space`` to be reported, your linker scripts must include 421properly aligned sections which span the unused remaining space for the relevant 422memory region with the ``unused_space`` string anywhere in their name. This 423typically means creating a trailing section which is pinned to span to the end 424of the memory region. 425 426For example imagine this partial example GNU LD linker script: 427 428.. code-block:: 429 430 MEMORY 431 { 432 FLASH(rx) : \ 433 ORIGIN = PW_BOOT_FLASH_BEGIN, \ 434 LENGTH = PW_BOOT_FLASH_SIZE 435 RAM(rwx) : \ 436 ORIGIN = PW_BOOT_RAM_BEGIN, \ 437 LENGTH = PW_BOOT_RAM_SIZE 438 } 439 440 SECTIONS 441 { 442 /* Main executable code. */ 443 .code : ALIGN(4) 444 { 445 /* Application code. */ 446 *(.text) 447 *(.text*) 448 KEEP(*(.init)) 449 KEEP(*(.fini)) 450 451 . = ALIGN(4); 452 /* Constants.*/ 453 *(.rodata) 454 *(.rodata*) 455 } >FLASH 456 457 /* Explicitly initialized global and static data. (.data)*/ 458 .static_init_ram : ALIGN(4) 459 { 460 *(.data) 461 *(.data*) 462 . = ALIGN(4); 463 } >RAM AT> FLASH 464 465 /* Zero initialized global/static data. (.bss) */ 466 .zero_init_ram (NOLOAD) : ALIGN(4) 467 { 468 *(.bss) 469 *(.bss*) 470 *(COMMON) 471 . = ALIGN(4); 472 } >RAM 473 } 474 475Could be modified as follows to enable ``Free Space`` reporting: 476 477.. code-block:: 478 479 MEMORY 480 { 481 FLASH(rx) : ORIGIN = PW_BOOT_FLASH_BEGIN, LENGTH = PW_BOOT_FLASH_SIZE 482 RAM(rwx) : ORIGIN = PW_BOOT_RAM_BEGIN, LENGTH = PW_BOOT_RAM_SIZE 483 484 /* Each memory region above has an associated .*.unused_space section that 485 * overlays the unused space at the end of the memory segment. These 486 * segments are used by pw_bloat.bloaty_config to create the utilization 487 * data source for bloaty size reports. 488 * 489 * These sections MUST be located immediately after the last section that is 490 * placed in the respective memory region or lld will issue a warning like: 491 * 492 * warning: ignoring memory region assignment for non-allocatable section 493 * '.VECTOR_TABLE.unused_space' 494 * 495 * If this warning occurs, it's also likely that LLD will have created quite 496 * large padded regions in the ELF file due to bad cursor operations. This 497 * can cause ELF files to balloon from hundreds of kilobytes to hundreds of 498 * megabytes. 499 * 500 * Attempting to add sections to the memory region AFTER the unused_space 501 * section will cause the region to overflow. 502 */ 503 } 504 505 SECTIONS 506 { 507 /* Main executable code. */ 508 .code : ALIGN(4) 509 { 510 /* Application code. */ 511 *(.text) 512 *(.text*) 513 KEEP(*(.init)) 514 KEEP(*(.fini)) 515 516 . = ALIGN(4); 517 /* Constants.*/ 518 *(.rodata) 519 *(.rodata*) 520 } >FLASH 521 522 /* Explicitly initialized global and static data. (.data)*/ 523 .static_init_ram : ALIGN(4) 524 { 525 *(.data) 526 *(.data*) 527 . = ALIGN(4); 528 } >RAM AT> FLASH 529 530 /* Defines a section representing the unused space in the FLASH segment. 531 * This MUST be the last section assigned to the FLASH region. 532 */ 533 PW_BLOAT_UNUSED_SPACE(FLASH) 534 535 /* Zero initialized global/static data. (.bss). */ 536 .zero_init_ram (NOLOAD) : ALIGN(4) 537 { 538 *(.bss) 539 *(.bss*) 540 *(COMMON) 541 . = ALIGN(4); 542 } >RAM 543 544 /* Defines a section representing the unused space in the RAM segment. This 545 * MUST be the last section assigned to the RAM region. 546 */ 547 PW_BLOAT_UNUSED_SPACE(RAM) 548 } 549 550The preprocessor macro ``PW_BLOAT_UNUSED_SPACE`` is defined in 551``pw_bloat/bloat_macros.ld``. To use these macros include this file in your 552``pw_linker_script`` as follows: 553 554.. code-block:: 555 556 pw_linker_script("my_linker_script") { 557 includes = [ "$dir_pw_bloat/bloat_macros.ld" ] 558 linker_script = "my_project_linker_script.ld" 559 } 560 561Note that linker scripts are not natively supported by GN and can't be provided 562through ``deps``, the ``bloat_macros.ld`` must be passed in the ``includes`` 563list. 564 565.. _module-pw_bloat-memoryregions: 566 567``memoryregions`` data source 568============================= 569Understanding how symbols, sections, and other data sources can be attributed 570back to the memory regions defined in your linker script is another common 571problem area. Unfortunately the ELF format does not include the original memory 572regions, meaning ``bloaty`` can not do this today by itself. In addition, it's 573relatively common that there are multiple memory regions which alias to the same 574memory but through different buses which could make attribution difficult. 575 576Instead of taking the less portable and brittle approach to parse ``*.map`` 577files, ``pw_bloat.bloaty_config`` consumes symbols which are defined in the 578linker script with a special format to extract this information from the ELF 579file: ``pw_bloat_config_memory_region_NAME_{start,end}{_N,}``. 580 581These symbols are defined by the preprocessor macros ``PW_BLOAT_MEMORY_REGION`` 582and ``PW_BLOAT_MEMORY_REGION_MAP`` with the right address and size for the 583regions. To use these macros include the ``pw_bloat/bloat_macros.ld`` in your 584``pw_linker_script`` as follows: 585 586.. code-block:: 587 588 pw_linker_script("my_linker_script") { 589 includes = [ "$dir_pw_bloat/bloat_macros.ld" ] 590 linker_script = "my_project_linker_script.ld" 591 } 592 593These symbols are then used to determine how to map segments to these memory 594regions. Note that segments must be used in order to account for inter-section 595padding which are not attributed against any sections. 596 597As an example, if you have a single view in the single memory region named 598``FLASH``, then you should include the following macro in your linker script to 599generate the symbols needed for the that region: 600 601.. code-block:: 602 603 PW_BLOAT_MEMORY_REGION(FLASH) 604 605As another example, if you have two aliased memory regions (``DCTM`` and 606``ITCM``) into the same effective memory named you'd like to call ``RAM``, then 607you should produce the following four symbols in your linker script: 608 609.. code-block:: 610 611 PW_BLOAT_MEMORY_REGION_MAP(RAM, ITCM) 612 PW_BLOAT_MEMORY_REGION_MAP(RAM, DTCM) 613