1# Build Performance 2 3## Debugging Build Performance 4 5### Tracing 6 7soong_ui has tracing built in, so that every build execution's trace can be 8viewed. Just open `$OUT_DIR/build.trace.gz` in Chrome's <chrome://tracing>, or 9with [catapult's trace viewer][catapult trace_viewer]. The last few traces are 10stored in `build.trace.#.gz` (larger numbers are older). The associated logs 11are stored in `soong.#.log` and `verbose.#.log.gz`. 12 13 14 15### Critical path 16 17soong_ui logs the wall time of the longest dependency chain compared to the 18elapsed wall time in `$OUT_DIR/soong.log`. For example: 19``` 20critical path took 3m10s 21elapsed time 5m16s 22perfect parallelism ratio 60% 23critical path: 24 0:00 build out/target/product/generic_arm64/obj/FAKE/sepolicy_neverallows_intermediates/policy_2.conf 25 0:04 build out/target/product/generic_arm64/obj/FAKE/sepolicy_neverallows_intermediates/sepolicy_neverallows 26 0:13 build out/target/product/generic_arm64/obj/ETC/plat_sepolicy.cil_intermediates/plat_sepolicy.cil 27 0:01 build out/target/product/generic_arm64/obj/ETC/plat_pub_versioned.cil_intermediates/plat_pub_versioned.cil 28 0:02 build out/target/product/generic_arm64/obj/ETC/vendor_sepolicy.cil_intermediates/vendor_sepolicy.cil 29 0:16 build out/target/product/generic_arm64/obj/ETC/sepolicy_intermediates/sepolicy 30 0:00 build out/target/product/generic_arm64/obj/ETC/plat_seapp_contexts_intermediates/plat_seapp_contexts 31 0:00 Install: out/target/product/generic_arm64/system/etc/selinux/plat_seapp_contexts 32 0:02 build out/target/product/generic_arm64/obj/NOTICE.txt 33 0:00 build out/target/product/generic_arm64/obj/NOTICE.xml.gz 34 0:00 build out/target/product/generic_arm64/system/etc/NOTICE.xml.gz 35 0:01 Installed file list: out/target/product/generic_arm64/installed-files.txt 36 1:00 Target system fs image: out/target/product/generic_arm64/obj/PACKAGING/systemimage_intermediates/system.img 37 0:01 Install system fs image: out/target/product/generic_arm64/system.img 38 0:01 Target vbmeta image: out/target/product/generic_arm64/vbmeta.img 39 1:26 Package target files: out/target/product/generic_arm64/obj/PACKAGING/target_files_intermediates/aosp_arm64-target_files-6663974.zip 40 0:01 Package: out/target/product/generic_arm64/aosp_arm64-img-6663974.zip 41 0:01 Dist: /buildbot/dist_dirs/aosp-master-linux-aosp_arm64-userdebug/6663974/aosp_arm64-img-6663974.zip 42``` 43 44If the elapsed time is much longer than the critical path then additional 45parallelism on the build machine will improve total build times. If there are 46long individual times listed in the critical path then improving build times 47for those steps or adjusting dependencies so that those steps can run earlier 48in the build graph will improve total build times. 49 50### Soong 51 52Soong proper (i.e., `soong_build` executable that processes the blueprint 53files) can be traced and profiled using the standard Go tools. It understands 54the `-trace`, `-cpuprofile`, and `-memprofile` command line arguments. 55Setting `SOONG_PROFILE_CPU` and/or `SOONG_PROFILE_MEM` environment variables 56for the build enables respective profiling, e.g., running 57 58```shell 59SOONG_PROFILE_CPU=/tmp/foo m ..._ 60``` 61 62saves CPU profile for each Soong invocation in /tmp/foo._step_ file, where 63_step_ is Soong execution step. The main step is `build`. The others as 64`soong_docs` (not all of them necessarily run during the build). 65The profiles can be inspected with `go tool pprof` from the command line or 66with _Run>Open Profiler Snapshot_ in IntelliJ IDEA. 67 68### Kati 69 70In general, the slow path of reading Android.mk files isn't particularly 71performance sensitive, since it doesn't need to happen on every build. It is 72important for the fast-path (detecting whether it needs to regenerate the ninja 73file) to be fast however. And it shouldn't hit the slow path too often -- so 74don't rely on output of a `$(shell)` command that includes the current timestamp, 75or read a file that's going to change on every build. 76 77#### Regen check is slow 78 79In most cases, we've found that the fast-path is slow because all of the 80`$(shell)` commands need to be re-executed to determine if their output changed. 81The `$OUT_DIR/verbose.log.gz` contains statistics from the regen check: 82 83``` 84verbose: *kati*: regen check time: 0.754030 85verbose: *kati*: glob time (regen): 0.545859 / 43840 86verbose: *kati*: shell time (regen): 0.278095 / 66 (59 unique) 87verbose: *kati*: 0.012 / 1 mkdir -p out/target/product/generic && echo Android/aosp_arm/generic:R/AOSP.MASTER/$(date -d @$(cat out/build_date.txt) +%m%d%H%M):eng/test-keys >out/target/product/generic/build_fingerprint.txt && grep " " out/target/product/generic/build_fingerprint.txt 88verbose: *kati*: 0.010 / 1 echo 'com.android.launcher3.config.FlagOverrideSampleTest com.android.launcher3.logging.FileLogTest com.android.launcher3.model.AddWorkspaceItemsTaskTest com.android.launcher3.model.CacheDataUpdatedTaskTest com.android.launcher3.model.DbDowngradeHelperTest com.android.launcher3.model.GridBackupTableTest com.android.launcher3.model.GridSizeMigrationTaskTest com.android.launcher3.model.PackageInstallStateChangedTaskTest com.android.launcher3.popup.PopupPopulatorTest com.android.launcher3.util.GridOccupancyTest com.android.launcher3.util.IntSetTest' | tr ' ' '\n' | cat 89verbose: *kati*: 0.010 / 1 cd cts/tests/framework/base/windowmanager ; find -L * -name "Components.java" -and -not -name ".*" 90verbose: *kati*: 0.010 / 1 git -C test/framework/build log -s -n 1 --format="%cd" --date=format:"%Y%m%d_%H%M%S" 2>/dev/null 91verbose: *kati*: 0.009 / 2 cd development/samples/ShortcutDemo/publisher ; find -L ../common/src -name "*.java" -and -not -name ".*" 92verbose: *kati*: 0.009 / 2 cd development/samples/ShortcutDemo/launcher ; find -L ../common/src -name "*.java" -and -not -name ".*" 93verbose: *kati*: 0.009 / 1 if ! cmp -s out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk.tmp out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk; then mv out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk.tmp out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk; else rm out/target/product/generic/obj/CONFIG/kati_packaging/dist.mk.tmp; fi 94verbose: *kati*: 0.008 / 1 mkdir -p out/target/product/generic && echo R/AOSP.MASTER/$(cat out/build_number.txt):eng/test-keys >out/target/product/generic/build_thumbprint.txt && grep " " out/target/product/generic/build_thumbprint.txt 95verbose: *kati*: 0.007 / 1 echo 'com.android.customization.model.clock.BaseClockManagerTest com.android.customization.model.clock.ClockManagerTest com.android.customization.model.grid.GridOptionsManagerTest com.android.customization.model.theme.ThemeManagerTest' | tr ' ' '\n' | cat 96verbose: *kati*: 0.007 / 1 uname -sm 97verbose: *kati*: stat time (regen): 0.361907 / 1241 98``` 99 100In this case, the total time spent checking was 0.75 seconds, even though the 101other "(regen)" numbers add up to more than that (some parts are parallelized 102where possible). Often times, the biggest contributor is the `$(shell)` times 103-- in this case, 66 calls took 0.27s. The top 10 longest shell functions are 104printed. 105 106All the longest commands in this case are all variants of a call to `find`, but 107this is where using pure make functions instead of calling out to the shell can 108make a performance impact -- many calls to check if `26 > 20` can add up. We've 109added some basic math functions in `math.mk` to help some common use cases that 110used to be rather expensive when they were used too often. 111 112There are some optimizations in place for find commands -- if Kati can 113understand the find command, the built-in find emulator can turn some of them 114into glob or stat checks (falling back to calling `find` if one of those imply 115that the output may change). Many of the common macros produce find commands 116that Kati can understand, but if you're writing your own, you may want to 117experiment with other options if they're showing up in this list. For example, 118if this was significantly more expensive (either in runtime, or was called 119often): 120 121``` 122.../kati.go:127: *kati*: 0.015 cd libcore && (find luni/src/test/java -name "*.java" 2> /dev/null) | grep -v -f java_tests_blacklist 123``` 124 125It may be more efficient to move the grep into make, so that the `find` portion 126can be rewritten and cached: 127 128``` 129$(filter-out $(file <$(LOCAL_PATH)/java_tests_blacklist),$(call all-java-files-under,luni/src/test/java)) 130``` 131 132Others can be simplified by just switching to an equivalent find command that 133Kati understands: 134 135``` 136.../kati.go:127: *kati*: 0.217 find device vendor -type f -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop | sort 137``` 138 139By adding the implicit `-a` and moving the `| sort` to Make, this can now be 140cached by Kati: 141 142``` 143$(sort $(shell find device vendor -type -f -a -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop)) 144``` 145 146Kati has now learned about the implicit `-a`, so this particular change is no 147longer necessary, but the basic concept holds. 148 149#### Kati regens too often 150 151Kati prints out what triggered the slow path to be taken -- this can be a 152changed file, a changed environment variable, or different output from a 153`$(shell)` command: 154 155``` 156out/soong/Android-aosp_arm.mk was modified, regenerating... 157``` 158 159The state is stored in `$OUT_DIR/.kati_stamp*` files, and can be (partially) 160read with the `ckati_stamp_dump` tool in prebuilts/build-tools. More debugging 161is available when ckati is run with `--regen_debug`, but that can be a lot of 162data to understand. 163 164#### Debugging the slow path 165 166Kati will now dump out information about which Makefiles took the most time to 167execute. This is also in the `verbose.log.gz` file: 168 169``` 170verbose: *kati*: included makefiles: 73.640833 / 232810 (1066 unique) 171verbose: *kati*: 18.389 / 1 out/soong/Android-aosp_arm.mk 172verbose: *kati*: 13.137 / 20144 build/make/core/soong_cc_prebuilt.mk 173verbose: *kati*: 11.743 / 27666 build/make/core/base_rules.mk 174verbose: *kati*: 2.289 / 1 art/Android.mk 175verbose: *kati*: 2.054 / 1 art/build/Android.cpplint.mk 176verbose: *kati*: 1.955 / 28269 build/make/core/clear_vars.mk 177verbose: *kati*: 1.795 / 283 build/make/core/package.mk 178verbose: *kati*: 1.790 / 283 build/make/core/package_internal.mk 179verbose: *kati*: 1.757 / 17382 build/make/core/link_type.mk 180verbose: *kati*: 1.078 / 297 build/make/core/aapt2.mk 181``` 182 183This shows that soong_cc_prebuilt.mk was included 20144 times, for a total time 184spent of 13.137 secounds. While Android-aosp_arm.mk was only included once, and 185took 18.389 seconds. In this case, Android-aosp_arm.mk is the only file that 186includes soong_cc_prebuilt.mk, so we can safely assume that 13 of the 18 seconds 187in Android-aosp_arm.mk was actually spent within soong_cc_prebuilt.mk (or 188something that it included, like base_rules.mk). 189 190By default this only includes the top 10 entries, but you can ask for the stats 191for any makefile to be printed with `$(KATI_profile_makefile)`: 192 193``` 194$(KATI_profile_makefile build/make/core/product.mk) 195``` 196 197With these primitives, it's possible to get the timing information for small 198chunks, or even single lines, of a makefile. Just move the lines you want to 199measure into a new makefile, and replace their use with an `include` of the 200new makefile. It's possible to analyze where the time is being spent by doing 201a binary search using this method, but you do need to be careful not to split 202conditionals across two files (the ifeq/else/endif must all be in the same file). 203 204### Ninja 205 206#### Understanding why something rebuilt 207 208Add `NINJA_ARGS="-d explain"` to your environment before a build, this will cause 209ninja to print out explanations on why actions were taken. Start reading from the 210beginning, as this much data can be hard to read: 211 212``` 213$ cd art 214$ mma 215$ touch runtime/jit/profile_compilation_info.h 216$ NINJA_ARGS="-d explain" mma 217... 218ninja explain: output out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o older than most recent input art/runtime/jit/profile_compilation_info.h ( 2191516683538 vs 1516685188) 220ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o is dirty 221ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/cpp-define-generator-data is dirty 222ninja explain: out/soong/host/linux-x86/bin/cpp-define-generator-data is dirty 223ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-asm-support/gen/asm_support_gen.h is dirty 224ninja explain: out/soong/.intermediates/art/cmdline/art_cmdline_tests/android_arm_armv7-a_core_cmdline_parser_test/obj/art/cmdline/cmdline_parser_test.o is dirty 225... 226``` 227 228In this case, art/cmdline/cmdline_parser_test.o was rebuilt because it uses 229asm_support_gen.h, which was generated by cpp-define-generator-data, which uses 230profile_compilation_info.h. 231 232You'll likely need to cross-reference this data against the build graph in the 233various .ninja files. The files are (mostly) human-readable, but a (slow) web 234interface can be used by running `NINJA_ARGS="-t browse <target>" m`. 235 236There is also `SOONG_UI_NINJA_ARGS`, which passes ninja arguments to soong ui's 237ninja invocations, e.g. to emit $OUT_DIR/soong/build.ninja, $OUT_DIR/soong/module-graph.json, etc. 238 239```bash 240$ m nothing 241$ touch Android.bp 242$ SOONG_UI_NINJA_ARGS="-d explain" m nothing 243... 244ninja explain: restat of output out/soong/build.ninja older than most recent input Android.bp 245... 246``` 247 248#### Builds take a long time 249 250If the long part in the trace view of a build is a relatively solid block, then 251the performance is probably more related to how much time the actual build 252commands are taking than having extra dependencies, or slowdowns in 253soong/kati/ninja themselves. 254 255Beyond looking at visible outliers in the trace view, we don't have any tooling 256to help in this area yet. It's possible to aggregate some of the raw data 257together, but since our builds are heavily parallelized, it's particularly easy 258for build commands to impact unrelated build commands. This is an area we'd 259like to improve -- we expect keeping track of user/system time per-action would 260provide more reliable data, but tracking some full-system data (memory/swap 261use, disk bandwidth, etc) may also be necessary. 262 263## Known Issues 264 265### Common 266 267### <= Android 10 (Q): mm 268 269Soong always loads the entire module graph, so as modules convert from Make to 270Soong, `mm` is becoming closer to `mma`. This produces more correct builds, but 271does slow down builds, as we need to verify/produce/load a larger build graph. 272 273As of Android Q, loading large build graphs is fast, and in Android R, `mm` is 274now an alias of `mma`. 275 276### Android 8.1 (Oreo MR1) 277 278In some cases, a tree would get into a state where Soong would be run twice on 279every incremental build, even if there was nothing to do. This was fixed in 280master with [these changes][blueprint_microfactory], but they were too 281significant to backport at the time. And while they fix this particular issue, 282they appear to cause ninja to spend more time during every build loading the 283`.ninja_log` / `.ninja_deps` files, especially as they become larger. 284 285A workaround to get out of this state is to remove the build.ninja entry from 286`$OUT_DIR/.ninja_log`: 287 288``` 289sed -i "/\/build.ninja/d" $(get_build_var OUT_DIR)/.ninja_log 290``` 291 292[catapult trace_viewer]: https://github.com/catapult-project/catapult/blob/master/tracing/README.md 293[ninja parse optimization]: https://android-review.googlesource.com/c/platform/external/ninja/+/461005 294[blueprint_microfactory]: https://android-review.googlesource.com/q/topic:%22blueprint_microfactory%22+status:merged 295