1# Training Regalloc MLGO model for Android Clang 2 3## Background 4 5MLGO is a framework for integrating ML techniques systematically in Clang. It 6replaces human-crafted optimization heuristics with machine learned models to 7decide which live range to evict with Reinforcement Learning (RL) on a corpus 8extracted from AOSP. 9 10This guide goes through how to re-train MLGO models on AOSP. 11 12## Preparation 13 14Create a working directory (e.g. `android`) and set up the `WORKING_DIR` 15environment variable. 16 17```sh 18mkdir ~/android-mlgo; cd ~/android-mlgo 19export WORKING_DIR=`pwd` 20``` 21 22### Get Repositories 23 24#### ml-compiler-opt 25 26```sh 27cd $WORKING_DIR 28git clone https://github.com/google/ml-compiler-opt --depth=1 29``` 30 31#### aosp-master-plus-llvm 32 33```sh 34cd $WORKING_DIR 35mkdir aosp-master-plus-llvm; cd aosp-master-plus-llvm 36repo init -u https://android.googlesource.com/platform/manifest -b master-plus-llvm --partial-clone --use-superproject --depth=1 37repo sync -c 38``` 39 40### Set up Tensorflow 41 42First, install Python Virtualenv: 43 44``` 45sudo apt install python3-venv 46``` 47 48You only need to run the above command once. 49 50Now, set up a Virtualenv and install Tensorflow and other dependencies: 51 52```sh 53cd $WORKING_DIR 54python3 -m venv venv 55source venv/bin/activate 56pip3 install tensorflow-cpu gin-config cloudpickle psutil tf_agents mlgo-utils 57``` 58 59### Set up TFLite 60 61```sh 62mkdir $WORKING_DIR/tflite; cd $WORKING_DIR/tflite 63$WORKING_DIR/ml-compiler-opt/buildbot/build_tflite.sh 64``` 65 66## Build LLVM for ML training 67 68You do not need the full toolchain for ML training. 69 70```sh 71mkdir $WORKING_DIR/llvm-build; cd $WORKING_DIR/llvm-build 72export CLANG_VER=`grep -oP "ClangDefaultVersion.*\"\Kclang-r\S+[^\"]" $WORKING_DIR/aosp-master-plus-llvm/build/soong/cc/config/global.go` 73CC=$WORKING_DIR/aosp-master-plus-llvm/prebuilts/clang/host/linux-x86/$CLANG_VER/bin/clang \ 74CXX=$WORKING_DIR/aosp-master-plus-llvm/prebuilts/clang/host/linux-x86/$CLANG_VER/bin/clang++ \ 75$WORKING_DIR/aosp-master-plus-llvm/prebuilts/cmake/linux-x86/bin/cmake -G Ninja \ 76 -DCMAKE_BUILD_TYPE=Release \ 77 -DLLVM_ENABLE_PROJECTS="clang" \ 78 -DLLVM_TARGETS_TO_BUILD="X86;ARM;AArch64" \ 79 -C $WORKING_DIR/tflite/tflite.cmake \ 80 $WORKING_DIR/aosp-master-plus-llvm/toolchain/llvm-project/llvm 81$WORKING_DIR/aosp-master-plus-llvm/prebuilts/build-tools/linux-x86/bin/ninja 82``` 83 84## Training 85 86### Build AOSP 87 88```sh 89cd $WORKING_DIR/aosp-master-plus-llvm 90source build/envsetup.sh 91lunch aosp_cf_arm64_only_phone-trunk_staging-userdebug 92USE_RBE=false \ 93 SOONG_GEN_COMPDB=true \ 94 THINLTO_EMIT_INDEXES_AND_IMPORTS=true \ 95 m 96``` 97 98### Corpus extraction 99 100```sh 101cd $WORKING_DIR/ml_compiler_opt 102extract_ir \ 103 --cmd_filter="^-O2|-O3" \ 104 --llvm_objcopy_path=$WORKING_DIR/llvm-build/bin/llvm-objcopy \ 105 --output_dir=$WORKING_DIR/corpus \ 106 --thinlto_build=local \ 107 --obj_base_dir=$WORKING_DIR/aosp-master-plus-llvm/out 108``` 109 110Edit `corpus_description.json` and add the following to the 111`global_command_override` section: 112 113``` 114 "-march=armv8.2-a", 115 "-mcpu=cortex-a55", 116 "--target=aarch64-linux-android10000", 117 "-fPIC", 118 "-fno-exceptions", 119 "-no-canonical-prefixes", 120 "-O2", 121 "-mllvm", 122 "-import-instr-limit=40", 123 "-nostdlib++", 124 "-c" 125``` 126 127Now you should have a properly prepared AOSP ThinLTO corpus. 128 129### Collect the Default Trace and Generate Vocab 130 131Follow the remaining steps listed in 132[the Chromium MLGO training demo](https://github.com/google/ml-compiler-opt/blob/main/docs/regalloc-demo/demo.md#collect-the-default-trace-and-generate-vocab) 133, beginning from the "Collect the Default Trace and Generate Vocab" section. 134