1# Training Regalloc MLGO model for Android Clang
2
3## Background
4
5MLGO is a framework for integrating ML techniques systematically in Clang. It
6replaces human-crafted optimization heuristics with machine learned models to
7decide which live range to evict with Reinforcement Learning (RL) on a corpus
8extracted from AOSP.
9
10This guide goes through how to re-train MLGO models on AOSP.
11
12## Preparation
13
14Create a working directory (e.g. `android`) and set up the `WORKING_DIR`
15environment variable.
16
17```sh
18mkdir ~/android-mlgo; cd ~/android-mlgo
19export WORKING_DIR=`pwd`
20```
21
22### Get Repositories
23
24#### ml-compiler-opt
25
26```sh
27cd $WORKING_DIR
28git clone https://github.com/google/ml-compiler-opt --depth=1
29```
30
31#### aosp-master-plus-llvm
32
33```sh
34cd $WORKING_DIR
35mkdir aosp-master-plus-llvm; cd aosp-master-plus-llvm
36repo init -u https://android.googlesource.com/platform/manifest -b master-plus-llvm --partial-clone --use-superproject --depth=1
37repo sync -c
38```
39
40### Set up Tensorflow
41
42First, install Python Virtualenv:
43
44```
45sudo apt install python3-venv
46```
47
48You only need to run the above command once.
49
50Now, set up a Virtualenv and install Tensorflow and other dependencies:
51
52```sh
53cd $WORKING_DIR
54python3 -m venv venv
55source venv/bin/activate
56pip3 install tensorflow-cpu gin-config cloudpickle psutil tf_agents mlgo-utils
57```
58
59### Set up TFLite
60
61```sh
62mkdir $WORKING_DIR/tflite; cd $WORKING_DIR/tflite
63$WORKING_DIR/ml-compiler-opt/buildbot/build_tflite.sh
64```
65
66## Build LLVM for ML training
67
68You do not need the full toolchain for ML training.
69
70```sh
71mkdir $WORKING_DIR/llvm-build; cd $WORKING_DIR/llvm-build
72export CLANG_VER=`grep -oP "ClangDefaultVersion.*\"\Kclang-r\S+[^\"]" $WORKING_DIR/aosp-master-plus-llvm/build/soong/cc/config/global.go`
73CC=$WORKING_DIR/aosp-master-plus-llvm/prebuilts/clang/host/linux-x86/$CLANG_VER/bin/clang \
74CXX=$WORKING_DIR/aosp-master-plus-llvm/prebuilts/clang/host/linux-x86/$CLANG_VER/bin/clang++ \
75$WORKING_DIR/aosp-master-plus-llvm/prebuilts/cmake/linux-x86/bin/cmake -G Ninja \
76  -DCMAKE_BUILD_TYPE=Release \
77  -DLLVM_ENABLE_PROJECTS="clang" \
78  -DLLVM_TARGETS_TO_BUILD="X86;ARM;AArch64" \
79  -C $WORKING_DIR/tflite/tflite.cmake \
80  $WORKING_DIR/aosp-master-plus-llvm/toolchain/llvm-project/llvm
81$WORKING_DIR/aosp-master-plus-llvm/prebuilts/build-tools/linux-x86/bin/ninja
82```
83
84## Training
85
86### Build AOSP
87
88```sh
89cd $WORKING_DIR/aosp-master-plus-llvm
90source build/envsetup.sh
91lunch aosp_cf_arm64_only_phone-trunk_staging-userdebug
92USE_RBE=false \
93  SOONG_GEN_COMPDB=true \
94  THINLTO_EMIT_INDEXES_AND_IMPORTS=true \
95  m
96```
97
98### Corpus extraction
99
100```sh
101cd $WORKING_DIR/ml_compiler_opt
102extract_ir \
103  --cmd_filter="^-O2|-O3" \
104  --llvm_objcopy_path=$WORKING_DIR/llvm-build/bin/llvm-objcopy \
105  --output_dir=$WORKING_DIR/corpus \
106  --thinlto_build=local \
107  --obj_base_dir=$WORKING_DIR/aosp-master-plus-llvm/out
108```
109
110Edit `corpus_description.json` and add the following to the
111`global_command_override` section:
112
113```
114  "-march=armv8.2-a",
115  "-mcpu=cortex-a55",
116  "--target=aarch64-linux-android10000",
117  "-fPIC",
118  "-fno-exceptions",
119  "-no-canonical-prefixes",
120  "-O2",
121  "-mllvm",
122  "-import-instr-limit=40",
123  "-nostdlib++",
124  "-c"
125```
126
127Now you should have a properly prepared AOSP ThinLTO corpus.
128
129### Collect the Default Trace and Generate Vocab
130
131Follow the remaining steps listed in
132[the Chromium MLGO training demo](https://github.com/google/ml-compiler-opt/blob/main/docs/regalloc-demo/demo.md#collect-the-default-trace-and-generate-vocab)
133, beginning from the "Collect the Default Trace and Generate Vocab" section.
134