1# Directory Structure 2 3Below is the layout of the `examples/mediatek` directory, which includes the necessary files for the example applications: 4 5```plaintext 6examples/mediatek 7├── aot_utils # Utils for AoT export 8 ├── llm_utils # Utils for LLM models 9 ├── preformatter_templates # Model specific prompt preformatter templates 10 ├── prompts # Calibration Prompts 11 ├── tokenizers_ # Model tokenizer scripts 12 ├── oss_utils # Utils for oss models 13├── eval_utils # Utils for eval oss models 14├── model_export_scripts # Model specifc export scripts 15├── models # Model definitions 16 ├── llm_models # LLM model definitions 17 ├── weights # LLM model weights location (Offline) [Ensure that config.json, relevant tokenizer files and .bin or .safetensors weights file(s) are placed here] 18├── executor_runner # Example C++ wrapper for the ExecuTorch runtime 19├── pte # Generated .pte files location 20├── shell_scripts # Shell scripts to quickrun model specific exports 21├── CMakeLists.txt # CMake build configuration file for compiling examples 22├── requirements.txt # MTK and other required packages 23├── mtk_build_examples.sh # Script for building MediaTek backend and the examples 24└── README.md # Documentation for the examples (this file) 25``` 26# Examples Build Instructions 27 28## Environment Setup 29- Follow the instructions of **Prerequisites** and **Setup** in `backends/mediatek/scripts/README.md`. 30 31## Build MediaTek Examples 321. Build the backend and the examples by exedcuting the script: 33```bash 34./mtk_build_examples.sh 35``` 36 37## LLaMa Example Instructions 38##### Note: Verify that localhost connection is available before running AoT Flow 391. Exporting Models to `.pte` 40- In the `examples/mediatek directory`, run: 41```bash 42source shell_scripts/export_llama.sh <model_name> <num_chunks> <prompt_num_tokens> <cache_size> <calibration_set_name> 43``` 44- Defaults: 45 - `model_name` = llama3 46 - `num_chunks` = 4 47 - `prompt_num_tokens` = 128 48 - `cache_size` = 1024 49 - `calibration_set_name` = None 50- Argument Explanations/Options: 51 - `model_name`: llama2/llama3 52 <sub>**Note: Currently Only Tested on Llama2 7B Chat and Llama3 8B Instruct.**</sub> 53 - `num_chunks`: Number of chunks to split the model into. Each chunk contains the same number of decoder layers. Will result in `num_chunks` number of `.pte` files being generated. Typical values are 1, 2 and 4. 54 - `prompt_num_tokens`: Number of tokens (> 1) consumed each forward pass for the prompt processing stage. 55 - `cache_size`: Cache Size. 56 - `calibration_set_name`: Name of calibration dataset with extension that is found inside the `aot_utils/llm_utils/prompts` directory. Example: `alpaca.txt`. If `"None"`, will use dummy data to calibrate. 57 <sub>**Note: Export script example only tested on `.txt` file.**</sub> 58 592. `.pte` files will be generated in `examples/mediatek/pte` 60 - Users should expect `num_chunks*2` number of pte files (half of them for prompt and half of them for generation). 61 - Generation `.pte` files have "`1t`" in their names. 62 - Additionally, an embedding bin file will be generated in the weights folder where the `config.json` can be found in. [`examples/mediatek/models/llm_models/weights/<model_name>/embedding_<model_config_folder>_fp32.bin`] 63 - eg. For `llama3-8B-instruct`, embedding bin generated in `examples/mediatek/models/llm_models/weights/llama3-8B-instruct/` 64 - AoT flow will take roughly 2.5 hours (114GB RAM for `num_chunks=4`) to complete (Results will vary by device/hardware configurations) 65 66### oss 671. Exporting Model to `.pte` 68```bash 69bash shell_scripts/export_oss.sh <model_name> 70``` 71- Argument Options: 72 - `model_name`: deeplabv3/edsr/inceptionv3/inceptionv4/mobilenetv2/mobilenetv3/resnet18/resnet50 73 74# Runtime 75## Environment Setup 76 77To set up the build environment for the `mtk_executor_runner`: 78 791. Navigate to the `backends/mediatek/scripts` directory within the repository. 802. Follow the detailed build steps provided in that location. 813. Upon successful completion of the build steps, the `mtk_executor_runner` binary will be generated. 82 83## Deploying and Running on the Device 84 85### Pushing Files to the Device 86 87Transfer the `.pte` model files and the `mtk_executor_runner` binary to your Android device using the following commands: 88 89```bash 90adb push mtk_executor_runner <PHONE_PATH, e.g. /data/local/tmp> 91adb push <MODEL_NAME>.pte <PHONE_PATH, e.g. /data/local/tmp> 92``` 93 94Make sure to replace `<MODEL_NAME>` with the actual name of your model file. And, replace the `<PHONE_PATH>` with the desired detination on the device. 95 96##### Note: For oss models, please push additional files to your Android device 97```bash 98adb push mtk_oss_executor_runner <PHONE_PATH, e.g. /data/local/tmp> 99adb push input_list.txt <PHONE_PATH, e.g. /data/local/tmp> 100for i in input*bin; do adb push "$i" <PHONE_PATH, e.g. /data/local/tmp>; done; 101``` 102 103### Executing the Model 104 105Execute the model on your Android device by running: 106 107```bash 108adb shell "/data/local/tmp/mtk_executor_runner --model_path /data/local/tmp/<MODEL_NAME>.pte --iteration <ITER_TIMES>" 109``` 110 111In the command above, replace `<MODEL_NAME>` with the name of your model file and `<ITER_TIMES>` with the desired number of iterations to run the model. 112 113##### Note: For llama models, please use `mtk_llama_executor_runner`. Refer to `examples/mediatek/executor_runner/run_llama3_sample.sh` for reference. 114##### Note: For oss models, please use `mtk_oss_executor_runner`. 115```bash 116adb shell "/data/local/tmp/mtk_oss_executor_runner --model_path /data/local/tmp/<MODEL_NAME>.pte --input_list /data/local/tmp/input_list.txt --output_folder /data/local/tmp/output_<MODEL_NAME>" 117adb pull "/data/local/tmp/output_<MODEL_NAME> ./" 118``` 119 120### Check oss result on PC 121```bash 122python3 eval_utils/eval_oss_result.py --eval_type <eval_type> --target_f <golden_folder> --output_f <prediction_folder> 123``` 124For example: 125``` 126python3 eval_utils/eval_oss_result.py --eval_type piq --target_f edsr --output_f output_edsr 127``` 128- Argument Options: 129 - `eval_type`: topk/piq/segmentation 130 - `target_f`: folder contain golden data files. file name is `golden_<data_idx>_0.bin` 131 - `output_f`: folder contain model output data files. file name is `output_<data_idx>_0.bin` 132