Name Date Size #Lines LOC

..--

aot_utils/H25-Apr-2025-2,3751,942

eval_utils/H25-Apr-2025-199153

executor_runner/H25-Apr-2025-4,5503,221

model_export_scripts/H25-Apr-2025-1,5101,197

models/llm_models/H25-Apr-2025-1,238,9831,238,828

shell_scripts/H25-Apr-2025-7971

CMakeLists.txtH A D25-Apr-20255.4 KiB180153

README.mdH A D25-Apr-20256.6 KiB132109

mtk_build_examples.shH A D25-Apr-20252.3 KiB7644

README.md

1# Directory Structure
2
3Below is the layout of the `examples/mediatek` directory, which includes the necessary files for the example applications:
4
5```plaintext
6examples/mediatek
7├── aot_utils                         # Utils for AoT export
8    ├── llm_utils                     # Utils for LLM models
9        ├── preformatter_templates    # Model specific prompt preformatter templates
10        ├── prompts                   # Calibration Prompts
11        ├── tokenizers_               # Model tokenizer scripts
12    ├── oss_utils                     # Utils for oss models
13├── eval_utils                        # Utils for eval oss models
14├── model_export_scripts              # Model specifc export scripts
15├── models                            # Model definitions
16    ├── llm_models                    # LLM model definitions
17        ├── weights                   # LLM model weights location (Offline) [Ensure that config.json, relevant tokenizer files and .bin or .safetensors weights file(s) are placed here]
18├── executor_runner                   # Example C++ wrapper for the ExecuTorch runtime
19├── pte                               # Generated .pte files location
20├── shell_scripts                     # Shell scripts to quickrun model specific exports
21├── CMakeLists.txt                    # CMake build configuration file for compiling examples
22├── requirements.txt                  # MTK and other required packages
23├── mtk_build_examples.sh             # Script for building MediaTek backend and the examples
24└── README.md                         # Documentation for the examples (this file)
25```
26# Examples Build Instructions
27
28## Environment Setup
29- Follow the instructions of **Prerequisites** and **Setup** in `backends/mediatek/scripts/README.md`.
30
31## Build MediaTek Examples
321. Build the backend and the examples by exedcuting the script:
33```bash
34./mtk_build_examples.sh
35```
36
37## LLaMa Example Instructions
38##### Note: Verify that localhost connection is available before running AoT Flow
391. Exporting Models to `.pte`
40- In the `examples/mediatek directory`, run:
41```bash
42source shell_scripts/export_llama.sh <model_name> <num_chunks> <prompt_num_tokens> <cache_size> <calibration_set_name>
43```
44- Defaults:
45    - `model_name` = llama3
46    - `num_chunks` = 4
47    - `prompt_num_tokens` = 128
48    - `cache_size` = 1024
49    - `calibration_set_name` = None
50- Argument Explanations/Options:
51    - `model_name`: llama2/llama3
52    <sub>**Note: Currently Only Tested on Llama2 7B Chat and Llama3 8B Instruct.**</sub>
53    - `num_chunks`: Number of chunks to split the model into. Each chunk contains the same number of decoder layers. Will result in `num_chunks` number of `.pte` files being generated. Typical values are 1, 2 and 4.
54    - `prompt_num_tokens`: Number of tokens (> 1) consumed each forward pass for the prompt processing stage.
55    - `cache_size`: Cache Size.
56    - `calibration_set_name`: Name of calibration dataset with extension that is found inside the `aot_utils/llm_utils/prompts` directory. Example: `alpaca.txt`. If `"None"`, will use dummy data to calibrate.
57    <sub>**Note: Export script example only tested on `.txt` file.**</sub>
58
592. `.pte` files will be generated in `examples/mediatek/pte`
60    - Users should expect `num_chunks*2` number of pte files (half of them for prompt and half of them for generation).
61    - Generation `.pte` files have "`1t`" in their names.
62    - Additionally, an embedding bin file will be generated in the weights folder where the `config.json` can be found in. [`examples/mediatek/models/llm_models/weights/<model_name>/embedding_<model_config_folder>_fp32.bin`]
63    - eg. For `llama3-8B-instruct`, embedding bin generated in `examples/mediatek/models/llm_models/weights/llama3-8B-instruct/`
64    - AoT flow will take roughly 2.5 hours (114GB RAM for `num_chunks=4`) to complete (Results will vary by device/hardware configurations)
65
66### oss
671. Exporting Model to `.pte`
68```bash
69bash shell_scripts/export_oss.sh <model_name>
70```
71- Argument Options:
72    - `model_name`: deeplabv3/edsr/inceptionv3/inceptionv4/mobilenetv2/mobilenetv3/resnet18/resnet50
73
74# Runtime
75## Environment Setup
76
77To set up the build environment for the `mtk_executor_runner`:
78
791. Navigate to the `backends/mediatek/scripts` directory within the repository.
802. Follow the detailed build steps provided in that location.
813. Upon successful completion of the build steps, the `mtk_executor_runner` binary will be generated.
82
83## Deploying and Running on the Device
84
85### Pushing Files to the Device
86
87Transfer the `.pte` model files and the `mtk_executor_runner` binary to your Android device using the following commands:
88
89```bash
90adb push mtk_executor_runner <PHONE_PATH, e.g. /data/local/tmp>
91adb push <MODEL_NAME>.pte <PHONE_PATH, e.g. /data/local/tmp>
92```
93
94Make sure to replace `<MODEL_NAME>` with the actual name of your model file. And, replace the `<PHONE_PATH>` with the desired detination on the device.
95
96##### Note: For oss models, please push additional files to your Android device
97```bash
98adb push mtk_oss_executor_runner <PHONE_PATH, e.g. /data/local/tmp>
99adb push input_list.txt <PHONE_PATH, e.g. /data/local/tmp>
100for i in input*bin; do adb push "$i" <PHONE_PATH, e.g. /data/local/tmp>; done;
101```
102
103### Executing the Model
104
105Execute the model on your Android device by running:
106
107```bash
108adb shell "/data/local/tmp/mtk_executor_runner --model_path /data/local/tmp/<MODEL_NAME>.pte --iteration <ITER_TIMES>"
109```
110
111In the command above, replace `<MODEL_NAME>` with the name of your model file and `<ITER_TIMES>` with the desired number of iterations to run the model.
112
113##### Note: For llama models, please use `mtk_llama_executor_runner`. Refer to `examples/mediatek/executor_runner/run_llama3_sample.sh` for reference.
114##### Note: For oss models, please use `mtk_oss_executor_runner`.
115```bash
116adb shell "/data/local/tmp/mtk_oss_executor_runner --model_path /data/local/tmp/<MODEL_NAME>.pte --input_list /data/local/tmp/input_list.txt --output_folder /data/local/tmp/output_<MODEL_NAME>"
117adb pull "/data/local/tmp/output_<MODEL_NAME> ./"
118```
119
120### Check oss result on PC
121```bash
122python3 eval_utils/eval_oss_result.py --eval_type <eval_type> --target_f <golden_folder> --output_f <prediction_folder>
123```
124For example:
125```
126python3 eval_utils/eval_oss_result.py --eval_type piq --target_f edsr --output_f output_edsr
127```
128- Argument Options:
129    - `eval_type`: topk/piq/segmentation
130    - `target_f`: folder contain golden data files. file name is `golden_<data_idx>_0.bin`
131    - `output_f`: folder contain model output data files. file name is `output_<data_idx>_0.bin`
132