xref: /aosp_15_r20/external/executorch/examples/xnnpack/README.md (revision 523fa7a60841cd1ecfb9cc4201f1ca8b03ed023a)
1*523fa7a6SAndroid Build Coastguard Worker# XNNPACK Backend
2*523fa7a6SAndroid Build Coastguard Worker
3*523fa7a6SAndroid Build Coastguard Worker[XNNPACK](https://github.com/google/XNNPACK) is a library of optimized neural network operators for ARM and x86 CPU platforms. Our delegate lowers models to run using these highly optimized CPU operators. You can try out lowering and running some example models in the demo. Please refer to the following docs for information on the XNNPACK Delegate
4*523fa7a6SAndroid Build Coastguard Worker- [XNNPACK Backend Delegate Overview](https://pytorch.org/executorch/stable/native-delegates-executorch-xnnpack-delegate.html)
5*523fa7a6SAndroid Build Coastguard Worker- [XNNPACK Delegate Export Tutorial](https://pytorch.org/executorch/stable/tutorial-xnnpack-delegate-lowering.html)
6*523fa7a6SAndroid Build Coastguard Worker
7*523fa7a6SAndroid Build Coastguard Worker
8*523fa7a6SAndroid Build Coastguard Worker## Directory structure
9*523fa7a6SAndroid Build Coastguard Worker
10*523fa7a6SAndroid Build Coastguard Worker```bash
11*523fa7a6SAndroid Build Coastguard Workerexamples/xnnpack
12*523fa7a6SAndroid Build Coastguard Worker├── quantization                      # Scripts to illustrate PyTorch 2 Export Quantization workflow with XNNPACKQuantizer
13*523fa7a6SAndroid Build Coastguard Worker│   └── example.py
14*523fa7a6SAndroid Build Coastguard Worker├── aot_compiler.py                   # The main script to illustrate the full AOT (export, quantization, delegation) workflow with XNNPACK delegate
15*523fa7a6SAndroid Build Coastguard Worker└── README.md                         # This file
16*523fa7a6SAndroid Build Coastguard Worker```
17*523fa7a6SAndroid Build Coastguard Worker
18*523fa7a6SAndroid Build Coastguard Worker## Delegating a Floating-point Model
19*523fa7a6SAndroid Build Coastguard Worker
20*523fa7a6SAndroid Build Coastguard WorkerThe following command will produce a floating-point XNNPACK delegated model `mv2_xnnpack_fp32.pte` that can be run using XNNPACK's operators. It will also print out the lowered graph, showing what parts of the models have been lowered to XNNPACK via `executorch_call_delegate`.
21*523fa7a6SAndroid Build Coastguard Worker
22*523fa7a6SAndroid Build Coastguard Worker```bash
23*523fa7a6SAndroid Build Coastguard Worker# For MobileNet V2
24*523fa7a6SAndroid Build Coastguard Workerpython3 -m examples.xnnpack.aot_compiler --model_name="mv2" --delegate
25*523fa7a6SAndroid Build Coastguard Worker```
26*523fa7a6SAndroid Build Coastguard Worker
27*523fa7a6SAndroid Build Coastguard WorkerOnce we have the model binary (pte) file, then let's run it with ExecuTorch runtime using the `xnn_executor_runner`. With cmake, you first configure your cmake with the following:
28*523fa7a6SAndroid Build Coastguard Worker
29*523fa7a6SAndroid Build Coastguard Worker```bash
30*523fa7a6SAndroid Build Coastguard Worker# cd to the root of executorch repo
31*523fa7a6SAndroid Build Coastguard Workercd executorch
32*523fa7a6SAndroid Build Coastguard Worker
33*523fa7a6SAndroid Build Coastguard Worker# Get a clean cmake-out directory
34*523fa7a6SAndroid Build Coastguard Workerrm -rf cmake-out
35*523fa7a6SAndroid Build Coastguard Workermkdir cmake-out
36*523fa7a6SAndroid Build Coastguard Worker
37*523fa7a6SAndroid Build Coastguard Worker# Configure cmake
38*523fa7a6SAndroid Build Coastguard Workercmake \
39*523fa7a6SAndroid Build Coastguard Worker    -DCMAKE_INSTALL_PREFIX=cmake-out \
40*523fa7a6SAndroid Build Coastguard Worker    -DCMAKE_BUILD_TYPE=Release \
41*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
42*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
43*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
44*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_XNNPACK=ON \
45*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_ENABLE_LOGGING=ON \
46*523fa7a6SAndroid Build Coastguard Worker    -DPYTHON_EXECUTABLE=python \
47*523fa7a6SAndroid Build Coastguard Worker    -Bcmake-out .
48*523fa7a6SAndroid Build Coastguard Worker```
49*523fa7a6SAndroid Build Coastguard Worker
50*523fa7a6SAndroid Build Coastguard WorkerThen you can build the runtime components with
51*523fa7a6SAndroid Build Coastguard Worker
52*523fa7a6SAndroid Build Coastguard Worker```bash
53*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-out -j9 --target install --config Release
54*523fa7a6SAndroid Build Coastguard Worker```
55*523fa7a6SAndroid Build Coastguard Worker
56*523fa7a6SAndroid Build Coastguard WorkerNow finally you should be able to run this model with the following command
57*523fa7a6SAndroid Build Coastguard Worker
58*523fa7a6SAndroid Build Coastguard Worker```bash
59*523fa7a6SAndroid Build Coastguard Worker./cmake-out/backends/xnnpack/xnn_executor_runner --model_path ./mv2_xnnpack_fp32.pte
60*523fa7a6SAndroid Build Coastguard Worker```
61*523fa7a6SAndroid Build Coastguard Worker
62*523fa7a6SAndroid Build Coastguard Worker## Quantization
63*523fa7a6SAndroid Build Coastguard WorkerFirst, learn more about the generic PyTorch 2 Export Quantization workflow in the [Quantization Flow Docs](https://pytorch.org/executorch/stable/quantization-overview.html), if you are not familiar already.
64*523fa7a6SAndroid Build Coastguard Worker
65*523fa7a6SAndroid Build Coastguard WorkerHere we will discuss quantizing a model suitable for XNNPACK delegation using XNNPACKQuantizer.
66*523fa7a6SAndroid Build Coastguard Worker
67*523fa7a6SAndroid Build Coastguard WorkerThough it is typical to run this quantized mode via XNNPACK delegate, we want to highlight that this is just another quantization flavor, and we can run this quantized model without necessarily using XNNPACK delegate, but only using standard quantization operators.
68*523fa7a6SAndroid Build Coastguard Worker
69*523fa7a6SAndroid Build Coastguard WorkerA shared library to register the out variants of the quantized operators (e.g., `quantized_decomposed::add.out`) into EXIR is required. On cmake, follow the instructions in `test_quantize.sh` to build it, the default path is `cmake-out/kernels/quantized/libquantized_ops_lib.so`.
70*523fa7a6SAndroid Build Coastguard Worker
71*523fa7a6SAndroid Build Coastguard WorkerThen you can generate a XNNPACK quantized model with the following command by passing the path to the shared library into the script `quantization/example.py`:
72*523fa7a6SAndroid Build Coastguard Worker```bash
73*523fa7a6SAndroid Build Coastguard Workerpython3 -m examples.xnnpack.quantization.example --model_name "mv2" --so_library "<path/to/so/lib>" # for MobileNetv2
74*523fa7a6SAndroid Build Coastguard Worker
75*523fa7a6SAndroid Build Coastguard Worker# This should generate ./mv2_quantized.pte file, if successful.
76*523fa7a6SAndroid Build Coastguard Worker```
77*523fa7a6SAndroid Build Coastguard WorkerYou can find more valid quantized example models by running:
78*523fa7a6SAndroid Build Coastguard Worker```bash
79*523fa7a6SAndroid Build Coastguard Workerpython3 -m examples.xnnpack.quantization.example --help
80*523fa7a6SAndroid Build Coastguard Worker```
81*523fa7a6SAndroid Build Coastguard Worker
82*523fa7a6SAndroid Build Coastguard Worker## Running the XNNPACK Model with CMake
83*523fa7a6SAndroid Build Coastguard WorkerAfter exporting the XNNPACK Delegated model, we can now try running it with example inputs using CMake. We can build and use the xnn_executor_runner, which is a sample wrapper for the ExecuTorch Runtime and XNNPACK Backend. We first begin by configuring the CMake build like such:
84*523fa7a6SAndroid Build Coastguard Worker```bash
85*523fa7a6SAndroid Build Coastguard Worker# cd to the root of executorch repo
86*523fa7a6SAndroid Build Coastguard Workercd executorch
87*523fa7a6SAndroid Build Coastguard Worker
88*523fa7a6SAndroid Build Coastguard Worker# Get a clean cmake-out directory
89*523fa7a6SAndroid Build Coastguard Workerrm -rf cmake-out
90*523fa7a6SAndroid Build Coastguard Workermkdir cmake-out
91*523fa7a6SAndroid Build Coastguard Worker
92*523fa7a6SAndroid Build Coastguard Worker# Configure cmake
93*523fa7a6SAndroid Build Coastguard Workercmake \
94*523fa7a6SAndroid Build Coastguard Worker    -DCMAKE_INSTALL_PREFIX=cmake-out \
95*523fa7a6SAndroid Build Coastguard Worker    -DCMAKE_BUILD_TYPE=Release \
96*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
97*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
98*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
99*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_XNNPACK=ON \
100*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_ENABLE_LOGGING=ON \
101*523fa7a6SAndroid Build Coastguard Worker    -DPYTHON_EXECUTABLE=python \
102*523fa7a6SAndroid Build Coastguard Worker    -Bcmake-out .
103*523fa7a6SAndroid Build Coastguard Worker```
104*523fa7a6SAndroid Build Coastguard WorkerThen you can build the runtime componenets with
105*523fa7a6SAndroid Build Coastguard Worker
106*523fa7a6SAndroid Build Coastguard Worker```bash
107*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-out -j9 --target install --config Release
108*523fa7a6SAndroid Build Coastguard Worker```
109*523fa7a6SAndroid Build Coastguard Worker
110*523fa7a6SAndroid Build Coastguard WorkerNow you should be able to find the executable built at `./cmake-out/backends/xnnpack/xnn_executor_runner` you can run the executable with the model you generated as such
111*523fa7a6SAndroid Build Coastguard Worker```bash
112*523fa7a6SAndroid Build Coastguard Worker./cmake-out/backends/xnnpack/xnn_executor_runner --model_path=./mv2_quantized.pte
113*523fa7a6SAndroid Build Coastguard Worker```
114*523fa7a6SAndroid Build Coastguard Worker
115*523fa7a6SAndroid Build Coastguard Worker## Delegating a Quantized Model
116*523fa7a6SAndroid Build Coastguard Worker
117*523fa7a6SAndroid Build Coastguard WorkerThe following command will produce a XNNPACK quantized and delegated model `mv2_xnnpack_q8.pte` that can be run using XNNPACK's operators. It will also print out the lowered graph, showing what parts of the models have been lowered to XNNPACK via `executorch_call_delegate`.
118*523fa7a6SAndroid Build Coastguard Worker
119*523fa7a6SAndroid Build Coastguard Worker```bash
120*523fa7a6SAndroid Build Coastguard Workerpython3 -m examples.xnnpack.aot_compiler --model_name "mv2" --quantize --delegate
121*523fa7a6SAndroid Build Coastguard Worker```
122