xref: /aosp_15_r20/external/executorch/backends/xnnpack/README.md (revision 523fa7a60841cd1ecfb9cc4201f1ca8b03ed023a)
1*523fa7a6SAndroid Build Coastguard Worker# ExecuTorch XNNPACK Delegate
2*523fa7a6SAndroid Build Coastguard Worker
3*523fa7a6SAndroid Build Coastguard WorkerThis subtree contains the XNNPACK Delegate implementation for ExecuTorch.
4*523fa7a6SAndroid Build Coastguard WorkerXNNPACK is an optimized library of neural network inference operators for ARM
5*523fa7a6SAndroid Build Coastguard Workerand x86 CPUs. It is an open source project used by PyTorch. The delegate is the
6*523fa7a6SAndroid Build Coastguard Workermechanism for leveraging the XNNPACK library to accelerate operators running on
7*523fa7a6SAndroid Build Coastguard WorkerCPU.
8*523fa7a6SAndroid Build Coastguard Worker
9*523fa7a6SAndroid Build Coastguard Worker## Layout
10*523fa7a6SAndroid Build Coastguard Worker- `cmake/` : CMake related files
11*523fa7a6SAndroid Build Coastguard Worker- `operators`: the directory to store all of op visitors
12*523fa7a6SAndroid Build Coastguard Worker    - `node_visitor.py`: Implementation of serializing each lowerable operator
13*523fa7a6SAndroid Build Coastguard Worker      node
14*523fa7a6SAndroid Build Coastguard Worker    - ...
15*523fa7a6SAndroid Build Coastguard Worker- `partition/`: Partitioner is used to identify operators in model's graph that
16*523fa7a6SAndroid Build Coastguard Worker  are suitable for lowering to XNNPACK delegate
17*523fa7a6SAndroid Build Coastguard Worker    - `xnnpack_partitioner.py`: Contains partitioner that tags graph patterns
18*523fa7a6SAndroid Build Coastguard Worker      for XNNPACK lowering
19*523fa7a6SAndroid Build Coastguard Worker    - `configs.py`: Contains lists of op/modules for XNNPACK lowering
20*523fa7a6SAndroid Build Coastguard Worker- `passes/`: Contains passes which are used before preprocessing to prepare the
21*523fa7a6SAndroid Build Coastguard Worker  graph for XNNPACK lowering
22*523fa7a6SAndroid Build Coastguard Worker- `runtime/` : Runtime logic used at inference. This contains all the cpp files
23*523fa7a6SAndroid Build Coastguard Worker  used to build the runtime graph and execute the XNNPACK model
24*523fa7a6SAndroid Build Coastguard Worker- `serialization/`: Contains files related to serializing the XNNPACK graph
25*523fa7a6SAndroid Build Coastguard Worker  representation of the PyTorch model
26*523fa7a6SAndroid Build Coastguard Worker    - `schema.fbs`: Flatbuffer schema of serialization format
27*523fa7a6SAndroid Build Coastguard Worker    - `xnnpack_graph_schema.py`: Python dataclasses mirroring the flatbuffer
28*523fa7a6SAndroid Build Coastguard Worker      schema
29*523fa7a6SAndroid Build Coastguard Worker    - `xnnpack_graph_serialize`: Implementation for serializing dataclasses
30*523fa7a6SAndroid Build Coastguard Worker      from graph schema to flatbuffer
31*523fa7a6SAndroid Build Coastguard Worker- `test/`: Tests for XNNPACK Delegate
32*523fa7a6SAndroid Build Coastguard Worker- `third-party/`: third-party libraries used by XNNPACK Delegate
33*523fa7a6SAndroid Build Coastguard Worker- `xnnpack_preprocess.py`: Contains preprocess implementation which is called
34*523fa7a6SAndroid Build Coastguard Worker  by `to_backend` on the graph or subgraph of a model returning a preprocessed
35*523fa7a6SAndroid Build Coastguard Worker  blob responsible for executing the graph or subgraph at runtime
36*523fa7a6SAndroid Build Coastguard Worker
37*523fa7a6SAndroid Build Coastguard Worker## End to End Example
38*523fa7a6SAndroid Build Coastguard Worker
39*523fa7a6SAndroid Build Coastguard WorkerTo further understand the features of the XNNPACK Delegate and how to use it, consider the following end to end example with MobilenetV2.
40*523fa7a6SAndroid Build Coastguard Worker
41*523fa7a6SAndroid Build Coastguard Worker### Lowering a model to XNNPACK
42*523fa7a6SAndroid Build Coastguard Worker```python
43*523fa7a6SAndroid Build Coastguard Workerimport torch
44*523fa7a6SAndroid Build Coastguard Workerimport torchvision.models as models
45*523fa7a6SAndroid Build Coastguard Worker
46*523fa7a6SAndroid Build Coastguard Workerfrom torch.export import export, ExportedProgram
47*523fa7a6SAndroid Build Coastguard Workerfrom torchvision.models.mobilenetv2 import MobileNet_V2_Weights
48*523fa7a6SAndroid Build Coastguard Workerfrom executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
49*523fa7a6SAndroid Build Coastguard Workerfrom executorch.exir import EdgeProgramManager, ExecutorchProgramManager, to_edge
50*523fa7a6SAndroid Build Coastguard Workerfrom executorch.exir.backend.backend_api import to_backend
51*523fa7a6SAndroid Build Coastguard Worker
52*523fa7a6SAndroid Build Coastguard Worker
53*523fa7a6SAndroid Build Coastguard Workermobilenet_v2 = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
54*523fa7a6SAndroid Build Coastguard Workersample_inputs = (torch.randn(1, 3, 224, 224), )
55*523fa7a6SAndroid Build Coastguard Worker
56*523fa7a6SAndroid Build Coastguard Workerexported_program: ExportedProgram = export(mobilenet_v2, sample_inputs)
57*523fa7a6SAndroid Build Coastguard Workeredge: EdgeProgramManager = to_edge(exported_program)
58*523fa7a6SAndroid Build Coastguard Worker
59*523fa7a6SAndroid Build Coastguard Workeredge = edge.to_backend(XnnpackPartitioner())
60*523fa7a6SAndroid Build Coastguard Worker```
61*523fa7a6SAndroid Build Coastguard Worker
62*523fa7a6SAndroid Build Coastguard WorkerWe will go through this example with the [MobileNetV2](https://pytorch.org/hub/pytorch_vision_mobilenet_v2/) pretrained model downloaded from the TorchVision library. The flow of lowering a model starts after exporting the model `to_edge`. We call the `to_backend` api with the `XnnpackPartitioner`. The partitioner identifies the subgraphs suitable for XNNPACK backend delegate to consume. Afterwards, the identified subgraphs will be serialized with the XNNPACK Delegate flatbuffer schema and each subgraph will be replaced with a call to the XNNPACK Delegate.
63*523fa7a6SAndroid Build Coastguard Worker
64*523fa7a6SAndroid Build Coastguard Worker```python
65*523fa7a6SAndroid Build Coastguard Worker>>> print(edge.exported_program().graph_module)
66*523fa7a6SAndroid Build Coastguard WorkerGraphModule(
67*523fa7a6SAndroid Build Coastguard Worker  (lowered_module_0): LoweredBackendModule()
68*523fa7a6SAndroid Build Coastguard Worker  (lowered_module_1): LoweredBackendModule()
69*523fa7a6SAndroid Build Coastguard Worker)
70*523fa7a6SAndroid Build Coastguard Worker
71*523fa7a6SAndroid Build Coastguard Workerdef forward(self, arg314_1):
72*523fa7a6SAndroid Build Coastguard Worker    lowered_module_0 = self.lowered_module_0
73*523fa7a6SAndroid Build Coastguard Worker    executorch_call_delegate = torch.ops.higher_order.executorch_call_delegate(lowered_module_0, arg314_1);  lowered_module_0 = arg314_1 = None
74*523fa7a6SAndroid Build Coastguard Worker    getitem = executorch_call_delegate[0];  executorch_call_delegate = None
75*523fa7a6SAndroid Build Coastguard Worker    aten_view_copy_default = executorch_exir_dialects_edge__ops_aten_view_copy_default(getitem, [1, 1280]);  getitem = None
76*523fa7a6SAndroid Build Coastguard Worker    aten_clone_default = executorch_exir_dialects_edge__ops_aten_clone_default(aten_view_copy_default);  aten_view_copy_default = None
77*523fa7a6SAndroid Build Coastguard Worker    lowered_module_1 = self.lowered_module_1
78*523fa7a6SAndroid Build Coastguard Worker    executorch_call_delegate_1 = torch.ops.higher_order.executorch_call_delegate(lowered_module_1, aten_clone_default);  lowered_module_1 = aten_clone_default = None
79*523fa7a6SAndroid Build Coastguard Worker    getitem_1 = executorch_call_delegate_1[0];  executorch_call_delegate_1 = None
80*523fa7a6SAndroid Build Coastguard Worker    return (getitem_1,)
81*523fa7a6SAndroid Build Coastguard Worker```
82*523fa7a6SAndroid Build Coastguard Worker
83*523fa7a6SAndroid Build Coastguard WorkerWe print the graph after lowering above to show the new nodes that were inserted to call the XNNPACK Delegate. The subgraphs which are being delegated to XNNPACK are the first argument at each call site. It can be observed that the majority of `convolution-relu-add` blocks and `linear` blocks were able to be delegated to XNNPACK. We can also see the operators which were not able to be lowered to the XNNPACK delegate, such as `clone` and `view_copy`.
84*523fa7a6SAndroid Build Coastguard Worker
85*523fa7a6SAndroid Build Coastguard Worker```python
86*523fa7a6SAndroid Build Coastguard Workerexec_prog = edge.to_executorch()
87*523fa7a6SAndroid Build Coastguard Worker
88*523fa7a6SAndroid Build Coastguard Workerwith open("xnnpack_mobilenetv2.pte", "wb") as file:
89*523fa7a6SAndroid Build Coastguard Worker    exec_prog.write_to_file(file)
90*523fa7a6SAndroid Build Coastguard Worker```
91*523fa7a6SAndroid Build Coastguard WorkerAfter lowering to the XNNPACK Program, we can then prepare it for executorch and save the model as a `.pte` file. `.pte` is a binary format that stores the serialized ExecuTorch graph.
92*523fa7a6SAndroid Build Coastguard Worker
93*523fa7a6SAndroid Build Coastguard Worker
94*523fa7a6SAndroid Build Coastguard Worker### Running the XNNPACK Model with CMake
95*523fa7a6SAndroid Build Coastguard WorkerAfter exporting the XNNPACK Delegated model, we can now try running it with example inputs using CMake. We can build and use the xnn_executor_runner, which is a sample wrapper for the ExecuTorch Runtime and XNNPACK Backend. We first begin by configuring the CMake build like such:
96*523fa7a6SAndroid Build Coastguard Worker```bash
97*523fa7a6SAndroid Build Coastguard Worker# cd to the root of executorch repo
98*523fa7a6SAndroid Build Coastguard Workercd executorch
99*523fa7a6SAndroid Build Coastguard Worker
100*523fa7a6SAndroid Build Coastguard Worker# Get a clean cmake-out directory
101*523fa7a6SAndroid Build Coastguard Workerrm -rf cmake-out
102*523fa7a6SAndroid Build Coastguard Workermkdir cmake-out
103*523fa7a6SAndroid Build Coastguard Worker
104*523fa7a6SAndroid Build Coastguard Worker# Configure cmake
105*523fa7a6SAndroid Build Coastguard Workercmake \
106*523fa7a6SAndroid Build Coastguard Worker    -DCMAKE_INSTALL_PREFIX=cmake-out \
107*523fa7a6SAndroid Build Coastguard Worker    -DCMAKE_BUILD_TYPE=Release \
108*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
109*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
110*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
111*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_XNNPACK=ON \
112*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_ENABLE_LOGGING=ON \
113*523fa7a6SAndroid Build Coastguard Worker    -DPYTHON_EXECUTABLE=python \
114*523fa7a6SAndroid Build Coastguard Worker    -Bcmake-out .
115*523fa7a6SAndroid Build Coastguard Worker```
116*523fa7a6SAndroid Build Coastguard WorkerThen you can build the runtime componenets with
117*523fa7a6SAndroid Build Coastguard Worker
118*523fa7a6SAndroid Build Coastguard Worker```bash
119*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-out -j9 --target install --config Release
120*523fa7a6SAndroid Build Coastguard Worker```
121*523fa7a6SAndroid Build Coastguard Worker
122*523fa7a6SAndroid Build Coastguard WorkerNow you should be able to find the executable built at `./cmake-out/backends/xnnpack/xnn_executor_runner` you can run the executable with the model you generated as such
123*523fa7a6SAndroid Build Coastguard Worker```bash
124*523fa7a6SAndroid Build Coastguard Worker./cmake-out/backends/xnnpack/xnn_executor_runner --model_path=./mv2_xnnpack_fp32.pte
125*523fa7a6SAndroid Build Coastguard Worker```
126*523fa7a6SAndroid Build Coastguard Worker
127*523fa7a6SAndroid Build Coastguard Worker## Help & Improvements
128*523fa7a6SAndroid Build Coastguard WorkerIf you have problems or questions, or have suggestions for ways to make
129*523fa7a6SAndroid Build Coastguard Workerimplementation and testing better, please reach out to the PyTorch Edge team or
130*523fa7a6SAndroid Build Coastguard Workercreate an issue on [github](https://www.github.com/pytorch/executorch/issues).
131*523fa7a6SAndroid Build Coastguard Worker
132*523fa7a6SAndroid Build Coastguard Worker
133*523fa7a6SAndroid Build Coastguard Worker## See Also
134*523fa7a6SAndroid Build Coastguard WorkerFor more information about the XNNPACK Delegate, please check out the following resources:
135*523fa7a6SAndroid Build Coastguard Worker- [ExecuTorch XNNPACK Delegate](https://pytorch.org/executorch/0.2/native-delegates-executorch-xnnpack-delegate.html)
136*523fa7a6SAndroid Build Coastguard Worker- [Building and Running ExecuTorch with XNNPACK Backend](https://pytorch.org/executorch/0.2/native-delegates-executorch-xnnpack-delegate.html)
137