Name Date Size #Lines LOC

..--

_passes/H25-Apr-2025-1,8851,401

operator_support/H25-Apr-2025-222150

operators/H25-Apr-2025-2,7141,883

quantizer/H25-Apr-2025-1,5711,172

runtime/H25-Apr-2025-582424

test/H25-Apr-2025-10,6908,853

third-party/H25-Apr-2025-

util/H25-Apr-2025-210158

CMakeLists.txtH A D25-Apr-20251.2 KiB3628

README.mdH A D25-Apr-20256.4 KiB12182

TARGETSH A D25-Apr-20252.8 KiB113104

arm_backend.pyH A D25-Apr-202510.4 KiB288210

arm_partitioner.pyH A D25-Apr-20253.3 KiB9269

arm_vela.pyH A D25-Apr-20254.1 KiB10658

process_node.pyH A D25-Apr-20257.4 KiB231181

tosa_mapping.pyH A D25-Apr-20253 KiB11175

tosa_quant_utils.pyH A D25-Apr-202513.6 KiB459352

tosa_specification.pyH A D25-Apr-20257.6 KiB227163

tosa_utils.pyH A D25-Apr-20259.2 KiB304216

README.md

1# ExecuTorch Arm/TOSA Delegate
2
3This subtree contains the Arm(R) Delegate implementation for ExecuTorch.
4
5This delegate is structured to, over time, support a number of different Arm devices
6through an AoT flow which targets multiple Arm IP using the TOSA standard.
7
8The expected flow is:
9 * torch.nn.module -> TOSA -> command_stream for fully AoT flows e.g. embedded.
10 * torch.nn.module -> TOSA for flows supporting a JiT compilation step.
11
12Current backend support is being developed for TOSA to Ethos(TM)-U55/65/85 via the
13ethos-u-vela compilation stack. which follows the fully AoT flow.
14
15## Layout
16
17Export:
18- `arm_backend.py` - Main entrypoint for the ArmPartitioner and ArmBackend. For more information see the section on
19[Arm Backend Architecture](#arm-backend-architecture). For examples of use see `executorch/examples/arm`.
20- `tosa_mapping.py` - utilities for mapping edge dialect to TOSA
21- `tosa_quant_utils.py` - utilities for mapping quantization information to TOSA encoding
22
23Operators:
24- `node_visitor.py` - Base class for edge operator lowering
25- `op_*.py` - Edge operator lowering/serialization to TOSA
26
27Passes:
28- `arm_pass_manager.py` - Pass manager. Will decide which passes need to be applied depending on the compile_spec.
29- `*_pass.py` - Compiler passes derived from ExportPass
30
31Quantization:
32- `arm_quantizer.py` - Quantizer for Arm backend
33- `arm_quantizer_utils.py` - Utilities for quantization
34
35Runtime:
36- `runtime/ArmBackendEthosU.cpp` - The Arm backend implementation of the ExecuTorch runtime backend (BackendInterface) for Ethos-U
37
38Other:
39- `third-party/` - Dependencies on other code - in particular the TOSA serialization_lib for compiling to TOSA and the ethos-u-core-driver for the bare-metal backend supporting Ethos-U
40- `test/` - Unit test and test support functions
41
42## Unit tests
43This is the structure of the test directory
44
45```
46test                            #  Root test folder
47├── misc                        #  Testing of debug features
48├── models                      #  Full model tests
49├── ops                         #  Single op tests
50├── passes                      #  Compiler passes tests
51├── tester                      #  Arm Tester class
52├── tosautil                    #  Utility functions for TOSA artifacts
53common.py                     #  Common functions and definitions used by many tests
54```
55
56Some example commands to run these tests follow. Run a single test:
57
58```
59python -m unittest backends.arm.test.ops.test_add.TestSimpleAdd -k test_add2_tosa_BI
60```
61
62Or all tests in "TestSimpleAdd":
63
64```
65python -m unittest backends.arm.test.ops.test_add.TestSimpleAdd
66```
67
68Or discover and run many tests:
69
70```
71python -m unittest discover -s backends/arm/test/ops/
72```
73
74### A note on unit tests
75
76There are currently 3 ways we unit test our code.
771. TOSA main inference. These tests are using non-quantized data and ops. Edge IR representation of the module is lowered to a TOSA flatbuffer, which is tested for numerical correcteness using the ```tosa_reference_model``` tool.
782. TOSA base inference. Same as above, but data and ops are quantized.
793. Ethos-U55. These tests use quantized data and ops (aka TOSA base inference). Edge IR is lowered to a TOSA flatbuffer, which is fed into the Vela compiler. Theses tests are functional tests and do not test numerical correctness, since that should be guaranteed by TOSA.
80
81In order to distinguise between the different tests, the following suffixes have been added to the respective test case.
82* ```_MI``` for main inference
83* ```_BI``` for base inference
84* ```_U55_BI``` for base inference on U55
85
86## Help & Improvements
87If you have problems or questions, or have suggestions for ways to make
88implementation and testing better, please reach out to the Arm team developing this delegate, or
89create an issue on [github](https://www.github.com/pytorch/executorch/issues).
90
91# Arm Backend Architecture
92
93The broad principle with the Arm backend implemention for ExecuTorch is to support multiple Arm devices and device configurations through a largely Homogeneous flow with maximal sharing of class logic.
94
95In practice for compilation, this means that the flow goes via [Arm TOSA](https://www.mlplatform.org/tosa/tosa_spec.html) to produce a common IR and quantization behaviour compatible with our various IP, and typically, device-specific backends to further lower to a device specific binary which can happen ahead of time (within the Python development flow) or at runtime (during a JIT compilation stage).
96
97In practice for the runtime, this means we will share common runtime backend functionality, with the aim for features like debugging to be available through common tooling.
98
99
100## Arm Backend Status and Maturity
101
102The Arm Backend should be considered a prototype quality at this point, likely subject to significant change and improvement, and with a limited coverage of functionality. We are actively developing this codebase.
103
104## Current flows
105
106The ArmBackend has a two stage process,
107- Compile to TOSA to rationalise the graph into known hardware support profiles. Currently this is to v0.80.0 TOSA BI with specific concern to a subset which gives support on Ethos-U55, the target of the initial prototype efforts.
108- Lower via the ethos-u-vela compilation flow which takes TOSA v0.80.0 as an input and produces a low level commandstream for the hardware which is then passed via the delegate to the ethos-u-core-driver for direct execution.
109
110The ArmPartitioner is currenly used to ensure the operations converted are Ethos-U compatible, but will be extended to offer spec-correct TOSA Base inference and TOSA Main Inference generation in future.
111
112### Controlling compilation
113
114It is possible to control the compilation flow to aid in development and debug of both networks and the code itself.
115
116Configuration of the ArmBackend export flow is controlled by CompileSpec information (essentially used as compilation flags) to determine which of these outputs is produced. In particular this allows for use of the tosa_reference_model to run intermediate output to check for correctness and quantization accuracy without a full loop via hardware implemntation.
117
118As this is in active development see the ArmBackend for accurate information on [compilation flags](https://github.com/pytorch/executorch/blob/29f6dc9353e90951ed3fae3c57ae416de0520067/backends/arm/arm_backend.py#L319-L324)
119
120You can also refer to the [example TOSA end-to-end code](/examples/arm/arm_tosa_e2e.py)
121