xref: /aosp_15_r20/external/executorch/kernels/README.md (revision 523fa7a60841cd1ecfb9cc4201f1ca8b03ed023a)
1*523fa7a6SAndroid Build Coastguard WorkerThis subtree contains operator implementations that ExecuTorch clients can use and
2*523fa7a6SAndroid Build Coastguard Workercontribute to. For internal users, please see `executorch/kernels/fb/README.md`.
3*523fa7a6SAndroid Build Coastguard Worker
4*523fa7a6SAndroid Build Coastguard Worker## Layout
5*523fa7a6SAndroid Build Coastguard Worker
6*523fa7a6SAndroid Build Coastguard Worker- `kernels`: Contains implementations and tests for the operators defined
7*523fa7a6SAndroid Build Coastguard Worker  in the YAML files.
8*523fa7a6SAndroid Build Coastguard Worker  - `kernels/portable/cpu`: Pure C++ implementations of the operators defined in the
9*523fa7a6SAndroid Build Coastguard Worker    YAML files.
10*523fa7a6SAndroid Build Coastguard Worker  - `kernels/optimized/cpu`: Optimized C++ implementations of the operators defined in the
11*523fa7a6SAndroid Build Coastguard Worker    YAML files, for specific hardware platforms.
12*523fa7a6SAndroid Build Coastguard Worker  - `kernels/aten`: A thin wrapper layer to hookup ATen library into ExecuTorch.
13*523fa7a6SAndroid Build Coastguard Worker  - `kernels/test`: Tests for all operator implementations. Since all
14*523fa7a6SAndroid Build Coastguard Worker    implementations should behave identically, the same tests should pass for
15*523fa7a6SAndroid Build Coastguard Worker    all target types.
16*523fa7a6SAndroid Build Coastguard Worker
17*523fa7a6SAndroid Build Coastguard Worker## Help & Improvements
18*523fa7a6SAndroid Build Coastguard Worker
19*523fa7a6SAndroid Build Coastguard WorkerIf you have problems or questions, or have suggestions for ways to make
20*523fa7a6SAndroid Build Coastguard Workerimplementation and testing better, please contact [Dave
21*523fa7a6SAndroid Build Coastguard WorkerBort](https://fb.workplace.com/profile.php?id=100042415022179), [Mengwei
22*523fa7a6SAndroid Build Coastguard WorkerLiu](https://fb.workplace.com/profile.php?id=100024007250862), or [Martin
23*523fa7a6SAndroid Build Coastguard Worker Yuan](https://fb.workplace.com/profile.php?id=100020734910364) on the PyTorch
24*523fa7a6SAndroid Build Coastguard WorkerEdge team.
25*523fa7a6SAndroid Build Coastguard Worker
26*523fa7a6SAndroid Build Coastguard Worker## Contributing
27*523fa7a6SAndroid Build Coastguard Worker
28*523fa7a6SAndroid Build Coastguard WorkerPlease follow these steps and guidelines when adding a new operator
29*523fa7a6SAndroid Build Coastguard Workerimplementation to this library. The goals of these guidelines are to:
30*523fa7a6SAndroid Build Coastguard Worker- Make it straightforward to add new operator implementations.
31*523fa7a6SAndroid Build Coastguard Worker- Ensure that the operator implementations are of high quality, and are easy to
32*523fa7a6SAndroid Build Coastguard Worker  maintain.
33*523fa7a6SAndroid Build Coastguard Worker- Make it easy for users to find available operator implementations, and to
34*523fa7a6SAndroid Build Coastguard Worker  trust in their quality and behavioral stability.
35*523fa7a6SAndroid Build Coastguard Worker
36*523fa7a6SAndroid Build Coastguard Worker### Your code must be compatible with ExecuTorch types
37*523fa7a6SAndroid Build Coastguard Worker
38*523fa7a6SAndroid Build Coastguard WorkerExecuTorch does not use `at::Tensor`, `at::ScalarType`, `c10::Scalar`, or any of
39*523fa7a6SAndroid Build Coastguard Workerthe types defined by PyTorch core in the `at` or `c10` namespaces. To retain
40*523fa7a6SAndroid Build Coastguard Workertigher control over CPU and memory runtime behavior, ExecuTorch reimplements
41*523fa7a6SAndroid Build Coastguard Workercompatible but restricted subsets of those types.
42*523fa7a6SAndroid Build Coastguard Worker
43*523fa7a6SAndroid Build Coastguard Worker[`//runtime/core/exec_aten/exec_aten.h`](https://github.com/pytorch/executorch/blob/main/runtime/core/exec_aten/exec_aten.h)
44*523fa7a6SAndroid Build Coastguard Workercontains the mapping between ATen/c10 types and the ExecuTorch types. The
45*523fa7a6SAndroid Build Coastguard WorkerExecuTorch types are defined in other headers in that same directory,
46*523fa7a6SAndroid Build Coastguard Worker[`//runtime/core/portable_type/`](https://github.com/pytorch/executorch/tree/main/runtime/core/portable_type).
47*523fa7a6SAndroid Build Coastguard Worker
48*523fa7a6SAndroid Build Coastguard WorkerThe ExecuTorch types are source-compatible with the ATen/c10 types; if you write
49*523fa7a6SAndroid Build Coastguard Workercode that works with the ExecuTorch types, then that same code should work when
50*523fa7a6SAndroid Build Coastguard Workerbuilt against ATen/c10. But, there are features of `at::Tensor` and other
51*523fa7a6SAndroid Build Coastguard WorkerATen/c10 types that may not be present. In many cases this is intentional, but
52*523fa7a6SAndroid Build Coastguard Workerin other cases we can consider adding the missing features.
53*523fa7a6SAndroid Build Coastguard Worker
54*523fa7a6SAndroid Build Coastguard Worker### Declare the operator in a YAML file
55*523fa7a6SAndroid Build Coastguard Worker
56*523fa7a6SAndroid Build Coastguard WorkerWe use yaml files to declare the ATen operators or custom operators being implemented by this kernel library.
57*523fa7a6SAndroid Build Coastguard Worker
58*523fa7a6SAndroid Build Coastguard WorkerBefore implementing, the operator must be declared in exactly one of the
59*523fa7a6SAndroid Build Coastguard Workeroperator YAML files:
60*523fa7a6SAndroid Build Coastguard Worker- [`//kernels/portable/functions.yaml`](https://github.com/pytorch/executorch/blob/main/kernels/portable/functions.yaml)
61*523fa7a6SAndroid Build Coastguard Worker  - Add your entry here if your operator overload (e.g., `op: add.out`)
62*523fa7a6SAndroid Build Coastguard Worker    appears in the core pytorch file
63*523fa7a6SAndroid Build Coastguard Worker    [`pytorch/aten/src/ATen/native/native_functions.yaml`](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/native_functions.yaml).
64*523fa7a6SAndroid Build Coastguard Worker  - Also add your entry to [`//kernels/aten/functions.yaml`](https://github.com/pytorch/executorch/blob/main/kernels/aten/functions.yaml) for test coverage.
65*523fa7a6SAndroid Build Coastguard Worker- [`//kernels/portable/custom_ops.yaml`](https://github.com/pytorch/executorch/blob/main/kernels/portable/custom_ops.yaml)
66*523fa7a6SAndroid Build Coastguard Worker  - Add your entry here if your operator overload does *not* appear in the core pytorch `native_functions.yaml`.
67*523fa7a6SAndroid Build Coastguard Worker
68*523fa7a6SAndroid Build Coastguard WorkerThe next sections describe how to add a yaml entry.
69*523fa7a6SAndroid Build Coastguard Worker
70*523fa7a6SAndroid Build Coastguard Worker#### YAML Schema
71*523fa7a6SAndroid Build Coastguard Worker
72*523fa7a6SAndroid Build Coastguard WorkerThis YAML file schema is a DSL to decribe the operators and the kernels that implement them. This YAML file is a contract between AOT model export and runtime execution, that if followed correctly, can make sure ExecuTorch runtime be able to link the C++ implementation of an operator to the exported model artifact. Here are some rules of writing up your own YAML files.
73*523fa7a6SAndroid Build Coastguard Worker
74*523fa7a6SAndroid Build Coastguard Worker**Out variants only**
75*523fa7a6SAndroid Build Coastguard Worker
76*523fa7a6SAndroid Build Coastguard WorkerExecuTorch only supports out-style operators, where:
77*523fa7a6SAndroid Build Coastguard Worker- The caller provides the output Tensor or Tensor list in the final position
78*523fa7a6SAndroid Build Coastguard Worker  with the name `out`.
79*523fa7a6SAndroid Build Coastguard Worker- The C++ function modifies and returns the same `out` argument.
80*523fa7a6SAndroid Build Coastguard Worker  - If the return type in the YAML file is `()` (which maps to void), the C++
81*523fa7a6SAndroid Build Coastguard Worker    function should still modify `out` but does not need to return anything.
82*523fa7a6SAndroid Build Coastguard Worker- The `out` argument must be keyword-only, which means it needs to follow an
83*523fa7a6SAndroid Build Coastguard Worker  argument named `*` like in the `add.out` example below.
84*523fa7a6SAndroid Build Coastguard Worker- Conventionally, these out operators are named using the pattern `<name>.out`
85*523fa7a6SAndroid Build Coastguard Worker  or `<name>.<overload>_out`.
86*523fa7a6SAndroid Build Coastguard Worker
87*523fa7a6SAndroid Build Coastguard WorkerSince all output values are returned via an `out` parameter, ExecuTorch ignores
88*523fa7a6SAndroid Build Coastguard Workerthe actual C++ function return value. But, to be consistent, functions should
89*523fa7a6SAndroid Build Coastguard Workeralways return `out` when the return type is non-`void`.
90*523fa7a6SAndroid Build Coastguard Worker
91*523fa7a6SAndroid Build Coastguard Worker**Can only return `Tensor` or `()`**
92*523fa7a6SAndroid Build Coastguard Worker
93*523fa7a6SAndroid Build Coastguard WorkerExecuTorch only supports operators that return a single `Tensor`, or the unit
94*523fa7a6SAndroid Build Coastguard Workertype `()` (which maps to `void`). It does not support returning any other types,
95*523fa7a6SAndroid Build Coastguard Workerincluding lists, optionals, tuples, or scalars like `bool`.
96*523fa7a6SAndroid Build Coastguard Worker
97*523fa7a6SAndroid Build Coastguard Worker**Supported argument types**
98*523fa7a6SAndroid Build Coastguard Worker
99*523fa7a6SAndroid Build Coastguard WorkerExecuTorch does not support all of the argument types that core PyTorch
100*523fa7a6SAndroid Build Coastguard Workersupports. See [this
101*523fa7a6SAndroid Build Coastguard Workerspreadsheet](https://docs.google.com/spreadsheets/d/1uArc0r1Yq1QSeyRJZKzZ8Wkz0eS9TsM39ghmMAZCXDA/edit#gid=0)
102*523fa7a6SAndroid Build Coastguard Workerfor the list of supported and unsupported types.
103*523fa7a6SAndroid Build Coastguard Worker<!-- TODO(dbort): Once that list stablizes, move to a table in this file
104*523fa7a6SAndroid Build Coastguard Workerso that external users can see it. -->
105*523fa7a6SAndroid Build Coastguard Worker
106*523fa7a6SAndroid Build Coastguard Worker**Functions only, no methods**
107*523fa7a6SAndroid Build Coastguard Worker
108*523fa7a6SAndroid Build Coastguard WorkerExecuTorch does not support Tensor methods, and assumes `variants: function` for
109*523fa7a6SAndroid Build Coastguard Workerall operators. Entries like `variants: method` or `variants: function, method`
110*523fa7a6SAndroid Build Coastguard Workerwill be ignored.
111*523fa7a6SAndroid Build Coastguard Worker
112*523fa7a6SAndroid Build Coastguard Worker#### Add your operator entry
113*523fa7a6SAndroid Build Coastguard Worker
114*523fa7a6SAndroid Build Coastguard WorkerSome examples of operator entry:
115*523fa7a6SAndroid Build Coastguard Worker
116*523fa7a6SAndroid Build Coastguard WorkerATen operator with a default kernel
117*523fa7a6SAndroid Build Coastguard Worker```
118*523fa7a6SAndroid Build Coastguard Worker- op: add.out
119*523fa7a6SAndroid Build Coastguard Worker  kernels:
120*523fa7a6SAndroid Build Coastguard Worker    - arg_meta: null
121*523fa7a6SAndroid Build Coastguard Worker      kernel_name: torch::executor::add_out
122*523fa7a6SAndroid Build Coastguard Worker```
123*523fa7a6SAndroid Build Coastguard Worker
124*523fa7a6SAndroid Build Coastguard WorkerATen operator with a dtype/dim order specialized kernel (works for `Double` dtype and dim order needs to be (0, 1, 2, 3))
125*523fa7a6SAndroid Build Coastguard Worker```
126*523fa7a6SAndroid Build Coastguard Worker- op: add.out
127*523fa7a6SAndroid Build Coastguard Worker  type_alias:
128*523fa7a6SAndroid Build Coastguard Worker    T0: [Double]
129*523fa7a6SAndroid Build Coastguard Worker  dim_order_alias:
130*523fa7a6SAndroid Build Coastguard Worker    D0: [[0, 1, 2, 3]]
131*523fa7a6SAndroid Build Coastguard Worker  kernels:
132*523fa7a6SAndroid Build Coastguard Worker    - arg_meta:
133*523fa7a6SAndroid Build Coastguard Worker        self: [T0, D0]
134*523fa7a6SAndroid Build Coastguard Worker        other: [T0 , D0]
135*523fa7a6SAndroid Build Coastguard Worker        out: [T0, D0]
136*523fa7a6SAndroid Build Coastguard Worker      kernel_name: torch::executor::add_out
137*523fa7a6SAndroid Build Coastguard Worker```
138*523fa7a6SAndroid Build Coastguard Worker
139*523fa7a6SAndroid Build Coastguard WorkerCustom operator with a default kernel
140*523fa7a6SAndroid Build Coastguard Worker```
141*523fa7a6SAndroid Build Coastguard Worker- func: allclose.out(Tensor self, Tensor other, float rtol=1e-05, float atol=1e-08, bool equal_nan=False, bool dummy_param=False, *, Tensor(a!) out) -> Tensor(a!)
142*523fa7a6SAndroid Build Coastguard Worker  kernels:
143*523fa7a6SAndroid Build Coastguard Worker    - arg_meta: null
144*523fa7a6SAndroid Build Coastguard Worker      kernel_name: torch::executor::allclose_out
145*523fa7a6SAndroid Build Coastguard Worker```
146*523fa7a6SAndroid Build Coastguard Worker
147*523fa7a6SAndroid Build Coastguard WorkerTop level attributes:
148*523fa7a6SAndroid Build Coastguard Worker* `op` (if the operator appears in `native_functions.yaml`) or `func` for custom operator. The value for this key needs to be the full operator name (including overload name) for `op` key, or a full operator schema (namespace, operator name, operator overload name and schema string). For schema syntax please refer to this [instruction](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/README.md).
149*523fa7a6SAndroid Build Coastguard Worker
150*523fa7a6SAndroid Build Coastguard Worker* `kernels`: this entry is used to define the information of kernels. It consists of `arg_meta` and `kernel_name`, they are bound together to describe "for input tensors with these metadata, use this kernel".
151*523fa7a6SAndroid Build Coastguard Worker* `type_alias`(optional): we are giving aliases to possible dtype options. `T0: [Double, Float]` means `T0` can be one of `Double` or `Float`.
152*523fa7a6SAndroid Build Coastguard Worker* `dim_order_alias`(optional): similar to `type_alias`, we are giving names to possible dim order options.
153*523fa7a6SAndroid Build Coastguard Worker
154*523fa7a6SAndroid Build Coastguard WorkerAttributes under `kernels`:
155*523fa7a6SAndroid Build Coastguard Worker* `arg_meta`: a list of "tensor arg name" entries. The value for these keys are dtypes and dim orders alias, that are implemented by the corresponding `kernel_name`. This being `null` means the kernel will be used for all types of input.
156*523fa7a6SAndroid Build Coastguard Worker* `kernel_name`: the expected name of the
157*523fa7a6SAndroid Build Coastguard WorkerC++ function that will implement this operator. You can put whatever you want to
158*523fa7a6SAndroid Build Coastguard Workerhere, but you should follow the convention of replacing the `.` in the overload
159*523fa7a6SAndroid Build Coastguard Workername with an underscore, and lowercasing all characters. In this example,
160*523fa7a6SAndroid Build Coastguard Worker`add.out` uses the C++ function named `add_out`. `add.Scalar_out` would become `add_scalar_out`, with a lowercase `S`. We support namespace for kernels, but note that we will be inserting a `native::` to the last level of namespace. So `custom::add_out` in the `kernel_name` will point to `custom::native::add_out`.
161*523fa7a6SAndroid Build Coastguard Worker
162*523fa7a6SAndroid Build Coastguard Worker### Find operator base name
163*523fa7a6SAndroid Build Coastguard Worker
164*523fa7a6SAndroid Build Coastguard WorkerThe base name is the part of the operator name before the `.`, excluding any
165*523fa7a6SAndroid Build Coastguard Workertrailing underscores. The rest of this document refer to this as `<name>`.
166*523fa7a6SAndroid Build Coastguard Worker
167*523fa7a6SAndroid Build Coastguard WorkerE.g., these operator overloads all have a base name of `add`:
168*523fa7a6SAndroid Build Coastguard Worker- `add.Scalar`
169*523fa7a6SAndroid Build Coastguard Worker- `add.Tensor`
170*523fa7a6SAndroid Build Coastguard Worker- `add.out`
171*523fa7a6SAndroid Build Coastguard Worker- `add_.Tensor`
172*523fa7a6SAndroid Build Coastguard Worker
173*523fa7a6SAndroid Build Coastguard WorkerSo, if you were implementing `add.out` then your operator base name would be
174*523fa7a6SAndroid Build Coastguard Worker`add`, and you would replace `<name>` with `add` everywhere below.
175*523fa7a6SAndroid Build Coastguard Worker
176*523fa7a6SAndroid Build Coastguard Worker### Selective build
177*523fa7a6SAndroid Build Coastguard Worker
178*523fa7a6SAndroid Build Coastguard WorkerWhen using macros that require a `NAME` argument, eg. `#define ET_SWITCH_REAL_TYPES_AND(ADDITIONAL, TYPE, CONTEXT, NAME, CTYPE_ALIAS, ...)`, make sure to pass in the same operator name defined in `functions.yaml`. This is the base name + variant, eg. `add.out`, `add.Scalar_out`. The function name is required for dtype selective build, which matches against the operator names and dtypes present in a model.
179*523fa7a6SAndroid Build Coastguard Worker
180*523fa7a6SAndroid Build Coastguard Worker### Overview of files and targets
181*523fa7a6SAndroid Build Coastguard Worker
182*523fa7a6SAndroid Build Coastguard WorkerFor the operator base name `<name>`, you should work with these files. Sections below give more details about what they should contain.
183*523fa7a6SAndroid Build Coastguard Worker
184*523fa7a6SAndroid Build Coastguard Worker- `./kernels/portable/cpu/op_<name>.cpp`: The implementations of operator overloads
185*523fa7a6SAndroid Build Coastguard Worker  with base name `<name>`. This is the file that clients will link into their
186*523fa7a6SAndroid Build Coastguard Worker  runtimes.
187*523fa7a6SAndroid Build Coastguard Worker- `./kernels/portable/CMakeLists.txt`: The CMake build file for all the
188*523fa7a6SAndroid Build Coastguard Worker  `op_<name>.cpp` files in the same directory.
189*523fa7a6SAndroid Build Coastguard Worker- `./kernels/test/op_<name>_test.cpp`: Unit tests for the operator overloads
190*523fa7a6SAndroid Build Coastguard Worker  with base name `<name>`.
191*523fa7a6SAndroid Build Coastguard Worker  - Note that tests under this directory are for portable kernel specific. To
192*523fa7a6SAndroid Build Coastguard Worker    share tests between multiple kernels, we can put tests in ../test.
193*523fa7a6SAndroid Build Coastguard Worker  - Note that the tests do not live under `cpu`; tests should be
194*523fa7a6SAndroid Build Coastguard Worker    implementation-agnostic. This will let us run the same tests against all
195*523fa7a6SAndroid Build Coastguard Worker    implementations of a given operator, which should behave identically.
196*523fa7a6SAndroid Build Coastguard Worker- `./kernels/test/CMakeLists.txt`: The CMake build file for all the
197*523fa7a6SAndroid Build Coastguard Worker  `op_<name>_test.cpp` files in the same directory.
198*523fa7a6SAndroid Build Coastguard Worker
199*523fa7a6SAndroid Build Coastguard WorkerFor an example, see the `add` operator (note that these are slightly different
200*523fa7a6SAndroid Build Coastguard Workerfrom the `add` examples in this doc):
201*523fa7a6SAndroid Build Coastguard Worker- [`executorch/kernels/portable/cpu/op_add.cpp`](https://github.com/pytorch/executorch/blob/main/kernels/portable/cpu/op_add.cpp):
202*523fa7a6SAndroid Build Coastguard Worker  Implementations.
203*523fa7a6SAndroid Build Coastguard Worker- [`./kernels/portable/CMakeLists.txt`](https://github.com/pytorch/executorch/blob/main/kernels/portable/CMakeLists.txt):
204*523fa7a6SAndroid Build Coastguard Worker  Build portable ops.
205*523fa7a6SAndroid Build Coastguard Worker- [`executorch/kernels/portable/test/op_add_test.cpp`](https://github.com/pytorch/executorch/blob/main/kernels/test/op_add_test.cpp):
206*523fa7a6SAndroid Build Coastguard Worker  Unit tests.
207*523fa7a6SAndroid Build Coastguard Worker- [`./kernels/test/CMakeLists.txt`](https://github.com/pytorch/executorch/blob/main/kernels/test/CMakeLists.txt):
208*523fa7a6SAndroid Build Coastguard Worker  Build kernel tests.
209*523fa7a6SAndroid Build Coastguard Worker
210*523fa7a6SAndroid Build Coastguard Worker### Add the operator implementation to CMakeLists.txt
211*523fa7a6SAndroid Build Coastguard Worker
212*523fa7a6SAndroid Build Coastguard WorkerThe portable operator files are collected by [`./kernels/portable/CMakeLists.txt`](https://github.com/pytorch/executorch/blob/main/kernels/portable/CMakeLists.txt) with a glob on `./kernels/portable/cpu/*.cpp`. Ensure your operator file is in that directory.
213*523fa7a6SAndroid Build Coastguard Worker
214*523fa7a6SAndroid Build Coastguard WorkerNOTE: a given `op_<name>` cannot implement both ATen-compatible and
215*523fa7a6SAndroid Build Coastguard Workernon-ATen-compatible (i.e., custom) operators. We suggest adding the suffix
216*523fa7a6SAndroid Build Coastguard Worker`_custom` if necessary: e.g., `op_add` for ATen-compatible overloads of
217*523fa7a6SAndroid Build Coastguard Workerthe `add` operator, and `op_add_custom` for non-ATen-compatible overloads.
218*523fa7a6SAndroid Build Coastguard Worker
219*523fa7a6SAndroid Build Coastguard WorkerNOTE: An `op_<name>` may not have dependencies outside of `//executorch`.
220*523fa7a6SAndroid Build Coastguard WorkerThis library is intended to be portable, open-sourceable, and self-contained.
221*523fa7a6SAndroid Build Coastguard Worker
222*523fa7a6SAndroid Build Coastguard Worker### Create a skeleton .cpp file for the operator implementation
223*523fa7a6SAndroid Build Coastguard Worker
224*523fa7a6SAndroid Build Coastguard WorkerIf not already present, create the file
225*523fa7a6SAndroid Build Coastguard Worker`executorch/kernels/portable/cpu/op_<name>.cpp`, which should follow the
226*523fa7a6SAndroid Build Coastguard Workerpattern:
227*523fa7a6SAndroid Build Coastguard Worker```
228*523fa7a6SAndroid Build Coastguard Worker// Copyright (c) Meta Platforms, Inc. and affiliates.
229*523fa7a6SAndroid Build Coastguard Worker#include <executorch/runtime/kernel/kernel_includes.h>
230*523fa7a6SAndroid Build Coastguard Worker
231*523fa7a6SAndroid Build Coastguard Workernamespace torch {
232*523fa7a6SAndroid Build Coastguard Workernamespace executor {
233*523fa7a6SAndroid Build Coastguard Workernamespace native {
234*523fa7a6SAndroid Build Coastguard Worker
235*523fa7a6SAndroid Build Coastguard Workernamespace {
236*523fa7a6SAndroid Build Coastguard Worker  // <helper code>
237*523fa7a6SAndroid Build Coastguard Worker} // namespace
238*523fa7a6SAndroid Build Coastguard Worker
239*523fa7a6SAndroid Build Coastguard Worker// <operator overload implementations>
240*523fa7a6SAndroid Build Coastguard Worker
241*523fa7a6SAndroid Build Coastguard Worker} // namespace native
242*523fa7a6SAndroid Build Coastguard Worker} // namespace executor
243*523fa7a6SAndroid Build Coastguard Worker} // namespace torch
244*523fa7a6SAndroid Build Coastguard Worker```
245*523fa7a6SAndroid Build Coastguard Worker
246*523fa7a6SAndroid Build Coastguard Worker### Find the function signature for the operator overload
247*523fa7a6SAndroid Build Coastguard Worker
248*523fa7a6SAndroid Build Coastguard WorkerWhen you add an entry to the YAML file, the codegen tools will generate an
249*523fa7a6SAndroid Build Coastguard Workerexpected function signature for you to implement in a file called
250*523fa7a6SAndroid Build Coastguard Worker`NativeFunctions.h`. To build and find that generated header:
251*523fa7a6SAndroid Build Coastguard Worker
252*523fa7a6SAndroid Build Coastguard Worker1. Build executorch
253*523fa7a6SAndroid Build Coastguard Worker```
254*523fa7a6SAndroid Build Coastguard Workercmake -DCMAKE_INSTALL_PREFIX=cmake-out \
255*523fa7a6SAndroid Build Coastguard Worker          -DCMAKE_BUILD_TYPE=Release \
256*523fa7a6SAndroid Build Coastguard Worker          -DPYTHON_EXECUTABLE=python \
257*523fa7a6SAndroid Build Coastguard Worker          -Bcmake-out .
258*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-out -j9 --target install --config Release
259*523fa7a6SAndroid Build Coastguard Worker```
260*523fa7a6SAndroid Build Coastguard Worker2. The generated `NativeFunctions.h` file is located in
261*523fa7a6SAndroid Build Coastguard Worker```
262*523fa7a6SAndroid Build Coastguard Workercmake-out/kernels/portable/portable_ops_lib/NativeFunctions.h
263*523fa7a6SAndroid Build Coastguard Worker```
264*523fa7a6SAndroid Build Coastguard Worker
265*523fa7a6SAndroid Build Coastguard WorkerSince this header is generated from the YAML files, re-run the script if you have modified your
266*523fa7a6SAndroid Build Coastguard Workeroperator's entry in those files.
267*523fa7a6SAndroid Build Coastguard Worker
268*523fa7a6SAndroid Build Coastguard WorkerOpen the file and look for the function with the same name that you earlier
269*523fa7a6SAndroid Build Coastguard Workeradded in the YAML file. For `add_out`, this might look like
270*523fa7a6SAndroid Build Coastguard Worker```
271*523fa7a6SAndroid Build Coastguard WorkerTORCH_API torch::executor::Tensor & add_out(const at::Tensor & self, const at::Tensor & other, at::Tensor & out);
272*523fa7a6SAndroid Build Coastguard Worker```
273*523fa7a6SAndroid Build Coastguard Worker
274*523fa7a6SAndroid Build Coastguard WorkerThis is the function signature that you will need to implement.
275*523fa7a6SAndroid Build Coastguard Worker
276*523fa7a6SAndroid Build Coastguard Worker### Add a stub implementation
277*523fa7a6SAndroid Build Coastguard Worker
278*523fa7a6SAndroid Build Coastguard WorkerNow that you have your function signature, add a stub to the `op_<name>.cpp`
279*523fa7a6SAndroid Build Coastguard Workerfile that just returns the `out` argument. For example:
280*523fa7a6SAndroid Build Coastguard Worker```
281*523fa7a6SAndroid Build Coastguard WorkerTensor& add_out(
282*523fa7a6SAndroid Build Coastguard Worker    const Tensor& self,
283*523fa7a6SAndroid Build Coastguard Worker    const Tensor& other,
284*523fa7a6SAndroid Build Coastguard Worker    Tensor& out) {
285*523fa7a6SAndroid Build Coastguard Worker  return out;
286*523fa7a6SAndroid Build Coastguard Worker}
287*523fa7a6SAndroid Build Coastguard Worker```
288*523fa7a6SAndroid Build Coastguard Worker
289*523fa7a6SAndroid Build Coastguard WorkerNote that you should drop the `TORCH_API` attribute, and should drop `at::`.
290*523fa7a6SAndroid Build Coastguard Worker
291*523fa7a6SAndroid Build Coastguard Worker### Create a skeleton test .cpp file
292*523fa7a6SAndroid Build Coastguard Worker
293*523fa7a6SAndroid Build Coastguard WorkerIf not already present, create the file
294*523fa7a6SAndroid Build Coastguard Worker`executorch/kernels/portable/test/op_<name>_test.cpp`. Here's a suggested
295*523fa7a6SAndroid Build Coastguard Workerstarting point:
296*523fa7a6SAndroid Build Coastguard Worker```
297*523fa7a6SAndroid Build Coastguard Worker// Copyright (c) Meta Platforms, Inc. and affiliates.
298*523fa7a6SAndroid Build Coastguard Worker
299*523fa7a6SAndroid Build Coastguard Worker#include <executorch/kernels/test/FunctionHeaderWrapper.h> // Declares the operator
300*523fa7a6SAndroid Build Coastguard Worker#include <executorch/runtime/core/exec_aten/exec_aten.h>
301*523fa7a6SAndroid Build Coastguard Worker#include <executorch/runtime/core/exec_aten/testing_util/tensor_factory.h>
302*523fa7a6SAndroid Build Coastguard Worker#include <executorch/runtime/core/exec_aten/testing_util/tensor_util.h>
303*523fa7a6SAndroid Build Coastguard Worker
304*523fa7a6SAndroid Build Coastguard Worker#include <gtest/gtest.h>
305*523fa7a6SAndroid Build Coastguard Worker
306*523fa7a6SAndroid Build Coastguard Workerusing namespace ::testing;
307*523fa7a6SAndroid Build Coastguard Workerusing exec_aten::ScalarType;
308*523fa7a6SAndroid Build Coastguard Workerusing exec_aten::Tensor;
309*523fa7a6SAndroid Build Coastguard Workerusing torch::executor::native::<operator_function_name>;
310*523fa7a6SAndroid Build Coastguard Workerusing torch::executor::testing::IsCloseTo;
311*523fa7a6SAndroid Build Coastguard Workerusing torch::executor::testing::TensorFactory;
312*523fa7a6SAndroid Build Coastguard Worker
313*523fa7a6SAndroid Build Coastguard WorkerTEST(Op<Name>Test, SmokeTest) {
314*523fa7a6SAndroid Build Coastguard Worker  TensorFactory<ScalarType::Int> tf;
315*523fa7a6SAndroid Build Coastguard Worker
316*523fa7a6SAndroid Build Coastguard Worker  Tensor a = tf.make(/*sizes=*/{2, 2}, /*data=*/{1, 1, 1, 1}):
317*523fa7a6SAndroid Build Coastguard Worker  Tensor b = tf.ones(/*sizes=*/{2, 2}):
318*523fa7a6SAndroid Build Coastguard Worker  Tensor z = tf.zeros(/*sizes=*/{2, 2}):
319*523fa7a6SAndroid Build Coastguard Worker
320*523fa7a6SAndroid Build Coastguard Worker  EXPECT_EQ(a, b); // Exact equality
321*523fa7a6SAndroid Build Coastguard Worker  EXPECT_THAT(a, IsCloseTo(b)); // For floating-point tensors
322*523fa7a6SAndroid Build Coastguard Worker
323*523fa7a6SAndroid Build Coastguard Worker  EXPECT_NE(a, z);
324*523fa7a6SAndroid Build Coastguard Worker  EXPECT_THAT(a, Not(IsCloseTo(z)));
325*523fa7a6SAndroid Build Coastguard Worker}
326*523fa7a6SAndroid Build Coastguard Worker```
327*523fa7a6SAndroid Build Coastguard Worker
328*523fa7a6SAndroid Build Coastguard Worker### Add operator test to CMakeLists.txt
329*523fa7a6SAndroid Build Coastguard Worker
330*523fa7a6SAndroid Build Coastguard WorkerNow, we have to add this to [executorch/kernels/tests/CMakeLists.txt](https://github.com/pytorch/executorch/blob/main/kernels/test/CMakeLists.txt). Note that this builds all the kernel tests.
331*523fa7a6SAndroid Build Coastguard Worker
332*523fa7a6SAndroid Build Coastguard WorkerFor portable kernels, add your test file to [`all_test_sources`](https://github.com/pytorch/executorch/blob/main/kernels/test/CMakeLists.txt#L69).
333*523fa7a6SAndroid Build Coastguard Worker
334*523fa7a6SAndroid Build Coastguard WorkerFor optimized kernels, add your test file to [`_optimized_kernels_test_sources](https://github.com/pytorch/executorch/blob/main/kernels/test/CMakeLists.txt#L230).
335*523fa7a6SAndroid Build Coastguard Worker
336*523fa7a6SAndroid Build Coastguard Worker### Implement and test the operator
337*523fa7a6SAndroid Build Coastguard Worker
338*523fa7a6SAndroid Build Coastguard WorkerYou should now be able to implement and test your operator. It's helpful to see
339*523fa7a6SAndroid Build Coastguard Workerhow other operators do it, so take a look at `op_add`:
340*523fa7a6SAndroid Build Coastguard Worker- [`executorch/kernels/portable/cpu/op_add.cpp`](https://github.com/pytorch/executorch/blob/main/kernels/portable/cpu/op_add.cpp)
341*523fa7a6SAndroid Build Coastguard Worker- [`executorch/kernels/portable/test/op_add_test.cpp`](https://github.com/pytorch/executorch/blob/main/kernels/test/op_add_test.cpp):
342*523fa7a6SAndroid Build Coastguard Worker
343*523fa7a6SAndroid Build Coastguard WorkerCheck out how it uses helper macros like `ET_CHECK_SAME_SHAPE_AND_DTYPE` and
344*523fa7a6SAndroid Build Coastguard Worker`ET_FORALL_REAL_TYPES` when implementing the operator, and test helpers like
345*523fa7a6SAndroid Build Coastguard Worker`TensorFactory` and `IsCloseTo()` when testing.
346*523fa7a6SAndroid Build Coastguard Worker
347*523fa7a6SAndroid Build Coastguard WorkerOnce you have your operator and corresponding tests in place, we can try it out.
348*523fa7a6SAndroid Build Coastguard Worker
349*523fa7a6SAndroid Build Coastguard Worker1. Build ExecuTorch.
350*523fa7a6SAndroid Build Coastguard Worker```
351*523fa7a6SAndroid Build Coastguard Workercmake . \
352*523fa7a6SAndroid Build Coastguard Worker  -DCMAKE_INSTALL_PREFIX=cmake-out \
353*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_USE_CPP_CODE_COVERAGE=ON \
354*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
355*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
356*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
357*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
358*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
359*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_BUILD_DEVTOOLS=ON \
360*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_BUILD_VULKAN=OFF \
361*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_BUILD_XNNPACK=ON \
362*523fa7a6SAndroid Build Coastguard Worker  -Bcmake-out
363*523fa7a6SAndroid Build Coastguard Worker
364*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-out -j9 --target install
365*523fa7a6SAndroid Build Coastguard Worker```
366*523fa7a6SAndroid Build Coastguard Worker2. Build gtest.
367*523fa7a6SAndroid Build Coastguard Worker```
368*523fa7a6SAndroid Build Coastguard Workermkdir -p third-party/googletest/build
369*523fa7a6SAndroid Build Coastguard Workercd third-party/googletest/build
370*523fa7a6SAndroid Build Coastguard Workercmake .. -DCMAKE_INSTALL_PREFIX=.
371*523fa7a6SAndroid Build Coastguard Workermake -j4
372*523fa7a6SAndroid Build Coastguard Workermake install
373*523fa7a6SAndroid Build Coastguard Workercd ../../../
374*523fa7a6SAndroid Build Coastguard Worker```
375*523fa7a6SAndroid Build Coastguard Worker
376*523fa7a6SAndroid Build Coastguard Worker3. Build kernel tests.
377*523fa7a6SAndroid Build Coastguard Worker```
378*523fa7a6SAndroid Build Coastguard Workercmake kernels/test \
379*523fa7a6SAndroid Build Coastguard Worker  -DCMAKE_BUILD_TYPE=Debug \
380*523fa7a6SAndroid Build Coastguard Worker  -DCMAKE_INSTALL_PREFIX=cmake-out \
381*523fa7a6SAndroid Build Coastguard Worker  -DEXECUTORCH_USE_CPP_CODE_COVERAGE=ON \
382*523fa7a6SAndroid Build Coastguard Worker  -DCMAKE_PREFIX_PATH="$(pwd)/third-party/googletest/build" \
383*523fa7a6SAndroid Build Coastguard Worker  -Bcmake-out/kernels/test
384*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-out/kernels/test -j9
385*523fa7a6SAndroid Build Coastguard Worker```
386*523fa7a6SAndroid Build Coastguard Worker4. Run tests. You should see your test here.
387*523fa7a6SAndroid Build Coastguard Worker```
388*523fa7a6SAndroid Build Coastguard Worker./cmake-out/kernels/test/portable_kernels_test
389*523fa7a6SAndroid Build Coastguard Worker./cmake-out/kernels/test/optimized_kernels_test
390*523fa7a6SAndroid Build Coastguard Worker```
391*523fa7a6SAndroid Build Coastguard Worker
392*523fa7a6SAndroid Build Coastguard Worker#### Implementation restrictions
393*523fa7a6SAndroid Build Coastguard Worker
394*523fa7a6SAndroid Build Coastguard WorkerTo reduce dependencies and size, to ensure portability, and to conform to the
395*523fa7a6SAndroid Build Coastguard Workerrestrictions of embedded environments, your operator implementations:
396*523fa7a6SAndroid Build Coastguard Worker
397*523fa7a6SAndroid Build Coastguard Worker- Must not include C++ stdlib headers, or use C++ stdlib types. For example,
398*523fa7a6SAndroid Build Coastguard Worker  `string`/`basic_string`, `vector`, `unordered_map`, `cout`, `unique_pointer`
399*523fa7a6SAndroid Build Coastguard Worker  must not be used.
400*523fa7a6SAndroid Build Coastguard Worker- Must not dynamically allocate memory, or cause memory to be dynamically
401*523fa7a6SAndroid Build Coastguard Worker  allocated. All non-stack memory must be provided as a function parameter by
402*523fa7a6SAndroid Build Coastguard Worker  the caller, typically via an `out` parameter or another tensor parameter to be
403*523fa7a6SAndroid Build Coastguard Worker  used as scratch space.
404*523fa7a6SAndroid Build Coastguard Worker  - This includes direct calls to `new`, `malloc`, `realloc`, etc., as well as
405*523fa7a6SAndroid Build Coastguard Worker    operations that allocate under the hood like `make_unique`, or the creation
406*523fa7a6SAndroid Build Coastguard Worker    of `vector` or `string`, for example.
407*523fa7a6SAndroid Build Coastguard Worker- Must be stateless.
408*523fa7a6SAndroid Build Coastguard Worker- Must be thread-safe. Note that the ExecuTorch environment does not provide
409*523fa7a6SAndroid Build Coastguard Worker  a locking construct, so this means that operator implementations must not
410*523fa7a6SAndroid Build Coastguard Worker  modify global memory.
411*523fa7a6SAndroid Build Coastguard Worker- Must work in an environment without threads. This, along with the stateless
412*523fa7a6SAndroid Build Coastguard Worker  requirement, means that thread local storage must not be used.
413*523fa7a6SAndroid Build Coastguard Worker- Must not use `stdout`, `stderr`, or other file/stream IO via `printf`/`cout`
414*523fa7a6SAndroid Build Coastguard Worker  etc.; instead, use `ET_LOG` from `executorch/runtime/platform/log.h`.
415*523fa7a6SAndroid Build Coastguard Worker- Must not use `assert()`. Instead use `ET_CHECK` and other macros from
416*523fa7a6SAndroid Build Coastguard Worker  `executorch/runtime/platform/assert.h`.
417*523fa7a6SAndroid Build Coastguard Worker- Must not raise exceptions. Instead use `ET_CHECK` and other macros from
418*523fa7a6SAndroid Build Coastguard Worker  `executorch/runtime/platform/assert.h`.
419*523fa7a6SAndroid Build Coastguard Worker
420*523fa7a6SAndroid Build Coastguard WorkerNote that not all of these apply to *every* ExecuTorch-compatible operator
421*523fa7a6SAndroid Build Coastguard Workerimplementation, only those included in this portable library.
422*523fa7a6SAndroid Build Coastguard Worker
423*523fa7a6SAndroid Build Coastguard WorkerFor example, a target-specfic custom operator that initiates a DMA copy would be
424*523fa7a6SAndroid Build Coastguard Workerstateful, and would probaby modify global memory, but it would need to use
425*523fa7a6SAndroid Build Coastguard Workertarget-specific APIs to do so. But, since this library is only for portable
426*523fa7a6SAndroid Build Coastguard Workeroperator implementations, the operators it contains can't depend on
427*523fa7a6SAndroid Build Coastguard Workertarget-specific APIs like that.
428*523fa7a6SAndroid Build Coastguard Worker
429*523fa7a6SAndroid Build Coastguard Worker### Shared kernel tests (executorch/kernels/test)
430*523fa7a6SAndroid Build Coastguard WorkerThe portable kernel implementation and its corresponding tests can be used as a
431*523fa7a6SAndroid Build Coastguard Workerreference for other kernels. We can also share the test cases in
432*523fa7a6SAndroid Build Coastguard Worker`//executorch/kernels/test`, which contains common resources for kernel testing.
433*523fa7a6SAndroid Build Coastguard Worker
434*523fa7a6SAndroid Build Coastguard Worker*generate_wrapper* generates a header FunctionHeaderWrapper.h, which simply
435*523fa7a6SAndroid Build Coastguard Workerincludes the corresponding Functions.h file for the specified kernel:
436*523fa7a6SAndroid Build Coastguard Worker`#include <executorch/kernels/{}/Functions.h>`. With that, the test sources don't need to know
437*523fa7a6SAndroid Build Coastguard Workerabout which kernel we are testing and which Functions.h we should use.
438*523fa7a6SAndroid Build Coastguard Worker
439*523fa7a6SAndroid Build Coastguard WorkerWith *_common_op_test* we use a single test source file (op_<op>_test.cpp) at this directory.
440*523fa7a6SAndroid Build Coastguard WorkerWe automatically find the corresponding registered dispatch function through Funcitons.h, so
441*523fa7a6SAndroid Build Coastguard Workerit can be used to test multiple kernels.
442*523fa7a6SAndroid Build Coastguard Worker
443*523fa7a6SAndroid Build Coastguard WorkerIn <kernel>/test/ we can put kernel-specific test cases.
444*523fa7a6SAndroid Build Coastguard Worker
445*523fa7a6SAndroid Build Coastguard Worker*supported_features* is used to distinguish between different kernel features. For example,
446*523fa7a6SAndroid Build Coastguard WorkerATen supports mixing input and output dtype while portable doesn't. When we expect death in
447*523fa7a6SAndroid Build Coastguard Workerportable testing in such case, we can check the supported features by the running kernel and
448*523fa7a6SAndroid Build Coastguard Workerbypass if it's supported.
449*523fa7a6SAndroid Build Coastguard Worker- The default value of supported features is in test/supported_features.yaml
450*523fa7a6SAndroid Build Coastguard Worker- Each kernel needs to override its supported features in <kernel>/test/supported_features_def.yaml.
451*523fa7a6SAndroid Build Coastguard Worker  See example in supported_features_def_example.yaml.
452*523fa7a6SAndroid Build Coastguard Worker- This ensures that all kernels can share the same c++ test case source
453