xref: /aosp_15_r20/external/executorch/backends/vulkan/README.md (revision 523fa7a60841cd1ecfb9cc4201f1ca8b03ed023a)
1*523fa7a6SAndroid Build Coastguard Worker# ExecuTorch Vulkan Delegate
2*523fa7a6SAndroid Build Coastguard Worker
3*523fa7a6SAndroid Build Coastguard WorkerThe ExecuTorch Vulkan delegate is a native GPU delegate for ExecuTorch that is
4*523fa7a6SAndroid Build Coastguard Workerbuilt on top of the cross-platform Vulkan GPU API standard. It is primarily
5*523fa7a6SAndroid Build Coastguard Workerdesigned to leverage the GPU to accelerate model inference on Android devices,
6*523fa7a6SAndroid Build Coastguard Workerbut can be used on any platform that supports an implementation of Vulkan:
7*523fa7a6SAndroid Build Coastguard Workerlaptops, servers, and edge devices.
8*523fa7a6SAndroid Build Coastguard Worker
9*523fa7a6SAndroid Build Coastguard Worker::::{note}
10*523fa7a6SAndroid Build Coastguard WorkerThe Vulkan delegate is currently under active development, and its components
11*523fa7a6SAndroid Build Coastguard Workerare subject to change.
12*523fa7a6SAndroid Build Coastguard Worker::::
13*523fa7a6SAndroid Build Coastguard Worker
14*523fa7a6SAndroid Build Coastguard Worker## What is Vulkan?
15*523fa7a6SAndroid Build Coastguard Worker
16*523fa7a6SAndroid Build Coastguard WorkerVulkan is a low-level GPU API specification developed as a successor to OpenGL.
17*523fa7a6SAndroid Build Coastguard WorkerIt is designed to offer developers more explicit control over GPUs compared to
18*523fa7a6SAndroid Build Coastguard Workerprevious specifications in order to reduce overhead and maximize the
19*523fa7a6SAndroid Build Coastguard Workercapabilities of the modern graphics hardware.
20*523fa7a6SAndroid Build Coastguard Worker
21*523fa7a6SAndroid Build Coastguard WorkerVulkan has been widely adopted among GPU vendors, and most modern GPUs (both
22*523fa7a6SAndroid Build Coastguard Workerdesktop and mobile) in the market support Vulkan. Vulkan is also included in
23*523fa7a6SAndroid Build Coastguard WorkerAndroid from Android 7.0 onwards.
24*523fa7a6SAndroid Build Coastguard Worker
25*523fa7a6SAndroid Build Coastguard Worker**Note that Vulkan is a GPU API, not a GPU Math Library**. That is to say it
26*523fa7a6SAndroid Build Coastguard Workerprovides a way to execute compute and graphics operations on a GPU, but does not
27*523fa7a6SAndroid Build Coastguard Workercome with a built-in library of performant compute kernels.
28*523fa7a6SAndroid Build Coastguard Worker
29*523fa7a6SAndroid Build Coastguard Worker## The Vulkan Compute Library
30*523fa7a6SAndroid Build Coastguard Worker
31*523fa7a6SAndroid Build Coastguard WorkerThe ExecuTorch Vulkan Delegate is a wrapper around a standalone runtime known as
32*523fa7a6SAndroid Build Coastguard Workerthe **Vulkan Compute Library**. The aim of the Vulkan Compute Library is to
33*523fa7a6SAndroid Build Coastguard Workerprovide GPU implementations for PyTorch operators via GLSL compute shaders.
34*523fa7a6SAndroid Build Coastguard Worker
35*523fa7a6SAndroid Build Coastguard WorkerThe Vulkan Compute Library is a fork/iteration of the [PyTorch Vulkan Backend](https://pytorch.org/tutorials/prototype/vulkan_workflow.html).
36*523fa7a6SAndroid Build Coastguard WorkerThe core components of the PyTorch Vulkan backend were forked into ExecuTorch
37*523fa7a6SAndroid Build Coastguard Workerand adapted for an AOT graph-mode style of model inference (as opposed to
38*523fa7a6SAndroid Build Coastguard WorkerPyTorch which adopted an eager execution style of model inference).
39*523fa7a6SAndroid Build Coastguard Worker
40*523fa7a6SAndroid Build Coastguard WorkerThe components of the Vulkan Compute Library are contained in the
41*523fa7a6SAndroid Build Coastguard Worker`executorch/backends/vulkan/runtime/` directory. The core components are listed
42*523fa7a6SAndroid Build Coastguard Workerand described below:
43*523fa7a6SAndroid Build Coastguard Worker
44*523fa7a6SAndroid Build Coastguard Worker```
45*523fa7a6SAndroid Build Coastguard Workerruntime/
46*523fa7a6SAndroid Build Coastguard Worker├── api/ .................... Wrapper API around Vulkan to manage Vulkan objects
47*523fa7a6SAndroid Build Coastguard Worker└── graph/ .................. ComputeGraph class which implements graph mode inference
48*523fa7a6SAndroid Build Coastguard Worker    └── ops/ ................ Base directory for operator implementations
49*523fa7a6SAndroid Build Coastguard Worker        ├── glsl/ ........... GLSL compute shaders
50*523fa7a6SAndroid Build Coastguard Worker        │   ├── *.glsl
51*523fa7a6SAndroid Build Coastguard Worker        │   └── conv2d.glsl
52*523fa7a6SAndroid Build Coastguard Worker        └── impl/ ........... C++ code to dispatch GPU compute shaders
53*523fa7a6SAndroid Build Coastguard Worker            ├── *.cpp
54*523fa7a6SAndroid Build Coastguard Worker            └── Conv2d.cpp
55*523fa7a6SAndroid Build Coastguard Worker```
56*523fa7a6SAndroid Build Coastguard Worker
57*523fa7a6SAndroid Build Coastguard Worker## Features
58*523fa7a6SAndroid Build Coastguard Worker
59*523fa7a6SAndroid Build Coastguard WorkerThe Vulkan delegate currently supports the following features:
60*523fa7a6SAndroid Build Coastguard Worker
61*523fa7a6SAndroid Build Coastguard Worker* **Memory Planning**
62*523fa7a6SAndroid Build Coastguard Worker  * Intermediate tensors whose lifetimes do not overlap will share memory allocations. This reduces the peak memory usage of model inference.
63*523fa7a6SAndroid Build Coastguard Worker* **Capability Based Partitioning**:
64*523fa7a6SAndroid Build Coastguard Worker  * A graph can be partially lowered to the Vulkan delegate via a partitioner, which will identify nodes (i.e. operators) that are supported by the Vulkan delegate and lower only supported subgraphs
65*523fa7a6SAndroid Build Coastguard Worker* **Support for upper-bound dynamic shapes**:
66*523fa7a6SAndroid Build Coastguard Worker  * Tensors can change shape between inferences as long as its current shape is smaller than the bounds specified during lowering
67*523fa7a6SAndroid Build Coastguard Worker
68*523fa7a6SAndroid Build Coastguard WorkerIn addition to increasing operator coverage, the following features are
69*523fa7a6SAndroid Build Coastguard Workercurrently in development:
70*523fa7a6SAndroid Build Coastguard Worker
71*523fa7a6SAndroid Build Coastguard Worker* **Quantization Support**
72*523fa7a6SAndroid Build Coastguard Worker  * We are currently working on support for 8-bit dynamic quantization, with plans to extend to other quantization schemes in the future.
73*523fa7a6SAndroid Build Coastguard Worker* **Memory Layout Management**
74*523fa7a6SAndroid Build Coastguard Worker  * Memory layout is an important factor to optimizing performance. We plan to introduce graph passes to introduce memory layout transitions throughout a graph to optimize memory-layout sensitive operators such as Convolution and Matrix Multiplication.
75*523fa7a6SAndroid Build Coastguard Worker* **Selective Build**
76*523fa7a6SAndroid Build Coastguard Worker  * We plan to make it possible to control build size by selecting which operators/shaders you want to build with
77*523fa7a6SAndroid Build Coastguard Worker
78*523fa7a6SAndroid Build Coastguard Worker## End to End Example
79*523fa7a6SAndroid Build Coastguard Worker
80*523fa7a6SAndroid Build Coastguard WorkerTo further understand the features of the Vulkan Delegate and how to use it,
81*523fa7a6SAndroid Build Coastguard Workerconsider the following end to end example with a simple single operator model.
82*523fa7a6SAndroid Build Coastguard Worker
83*523fa7a6SAndroid Build Coastguard Worker### Compile and lower a model to the Vulkan Delegate
84*523fa7a6SAndroid Build Coastguard Worker
85*523fa7a6SAndroid Build Coastguard WorkerAssuming ExecuTorch has been set up and installed, the following script can be
86*523fa7a6SAndroid Build Coastguard Workerused to produce a lowered MobileNet V2 model as `vulkan_mobilenetv2.pte`.
87*523fa7a6SAndroid Build Coastguard Worker
88*523fa7a6SAndroid Build Coastguard WorkerOnce ExecuTorch has been set up and installed, the following script can be used
89*523fa7a6SAndroid Build Coastguard Workerto generate a simple model and lower it to the Vulkan delegate.
90*523fa7a6SAndroid Build Coastguard Worker
91*523fa7a6SAndroid Build Coastguard Worker```
92*523fa7a6SAndroid Build Coastguard Worker# Note: this script is the same as the script from the "Setting up ExecuTorch"
93*523fa7a6SAndroid Build Coastguard Worker# page, with one minor addition to lower to the Vulkan backend.
94*523fa7a6SAndroid Build Coastguard Workerimport torch
95*523fa7a6SAndroid Build Coastguard Workerfrom torch.export import export
96*523fa7a6SAndroid Build Coastguard Workerfrom executorch.exir import to_edge
97*523fa7a6SAndroid Build Coastguard Worker
98*523fa7a6SAndroid Build Coastguard Workerfrom executorch.backends.vulkan.partitioner.vulkan_partitioner import VulkanPartitioner
99*523fa7a6SAndroid Build Coastguard Worker
100*523fa7a6SAndroid Build Coastguard Worker# Start with a PyTorch model that adds two input tensors (matrices)
101*523fa7a6SAndroid Build Coastguard Workerclass Add(torch.nn.Module):
102*523fa7a6SAndroid Build Coastguard Worker  def __init__(self):
103*523fa7a6SAndroid Build Coastguard Worker    super(Add, self).__init__()
104*523fa7a6SAndroid Build Coastguard Worker
105*523fa7a6SAndroid Build Coastguard Worker  def forward(self, x: torch.Tensor, y: torch.Tensor):
106*523fa7a6SAndroid Build Coastguard Worker      return x + y
107*523fa7a6SAndroid Build Coastguard Worker
108*523fa7a6SAndroid Build Coastguard Worker# 1. torch.export: Defines the program with the ATen operator set.
109*523fa7a6SAndroid Build Coastguard Workeraten_dialect = export(Add(), (torch.ones(1), torch.ones(1)))
110*523fa7a6SAndroid Build Coastguard Worker
111*523fa7a6SAndroid Build Coastguard Worker# 2. to_edge: Make optimizations for Edge devices
112*523fa7a6SAndroid Build Coastguard Workeredge_program = to_edge(aten_dialect)
113*523fa7a6SAndroid Build Coastguard Worker# 2.1 Lower to the Vulkan backend
114*523fa7a6SAndroid Build Coastguard Workeredge_program = edge_program.to_backend(VulkanPartitioner())
115*523fa7a6SAndroid Build Coastguard Worker
116*523fa7a6SAndroid Build Coastguard Worker# 3. to_executorch: Convert the graph to an ExecuTorch program
117*523fa7a6SAndroid Build Coastguard Workerexecutorch_program = edge_program.to_executorch()
118*523fa7a6SAndroid Build Coastguard Worker
119*523fa7a6SAndroid Build Coastguard Worker# 4. Save the compiled .pte program
120*523fa7a6SAndroid Build Coastguard Workerwith open("vk_add.pte", "wb") as file:
121*523fa7a6SAndroid Build Coastguard Worker    file.write(executorch_program.buffer)
122*523fa7a6SAndroid Build Coastguard Worker```
123*523fa7a6SAndroid Build Coastguard Worker
124*523fa7a6SAndroid Build Coastguard WorkerLike other ExecuTorch delegates, a model can be lowered to the Vulkan Delegate
125*523fa7a6SAndroid Build Coastguard Workerusing the `to_backend()` API. The Vulkan Delegate implements the
126*523fa7a6SAndroid Build Coastguard Worker`VulkanPartitioner` class which identifies nodes (i.e. operators) in the graph
127*523fa7a6SAndroid Build Coastguard Workerthat are supported by the Vulkan delegate, and separates compatible sections of
128*523fa7a6SAndroid Build Coastguard Workerthe model to be executed on the GPU.
129*523fa7a6SAndroid Build Coastguard Worker
130*523fa7a6SAndroid Build Coastguard WorkerThis means the a model can be lowered to the Vulkan delegate even if it contains
131*523fa7a6SAndroid Build Coastguard Workersome unsupported operators. This will just mean that only parts of the graph
132*523fa7a6SAndroid Build Coastguard Workerwill be executed on the GPU.
133*523fa7a6SAndroid Build Coastguard Worker
134*523fa7a6SAndroid Build Coastguard Worker
135*523fa7a6SAndroid Build Coastguard Worker::::{note}
136*523fa7a6SAndroid Build Coastguard WorkerThe [supported ops list](https://github.com/pytorch/executorch/blob/main/backends/vulkan/partitioner/supported_ops.py)
137*523fa7a6SAndroid Build Coastguard WorkerVulkan partitioner code can be inspected to examine which ops are currently
138*523fa7a6SAndroid Build Coastguard Workerimplemented in the Vulkan delegate.
139*523fa7a6SAndroid Build Coastguard Worker::::
140*523fa7a6SAndroid Build Coastguard Worker
141*523fa7a6SAndroid Build Coastguard Worker### Build Vulkan Delegate libraries
142*523fa7a6SAndroid Build Coastguard Worker
143*523fa7a6SAndroid Build Coastguard WorkerThe easiest way to build and test the Vulkan Delegate is to build for Android
144*523fa7a6SAndroid Build Coastguard Workerand test on a local Android device. Android devices have built in support for
145*523fa7a6SAndroid Build Coastguard WorkerVulkan, and the Android NDK ships with a GLSL compiler which is needed to
146*523fa7a6SAndroid Build Coastguard Workercompile the Vulkan Compute Library's GLSL compute shaders.
147*523fa7a6SAndroid Build Coastguard Worker
148*523fa7a6SAndroid Build Coastguard WorkerThe Vulkan Delegate libraries can be built by setting `-DEXECUTORCH_BUILD_VULKAN=ON`
149*523fa7a6SAndroid Build Coastguard Workerwhen building with CMake.
150*523fa7a6SAndroid Build Coastguard Worker
151*523fa7a6SAndroid Build Coastguard WorkerFirst, make sure that you have the Android NDK installed; any NDK version past
152*523fa7a6SAndroid Build Coastguard WorkerNDK r19c should work. Note that the examples in this doc have been validated with
153*523fa7a6SAndroid Build Coastguard WorkerNDK r27b. The Android SDK should also be installed so that you have access to `adb`.
154*523fa7a6SAndroid Build Coastguard Worker
155*523fa7a6SAndroid Build Coastguard WorkerThe instructions in this page assumes that the following environment variables
156*523fa7a6SAndroid Build Coastguard Workerare set.
157*523fa7a6SAndroid Build Coastguard Worker
158*523fa7a6SAndroid Build Coastguard Worker```shell
159*523fa7a6SAndroid Build Coastguard Workerexport ANDROID_NDK=<path_to_ndk>
160*523fa7a6SAndroid Build Coastguard Worker# Select the appropriate Android ABI for your device
161*523fa7a6SAndroid Build Coastguard Workerexport ANDROID_ABI=arm64-v8a
162*523fa7a6SAndroid Build Coastguard Worker# All subsequent commands should be performed from ExecuTorch repo root
163*523fa7a6SAndroid Build Coastguard Workercd <path_to_executorch_root>
164*523fa7a6SAndroid Build Coastguard Worker# Make sure adb works
165*523fa7a6SAndroid Build Coastguard Workeradb --version
166*523fa7a6SAndroid Build Coastguard Worker```
167*523fa7a6SAndroid Build Coastguard Worker
168*523fa7a6SAndroid Build Coastguard WorkerTo build and install ExecuTorch libraries (for Android) with the Vulkan
169*523fa7a6SAndroid Build Coastguard WorkerDelegate:
170*523fa7a6SAndroid Build Coastguard Worker
171*523fa7a6SAndroid Build Coastguard Worker```shell
172*523fa7a6SAndroid Build Coastguard Worker# From executorch root directory
173*523fa7a6SAndroid Build Coastguard Worker(rm -rf cmake-android-out && \
174*523fa7a6SAndroid Build Coastguard Worker  pp cmake . -DCMAKE_INSTALL_PREFIX=cmake-android-out \
175*523fa7a6SAndroid Build Coastguard Worker    -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
176*523fa7a6SAndroid Build Coastguard Worker    -DANDROID_ABI=$ANDROID_ABI \
177*523fa7a6SAndroid Build Coastguard Worker    -DEXECUTORCH_BUILD_VULKAN=ON \
178*523fa7a6SAndroid Build Coastguard Worker    -DPYTHON_EXECUTABLE=python \
179*523fa7a6SAndroid Build Coastguard Worker    -Bcmake-android-out && \
180*523fa7a6SAndroid Build Coastguard Worker  cmake --build cmake-android-out -j16 --target install)
181*523fa7a6SAndroid Build Coastguard Worker```
182*523fa7a6SAndroid Build Coastguard Worker
183*523fa7a6SAndroid Build Coastguard Worker### Run the Vulkan model on device
184*523fa7a6SAndroid Build Coastguard Worker
185*523fa7a6SAndroid Build Coastguard Worker::::{note}
186*523fa7a6SAndroid Build Coastguard WorkerSince operator support is currently limited, only binary arithmetic operators
187*523fa7a6SAndroid Build Coastguard Workerwill run on the GPU. Expect inference to be slow as the majority of operators
188*523fa7a6SAndroid Build Coastguard Workerare being executed via Portable operators.
189*523fa7a6SAndroid Build Coastguard Worker::::
190*523fa7a6SAndroid Build Coastguard Worker
191*523fa7a6SAndroid Build Coastguard WorkerNow, the partially delegated model can be executed (partially) on your device's
192*523fa7a6SAndroid Build Coastguard WorkerGPU!
193*523fa7a6SAndroid Build Coastguard Worker
194*523fa7a6SAndroid Build Coastguard Worker```shell
195*523fa7a6SAndroid Build Coastguard Worker# Build a model runner binary linked with the Vulkan delegate libs
196*523fa7a6SAndroid Build Coastguard Workercmake --build cmake-android-out --target vulkan_executor_runner -j32
197*523fa7a6SAndroid Build Coastguard Worker
198*523fa7a6SAndroid Build Coastguard Worker# Push model to device
199*523fa7a6SAndroid Build Coastguard Workeradb push vk_add.pte /data/local/tmp/vk_add.pte
200*523fa7a6SAndroid Build Coastguard Worker# Push binary to device
201*523fa7a6SAndroid Build Coastguard Workeradb push cmake-android-out/backends/vulkan/vulkan_executor_runner /data/local/tmp/runner_bin
202*523fa7a6SAndroid Build Coastguard Worker
203*523fa7a6SAndroid Build Coastguard Worker# Run the model
204*523fa7a6SAndroid Build Coastguard Workeradb shell /data/local/tmp/runner_bin --model_path /data/local/tmp/vk_add.pte
205*523fa7a6SAndroid Build Coastguard Worker```
206