1TensorFlow Lite delegate 2======================== 3 4Mesa contains a TensorFlow Lite delegate that can make use of NPUs to accelerate ML inference. It is implemented in the form of a *external delegate*, a shared library that the TensorFlow Lite runtime can load at startup. See https://www.tensorflow.org/api_docs/python/tf/lite/experimental/load_delegate. 5 6.. list-table:: Supported acceleration hardware 7 :header-rows: 1 8 9 * - Gallium driver 10 - NPU supported 11 - Hardware tested 12 * - Etnaviv 13 - ``VeriSilicon VIPNano-QI.7120`` 14 - ``Amlogic A311D on Libre Computer AML-A311D-CC Alta and Khadas VIM3`` 15 16.. list-table:: Tested models 17 :header-rows: 1 18 19 * - Model name 20 - Data type 21 - Link (may be outdated) 22 - Status 23 - Inference speed on AML-A311D-CC Alta 24 * - MobileNet V1 25 - UINT8 26 - http://download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224_quant.tgz 27 - Fully supported 28 - ~15 ms 29 * - MobileNet V2 30 - UINT8 31 - https://storage.googleapis.com/mobilenet_v2/checkpoints/quantized_v2_224_100.tgz 32 - Fully supported 33 - ~15.5 ms 34 * - SSDLite MobileDet 35 - UINT8 36 - https://raw.githubusercontent.com/google-coral/test_data/master/ssdlite_mobiledet_coco_qat_postprocess.tflite 37 - Fully supported 38 - ~53 ms 39 40Build 41----- 42 43Build Mesa as usual, with the -Dteflon=true argument. 44 45Example instructions: 46 47.. code-block:: console 48 49 # Install build dependencies 50 ~ # apt-get -y build-dep mesa 51 ~ # apt-get -y install git cmake 52 53 # Download sources 54 ~ $ git clone https://gitlab.freedesktop.org/mesa/mesa.git 55 56 # Build Mesa 57 ~ $ cd mesa 58 mesa $ meson setup build -Dgallium-drivers=etnaviv -Dvulkan-drivers= -Dteflon=true 59 mesa $ meson compile -C build 60 61Install runtime dependencies 62---------------------------- 63 64Your board should have booted into a mainline 6.7 or greater kernel and have the etnaviv driver loaded. You will also need to enable the NPU device in the device tree by means of an overlay or by a change such as the below (and rebuild the DTB): 65 66.. code-block:: diff 67 68 diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b-a311d-khadas-vim3.dts b/arch/arm64/boot/dts/amlogic/meson-g12b-a311d-khadas-vim3.dts 69 index 4aa2b20bfbf2..4e8266056bca 100644 70 --- a/arch/arm64/boot/dts/amlogic/meson-g12b-a311d-khadas-vim3.dts 71 +++ b/arch/arm64/boot/dts/amlogic/meson-g12b-a311d-khadas-vim3.dts 72 @@ -50,6 +50,10 @@ galcore { 73 }; 74 }; 75 76 +&npu { 77 + status = "okay"; 78 +}; 79 + 80 /* 81 * The VIM3 on-board MCU can mux the PCIe/USB3.0 shared differential 82 * lines using a FUSB340TMX USB 3.1 SuperSpeed Data Switch between 83 84 85.. code-block:: console 86 87 # Install Python 3.10 and dependencies (as root) 88 ~ # echo deb-src http://deb.debian.org/debian testing main >> /etc/apt/sources.list 89 ~ # echo deb http://deb.debian.org/debian unstable main >> /etc/apt/sources.list 90 ~ # echo 'APT::Default-Release "testing";' >> /etc/apt/apt.conf 91 ~ # apt-get update 92 ~ # apt-get -y install python3.10 python3-pytest python3-exceptiongroup 93 94 # Install TensorFlow Lite Python package (as non-root) 95 ~ $ python3.10 -m pip install --break-system-packages tflite-runtime==2.13.0 96 97Do some inference with MobileNetV1 98---------------------------------- 99 100.. code-block:: console 101 102 ~ $ cd mesa/ 103 mesa $ TEFLON_DEBUG=verbose ETNA_MESA_DEBUG=ml_dbgs python3.10 src/gallium/frontends/teflon/tests/classification.py -i ~/tensorflow/assets/grace_hopper.bmp -m src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite -l src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt -e build/src/gallium/targets/teflon/libteflon.so 104 105 Loading external delegate from build/src/gallium/targets/teflon/libteflon.so with args: {} 106 Teflon delegate: loaded etnaviv driver 107 108 teflon: compiling graph: 89 tensors 28 operations 109 idx scale zp has_data size 110 ======================================= 111 0 0.023528 0 no 1x1x1x1024 112 1 0.166099 42 no 1x1x1x1001 113 2 0.000117 0 yes 1001x0x0x0 114 3 0.004987 4a yes 1001x1x1x1024 115 4 0.166099 42 no 1x1001x0x0 116 5 0.166099 42 yes 2x0x0x0 117 6 0.000171 0 yes 32x0x0x0 118 7 0.023528 0 no 1x112x112x32 119 8 0.021827 97 yes 32x3x3x3 120 9 0.023528 0 no 1x14x14x512 121 ... 122 123 idx type in out operation type-specific 124 ================================================================================================ 125 0 CONV 88 7 w: 8 b: 6 stride: 2 pad: SAME 126 1 DWCONV 7 33 w: 35 b: 34 stride: 1 pad: SAME 127 2 CONV 33 37 w: 38 b: 36 stride: 1 pad: SAME 128 3 DWCONV 37 39 w: 41 b: 40 stride: 2 pad: SAME 129 4 CONV 39 43 w: 44 b: 42 stride: 1 pad: SAME 130 5 DWCONV 43 45 w: 47 b: 46 stride: 1 pad: SAME 131 6 CONV 45 49 w: 50 b: 48 stride: 1 pad: SAME 132 7 DWCONV 49 51 w: 53 b: 52 stride: 2 pad: SAME 133 8 CONV 51 55 w: 56 b: 54 stride: 1 pad: SAME 134 9 DWCONV 55 57 w: 59 b: 58 stride: 1 pad: SAME 135 10 CONV 57 61 w: 62 b: 60 stride: 1 pad: SAME 136 11 DWCONV 61 63 w: 65 b: 64 stride: 2 pad: SAME 137 12 CONV 63 67 w: 68 b: 66 stride: 1 pad: SAME 138 13 DWCONV 67 69 w: 71 b: 70 stride: 1 pad: SAME 139 14 CONV 69 73 w: 74 b: 72 stride: 1 pad: SAME 140 15 DWCONV 73 75 w: 77 b: 76 stride: 1 pad: SAME 141 16 CONV 75 79 w: 80 b: 78 stride: 1 pad: SAME 142 17 DWCONV 79 81 w: 83 b: 82 stride: 1 pad: SAME 143 18 CONV 81 85 w: 86 b: 84 stride: 1 pad: SAME 144 19 DWCONV 85 9 w: 11 b: 10 stride: 1 pad: SAME 145 20 CONV 9 13 w: 14 b: 12 stride: 1 pad: SAME 146 21 DWCONV 13 15 w: 17 b: 16 stride: 1 pad: SAME 147 22 CONV 15 19 w: 20 b: 18 stride: 1 pad: SAME 148 23 DWCONV 19 21 w: 23 b: 22 stride: 2 pad: SAME 149 24 CONV 21 25 w: 26 b: 24 stride: 1 pad: SAME 150 25 DWCONV 25 27 w: 29 b: 28 stride: 1 pad: SAME 151 26 CONV 27 31 w: 32 b: 30 stride: 1 pad: SAME 152 27 POOL 31 0 filter: 0x0 stride: 0 pad: VALID 153 154 teflon: compiled graph, took 10307 ms 155 teflon: invoked graph, took 21 ms 156 teflon: invoked graph, took 17 ms 157 teflon: invoked graph, took 17 ms 158 teflon: invoked graph, took 17 ms 159 teflon: invoked graph, took 16 ms 160 0.866667: military uniform 161 0.031373: Windsor tie 162 0.015686: mortarboard 163 0.007843: bow tie 164 0.007843: academic 165