xref: /aosp_15_r20/external/mesa3d/docs/teflon.rst (revision 6104692788411f58d303aa86923a9ff6ecaded22)
1TensorFlow Lite delegate
2========================
3
4Mesa contains a TensorFlow Lite delegate that can make use of NPUs to accelerate ML inference. It is implemented in the form of a *external delegate*, a shared library that the TensorFlow Lite runtime can load at startup. See https://www.tensorflow.org/api_docs/python/tf/lite/experimental/load_delegate.
5
6.. list-table:: Supported acceleration hardware
7   :header-rows: 1
8
9   * - Gallium driver
10     - NPU supported
11     - Hardware tested
12   * - Etnaviv
13     - ``VeriSilicon VIPNano-QI.7120``
14     - ``Amlogic A311D on Libre Computer AML-A311D-CC Alta and Khadas VIM3``
15
16.. list-table:: Tested models
17   :header-rows: 1
18
19   * - Model name
20     - Data type
21     - Link (may be outdated)
22     - Status
23     - Inference speed on AML-A311D-CC Alta
24   * - MobileNet V1
25     - UINT8
26     - http://download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224_quant.tgz
27     - Fully supported
28     - ~15 ms
29   * - MobileNet V2
30     - UINT8
31     - https://storage.googleapis.com/mobilenet_v2/checkpoints/quantized_v2_224_100.tgz
32     - Fully supported
33     - ~15.5 ms
34   * - SSDLite MobileDet
35     - UINT8
36     - https://raw.githubusercontent.com/google-coral/test_data/master/ssdlite_mobiledet_coco_qat_postprocess.tflite
37     - Fully supported
38     - ~53 ms
39
40Build
41-----
42
43Build Mesa as usual, with the -Dteflon=true argument.
44
45Example instructions:
46
47.. code-block:: console
48
49   # Install build dependencies
50   ~ # apt-get -y build-dep mesa
51   ~ # apt-get -y install git cmake
52
53   # Download sources
54   ~ $ git clone https://gitlab.freedesktop.org/mesa/mesa.git
55
56   # Build Mesa
57   ~ $ cd mesa
58   mesa $ meson setup build -Dgallium-drivers=etnaviv -Dvulkan-drivers= -Dteflon=true
59   mesa $ meson compile -C build
60
61Install runtime dependencies
62----------------------------
63
64Your board should have booted into a mainline 6.7 or greater kernel and have the etnaviv driver loaded. You will also need to enable the NPU device in the device tree by means of an overlay or by a change such as the below (and rebuild the DTB):
65
66.. code-block:: diff
67
68   diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b-a311d-khadas-vim3.dts b/arch/arm64/boot/dts/amlogic/meson-g12b-a311d-khadas-vim3.dts
69   index 4aa2b20bfbf2..4e8266056bca 100644
70   --- a/arch/arm64/boot/dts/amlogic/meson-g12b-a311d-khadas-vim3.dts
71   +++ b/arch/arm64/boot/dts/amlogic/meson-g12b-a311d-khadas-vim3.dts
72   @@ -50,6 +50,10 @@ galcore {
73         };
74   };
75
76   +&npu {
77   +       status = "okay";
78   +};
79   +
80   /*
81   * The VIM3 on-board  MCU can mux the PCIe/USB3.0 shared differential
82   * lines using a FUSB340TMX USB 3.1 SuperSpeed Data Switch between
83
84
85.. code-block:: console
86
87   # Install Python 3.10 and dependencies (as root)
88   ~ # echo deb-src http://deb.debian.org/debian testing main >> /etc/apt/sources.list
89   ~ # echo deb http://deb.debian.org/debian unstable main >> /etc/apt/sources.list
90   ~ # echo 'APT::Default-Release "testing";' >> /etc/apt/apt.conf
91   ~ # apt-get update
92   ~ # apt-get -y install python3.10 python3-pytest python3-exceptiongroup
93
94   # Install TensorFlow Lite Python package (as non-root)
95   ~ $ python3.10 -m pip install --break-system-packages tflite-runtime==2.13.0
96
97Do some inference with MobileNetV1
98----------------------------------
99
100.. code-block:: console
101
102   ~ $ cd mesa/
103   mesa $ TEFLON_DEBUG=verbose ETNA_MESA_DEBUG=ml_dbgs python3.10 src/gallium/frontends/teflon/tests/classification.py -i ~/tensorflow/assets/grace_hopper.bmp -m src/gallium/targets/teflon/tests/mobilenet_v1_1.0_224_quant.tflite -l src/gallium/frontends/teflon/tests/labels_mobilenet_quant_v1_224.txt -e build/src/gallium/targets/teflon/libteflon.so
104
105   Loading external delegate from build/src/gallium/targets/teflon/libteflon.so with args: {}
106   Teflon delegate: loaded etnaviv driver
107
108   teflon: compiling graph: 89 tensors 28 operations
109   idx scale     zp has_data size
110   =======================================
111   0 0.023528   0 no       1x1x1x1024
112   1 0.166099  42 no       1x1x1x1001
113   2 0.000117   0 yes      1001x0x0x0
114   3 0.004987  4a yes      1001x1x1x1024
115   4 0.166099  42 no       1x1001x0x0
116   5 0.166099  42 yes      2x0x0x0
117   6 0.000171   0 yes      32x0x0x0
118   7 0.023528   0 no       1x112x112x32
119   8 0.021827  97 yes      32x3x3x3
120   9 0.023528   0 no       1x14x14x512
121   ...
122
123   idx type    in out  operation type-specific
124   ================================================================================================
125   0 CONV    88   7  w: 8 b: 6 stride: 2 pad: SAME
126   1 DWCONV   7  33  w: 35 b: 34 stride: 1 pad: SAME
127   2 CONV    33  37  w: 38 b: 36 stride: 1 pad: SAME
128   3 DWCONV  37  39  w: 41 b: 40 stride: 2 pad: SAME
129   4 CONV    39  43  w: 44 b: 42 stride: 1 pad: SAME
130   5 DWCONV  43  45  w: 47 b: 46 stride: 1 pad: SAME
131   6 CONV    45  49  w: 50 b: 48 stride: 1 pad: SAME
132   7 DWCONV  49  51  w: 53 b: 52 stride: 2 pad: SAME
133   8 CONV    51  55  w: 56 b: 54 stride: 1 pad: SAME
134   9 DWCONV  55  57  w: 59 b: 58 stride: 1 pad: SAME
135   10 CONV    57  61  w: 62 b: 60 stride: 1 pad: SAME
136   11 DWCONV  61  63  w: 65 b: 64 stride: 2 pad: SAME
137   12 CONV    63  67  w: 68 b: 66 stride: 1 pad: SAME
138   13 DWCONV  67  69  w: 71 b: 70 stride: 1 pad: SAME
139   14 CONV    69  73  w: 74 b: 72 stride: 1 pad: SAME
140   15 DWCONV  73  75  w: 77 b: 76 stride: 1 pad: SAME
141   16 CONV    75  79  w: 80 b: 78 stride: 1 pad: SAME
142   17 DWCONV  79  81  w: 83 b: 82 stride: 1 pad: SAME
143   18 CONV    81  85  w: 86 b: 84 stride: 1 pad: SAME
144   19 DWCONV  85   9  w: 11 b: 10 stride: 1 pad: SAME
145   20 CONV     9  13  w: 14 b: 12 stride: 1 pad: SAME
146   21 DWCONV  13  15  w: 17 b: 16 stride: 1 pad: SAME
147   22 CONV    15  19  w: 20 b: 18 stride: 1 pad: SAME
148   23 DWCONV  19  21  w: 23 b: 22 stride: 2 pad: SAME
149   24 CONV    21  25  w: 26 b: 24 stride: 1 pad: SAME
150   25 DWCONV  25  27  w: 29 b: 28 stride: 1 pad: SAME
151   26 CONV    27  31  w: 32 b: 30 stride: 1 pad: SAME
152   27 POOL    31   0  filter: 0x0 stride: 0 pad: VALID
153
154   teflon: compiled graph, took 10307 ms
155   teflon: invoked graph, took 21 ms
156   teflon: invoked graph, took 17 ms
157   teflon: invoked graph, took 17 ms
158   teflon: invoked graph, took 17 ms
159   teflon: invoked graph, took 16 ms
160   0.866667: military uniform
161   0.031373: Windsor tie
162   0.015686: mortarboard
163   0.007843: bow tie
164   0.007843: academic
165