xref: /aosp_15_r20/external/renderscript-intrinsics-replacement-toolkit/README.md (revision 32afb93c2e1b8d1a7b2b095a23ba27ef06a1d9f7)
1*32afb93cSXin Li# RenderScript Intrinsics Replacement Toolkit - v0.8 BETA
2*32afb93cSXin Li
3*32afb93cSXin LiThis Toolkit provides a collection of high-performance image manipulation functions
4*32afb93cSXin Lilike blur, blend, and resize. It can be used as a stand-alone replacement for most
5*32afb93cSXin Liof the deprecated RenderScript Intrinsics functions.
6*32afb93cSXin Li
7*32afb93cSXin LiThe Toolkit provides ten image manipulation functions:
8*32afb93cSXin Li* blend,
9*32afb93cSXin Li* blur,
10*32afb93cSXin Li* color matrix,
11*32afb93cSXin Li* convolve,
12*32afb93cSXin Li* histogram and histogramDot,
13*32afb93cSXin Li* LUT (lookup table) and LUT 3D,
14*32afb93cSXin Li* resize, and
15*32afb93cSXin Li* YUV to RGB.
16*32afb93cSXin Li
17*32afb93cSXin LiThe Toolkit provides a C++ and a Java/Kotlin interface. It is packaged as an Android
18*32afb93cSXin Lilibrary that you can add to your project.
19*32afb93cSXin Li
20*32afb93cSXin LiThese functions execute multithreaded on the CPU. They take advantage of Neon/AdvSimd
21*32afb93cSXin Lion Arm processors and SSE on Intel's.
22*32afb93cSXin Li
23*32afb93cSXin LiCompared to the RenderScript Intrinsics, this Toolkit is simpler to use and twice as fast
24*32afb93cSXin Liwhen executing on the CPU. However RenderScript Intrinsics allow more flexibility for
25*32afb93cSXin Lithe type of allocations supported. This toolkit does not support allocations of floats;
26*32afb93cSXin Limost the functions support ByteArrays and Bitmaps.
27*32afb93cSXin Li
28*32afb93cSXin LiYou should instantiate the Toolkit once and reuse it throughout your application.
29*32afb93cSXin LiOn instantiation, the Toolkit creates a thread pool that's used for processing all the functions.
30*32afb93cSXin LiYou can limit the number of poolThreads used by the Toolkit via the constructor. The poolThreads
31*32afb93cSXin Liare destroyed once the Toolkit is destroyed, after any pending work is done.
32*32afb93cSXin Li
33*32afb93cSXin LiThis library is thread safe. You can call methods from different poolThreads. The functions will
34*32afb93cSXin Liexecute sequentially.
35*32afb93cSXin Li
36*32afb93cSXin Li
37*32afb93cSXin Li## Future improvement ideas:
38*32afb93cSXin Li
39*32afb93cSXin Li* Turn the Java version of the Toolkit into a singleton, to reduce the chance that someone inadventarly
40*32afb93cSXin Licreate multiple threadpools.
41*32afb93cSXin Li
42*32afb93cSXin Li* Support ByteBuffer. It should be straightforward to use GetDirectBufferAddress in JniEntryPoints.cpp.
43*32afb93cSXin LiSee https://developer.android.com/training/articles/perf-jni and jni_helper.h.
44*32afb93cSXin Li
45*32afb93cSXin Li* The RenderScript Intrinsics support floats for colorMatrix, convolve, and resize. The Toolkit does not.
46*32afb93cSXin Li
47*32afb93cSXin Li* Allow in place update of buffers, or writing to an existing byte array.
48*32afb93cSXin Li
49*32afb93cSXin Li* For Blur, we could have a version that accepts a mask. This is commonly used for background
50*32afb93cSXin Liblurring. We should allow the mask to be smaller than the original, since neural networks models
51*32afb93cSXin Lithat do segmentation are downscaled.
52*32afb93cSXin Li
53*32afb93cSXin Li* Allow yuvToRgb to have a Restriction.
54*32afb93cSXin Li
55*32afb93cSXin Li* Add support for YUV_420_888, the YUV format favored by Camera2. Allow various strides to be specified.
56*32afb93cSXin Li
57*32afb93cSXin Li* When passing a Restriction, it would be nice to say "Create a smaller output".
58*32afb93cSXin LiThe original RenderScript does not allow that. It's not that useful when outputing new buffers as
59*32afb93cSXin Liour Java library does.
60*32afb93cSXin Li
61*32afb93cSXin Li* For Resize, Restriction working on input buffer would be more useful but that's not RenderScript.
62*32afb93cSXin Li
63*32afb93cSXin Li* Integrate and test with imageprocessing_jb. Do the same for [github/renderscript-samples/](https://github.com/android/renderscript-samples/tree/main/RenderScriptIntrinsic)
64*32afb93cSXin Li
65*32afb93cSXin Li* Allow Bitmaps with rowSize != width * vectorSize. We could do this also for ByteArray.
66*32afb93cSXin Li
67*32afb93cSXin Li- In TaskProcessor.cpp, the code below is fine and clean, but probably a bit inefficient.
68*32afb93cSXin LiWhen this wakes up another thread, it may have to immediately go back to sleep, since we still hold the lock.
69*32afb93cSXin LiIt could instead set a need_to_notify flag and test that after releasing the lock (both places).
70*32afb93cSXin LiThat might avoid some context switches.
71*32afb93cSXin Li```cpp
72*32afb93cSXin Liif (mTilesInProcess == 0 && mTilesNotYetStarted == 0) {
73*32afb93cSXin Li    mWorkIsFinished.notify_one();
74*32afb93cSXin Li```
75*32afb93cSXin Li
76*32afb93cSXin Li* When compiled as part of Android, librenderscript_toolkit.so is 101,456 bytes. When compiled by Android Studio as part of an .aar, it's 387K. Figure out why and slim it down.
77