1# Uniform Block to StructuredBuffer Translation
2
3## Background
4In ANGLE's D3D11 backend, we normally translate GLSL uniform blocks to
5HLSL constant buffers. We run into a compile performance issue with
6[fxc](https://docs.microsoft.com/en-us/windows/win32/direct3dtools/fxc)
7and dynamic constant buffer indexing,
8[anglebug.com/40096608](https://bugs.chromium.org/p/angleproject/issues/detail?id=3682)
9
10## Solution
11We translate a uniform block into a StructuredBuffer when the following three
12conditions are satisfied:
13* The uniform block has only one array member, and the array size is larger than or
14equal to 50;
15* In the shader, all the accesses of the member are through indexing operator;
16* The type of the array member must be any of the following:
17  * a scalar or vector type.
18  * a mat2x4, mat3x4 or mat4x4 matrix type in column major layout, a mat4x2, mat4x2 or
19  mat4x3 matrix type in row major layout.
20  * a structure type with no array or structure members, where all of the structure's
21  fields satisify the prior type conditions.
22
23## Analysis
24A typical use case for uniform block to StructuredBuffer translation is for shaders with
25one large array member in a uniform block. For example:
26```
27// GLSL code
28uniform buffer {
29    TYPE buf [100];
30};
31```
32Will be translated into
33```
34// HLSL code
35StructuredBuffer <TYPETRANSLATED>  bufTranslated: register(tN);
36```
37
38However, even with the above limitation, there are still many shaders where we cannot
39apply our translation. They are divided into two classes. The first case is when the
40shader accesses a "whole entity" uniform block array member element. The second is when
41the shader uses the std140 layout.
42
43### Operate uniform block array member as whole entity
44According to ESSL spec 3.0, 5.7 Structure and Array Operations,  the following operators
45are allowed to operate on arrays as whole entities:
46
47|  Operator Name             |    Operator    |
48|  :----------------         |    :------:    |
49|  field or method selector  |        .       |
50|  assignment                |     ==  !=     |
51|  Ternary operator          |       ?:       |
52|  Sequence operator         |       ,        |
53|  indexing                  |       []       |
54
55However, after translating to StructuredBuffer, the uniform array member cannot be used as
56a whole entity since its type has been changed. The member is no longer an array.
57After the change, we only support the indexed operation since it is the most common use case.
58Other operator usage is unsupported. Example unsupported usages:
59
60| Operator On the Uniform Array Member    |                  examples                        |
61|             :------                     |                  :------                         |
62| method selector      | buf.length();   // Angle don’t support it, too.                     |
63| equality == !=       | TYPE var[NUMBER] = {…}; <br> if (var == buf);                       |
64| assignment =         | TYPE var[NUMBER] = {…}; <br> var = buf;                             |
65| Ternary operator ?:  | // Angle don’t support it, too.                                     |
66| Sequence operator ,  | TYPE var1[NUMBER] = {…}; <br> TYPE var2[NUMBER] = (var1, buf);      |
67| Function arguments   | void func(TYPE a[NUMBER]); <br> func(buf);                          |
68| Function return type | TYPE[NUMBER] func() { return buf;}  <br> TYPE var[NUMBER] = func(); |
69
70### Std140 limitation
71GLSL uniform blocks follow std140 layout packing rules. StructuredBufer has a different set
72of packing rules. So we may need to explicitly pad the type `TYPETRANSLATED` to follow std140
73rules. Alternately, we can just simply only support those types which don't need to be padded.
74These are the supported translation types which do not require translation emulation:
75
76|         GLSL TYPE          |     TRANSLATED HLSL TYPE      |
77|         :------            |          :------              |
78|   vec4/ivec4/uvec4/bvec4   |     float4/int4/uint4/bool4   |
79|   mat2x4 (column_major)    |     float2x4 (row_major)      |
80|   mat3x4 (column_major)    |     float3x4 (row_major)      |
81|   mat4x4 (column_major)    |     float4x4 (row_major)      |
82|   mat4x2 (row_major)       |     float4x3 (column_major)   |
83|   mat4x3 (row_major)       |     float4x3 (column_major)   |
84|   mat4x4 (row_major)       |     float4x4 (column_major)   |
85
86These are the supported translation types which require some basic translation emulation:
87
88|         GLSL TYPE          |     TRANSLATED HLSL TYPE      |     examples        |
89|         :------            |          :------              |     :------         |
90|float/int/uint/bool   |float4/int4/uint4/bool4|GLSL: float var = buf[0]; <br> HLSL: float var = buf[0].x;  |
91|vec2/ivec2/uvec2/bvec2|float4/int4/uint4/bool4|GLSL: vec2 var = buf[0]; <br> HLSL: float2 var = buf[0].xy; |
92|vec3/ivec3/uvec3/bvec3|float4/int4/uint4/bool4|GLSL: vec3 var = buf[0]; <br> HLSL: float3 var = buf[0].xyz;|
93
94These are the unsupported translation types which require more complex translation emulation:
95
96|         GLSL TYPE          |     TRANSLATED HLSL TYPE      |
97|         :------            |          :------              |
98|   mat2x2 (column_major)    |     float2x4 (row_major)      |
99|   mat2x3 (column_major)    |     float2x4 (row_major)      |
100|   mat3x2 (column_major)    |     float3x4 (row_major)      |
101|   mat3x3 (column_major)    |     float3x4 (row_major)      |
102|   mat4x2 (column_major)    |     float4x4 (row_major)      |
103|   mat4x3 (column_major)    |     float4x4 (row_major)      |
104|   mat2x2 (row_major)       |     float4x2 (column_major)   |
105|   mat2x3 (row_major)       |     float4x3 (column_major)   |
106|   mat2x4 (row_major)       |     float4x4 (column_major)   |
107|   mat3x2 (row_major)       |     float4x2 (column_major)   |
108|   mat3x3 (row_major)       |     float4x3 (column_major)   |
109|   mat3x4 (row_major)       |     float4x4 (column_major)   |
110
111
112Take mat3x2(column_major) for an example, the uniform buffer's memory layout is as shown below.
113
114|index|0                      |1                         |...      |
115|:--- |      :------          |        :------           |  :---   |
116|data |1 2 x x 3 4 x x 5 6 x x|7 8 x x 9 10 x x 11 12 x x|...      |
117
118
119And the declaration of the uniform block in vertex shader may be as shown below.
120```
121layout(std140) uniform buffer {
122    mat3x2 buf [100];
123};
124
125void main(void) {
126    ...
127    vec2 var = buf[0][2]
128    ...
129}
130```
131Will be translated to
132
133```
134#pragma pack_matrix(row_major)
135StructuredBuffer<float3x4> bufTranslated: register(t0);
136
137float3x2 GetFloat3x2FromFloat3x4Rowmajor(float3x4 mat)
138{
139    float3x2 res = { 0.0 };
140    res[0] = mat[0].xy;
141    res[1] = mat[1].xy;
142    res[2] = mat[2].xy;
143    return res;
144}
145
146VS_OUTPUT main(VS_INPUT input) {
147    ...
148    float3x2 var = GetFloat3x2FromFloat3x4Rowmajor(bufTranslated[0])
149}
150```
151
152When accessing the element of the `buf` variable, we would need to extract a float3x2 from
153a float3x4 for every element.