1*61046927SAndroid Build Coastguard WorkerGL Dispatch 2*61046927SAndroid Build Coastguard Worker=========== 3*61046927SAndroid Build Coastguard Worker 4*61046927SAndroid Build Coastguard WorkerSeveral factors combine to make efficient dispatch of OpenGL functions 5*61046927SAndroid Build Coastguard Workerfairly complicated. This document attempts to explain some of the issues 6*61046927SAndroid Build Coastguard Workerand introduce the reader to Mesa's implementation. Readers already 7*61046927SAndroid Build Coastguard Workerfamiliar with the issues around GL dispatch can safely skip ahead to the 8*61046927SAndroid Build Coastguard Worker:ref:`overview of Mesa's implementation <overview>`. 9*61046927SAndroid Build Coastguard Worker 10*61046927SAndroid Build Coastguard Worker1. Complexity of GL Dispatch 11*61046927SAndroid Build Coastguard Worker---------------------------- 12*61046927SAndroid Build Coastguard Worker 13*61046927SAndroid Build Coastguard WorkerEvery GL application has at least one object called a GL *context*. This 14*61046927SAndroid Build Coastguard Workerobject, which is an implicit parameter to every GL function, stores all 15*61046927SAndroid Build Coastguard Workerof the GL related state for the application. Every texture, every buffer 16*61046927SAndroid Build Coastguard Workerobject, every enable, and much, much more is stored in the context. 17*61046927SAndroid Build Coastguard WorkerSince an application can have more than one context, the context to be 18*61046927SAndroid Build Coastguard Workerused is selected by a window-system dependent function such as 19*61046927SAndroid Build Coastguard Worker``glXMakeContextCurrent``. 20*61046927SAndroid Build Coastguard Worker 21*61046927SAndroid Build Coastguard WorkerIn environments that implement OpenGL with X-Windows using GLX, every GL 22*61046927SAndroid Build Coastguard Workerfunction, including the pointers returned by ``glXGetProcAddress``, are 23*61046927SAndroid Build Coastguard Worker*context independent*. This means that no matter what context is 24*61046927SAndroid Build Coastguard Workercurrently active, the same ``glVertex3fv`` function is used. 25*61046927SAndroid Build Coastguard Worker 26*61046927SAndroid Build Coastguard WorkerThis creates the first bit of dispatch complexity. An application can 27*61046927SAndroid Build Coastguard Workerhave two GL contexts. One context is a direct rendering context where 28*61046927SAndroid Build Coastguard Workerfunction calls are routed directly to a driver loaded within the 29*61046927SAndroid Build Coastguard Workerapplication's address space. The other context is an indirect rendering 30*61046927SAndroid Build Coastguard Workercontext where function calls are converted to GLX protocol and sent to a 31*61046927SAndroid Build Coastguard Workerserver. The same ``glVertex3fv`` has to do the right thing depending on 32*61046927SAndroid Build Coastguard Workerwhich context is current. 33*61046927SAndroid Build Coastguard Worker 34*61046927SAndroid Build Coastguard WorkerHighly optimized drivers or GLX protocol implementations may want to 35*61046927SAndroid Build Coastguard Workerchange the behavior of GL functions depending on current state. For 36*61046927SAndroid Build Coastguard Workerexample, ``glFogCoordf`` may operate differently depending on whether or 37*61046927SAndroid Build Coastguard Workernot fog is enabled. 38*61046927SAndroid Build Coastguard Worker 39*61046927SAndroid Build Coastguard WorkerIn multi-threaded environments, it is possible for each thread to have a 40*61046927SAndroid Build Coastguard Workerdifferent GL context current. This means that poor old ``glVertex3fv`` 41*61046927SAndroid Build Coastguard Workerhas to know which GL context is current in the thread where it is being 42*61046927SAndroid Build Coastguard Workercalled. 43*61046927SAndroid Build Coastguard Worker 44*61046927SAndroid Build Coastguard Worker.. _overview: 45*61046927SAndroid Build Coastguard Worker 46*61046927SAndroid Build Coastguard Worker2. Overview of Mesa's Implementation 47*61046927SAndroid Build Coastguard Worker------------------------------------ 48*61046927SAndroid Build Coastguard Worker 49*61046927SAndroid Build Coastguard WorkerMesa uses two per-thread pointers. The first pointer stores the address 50*61046927SAndroid Build Coastguard Workerof the context current in the thread, and the second pointer stores the 51*61046927SAndroid Build Coastguard Workeraddress of the *dispatch table* associated with that context. The 52*61046927SAndroid Build Coastguard Workerdispatch table stores pointers to functions that actually implement 53*61046927SAndroid Build Coastguard Workerspecific GL functions. Each time a new context is made current in a 54*61046927SAndroid Build Coastguard Workerthread, these pointers are updated. 55*61046927SAndroid Build Coastguard Worker 56*61046927SAndroid Build Coastguard WorkerThe implementation of functions such as ``glVertex3fv`` becomes 57*61046927SAndroid Build Coastguard Workerconceptually simple: 58*61046927SAndroid Build Coastguard Worker 59*61046927SAndroid Build Coastguard Worker- Fetch the current dispatch table pointer. 60*61046927SAndroid Build Coastguard Worker- Fetch the pointer to the real ``glVertex3fv`` function from the 61*61046927SAndroid Build Coastguard Worker table. 62*61046927SAndroid Build Coastguard Worker- Call the real function. 63*61046927SAndroid Build Coastguard Worker 64*61046927SAndroid Build Coastguard WorkerThis can be implemented in just a few lines of C code. The file 65*61046927SAndroid Build Coastguard Worker``src/mesa/glapi/glapitemp.h`` contains code very similar to this. 66*61046927SAndroid Build Coastguard Worker 67*61046927SAndroid Build Coastguard Worker.. code-block:: c 68*61046927SAndroid Build Coastguard Worker :caption: Sample dispatch function 69*61046927SAndroid Build Coastguard Worker 70*61046927SAndroid Build Coastguard Worker void glVertex3f(GLfloat x, GLfloat y, GLfloat z) 71*61046927SAndroid Build Coastguard Worker { 72*61046927SAndroid Build Coastguard Worker const struct _glapi_table * const dispatch = GET_DISPATCH(); 73*61046927SAndroid Build Coastguard Worker 74*61046927SAndroid Build Coastguard Worker dispatch->Vertex3f(x, y, z); 75*61046927SAndroid Build Coastguard Worker } 76*61046927SAndroid Build Coastguard Worker 77*61046927SAndroid Build Coastguard WorkerThe problem with this simple implementation is the large amount of 78*61046927SAndroid Build Coastguard Workeroverhead that it adds to every GL function call. 79*61046927SAndroid Build Coastguard Worker 80*61046927SAndroid Build Coastguard WorkerIn a multithreaded environment, a naive implementation of 81*61046927SAndroid Build Coastguard Worker``GET_DISPATCH()`` involves a call to ``_glapi_get_dispatch()`` or 82*61046927SAndroid Build Coastguard Worker``_glapi_tls_Dispatch``. 83*61046927SAndroid Build Coastguard Worker 84*61046927SAndroid Build Coastguard Worker3. Optimizations 85*61046927SAndroid Build Coastguard Worker---------------- 86*61046927SAndroid Build Coastguard Worker 87*61046927SAndroid Build Coastguard WorkerA number of optimizations have been made over the years to diminish the 88*61046927SAndroid Build Coastguard Workerperformance hit imposed by GL dispatch. This section describes these 89*61046927SAndroid Build Coastguard Workeroptimizations. The benefits of each optimization and the situations 90*61046927SAndroid Build Coastguard Workerwhere each can or cannot be used are listed. 91*61046927SAndroid Build Coastguard Worker 92*61046927SAndroid Build Coastguard Worker3.1. ELF TLS 93*61046927SAndroid Build Coastguard Worker~~~~~~~~~~~~ 94*61046927SAndroid Build Coastguard Worker 95*61046927SAndroid Build Coastguard WorkerStarting with the 2.4.20 Linux kernel, each thread is allocated an area 96*61046927SAndroid Build Coastguard Workerof per-thread, global storage. Variables can be put in this area using 97*61046927SAndroid Build Coastguard Workersome extensions to GCC that called ``ELF TLS``. By storing the dispatch table 98*61046927SAndroid Build Coastguard Workerpointer in this area, the expensive call to ``pthread_getspecific`` and 99*61046927SAndroid Build Coastguard Workerthe test of ``_glapi_Dispatch`` can be avoided. As we don't support for 100*61046927SAndroid Build Coastguard WorkerLinux kernel earlier than 2.4.20, so we can always using ``ELF TLS``. 101*61046927SAndroid Build Coastguard Worker 102*61046927SAndroid Build Coastguard WorkerThe dispatch table pointer is stored in a new variable called 103*61046927SAndroid Build Coastguard Worker``_glapi_tls_Dispatch``. A new variable name is used so that a single 104*61046927SAndroid Build Coastguard WorkerlibGL can implement both interfaces. This allows the libGL to operate 105*61046927SAndroid Build Coastguard Workerwith direct rendering drivers that use either interface. Once the 106*61046927SAndroid Build Coastguard Workerpointer is properly declared, ``GET_DISPACH`` becomes a simple variable 107*61046927SAndroid Build Coastguard Workerreference. 108*61046927SAndroid Build Coastguard Worker 109*61046927SAndroid Build Coastguard Worker.. code-block:: c 110*61046927SAndroid Build Coastguard Worker :caption: TLS ``GET_DISPATCH`` Implementation 111*61046927SAndroid Build Coastguard Worker 112*61046927SAndroid Build Coastguard Worker extern __THREAD_INITIAL_EXEC struct _glapi_table *_glapi_tls_Dispatch; 113*61046927SAndroid Build Coastguard Worker 114*61046927SAndroid Build Coastguard Worker #define GET_DISPATCH() _glapi_tls_Dispatch 115*61046927SAndroid Build Coastguard Worker 116*61046927SAndroid Build Coastguard Worker3.2. Assembly Language Dispatch Stubs 117*61046927SAndroid Build Coastguard Worker~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 118*61046927SAndroid Build Coastguard Worker 119*61046927SAndroid Build Coastguard WorkerMany platforms have difficulty properly optimizing the tail-call in the 120*61046927SAndroid Build Coastguard Workerdispatch stubs. Platforms like x86 that pass parameters on the stack 121*61046927SAndroid Build Coastguard Workerseem to have even more difficulty optimizing these routines. All of the 122*61046927SAndroid Build Coastguard Workerdispatch routines are very short, and it is trivial to create optimal 123*61046927SAndroid Build Coastguard Workerassembly language versions. The amount of optimization provided by using 124*61046927SAndroid Build Coastguard Workerassembly stubs varies from platform to platform and application to 125*61046927SAndroid Build Coastguard Workerapplication. However, by using the assembly stubs, many platforms can 126*61046927SAndroid Build Coastguard Workeruse an additional space optimization (see :ref:`below <fixedsize>`). 127*61046927SAndroid Build Coastguard Worker 128*61046927SAndroid Build Coastguard WorkerThe biggest hurdle to creating assembly stubs is handling the various 129*61046927SAndroid Build Coastguard Workerways that the dispatch table pointer can be accessed. There are four 130*61046927SAndroid Build Coastguard Workerdifferent methods that can be used: 131*61046927SAndroid Build Coastguard Worker 132*61046927SAndroid Build Coastguard Worker#. Using ``_glapi_Dispatch`` directly in builds for non-multithreaded 133*61046927SAndroid Build Coastguard Worker environments. 134*61046927SAndroid Build Coastguard Worker#. Using ``_glapi_Dispatch`` and ``_glapi_get_dispatch`` in 135*61046927SAndroid Build Coastguard Worker multithreaded environments. 136*61046927SAndroid Build Coastguard Worker#. Using ``_glapi_tls_Dispatch`` directly in TLS enabled multithreaded 137*61046927SAndroid Build Coastguard Worker environments. 138*61046927SAndroid Build Coastguard Worker 139*61046927SAndroid Build Coastguard WorkerPeople wishing to implement assembly stubs for new platforms should 140*61046927SAndroid Build Coastguard Workerfocus on #3 if the new platform supports TLS. Otherwise implement #2. 141*61046927SAndroid Build Coastguard WorkerEnvironments that do not support multithreading are 142*61046927SAndroid Build Coastguard Workeruncommon and not terribly relevant. 143*61046927SAndroid Build Coastguard Worker 144*61046927SAndroid Build Coastguard WorkerSelection of the dispatch table pointer access method is controlled by a 145*61046927SAndroid Build Coastguard Workerfew preprocessor defines. 146*61046927SAndroid Build Coastguard Worker 147*61046927SAndroid Build Coastguard Worker- If ``HAVE_PTHREAD`` is defined, method #2 is used. 148*61046927SAndroid Build Coastguard Worker- If none of the preceding are defined, method #1 is used. 149*61046927SAndroid Build Coastguard Worker 150*61046927SAndroid Build Coastguard WorkerTwo different techniques are used to handle the various different cases. 151*61046927SAndroid Build Coastguard WorkerOn x86 and SPARC, a macro called ``GL_STUB`` is used. In the preamble of 152*61046927SAndroid Build Coastguard Workerthe assembly source file different implementations of the macro are 153*61046927SAndroid Build Coastguard Workerselected based on the defined preprocessor variables. The assembly code 154*61046927SAndroid Build Coastguard Workerthen consists of a series of invocations of the macros such as: 155*61046927SAndroid Build Coastguard Worker 156*61046927SAndroid Build Coastguard Worker.. code-block:: c 157*61046927SAndroid Build Coastguard Worker :caption: SPARC Assembly Implementation of ``glColor3fv`` 158*61046927SAndroid Build Coastguard Worker 159*61046927SAndroid Build Coastguard Worker GL_STUB(Color3fv, _gloffset_Color3fv) 160*61046927SAndroid Build Coastguard Worker 161*61046927SAndroid Build Coastguard WorkerThe benefit of this technique is that changes to the calling pattern 162*61046927SAndroid Build Coastguard Worker(i.e., addition of a new dispatch table pointer access method) require 163*61046927SAndroid Build Coastguard Workerfewer changed lines in the assembly code. 164*61046927SAndroid Build Coastguard Worker 165*61046927SAndroid Build Coastguard WorkerHowever, this technique can only be used on platforms where the function 166*61046927SAndroid Build Coastguard Workerimplementation does not change based on the parameters passed to the 167*61046927SAndroid Build Coastguard Workerfunction. For example, since x86 passes all parameters on the stack, no 168*61046927SAndroid Build Coastguard Workeradditional code is needed to save and restore function parameters around 169*61046927SAndroid Build Coastguard Workera call to ``pthread_getspecific``. Since x86-64 passes parameters in 170*61046927SAndroid Build Coastguard Workerregisters, varying amounts of code needs to be inserted around the call 171*61046927SAndroid Build Coastguard Workerto ``pthread_getspecific`` to save and restore the GL function's 172*61046927SAndroid Build Coastguard Workerparameters. 173*61046927SAndroid Build Coastguard Worker 174*61046927SAndroid Build Coastguard WorkerThe other technique, used by platforms like x86-64 that cannot use the 175*61046927SAndroid Build Coastguard Workerfirst technique, is to insert ``#ifdef`` within the assembly 176*61046927SAndroid Build Coastguard Workerimplementation of each function. This makes the assembly file 177*61046927SAndroid Build Coastguard Workerconsiderably larger (e.g., 29,332 lines for ``glapi_x86-64.S`` versus 178*61046927SAndroid Build Coastguard Worker1,155 lines for ``glapi_x86.S``) and causes simple changes to the 179*61046927SAndroid Build Coastguard Workerfunction implementation to generate many lines of diffs. Since the 180*61046927SAndroid Build Coastguard Workerassembly files are typically generated by scripts, this isn't a 181*61046927SAndroid Build Coastguard Workersignificant problem. 182*61046927SAndroid Build Coastguard Worker 183*61046927SAndroid Build Coastguard WorkerOnce a new assembly file is created, it must be inserted in the build 184*61046927SAndroid Build Coastguard Workersystem. There are two steps to this. The file must first be added to 185*61046927SAndroid Build Coastguard Worker``src/mesa/sources``. That gets the file built and linked. The second 186*61046927SAndroid Build Coastguard Workerstep is to add the correct ``#ifdef`` magic to 187*61046927SAndroid Build Coastguard Worker``src/mesa/glapi/glapi_dispatch.c`` to prevent the C version of the 188*61046927SAndroid Build Coastguard Workerdispatch functions from being built. 189*61046927SAndroid Build Coastguard Worker 190*61046927SAndroid Build Coastguard Worker.. _fixedsize: 191*61046927SAndroid Build Coastguard Worker 192*61046927SAndroid Build Coastguard Worker3.3. Fixed-Length Dispatch Stubs 193*61046927SAndroid Build Coastguard Worker~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 194*61046927SAndroid Build Coastguard Worker 195*61046927SAndroid Build Coastguard WorkerTo implement ``glXGetProcAddress``, Mesa stores a table that associates 196*61046927SAndroid Build Coastguard Workerfunction names with pointers to those functions. This table is stored in 197*61046927SAndroid Build Coastguard Worker``src/mesa/glapi/glprocs.h``. For different reasons on different 198*61046927SAndroid Build Coastguard Workerplatforms, storing all of those pointers is inefficient. On most 199*61046927SAndroid Build Coastguard Workerplatforms, including all known platforms that support TLS, we can avoid 200*61046927SAndroid Build Coastguard Workerthis added overhead. 201*61046927SAndroid Build Coastguard Worker 202*61046927SAndroid Build Coastguard WorkerIf the assembly stubs are all the same size, the pointer need not be 203*61046927SAndroid Build Coastguard Workerstored for every function. The location of the function can instead be 204*61046927SAndroid Build Coastguard Workercalculated by multiplying the size of the dispatch stub by the offset of 205*61046927SAndroid Build Coastguard Workerthe function in the table. This value is then added to the address of 206*61046927SAndroid Build Coastguard Workerthe first dispatch stub. 207*61046927SAndroid Build Coastguard Worker 208*61046927SAndroid Build Coastguard WorkerThis path is activated by adding the correct ``#ifdef`` magic to 209*61046927SAndroid Build Coastguard Worker``src/mesa/glapi/glapi.c`` just before ``glprocs.h`` is included. 210