xref: /aosp_15_r20/bionic/docs/elf-tls.md (revision 8d67ca893c1523eb926b9080dbe4e2ffd2a27ba1)
1*8d67ca89SAndroid Build Coastguard Worker# Android ELF TLS
2*8d67ca89SAndroid Build Coastguard Worker
3*8d67ca89SAndroid Build Coastguard WorkerApp developers probably just want to read the
4*8d67ca89SAndroid Build Coastguard Worker[quick ELS TLS status summary](../android-changes-for-ndk-developers.md#elf-tls-available-for-api-level-29)
5*8d67ca89SAndroid Build Coastguard Workerinstead.
6*8d67ca89SAndroid Build Coastguard Worker
7*8d67ca89SAndroid Build Coastguard WorkerThis document covers the detailed design and implementation choices.
8*8d67ca89SAndroid Build Coastguard Worker
9*8d67ca89SAndroid Build Coastguard Worker[TOC]
10*8d67ca89SAndroid Build Coastguard Worker
11*8d67ca89SAndroid Build Coastguard Worker# Overview
12*8d67ca89SAndroid Build Coastguard Worker
13*8d67ca89SAndroid Build Coastguard WorkerELF TLS is a system for automatically allocating thread-local variables with cooperation among the
14*8d67ca89SAndroid Build Coastguard Workercompiler, linker, dynamic loader, and libc.
15*8d67ca89SAndroid Build Coastguard Worker
16*8d67ca89SAndroid Build Coastguard WorkerThread-local variables are declared in C and C++ with a specifier, e.g.:
17*8d67ca89SAndroid Build Coastguard Worker
18*8d67ca89SAndroid Build Coastguard Worker```cpp
19*8d67ca89SAndroid Build Coastguard Workerthread_local int tls_var;
20*8d67ca89SAndroid Build Coastguard Worker```
21*8d67ca89SAndroid Build Coastguard Worker
22*8d67ca89SAndroid Build Coastguard WorkerAt run-time, TLS variables are allocated on a module-by-module basis, where a module is a shared
23*8d67ca89SAndroid Build Coastguard Workerobject or executable. At program startup, TLS for all initially-loaded modules comprises the "Static
24*8d67ca89SAndroid Build Coastguard WorkerTLS Block". TLS variables within the Static TLS Block exist at fixed offsets from an
25*8d67ca89SAndroid Build Coastguard Workerarchitecture-specific thread pointer (TP) and can be accessed very efficiently -- typically just a
26*8d67ca89SAndroid Build Coastguard Workerfew instructions. TLS variables belonging to dlopen'ed shared objects, on the other hand, may be
27*8d67ca89SAndroid Build Coastguard Workerallocated lazily, and accessing them typically requires a function call.
28*8d67ca89SAndroid Build Coastguard Worker
29*8d67ca89SAndroid Build Coastguard Worker# Thread-Specific Memory Layout
30*8d67ca89SAndroid Build Coastguard Worker
31*8d67ca89SAndroid Build Coastguard WorkerUlrich Drepper's ELF TLS document specifies two ways of organizing memory pointed at by the
32*8d67ca89SAndroid Build Coastguard Workerarchitecture-specific thread-pointer ([`__get_tls()`] in Bionic):
33*8d67ca89SAndroid Build Coastguard Worker
34*8d67ca89SAndroid Build Coastguard Worker![TLS Variant 1 Layout](img/tls-variant1.png)
35*8d67ca89SAndroid Build Coastguard Worker
36*8d67ca89SAndroid Build Coastguard Worker![TLS Variant 2 Layout](img/tls-variant2.png)
37*8d67ca89SAndroid Build Coastguard Worker
38*8d67ca89SAndroid Build Coastguard WorkerVariant 1 places the static TLS block after the TP, whereas variant 2 places it before the TP.
39*8d67ca89SAndroid Build Coastguard WorkerAccording to Drepper, variant 2 was motivated by backwards compatibility, and variant 1 was designed
40*8d67ca89SAndroid Build Coastguard Workerfor Itanium. The choice has effects on the toolchain, loader, and libc. In particular, when linking
41*8d67ca89SAndroid Build Coastguard Workeran executable, the linker needs to know where an executable's TLS segment is relative to the TP so
42*8d67ca89SAndroid Build Coastguard Workerit can correctly relocate TLS accesses. Both variants are incompatible with Bionic's current
43*8d67ca89SAndroid Build Coastguard Workerthread-specific data layout, but variant 1 is more problematic than variant 2.
44*8d67ca89SAndroid Build Coastguard Worker
45*8d67ca89SAndroid Build Coastguard WorkerEach thread has a "Dynamic Thread Vector" (DTV) with a pointer to each module's TLS block (or NULL
46*8d67ca89SAndroid Build Coastguard Workerif it hasn't been allocated yet). If the executable has a TLS segment, then it will always be module
47*8d67ca89SAndroid Build Coastguard Worker1, and its storage will always be immediately after (or before) the TP. In variant 1, the TP is
48*8d67ca89SAndroid Build Coastguard Workerexpected to point immediately at the DTV pointer, whereas in variant 2, the DTV pointer's offset
49*8d67ca89SAndroid Build Coastguard Workerfrom TP is implementation-defined.
50*8d67ca89SAndroid Build Coastguard Worker
51*8d67ca89SAndroid Build Coastguard WorkerThe DTV's "generation" field is used to lazily update/reallocate the DTV when new modules are loaded
52*8d67ca89SAndroid Build Coastguard Workeror unloaded.
53*8d67ca89SAndroid Build Coastguard Worker
54*8d67ca89SAndroid Build Coastguard Worker[`__get_tls()`]: https://android.googlesource.com/platform/bionic/+/7245c082658182c15d2a423fe770388fec707cbc/libc/private/__get_tls.h
55*8d67ca89SAndroid Build Coastguard Worker
56*8d67ca89SAndroid Build Coastguard Worker# Access Models
57*8d67ca89SAndroid Build Coastguard Worker
58*8d67ca89SAndroid Build Coastguard WorkerWhen a C/C++ file references a TLS variable, the toolchain generates instructions to find its
59*8d67ca89SAndroid Build Coastguard Workeraddress using a TLS "access model". The access models trade generality against efficiency. The four
60*8d67ca89SAndroid Build Coastguard Workermodels are:
61*8d67ca89SAndroid Build Coastguard Worker
62*8d67ca89SAndroid Build Coastguard Worker * GD: General Dynamic (aka Global Dynamic)
63*8d67ca89SAndroid Build Coastguard Worker * LD: Local Dynamic
64*8d67ca89SAndroid Build Coastguard Worker * IE: Initial Exec
65*8d67ca89SAndroid Build Coastguard Worker * LE: Local Exec
66*8d67ca89SAndroid Build Coastguard Worker
67*8d67ca89SAndroid Build Coastguard WorkerA TLS variable may be in a different module than the reference.
68*8d67ca89SAndroid Build Coastguard Worker
69*8d67ca89SAndroid Build Coastguard Worker## General Dynamic (or Global Dynamic) (GD)
70*8d67ca89SAndroid Build Coastguard Worker
71*8d67ca89SAndroid Build Coastguard WorkerA GD access can refer to a TLS variable anywhere. To access a variable `tls_var` using the
72*8d67ca89SAndroid Build Coastguard Worker"traditional" non-TLSDESC design described in Drepper's TLS document, the toolchain compiler emits a
73*8d67ca89SAndroid Build Coastguard Workercall to a `__tls_get_addr` function provided by libc.
74*8d67ca89SAndroid Build Coastguard Worker
75*8d67ca89SAndroid Build Coastguard WorkerFor example, if we have this C code in a shared object:
76*8d67ca89SAndroid Build Coastguard Worker
77*8d67ca89SAndroid Build Coastguard Worker```cpp
78*8d67ca89SAndroid Build Coastguard Workerextern thread_local char tls_var;
79*8d67ca89SAndroid Build Coastguard Workerchar* get_tls_var() {
80*8d67ca89SAndroid Build Coastguard Worker  return &tls_var;
81*8d67ca89SAndroid Build Coastguard Worker}
82*8d67ca89SAndroid Build Coastguard Worker```
83*8d67ca89SAndroid Build Coastguard Worker
84*8d67ca89SAndroid Build Coastguard WorkerThe toolchain generates code like this:
85*8d67ca89SAndroid Build Coastguard Worker
86*8d67ca89SAndroid Build Coastguard Worker```cpp
87*8d67ca89SAndroid Build Coastguard Workerstruct TlsIndex {
88*8d67ca89SAndroid Build Coastguard Worker  long module; // starts counting at 1
89*8d67ca89SAndroid Build Coastguard Worker  long offset;
90*8d67ca89SAndroid Build Coastguard Worker};
91*8d67ca89SAndroid Build Coastguard Worker
92*8d67ca89SAndroid Build Coastguard Workerchar* get_tls_var() {
93*8d67ca89SAndroid Build Coastguard Worker  static TlsIndex tls_var_idx = { // allocated in the .got
94*8d67ca89SAndroid Build Coastguard Worker    R_TLS_DTPMOD(tls_var), // dynamic TP module ID
95*8d67ca89SAndroid Build Coastguard Worker    R_TLS_DTPOFF(tls_var), // dynamic TP offset
96*8d67ca89SAndroid Build Coastguard Worker  };
97*8d67ca89SAndroid Build Coastguard Worker  return __tls_get_addr(&tls_var_idx);
98*8d67ca89SAndroid Build Coastguard Worker}
99*8d67ca89SAndroid Build Coastguard Worker```
100*8d67ca89SAndroid Build Coastguard Worker
101*8d67ca89SAndroid Build Coastguard Worker`R_TLS_DTPMOD` is a dynamic relocation to the index of the module containing `tls_var`, and
102*8d67ca89SAndroid Build Coastguard Worker`R_TLS_DTPOFF` is a dynamic relocation to the offset of `tls_var` within its module's `PT_TLS`
103*8d67ca89SAndroid Build Coastguard Workersegment.
104*8d67ca89SAndroid Build Coastguard Worker
105*8d67ca89SAndroid Build Coastguard Worker`__tls_get_addr` looks up `TlsIndex::module_id`'s entry in the DTV and adds `TlsIndex::offset` to
106*8d67ca89SAndroid Build Coastguard Workerthe module's TLS block. Before it can do this, it ensures that the module's TLS block is allocated.
107*8d67ca89SAndroid Build Coastguard WorkerA simple approach is to allocate memory lazily:
108*8d67ca89SAndroid Build Coastguard Worker
109*8d67ca89SAndroid Build Coastguard Worker1. If the current thread's DTV generation count is less than the current global TLS generation, then
110*8d67ca89SAndroid Build Coastguard Worker   `__tls_get_addr` may reallocate the DTV or free blocks for unloaded modules.
111*8d67ca89SAndroid Build Coastguard Worker
112*8d67ca89SAndroid Build Coastguard Worker2. If the DTV's entry for the given module is `NULL`, then `__tls_get_addr` allocates the module's
113*8d67ca89SAndroid Build Coastguard Worker   memory.
114*8d67ca89SAndroid Build Coastguard Worker
115*8d67ca89SAndroid Build Coastguard WorkerIf an allocation fails, `__tls_get_addr` calls `abort` (like emutls).
116*8d67ca89SAndroid Build Coastguard Worker
117*8d67ca89SAndroid Build Coastguard Workermusl, on the other, preallocates TLS memory in `pthread_create` and in `dlopen`, and each can report
118*8d67ca89SAndroid Build Coastguard Workerout-of-memory.
119*8d67ca89SAndroid Build Coastguard Worker
120*8d67ca89SAndroid Build Coastguard Worker## Local Dynamic (LD)
121*8d67ca89SAndroid Build Coastguard Worker
122*8d67ca89SAndroid Build Coastguard WorkerLD is a specialization of GD that's useful when a function has references to two or more TLS
123*8d67ca89SAndroid Build Coastguard Workervariables that are both part of the same module as the reference. Instead of a call to
124*8d67ca89SAndroid Build Coastguard Worker`__tls_get_addr` for each variable, the compiler calls `__tls_get_addr` once to get the current
125*8d67ca89SAndroid Build Coastguard Workermodule's TLS block, then adds each variable's DTPOFF to the result.
126*8d67ca89SAndroid Build Coastguard Worker
127*8d67ca89SAndroid Build Coastguard WorkerFor example, suppose we have this C code:
128*8d67ca89SAndroid Build Coastguard Worker
129*8d67ca89SAndroid Build Coastguard Worker```cpp
130*8d67ca89SAndroid Build Coastguard Workerstatic thread_local int x;
131*8d67ca89SAndroid Build Coastguard Workerstatic thread_local int y;
132*8d67ca89SAndroid Build Coastguard Workerint sum() {
133*8d67ca89SAndroid Build Coastguard Worker  return x + y;
134*8d67ca89SAndroid Build Coastguard Worker}
135*8d67ca89SAndroid Build Coastguard Worker```
136*8d67ca89SAndroid Build Coastguard Worker
137*8d67ca89SAndroid Build Coastguard WorkerThe toolchain generates code like this:
138*8d67ca89SAndroid Build Coastguard Worker
139*8d67ca89SAndroid Build Coastguard Worker```cpp
140*8d67ca89SAndroid Build Coastguard Workerint sum() {
141*8d67ca89SAndroid Build Coastguard Worker  static TlsIndex tls_module_idx = { // allocated in the .got
142*8d67ca89SAndroid Build Coastguard Worker    // a dynamic relocation against symbol 0 => current module ID
143*8d67ca89SAndroid Build Coastguard Worker    R_TLS_DTPMOD(NULL),
144*8d67ca89SAndroid Build Coastguard Worker    0,
145*8d67ca89SAndroid Build Coastguard Worker  };
146*8d67ca89SAndroid Build Coastguard Worker  char* base = __tls_get_addr(&tls_module_idx);
147*8d67ca89SAndroid Build Coastguard Worker  // These R_TLS_DTPOFF() relocations are resolved at link-time.
148*8d67ca89SAndroid Build Coastguard Worker  int* px = base + R_TLS_DTPOFF(x);
149*8d67ca89SAndroid Build Coastguard Worker  int* py = base + R_TLS_DTPOFF(y);
150*8d67ca89SAndroid Build Coastguard Worker  return *px + *py;
151*8d67ca89SAndroid Build Coastguard Worker}
152*8d67ca89SAndroid Build Coastguard Worker```
153*8d67ca89SAndroid Build Coastguard Worker
154*8d67ca89SAndroid Build Coastguard Worker(XXX: LD might be important for C++ `thread_local` variables -- even a single `thread_local`
155*8d67ca89SAndroid Build Coastguard Workervariable with a dynamic initializer has an associated TLS guard variable.)
156*8d67ca89SAndroid Build Coastguard Worker
157*8d67ca89SAndroid Build Coastguard Worker## Initial Exec (IE)
158*8d67ca89SAndroid Build Coastguard Worker
159*8d67ca89SAndroid Build Coastguard WorkerIf the variable is part of the Static TLS Block (i.e. the executable or an initially-loaded shared
160*8d67ca89SAndroid Build Coastguard Workerobject), then its offset from the TP is known at load-time. The variable can be accessed with a few
161*8d67ca89SAndroid Build Coastguard Workerloads.
162*8d67ca89SAndroid Build Coastguard Worker
163*8d67ca89SAndroid Build Coastguard WorkerExample: a C file for an executable:
164*8d67ca89SAndroid Build Coastguard Worker
165*8d67ca89SAndroid Build Coastguard Worker```cpp
166*8d67ca89SAndroid Build Coastguard Worker// tls_var could be defined in the executable, or it could be defined
167*8d67ca89SAndroid Build Coastguard Worker// in a shared object the executable links against.
168*8d67ca89SAndroid Build Coastguard Workerextern thread_local char tls_var;
169*8d67ca89SAndroid Build Coastguard Workerchar* get_addr() { return &tls_var; }
170*8d67ca89SAndroid Build Coastguard Worker```
171*8d67ca89SAndroid Build Coastguard Worker
172*8d67ca89SAndroid Build Coastguard WorkerCompiles to:
173*8d67ca89SAndroid Build Coastguard Worker
174*8d67ca89SAndroid Build Coastguard Worker```cpp
175*8d67ca89SAndroid Build Coastguard Worker// allocated in the .got, resolved at load-time with a dynamic reloc.
176*8d67ca89SAndroid Build Coastguard Worker// Unlike DTPOFF, which is relative to the start of the module’s block,
177*8d67ca89SAndroid Build Coastguard Worker// TPOFF is directly relative to the thread pointer.
178*8d67ca89SAndroid Build Coastguard Workerstatic long tls_var_gotoff = R_TLS_TPOFF(tls_var);
179*8d67ca89SAndroid Build Coastguard Worker
180*8d67ca89SAndroid Build Coastguard Workerchar* get_addr() {
181*8d67ca89SAndroid Build Coastguard Worker  return (char*)__get_tls() + tls_var_gotoff;
182*8d67ca89SAndroid Build Coastguard Worker}
183*8d67ca89SAndroid Build Coastguard Worker```
184*8d67ca89SAndroid Build Coastguard Worker
185*8d67ca89SAndroid Build Coastguard Worker## Local Exec (LE)
186*8d67ca89SAndroid Build Coastguard Worker
187*8d67ca89SAndroid Build Coastguard WorkerLE is a specialization of IE. If the variable is not just part of the Static TLS Block, but is also
188*8d67ca89SAndroid Build Coastguard Workerpart of the executable (and referenced from the executable), then a GOT access can be avoided. The
189*8d67ca89SAndroid Build Coastguard WorkerIE example compiles to:
190*8d67ca89SAndroid Build Coastguard Worker
191*8d67ca89SAndroid Build Coastguard Worker```cpp
192*8d67ca89SAndroid Build Coastguard Workerchar* get_addr() {
193*8d67ca89SAndroid Build Coastguard Worker  // R_TLS_TPOFF() is resolved at (static) link-time
194*8d67ca89SAndroid Build Coastguard Worker  return (char*)__get_tls() + R_TLS_TPOFF(tls_var);
195*8d67ca89SAndroid Build Coastguard Worker}
196*8d67ca89SAndroid Build Coastguard Worker```
197*8d67ca89SAndroid Build Coastguard Worker
198*8d67ca89SAndroid Build Coastguard Worker## Selecting an Access Model
199*8d67ca89SAndroid Build Coastguard Worker
200*8d67ca89SAndroid Build Coastguard WorkerThe compiler selects an access model for each variable reference using these factors:
201*8d67ca89SAndroid Build Coastguard Worker * The absence of `-fpic` implies an executable, so use IE/LE.
202*8d67ca89SAndroid Build Coastguard Worker * Code compiled with `-fpic` could be in a shared object, so use GD/LD.
203*8d67ca89SAndroid Build Coastguard Worker * The per-file default can be overridden with `-ftls-model=<model>`.
204*8d67ca89SAndroid Build Coastguard Worker * Specifiers on the variable (`static`, `extern`, ELF visibility attributes).
205*8d67ca89SAndroid Build Coastguard Worker * A variable can be annotated with `__attribute__((tls_model(...)))`. Clang may still use a more
206*8d67ca89SAndroid Build Coastguard Worker   efficient model than the one specified.
207*8d67ca89SAndroid Build Coastguard Worker
208*8d67ca89SAndroid Build Coastguard Worker# Shared Objects with Static TLS
209*8d67ca89SAndroid Build Coastguard Worker
210*8d67ca89SAndroid Build Coastguard WorkerShared objects are sometimes compiled with `-ftls-model=initial-exec` (i.e. "static TLS") for better
211*8d67ca89SAndroid Build Coastguard Workerperformance. On Ubuntu, for example, `libc.so.6` and `libOpenGL.so.0` are compiled this way. Shared
212*8d67ca89SAndroid Build Coastguard Workerobjects using static TLS can't be loaded with `dlopen` unless libc has reserved enough surplus
213*8d67ca89SAndroid Build Coastguard Workermemory in the static TLS block. glibc reserves a kilobyte or two (`TLS_STATIC_SURPLUS`) with the
214*8d67ca89SAndroid Build Coastguard Workerintent that only a few core system libraries would use static TLS. Non-core libraries also sometimes
215*8d67ca89SAndroid Build Coastguard Workeruse it, which can break `dlopen` if the surplus area is exhausted. See:
216*8d67ca89SAndroid Build Coastguard Worker * https://bugzilla.redhat.com/show_bug.cgi?id=1124987
217*8d67ca89SAndroid Build Coastguard Worker * web search: [`"dlopen: cannot load any more object with static TLS"`][glibc-static-tls-error]
218*8d67ca89SAndroid Build Coastguard Worker
219*8d67ca89SAndroid Build Coastguard WorkerNeither bionic nor musl currently allocate any surplus TLS memory.
220*8d67ca89SAndroid Build Coastguard Worker
221*8d67ca89SAndroid Build Coastguard WorkerIn general, supporting surplus TLS memory probably requires maintaining a thread list so that
222*8d67ca89SAndroid Build Coastguard Worker`dlopen` can initialize the new static TLS memory in all existing threads. A thread list could be
223*8d67ca89SAndroid Build Coastguard Workeromitted if the loader only allowed zero-initialized TLS segments and didn't reclaim memory on
224*8d67ca89SAndroid Build Coastguard Worker`dlclose`.
225*8d67ca89SAndroid Build Coastguard Worker
226*8d67ca89SAndroid Build Coastguard WorkerAs long as a shared object is one of the initially-loaded modules, a better option is to use
227*8d67ca89SAndroid Build Coastguard WorkerTLSDESC.
228*8d67ca89SAndroid Build Coastguard Worker
229*8d67ca89SAndroid Build Coastguard Worker[glibc-static-tls-error]: https://www.google.com/search?q=%22dlopen:+cannot+load+any+more+object+with+static+TLS%22
230*8d67ca89SAndroid Build Coastguard Worker
231*8d67ca89SAndroid Build Coastguard Worker# TLS Descriptors (TLSDESC)
232*8d67ca89SAndroid Build Coastguard Worker
233*8d67ca89SAndroid Build Coastguard WorkerThe code fragments above match the "traditional" TLS design from Drepper's document. For the GD and
234*8d67ca89SAndroid Build Coastguard WorkerLD models, there is a newer, more efficient design that uses "TLS descriptors". Each TLS variable
235*8d67ca89SAndroid Build Coastguard Workerreference has a corresponding descriptor, which contains a resolver function address and an argument
236*8d67ca89SAndroid Build Coastguard Workerto pass to the resolver.
237*8d67ca89SAndroid Build Coastguard Worker
238*8d67ca89SAndroid Build Coastguard WorkerFor example, if we have this C code in a shared object:
239*8d67ca89SAndroid Build Coastguard Worker
240*8d67ca89SAndroid Build Coastguard Worker```cpp
241*8d67ca89SAndroid Build Coastguard Workerextern thread_local char tls_var;
242*8d67ca89SAndroid Build Coastguard Workerchar* get_tls_var() {
243*8d67ca89SAndroid Build Coastguard Worker  return &tls_var;
244*8d67ca89SAndroid Build Coastguard Worker}
245*8d67ca89SAndroid Build Coastguard Worker```
246*8d67ca89SAndroid Build Coastguard Worker
247*8d67ca89SAndroid Build Coastguard WorkerThe toolchain generates code like this:
248*8d67ca89SAndroid Build Coastguard Worker
249*8d67ca89SAndroid Build Coastguard Worker```cpp
250*8d67ca89SAndroid Build Coastguard Workerstruct TlsDescriptor { // NB: arm32 reverses these fields
251*8d67ca89SAndroid Build Coastguard Worker  long (*resolver)(long);
252*8d67ca89SAndroid Build Coastguard Worker  long arg;
253*8d67ca89SAndroid Build Coastguard Worker};
254*8d67ca89SAndroid Build Coastguard Worker
255*8d67ca89SAndroid Build Coastguard Workerchar* get_tls_var() {
256*8d67ca89SAndroid Build Coastguard Worker  // allocated in the .got, uses a dynamic relocation
257*8d67ca89SAndroid Build Coastguard Worker  static TlsDescriptor desc = R_TLS_DESC(tls_var);
258*8d67ca89SAndroid Build Coastguard Worker  return (char*)__get_tls() + desc.resolver(desc.arg);
259*8d67ca89SAndroid Build Coastguard Worker}
260*8d67ca89SAndroid Build Coastguard Worker```
261*8d67ca89SAndroid Build Coastguard Worker
262*8d67ca89SAndroid Build Coastguard WorkerThe dynamic loader fills in the TLS descriptors. For a reference to a variable allocated in the
263*8d67ca89SAndroid Build Coastguard WorkerStatic TLS Block, it can use a simple resolver function:
264*8d67ca89SAndroid Build Coastguard Worker
265*8d67ca89SAndroid Build Coastguard Worker```cpp
266*8d67ca89SAndroid Build Coastguard Workerlong static_tls_resolver(long arg) {
267*8d67ca89SAndroid Build Coastguard Worker  return arg;
268*8d67ca89SAndroid Build Coastguard Worker}
269*8d67ca89SAndroid Build Coastguard Worker```
270*8d67ca89SAndroid Build Coastguard Worker
271*8d67ca89SAndroid Build Coastguard WorkerThe loader writes `tls_var@TPOFF` into the descriptor's argument.
272*8d67ca89SAndroid Build Coastguard Worker
273*8d67ca89SAndroid Build Coastguard WorkerTo support modules loaded with `dlopen`, the loader must use a resolver function that calls
274*8d67ca89SAndroid Build Coastguard Worker`__tls_get_addr`. In principle, this simple implementation would work:
275*8d67ca89SAndroid Build Coastguard Worker
276*8d67ca89SAndroid Build Coastguard Worker```cpp
277*8d67ca89SAndroid Build Coastguard Workerlong dynamic_tls_resolver(TlsIndex* arg) {
278*8d67ca89SAndroid Build Coastguard Worker  return (long)__tls_get_addr(arg) - (long)__get_tls();
279*8d67ca89SAndroid Build Coastguard Worker}
280*8d67ca89SAndroid Build Coastguard Worker```
281*8d67ca89SAndroid Build Coastguard Worker
282*8d67ca89SAndroid Build Coastguard WorkerThere are optimizations that complicate the design a little:
283*8d67ca89SAndroid Build Coastguard Worker * Unlike `__tls_get_addr`, the resolver function has a special calling convention that preserves
284*8d67ca89SAndroid Build Coastguard Worker   almost all registers, reducing register pressure in the caller
285*8d67ca89SAndroid Build Coastguard Worker   ([example](https://godbolt.org/g/gywcxk)).
286*8d67ca89SAndroid Build Coastguard Worker * In general, the resolver function must call `__tls_get_addr`, so it must save and restore all
287*8d67ca89SAndroid Build Coastguard Worker   registers.
288*8d67ca89SAndroid Build Coastguard Worker * To keep the fast path fast, the resolver inlines the fast path of `__tls_get_addr`.
289*8d67ca89SAndroid Build Coastguard Worker * By storing the module's initial generation alongside the TlsIndex, the resolver function doesn't
290*8d67ca89SAndroid Build Coastguard Worker   need to use an atomic or synchronized access of the global TLS generation counter.
291*8d67ca89SAndroid Build Coastguard Worker
292*8d67ca89SAndroid Build Coastguard WorkerThe resolver must be written in assembly, but in C, the function looks like so:
293*8d67ca89SAndroid Build Coastguard Worker
294*8d67ca89SAndroid Build Coastguard Worker```cpp
295*8d67ca89SAndroid Build Coastguard Workerstruct TlsDescDynamicArg {
296*8d67ca89SAndroid Build Coastguard Worker  unsigned long first_generation;
297*8d67ca89SAndroid Build Coastguard Worker  TlsIndex idx;
298*8d67ca89SAndroid Build Coastguard Worker};
299*8d67ca89SAndroid Build Coastguard Worker
300*8d67ca89SAndroid Build Coastguard Workerstruct TlsDtv { // DTV == dynamic thread vector
301*8d67ca89SAndroid Build Coastguard Worker  unsigned long generation;
302*8d67ca89SAndroid Build Coastguard Worker  char* modules[];
303*8d67ca89SAndroid Build Coastguard Worker};
304*8d67ca89SAndroid Build Coastguard Worker
305*8d67ca89SAndroid Build Coastguard Workerlong dynamic_tls_resolver(TlsDescDynamicArg* arg) {
306*8d67ca89SAndroid Build Coastguard Worker  TlsDtv* dtv = __get_dtv();
307*8d67ca89SAndroid Build Coastguard Worker  char* addr;
308*8d67ca89SAndroid Build Coastguard Worker  if (dtv->generation >= arg->first_generation &&
309*8d67ca89SAndroid Build Coastguard Worker      dtv->modules[arg->idx.module] != nullptr) {
310*8d67ca89SAndroid Build Coastguard Worker    addr = dtv->modules[arg->idx.module] + arg->idx.offset;
311*8d67ca89SAndroid Build Coastguard Worker  } else {
312*8d67ca89SAndroid Build Coastguard Worker    addr = __tls_get_addr(&arg->idx);
313*8d67ca89SAndroid Build Coastguard Worker  }
314*8d67ca89SAndroid Build Coastguard Worker  return (long)addr - (long)__get_tls();
315*8d67ca89SAndroid Build Coastguard Worker}
316*8d67ca89SAndroid Build Coastguard Worker```
317*8d67ca89SAndroid Build Coastguard Worker
318*8d67ca89SAndroid Build Coastguard WorkerThe loader needs to allocate a table of `TlsDescDynamicArg` objects for each TLS module with dynamic
319*8d67ca89SAndroid Build Coastguard WorkerTLSDESC relocations.
320*8d67ca89SAndroid Build Coastguard Worker
321*8d67ca89SAndroid Build Coastguard WorkerThe static linker can still relax a TLSDESC-based access to an IE/LE access.
322*8d67ca89SAndroid Build Coastguard Worker
323*8d67ca89SAndroid Build Coastguard WorkerThe traditional TLS design is implemented everywhere, but the TLSDESC design has less toolchain
324*8d67ca89SAndroid Build Coastguard Workersupport:
325*8d67ca89SAndroid Build Coastguard Worker * GCC and the BFD linker support both designs on all supported Android architectures (arm32, arm64,
326*8d67ca89SAndroid Build Coastguard Worker   x86, x86-64).
327*8d67ca89SAndroid Build Coastguard Worker * GCC can select the design at run-time using `-mtls-dialect=<dialect>` (`trad`-vs-`desc` on arm64,
328*8d67ca89SAndroid Build Coastguard Worker   otherwise `gnu`-vs-`gnu2`). Clang always uses the default mode.
329*8d67ca89SAndroid Build Coastguard Worker * GCC and Clang default to TLSDESC on arm64 and the traditional design on other architectures.
330*8d67ca89SAndroid Build Coastguard Worker * Gold and LLD support for TLSDESC is spotty (except when targeting arm64).
331*8d67ca89SAndroid Build Coastguard Worker
332*8d67ca89SAndroid Build Coastguard Worker# Linker Relaxations
333*8d67ca89SAndroid Build Coastguard Worker
334*8d67ca89SAndroid Build Coastguard WorkerThe (static) linker frequently has more information about the location of a referenced TLS variable
335*8d67ca89SAndroid Build Coastguard Workerthan the compiler, so it can "relax" TLS accesses to more efficient models. For example, if an
336*8d67ca89SAndroid Build Coastguard Workerobject file compiled with `-fpic` is linked into an executable, the linker could relax GD accesses
337*8d67ca89SAndroid Build Coastguard Workerto IE or LE. To relax a TLS access, the linker looks for an expected sequences of instructions and
338*8d67ca89SAndroid Build Coastguard Workerstatic relocations, then replaces the sequence with a different one of equal size. It may need to
339*8d67ca89SAndroid Build Coastguard Workeradd or remove no-op instructions.
340*8d67ca89SAndroid Build Coastguard Worker
341*8d67ca89SAndroid Build Coastguard Worker## Current Support for GD->LE Relaxations Across Linkers
342*8d67ca89SAndroid Build Coastguard Worker
343*8d67ca89SAndroid Build Coastguard WorkerVersions tested:
344*8d67ca89SAndroid Build Coastguard Worker * BFD and Gold linkers: version 2.30
345*8d67ca89SAndroid Build Coastguard Worker * LLD version 6.0.0 (upstream)
346*8d67ca89SAndroid Build Coastguard Worker
347*8d67ca89SAndroid Build Coastguard WorkerLinker support for GD->LE relaxation with `-mtls-dialect=gnu/trad` (traditional):
348*8d67ca89SAndroid Build Coastguard Worker
349*8d67ca89SAndroid Build Coastguard WorkerArchitecture    | BFD | Gold | LLD
350*8d67ca89SAndroid Build Coastguard Worker--------------- | --- | ---- | ---
351*8d67ca89SAndroid Build Coastguard Workerarm32           | no  | no   | no
352*8d67ca89SAndroid Build Coastguard Workerarm64 (unusual) | yes | yes  | no
353*8d67ca89SAndroid Build Coastguard Workerx86             | yes | yes  | yes
354*8d67ca89SAndroid Build Coastguard Workerx86_64          | yes | yes  | yes
355*8d67ca89SAndroid Build Coastguard Worker
356*8d67ca89SAndroid Build Coastguard WorkerLinker support for GD->LE relaxation with `-mtls-dialect=gnu2/desc` (TLSDESC):
357*8d67ca89SAndroid Build Coastguard Worker
358*8d67ca89SAndroid Build Coastguard WorkerArchitecture          | BFD | Gold               | LLD
359*8d67ca89SAndroid Build Coastguard Worker--------------------- | --- | ------------------ | ------------------
360*8d67ca89SAndroid Build Coastguard Workerarm32 (experimental)  | yes | unsupported relocs | unsupported relocs
361*8d67ca89SAndroid Build Coastguard Workerarm64                 | yes | yes                | yes
362*8d67ca89SAndroid Build Coastguard Workerx86 (experimental)    | yes | yes                | unsupported relocs
363*8d67ca89SAndroid Build Coastguard WorkerX86_64 (experimental) | yes | yes                | unsupported relocs
364*8d67ca89SAndroid Build Coastguard Worker
365*8d67ca89SAndroid Build Coastguard Workerarm32 linkers can't relax traditional TLS accesses. BFD can relax an arm32 TLSDESC access, but LLD
366*8d67ca89SAndroid Build Coastguard Workercan't link code using TLSDESC at all, except on arm64, where it's used by default.
367*8d67ca89SAndroid Build Coastguard Worker
368*8d67ca89SAndroid Build Coastguard Worker# dlsym
369*8d67ca89SAndroid Build Coastguard Worker
370*8d67ca89SAndroid Build Coastguard WorkerCalling `dlsym` on a TLS variable returns the address of the current thread's variable.
371*8d67ca89SAndroid Build Coastguard Worker
372*8d67ca89SAndroid Build Coastguard Worker# Debugger Support
373*8d67ca89SAndroid Build Coastguard Worker
374*8d67ca89SAndroid Build Coastguard Worker## gdb
375*8d67ca89SAndroid Build Coastguard Worker
376*8d67ca89SAndroid Build Coastguard Workergdb uses a libthread_db plugin library to retrieve thread-related information from a target. This
377*8d67ca89SAndroid Build Coastguard Workerlibrary is typically a shared object, but for Android, we link our own `libthread_db.a` into
378*8d67ca89SAndroid Build Coastguard Workergdbserver. We will need to implement at least 2 APIs in `libthread_db.a` to find TLS variables, and
379*8d67ca89SAndroid Build Coastguard Workergdb provides APIs for looking up symbols, reading or writing memory, and retrieving the current
380*8d67ca89SAndroid Build Coastguard Workerthread pointer (e.g. `ps_get_thread_area`).
381*8d67ca89SAndroid Build Coastguard Worker * Reference: [gdb_proc_service.h]: APIs gdb provides to libthread_db
382*8d67ca89SAndroid Build Coastguard Worker * Reference: [Currently unimplemented TLS functions in Android's libthread_tb][libthread_db.c]
383*8d67ca89SAndroid Build Coastguard Worker
384*8d67ca89SAndroid Build Coastguard Worker[gdb_proc_service.h]: https://android.googlesource.com/toolchain/gdb/+/a7e49fd02c21a496095c828841f209eef8ae2985/gdb-8.0.1/gdb/gdb_proc_service.h#41
385*8d67ca89SAndroid Build Coastguard Worker[libthread_db.c]: https://android.googlesource.com/platform/ndk/+/e1f0ad12fc317c0ca3183529cc9625d3f084d981/sources/android/libthread_db/libthread_db.c#115
386*8d67ca89SAndroid Build Coastguard Worker
387*8d67ca89SAndroid Build Coastguard Worker## LLDB
388*8d67ca89SAndroid Build Coastguard Worker
389*8d67ca89SAndroid Build Coastguard WorkerLLDB more-or-less implemented Linux TLS debugging in [r192922][rL192922] ([D1944]) for x86 and
390*8d67ca89SAndroid Build Coastguard Workerx86-64. [arm64 support came later][D5073]. However, the Linux TLS functionality no longer does
391*8d67ca89SAndroid Build Coastguard Workeranything: the `GetThreadPointer` function is no longer implemented. Code for reading the thread
392*8d67ca89SAndroid Build Coastguard Workerpointer was removed in [D10661] ([this function][r240543]). (arm32 was apparently never supported.)
393*8d67ca89SAndroid Build Coastguard Worker
394*8d67ca89SAndroid Build Coastguard Worker[rL192922]: https://reviews.llvm.org/rL192922
395*8d67ca89SAndroid Build Coastguard Worker[D1944]: https://reviews.llvm.org/D1944
396*8d67ca89SAndroid Build Coastguard Worker[D5073]: https://reviews.llvm.org/D5073
397*8d67ca89SAndroid Build Coastguard Worker[D10661]: https://reviews.llvm.org/D10661
398*8d67ca89SAndroid Build Coastguard Worker[r240543]: https://github.com/llvm-mirror/lldb/commit/79246050b0f8d6b54acb5366f153d07f235d2780#diff-52dee3d148892cccfcdab28bc2165548L962
399*8d67ca89SAndroid Build Coastguard Worker
400*8d67ca89SAndroid Build Coastguard Worker## Threading Library Metadata
401*8d67ca89SAndroid Build Coastguard Worker
402*8d67ca89SAndroid Build Coastguard WorkerBoth debuggers need metadata from the threading library (`libc.so` / `libpthread.so`) to find TLS
403*8d67ca89SAndroid Build Coastguard Workervariables. From [LLDB r192922][rL192922]'s commit message:
404*8d67ca89SAndroid Build Coastguard Worker
405*8d67ca89SAndroid Build Coastguard Worker> ... All OSes use basically the same algorithm (a per-module lookup table) as detailed in Ulrich
406*8d67ca89SAndroid Build Coastguard Worker> Drepper's TLS ELF ABI document, so we can easily write code to decode it ourselves. The only
407*8d67ca89SAndroid Build Coastguard Worker> question therefore is the exact field layouts required. Happily, the implementors of libpthread
408*8d67ca89SAndroid Build Coastguard Worker> expose the structure of the DTV via metadata exported as symbols from the .so itself, designed
409*8d67ca89SAndroid Build Coastguard Worker> exactly for this kind of thing. So this patch simply reads that metadata in, and re-implements
410*8d67ca89SAndroid Build Coastguard Worker> libthread_db's algorithm itself. We thereby get cross-platform TLS lookup without either requiring
411*8d67ca89SAndroid Build Coastguard Worker> third-party libraries, while still being independent of the version of libpthread being used.
412*8d67ca89SAndroid Build Coastguard Worker
413*8d67ca89SAndroid Build Coastguard Worker LLDB uses these variables:
414*8d67ca89SAndroid Build Coastguard Worker
415*8d67ca89SAndroid Build Coastguard WorkerName                              | Notes
416*8d67ca89SAndroid Build Coastguard Worker--------------------------------- | ---------------------------------------------------------------------------------------
417*8d67ca89SAndroid Build Coastguard Worker`_thread_db_pthread_dtvp`         | Offset from TP to DTV pointer (0 for variant 1, implementation-defined for variant 2)
418*8d67ca89SAndroid Build Coastguard Worker`_thread_db_dtv_dtv`              | Size of a DTV slot (typically/always sizeof(void*))
419*8d67ca89SAndroid Build Coastguard Worker`_thread_db_dtv_t_pointer_val`    | Offset within a DTV slot to the pointer to the allocated TLS block (typically/always 0)
420*8d67ca89SAndroid Build Coastguard Worker`_thread_db_link_map_l_tls_modid` | Offset of a `link_map` field containing the module's 1-based TLS module ID
421*8d67ca89SAndroid Build Coastguard Worker
422*8d67ca89SAndroid Build Coastguard WorkerThe metadata variables are local symbols in glibc's `libpthread.so` symbol table (but not its
423*8d67ca89SAndroid Build Coastguard Workerdynamic symbol table). Debuggers can access them, but applications can't.
424*8d67ca89SAndroid Build Coastguard Worker
425*8d67ca89SAndroid Build Coastguard WorkerThe debugger lookup process is straightforward:
426*8d67ca89SAndroid Build Coastguard Worker * Find the `link_map` object and module-relative offset for a TLS variable.
427*8d67ca89SAndroid Build Coastguard Worker * Use `_thread_db_link_map_l_tls_modid` to find the TLS variable's module ID.
428*8d67ca89SAndroid Build Coastguard Worker * Read the target thread pointer.
429*8d67ca89SAndroid Build Coastguard Worker * Use `_thread_db_pthread_dtvp` to find the thread's DTV.
430*8d67ca89SAndroid Build Coastguard Worker * Use `_thread_db_dtv_dtv` and `_thread_db_dtv_t_pointer_val` to find the desired module's block
431*8d67ca89SAndroid Build Coastguard Worker   within the DTV.
432*8d67ca89SAndroid Build Coastguard Worker * Add the module-relative offset to the module pointer.
433*8d67ca89SAndroid Build Coastguard Worker
434*8d67ca89SAndroid Build Coastguard WorkerThis process doesn't appear robust in the face of lazy DTV initialization -- presumably it could
435*8d67ca89SAndroid Build Coastguard Workerread past the end of an out-of-date DTV or access an unloaded module. To be robust, it needs to
436*8d67ca89SAndroid Build Coastguard Workercompare a module's initial generation count against the DTV's generation count. (XXX: Does gdb have
437*8d67ca89SAndroid Build Coastguard Workerthese sorts of problems with glibc's libpthread?)
438*8d67ca89SAndroid Build Coastguard Worker
439*8d67ca89SAndroid Build Coastguard Worker## Reading the Thread Pointer with Ptrace
440*8d67ca89SAndroid Build Coastguard Worker
441*8d67ca89SAndroid Build Coastguard WorkerThere are ptrace interfaces for reading the thread pointer for each of arm32, arm64, x86, and x86-64
442*8d67ca89SAndroid Build Coastguard Worker(XXX: check 32-vs-64-bit for inferiors, debuggers, and kernels):
443*8d67ca89SAndroid Build Coastguard Worker * arm32: `PTRACE_GET_THREAD_AREA`
444*8d67ca89SAndroid Build Coastguard Worker * arm64: `PTRACE_GETREGSET`, `NT_ARM_TLS`
445*8d67ca89SAndroid Build Coastguard Worker * x86_32: `PTRACE_GET_THREAD_AREA`
446*8d67ca89SAndroid Build Coastguard Worker * x86_64: use `PTRACE_PEEKUSER` to read the `{fs,gs}_base` fields of `user_regs_struct`
447*8d67ca89SAndroid Build Coastguard Worker
448*8d67ca89SAndroid Build Coastguard Worker# C/C++ Specifiers
449*8d67ca89SAndroid Build Coastguard Worker
450*8d67ca89SAndroid Build Coastguard WorkerC/C++ TLS variables are declared with a specifier:
451*8d67ca89SAndroid Build Coastguard Worker
452*8d67ca89SAndroid Build Coastguard WorkerSpecifier       | Notes
453*8d67ca89SAndroid Build Coastguard Worker--------------- | -----------------------------------------------------------------------------------------------------------------------------
454*8d67ca89SAndroid Build Coastguard Worker`__thread`      |  - non-standard, but ubiquitous in GCC and Clang<br/> - cannot have dynamic initialization or destruction
455*8d67ca89SAndroid Build Coastguard Worker`_Thread_local` |  - a keyword standardized in C11<br/> - cannot have dynamic initialization or destruction
456*8d67ca89SAndroid Build Coastguard Worker`thread_local`  |  - C11: a macro for `_Thread_local` via `threads.h`<br/> - C++11: a keyword, allows dynamic initialization and/or destruction
457*8d67ca89SAndroid Build Coastguard Worker
458*8d67ca89SAndroid Build Coastguard WorkerThe dynamic initialization and destruction of C++ `thread_local` variables is layered on top of ELF
459*8d67ca89SAndroid Build Coastguard WorkerTLS (or emutls), so this design document mostly ignores it. Like emutls, ELF TLS variables either
460*8d67ca89SAndroid Build Coastguard Workerhave a static initializer or are zero-initialized.
461*8d67ca89SAndroid Build Coastguard Worker
462*8d67ca89SAndroid Build Coastguard WorkerAside: Because a `__thread` variable cannot have dynamic initialization, `__thread` is more
463*8d67ca89SAndroid Build Coastguard Workerefficient in C++ than `thread_local` when the compiler cannot see the definition of a declared TLS
464*8d67ca89SAndroid Build Coastguard Workervariable. The compiler assumes the variable could have a dynamic initializer and generates code, at
465*8d67ca89SAndroid Build Coastguard Workereach access, to call a function to initialize the variable.
466*8d67ca89SAndroid Build Coastguard Worker
467*8d67ca89SAndroid Build Coastguard Worker# Graceful Failure on Old Platforms
468*8d67ca89SAndroid Build Coastguard Worker
469*8d67ca89SAndroid Build Coastguard WorkerELF TLS isn't implemented on older Android platforms, so dynamic executables and shared objects
470*8d67ca89SAndroid Build Coastguard Workerusing it generally won't work on them. Ideally, the older platforms would reject these binaries
471*8d67ca89SAndroid Build Coastguard Workerrather than experience memory corruption at run-time.
472*8d67ca89SAndroid Build Coastguard Worker
473*8d67ca89SAndroid Build Coastguard WorkerStatic executables aren't a problem--the necessary runtime support is part of the executable, so TLS
474*8d67ca89SAndroid Build Coastguard Workerjust works.
475*8d67ca89SAndroid Build Coastguard Worker
476*8d67ca89SAndroid Build Coastguard WorkerXXX: Shared objects are less of a problem.
477*8d67ca89SAndroid Build Coastguard Worker * On arm32, x86, and x86_64, the loader [should reject a TLS relocation]. (XXX: I haven't verified
478*8d67ca89SAndroid Build Coastguard Worker   this.)
479*8d67ca89SAndroid Build Coastguard Worker * On arm64, the primary TLS relocation (R_AARCH64_TLSDESC) is [confused with an obsolete
480*8d67ca89SAndroid Build Coastguard Worker   R_AARCH64_TLS_DTPREL32 relocation][R_AARCH64_TLS_DTPREL32] and is [quietly ignored].
481*8d67ca89SAndroid Build Coastguard Worker * Android P [added compatibility checks] for TLS symbols and `DT_TLSDESC_{GOT|PLT}` entries.
482*8d67ca89SAndroid Build Coastguard Worker
483*8d67ca89SAndroid Build Coastguard WorkerXXX: A dynamic executable using ELF TLS would have a PT_TLS segment and no other distinguishing
484*8d67ca89SAndroid Build Coastguard Workermarks, so running it on an older platform would result in memory corruption. Should we add something
485*8d67ca89SAndroid Build Coastguard Workerto these executables that only newer platforms recognize? (e.g. maybe an entry in .dynamic, a
486*8d67ca89SAndroid Build Coastguard Workerreference to a symbol only a new libc.so has...)
487*8d67ca89SAndroid Build Coastguard Worker
488*8d67ca89SAndroid Build Coastguard Worker[should reject a TLS relocation]: https://android.googlesource.com/platform/bionic/+/android-8.1.0_r48/linker/linker.cpp#2852
489*8d67ca89SAndroid Build Coastguard Worker[R_AARCH64_TLS_DTPREL32]: https://android-review.googlesource.com/c/platform/bionic/+/723696
490*8d67ca89SAndroid Build Coastguard Worker[quietly ignored]: https://android.googlesource.com/platform/bionic/+/android-8.1.0_r48/linker/linker.cpp#2784
491*8d67ca89SAndroid Build Coastguard Worker[added compatibility checks]: https://android-review.googlesource.com/c/platform/bionic/+/648760
492*8d67ca89SAndroid Build Coastguard Worker
493*8d67ca89SAndroid Build Coastguard Worker## Loader/libc Communication
494*8d67ca89SAndroid Build Coastguard Worker
495*8d67ca89SAndroid Build Coastguard WorkerThe loader exposes a list of TLS modules ([`struct TlsModules`][TlsModules]) to `libc.so` using the
496*8d67ca89SAndroid Build Coastguard Worker`__libc_shared_globals` variable (see `tls_modules()` in [linker_tls.cpp][tls_modules-linker] and
497*8d67ca89SAndroid Build Coastguard Worker[elf_tls.cpp][tls_modules-libc]). `__tls_get_addr` in libc.so acquires the `TlsModules::mutex` and
498*8d67ca89SAndroid Build Coastguard Workeriterates its module list to lazily allocate and free TLS blocks.
499*8d67ca89SAndroid Build Coastguard Worker
500*8d67ca89SAndroid Build Coastguard Worker[TlsModules]: https://android-review.googlesource.com/c/platform/bionic/+/723698/1/libc/bionic/elf_tls.h#53
501*8d67ca89SAndroid Build Coastguard Worker[tls_modules-linker]: https://android-review.googlesource.com/c/platform/bionic/+/723698/1/linker/linker_tls.cpp#45
502*8d67ca89SAndroid Build Coastguard Worker[tls_modules-libc]: https://android-review.googlesource.com/c/platform/bionic/+/723698/1/libc/bionic/elf_tls.cpp#49
503*8d67ca89SAndroid Build Coastguard Worker
504*8d67ca89SAndroid Build Coastguard Worker## TLS Allocator
505*8d67ca89SAndroid Build Coastguard Worker
506*8d67ca89SAndroid Build Coastguard Workerbionic currently allocates a `pthread_internal_t` object and static TLS in a single mmap'ed
507*8d67ca89SAndroid Build Coastguard Workerregion, along with a thread's stack if it needs one allocated. It doesn't place TLS memory on a
508*8d67ca89SAndroid Build Coastguard Workerpreallocated stack (either the main thread's stack or one provided with `pthread_attr_setstack`).
509*8d67ca89SAndroid Build Coastguard Worker
510*8d67ca89SAndroid Build Coastguard WorkerThe DTV and blocks for dlopen'ed modules are instead allocated using the Bionic loader's
511*8d67ca89SAndroid Build Coastguard Worker`LinkerMemoryAllocator`, adapted to avoid the STL and to provide `memalign`.
512*8d67ca89SAndroid Build Coastguard WorkerThe implementation tries to achieve async-signal safety by blocking signals and
513*8d67ca89SAndroid Build Coastguard Workeracquiring a lock.
514*8d67ca89SAndroid Build Coastguard Worker
515*8d67ca89SAndroid Build Coastguard WorkerThere are three "entry points" to dynamically locate a TLS variable's address:
516*8d67ca89SAndroid Build Coastguard Worker * libc.so: `__tls_get_addr`
517*8d67ca89SAndroid Build Coastguard Worker * loader: TLSDESC dynamic resolver
518*8d67ca89SAndroid Build Coastguard Worker * loader: dlsym
519*8d67ca89SAndroid Build Coastguard Worker
520*8d67ca89SAndroid Build Coastguard WorkerThe loader's entry points need to call `__tls_get_addr`, which needs to allocate memory. Currently,
521*8d67ca89SAndroid Build Coastguard Workerthe implementation uses a [special function pointer] to call libc.so's `__tls_get_addr` from the loader.
522*8d67ca89SAndroid Build Coastguard Worker(This should probably be removed.)
523*8d67ca89SAndroid Build Coastguard Worker
524*8d67ca89SAndroid Build Coastguard WorkerThe implementation currently allows for arbitrarily-large TLS variable alignment. IIRC, different
525*8d67ca89SAndroid Build Coastguard Workerimplementations (glibc, musl, FreeBSD) vary in their level of respect for TLS alignment. It looks
526*8d67ca89SAndroid Build Coastguard Workerlike the Bionic loader ignores segments' alignment and aligns loaded libraries to 256 KiB. See
527*8d67ca89SAndroid Build Coastguard Worker`ReserveAligned`.
528*8d67ca89SAndroid Build Coastguard Worker
529*8d67ca89SAndroid Build Coastguard Worker[special function pointer]: https://android-review.googlesource.com/c/platform/bionic/+/723698/1/libc/private/bionic_globals.h#52
530*8d67ca89SAndroid Build Coastguard Worker
531*8d67ca89SAndroid Build Coastguard Worker## Async-Signal Safety
532*8d67ca89SAndroid Build Coastguard Worker
533*8d67ca89SAndroid Build Coastguard WorkerThe implementation's `__tls_get_addr` might be async-signal safe. Making it AS-safe is a good idea if
534*8d67ca89SAndroid Build Coastguard Workerit's feasible. musl's function is AS-safe, but glibc's isn't (or wasn't). Google had a patch to make
535*8d67ca89SAndroid Build Coastguard Workerglibc AS-safe back in 2012-2013. See:
536*8d67ca89SAndroid Build Coastguard Worker * https://sourceware.org/glibc/wiki/TLSandSignals
537*8d67ca89SAndroid Build Coastguard Worker * https://sourceware.org/ml/libc-alpha/2012-06/msg00335.html
538*8d67ca89SAndroid Build Coastguard Worker * https://sourceware.org/ml/libc-alpha/2013-09/msg00563.html
539*8d67ca89SAndroid Build Coastguard Worker
540*8d67ca89SAndroid Build Coastguard Worker## Out-of-Memory Handling (abort)
541*8d67ca89SAndroid Build Coastguard Worker
542*8d67ca89SAndroid Build Coastguard WorkerThe implementation lazily allocates TLS memory for dlopen'ed modules (see `__tls_get_addr`), and an
543*8d67ca89SAndroid Build Coastguard Workerout-of-memory error on a TLS access aborts the process. musl, on the other hand, preallocates TLS
544*8d67ca89SAndroid Build Coastguard Workermemory on `pthread_create` and `dlopen`, so either function can return out-of-memory. Both functions
545*8d67ca89SAndroid Build Coastguard Workerprobably need to acquire the same lock.
546*8d67ca89SAndroid Build Coastguard Worker
547*8d67ca89SAndroid Build Coastguard WorkerMaybe Bionic should do the same as musl? Perhaps musl's robustness argument holds for Bionic,
548*8d67ca89SAndroid Build Coastguard Workerthough, because Bionic (at least the linker) probably already aborts on OOM. musl doesn't support
549*8d67ca89SAndroid Build Coastguard Worker`dlclose`/unloading, so it might have an easier time.
550*8d67ca89SAndroid Build Coastguard Worker
551*8d67ca89SAndroid Build Coastguard WorkerOn the other hand, maybe lazy allocation is a feature, because not all threads will use a dlopen'ed
552*8d67ca89SAndroid Build Coastguard Workersolib's TLS variables. Drepper makes this argument in his TLS document:
553*8d67ca89SAndroid Build Coastguard Worker
554*8d67ca89SAndroid Build Coastguard Worker> In addition the run-time support should avoid creating the thread-local storage if it is not
555*8d67ca89SAndroid Build Coastguard Worker> necessary. For instance, a loaded module might only be used by one thread of the many which make
556*8d67ca89SAndroid Build Coastguard Worker> up the process. It would be a waste of memory and time to allocate the storage for all threads. A
557*8d67ca89SAndroid Build Coastguard Worker> lazy method is wanted. This is not much extra burden since the requirement to handle dynamically
558*8d67ca89SAndroid Build Coastguard Worker> loaded objects already requires recognizing storage which is not yet allocated. This is the only
559*8d67ca89SAndroid Build Coastguard Worker> alternative to stopping all threads and allocating storage for all threads before letting them run
560*8d67ca89SAndroid Build Coastguard Worker> again.
561*8d67ca89SAndroid Build Coastguard Worker
562*8d67ca89SAndroid Build Coastguard WorkerFWIW: emutls also aborts on out-of-memory.
563*8d67ca89SAndroid Build Coastguard Worker
564*8d67ca89SAndroid Build Coastguard Worker## ELF TLS Not Usable in libc Itself
565*8d67ca89SAndroid Build Coastguard Worker
566*8d67ca89SAndroid Build Coastguard WorkerThe dynamic loader currently can't use ELF TLS, so any part of libc linked into the loader (i.e.
567*8d67ca89SAndroid Build Coastguard Workermost of it) also can't use ELF TLS. It might be possible to lift this restriction, perhaps with
568*8d67ca89SAndroid Build Coastguard Workerspecialized `__tls_get_addr` and TLSDESC resolver functions.
569*8d67ca89SAndroid Build Coastguard Worker
570*8d67ca89SAndroid Build Coastguard Worker# Open Issues
571*8d67ca89SAndroid Build Coastguard Worker
572*8d67ca89SAndroid Build Coastguard Worker## Bionic Memory Layout Conflicts with Common TLS Layout
573*8d67ca89SAndroid Build Coastguard Worker
574*8d67ca89SAndroid Build Coastguard WorkerBionic already allocates thread-specific data in a way that conflicts with TLS variants 1 and 2:
575*8d67ca89SAndroid Build Coastguard Worker![Bionic TLS Layout in Android P](img/bionic-tls-layout-in-p.png)
576*8d67ca89SAndroid Build Coastguard Worker
577*8d67ca89SAndroid Build Coastguard WorkerTLS variant 1 allocates everything after the TP to ELF TLS (except the first two words), and variant
578*8d67ca89SAndroid Build Coastguard Worker2 allocates everything before the TP. Bionic currently allocates memory before and after the TP to
579*8d67ca89SAndroid Build Coastguard Workerthe `pthread_internal_t` struct.
580*8d67ca89SAndroid Build Coastguard Worker
581*8d67ca89SAndroid Build Coastguard WorkerThe `bionic_tls.h` header is marked with a warning:
582*8d67ca89SAndroid Build Coastguard Worker
583*8d67ca89SAndroid Build Coastguard Worker```cpp
584*8d67ca89SAndroid Build Coastguard Worker/** WARNING WARNING WARNING
585*8d67ca89SAndroid Build Coastguard Worker **
586*8d67ca89SAndroid Build Coastguard Worker ** This header file is *NOT* part of the public Bionic ABI/API
587*8d67ca89SAndroid Build Coastguard Worker ** and should not be used/included by user-serviceable parts of
588*8d67ca89SAndroid Build Coastguard Worker ** the system (e.g. applications).
589*8d67ca89SAndroid Build Coastguard Worker **
590*8d67ca89SAndroid Build Coastguard Worker ** It is only provided here for the benefit of the system dynamic
591*8d67ca89SAndroid Build Coastguard Worker ** linker and the OpenGL sub-system (which needs to access the
592*8d67ca89SAndroid Build Coastguard Worker ** pre-allocated slot directly for performance reason).
593*8d67ca89SAndroid Build Coastguard Worker **/
594*8d67ca89SAndroid Build Coastguard Worker```
595*8d67ca89SAndroid Build Coastguard Worker
596*8d67ca89SAndroid Build Coastguard WorkerThere are issues with rearranging this memory:
597*8d67ca89SAndroid Build Coastguard Worker
598*8d67ca89SAndroid Build Coastguard Worker * `TLS_SLOT_STACK_GUARD` is used for `-fstack-protector`. The location (word #5) was initially used
599*8d67ca89SAndroid Build Coastguard Worker   by GCC on x86 (and x86-64), where it is compatible with x86's TLS variant 2. We [modified Clang
600*8d67ca89SAndroid Build Coastguard Worker   to use this slot for arm64 in 2016][D18632], though, and the slot isn't compatible with ARM's
601*8d67ca89SAndroid Build Coastguard Worker   variant 1 layout. This change shipped in NDK r14, and the NDK's build systems (ndk-build and the
602*8d67ca89SAndroid Build Coastguard Worker   CMake toolchain file) enable `-fstack-protector-strong` by default.
603*8d67ca89SAndroid Build Coastguard Worker
604*8d67ca89SAndroid Build Coastguard Worker * `TLS_SLOT_TSAN` is used for more than just TSAN -- it's also used by [HWASAN and
605*8d67ca89SAndroid Build Coastguard Worker   Scudo](https://reviews.llvm.org/D53906#1285002).
606*8d67ca89SAndroid Build Coastguard Worker
607*8d67ca89SAndroid Build Coastguard Worker * The Go runtime allocates a thread-local "g" variable on Android by creating a pthread key and
608*8d67ca89SAndroid Build Coastguard Worker   searching for its TP-relative offset, which it assumes is nonnegative:
609*8d67ca89SAndroid Build Coastguard Worker    * On arm32/arm64, it creates a pthread key, sets it to a magic value, then scans forward from
610*8d67ca89SAndroid Build Coastguard Worker      the thread pointer looking for it. [The scan count was bumped to 384 to fix a reported
611*8d67ca89SAndroid Build Coastguard Worker      breakage happening with Android N.](https://go-review.googlesource.com/c/go/+/38636) (XXX: I
612*8d67ca89SAndroid Build Coastguard Worker      suspect the actual platform breakage happened with Android M's [lock-free pthread key
613*8d67ca89SAndroid Build Coastguard Worker      work][bionic-lockfree-keys].)
614*8d67ca89SAndroid Build Coastguard Worker    * On x86/x86-64, it uses a fixed offset from the thread pointer (TP+0xf8 or TP+0x1d0) and
615*8d67ca89SAndroid Build Coastguard Worker      creates pthread keys until one of them hits the fixed offset.
616*8d67ca89SAndroid Build Coastguard Worker    * CLs:
617*8d67ca89SAndroid Build Coastguard Worker       * arm32: https://codereview.appspot.com/106380043
618*8d67ca89SAndroid Build Coastguard Worker       * arm64: https://go-review.googlesource.com/c/go/+/17245
619*8d67ca89SAndroid Build Coastguard Worker       * x86: https://go-review.googlesource.com/c/go/+/16678
620*8d67ca89SAndroid Build Coastguard Worker       * x86-64: https://go-review.googlesource.com/c/go/+/15991
621*8d67ca89SAndroid Build Coastguard Worker    * Moving the pthread keys before the thread pointer breaks Go-based apps.
622*8d67ca89SAndroid Build Coastguard Worker    * It's unclear how many Android apps use Go. There are at least two with 1,000,000+ installs.
623*8d67ca89SAndroid Build Coastguard Worker    * [Some motivation for Go's design][golang-post], [runtime/HACKING.md][go-hacking]
624*8d67ca89SAndroid Build Coastguard Worker    * [On x86/x86-64 Darwin, Go uses a TLS slot reserved for both Go and Wine][go-darwin-x86] (On
625*8d67ca89SAndroid Build Coastguard Worker      [arm32][go-darwin-arm32]/[arm64][go-darwin-arm64] Darwin, Go scans for pthread keys like it
626*8d67ca89SAndroid Build Coastguard Worker      does on Android.)
627*8d67ca89SAndroid Build Coastguard Worker
628*8d67ca89SAndroid Build Coastguard Worker * Android's "native bridge" system allows the Zygote to load an app solib of a non-native ABI. (For
629*8d67ca89SAndroid Build Coastguard Worker   example, it could be used to load an arm32 solib into an x86 Zygote.) The solib is translated
630*8d67ca89SAndroid Build Coastguard Worker   into the host architecture. TLS accesses in the app solib (whether ELF TLS, Bionic slots, or
631*8d67ca89SAndroid Build Coastguard Worker   `pthread_internal_t` fields) become host accesses. Laying out TLS memory differently across
632*8d67ca89SAndroid Build Coastguard Worker   architectures could complicate this translation.
633*8d67ca89SAndroid Build Coastguard Worker
634*8d67ca89SAndroid Build Coastguard Worker * A `pthread_t` is practically just a `pthread_internal_t*`, and some apps directly access the
635*8d67ca89SAndroid Build Coastguard Worker   `pthread_internal_t::tid` field. Past examples: http://b/17389248, [aosp/107467]. Reorganizing
636*8d67ca89SAndroid Build Coastguard Worker   the initial `pthread_internal_t` fields could break those apps.
637*8d67ca89SAndroid Build Coastguard Worker
638*8d67ca89SAndroid Build Coastguard WorkerIt seems easy to fix the incompatibility for variant 2 (x86 and x86_64) by splitting out the Bionic
639*8d67ca89SAndroid Build Coastguard Workerslots into a new data structure. Variant 1 is a harder problem.
640*8d67ca89SAndroid Build Coastguard Worker
641*8d67ca89SAndroid Build Coastguard WorkerThe TLS prototype used a patched LLD that uses a variant 1 TLS layout with a 16-word TCB
642*8d67ca89SAndroid Build Coastguard Workeron all architectures.
643*8d67ca89SAndroid Build Coastguard Worker
644*8d67ca89SAndroid Build Coastguard WorkerAside: gcc's arm64ilp32 target uses a 32-bit unsigned offset for a TLS IE access
645*8d67ca89SAndroid Build Coastguard Worker(https://godbolt.org/z/_NIXjF). If Android ever supports this target, and in a configuration with
646*8d67ca89SAndroid Build Coastguard Workervariant 2 TLS, we might need to change the compiler to emit a sign-extending load.
647*8d67ca89SAndroid Build Coastguard Worker
648*8d67ca89SAndroid Build Coastguard Worker[D18632]: https://reviews.llvm.org/D18632
649*8d67ca89SAndroid Build Coastguard Worker[bionic-lockfree-keys]: https://android-review.googlesource.com/c/platform/bionic/+/134202
650*8d67ca89SAndroid Build Coastguard Worker[golang-post]: https://groups.google.com/forum/#!msg/golang-nuts/EhndTzcPJxQ/i-w7kAMfBQAJ
651*8d67ca89SAndroid Build Coastguard Worker[go-hacking]: https://github.com/golang/go/blob/master/src/runtime/HACKING.md
652*8d67ca89SAndroid Build Coastguard Worker[go-darwin-x86]: https://github.com/golang/go/issues/23617
653*8d67ca89SAndroid Build Coastguard Worker[go-darwin-arm32]: https://github.com/golang/go/blob/15c106d99305411b587ec0d9e80c882e538c9d47/src/runtime/cgo/gcc_darwin_arm.c
654*8d67ca89SAndroid Build Coastguard Worker[go-darwin-arm64]: https://github.com/golang/go/blob/15c106d99305411b587ec0d9e80c882e538c9d47/src/runtime/cgo/gcc_darwin_arm64.c
655*8d67ca89SAndroid Build Coastguard Worker[aosp/107467]: https://android-review.googlesource.com/c/platform/bionic/+/107467
656*8d67ca89SAndroid Build Coastguard Worker
657*8d67ca89SAndroid Build Coastguard Worker### Workaround: Use Variant 2 on arm32/arm64
658*8d67ca89SAndroid Build Coastguard Worker
659*8d67ca89SAndroid Build Coastguard WorkerPros: simplifies Bionic
660*8d67ca89SAndroid Build Coastguard Worker
661*8d67ca89SAndroid Build Coastguard WorkerCons:
662*8d67ca89SAndroid Build Coastguard Worker * arm64: requires either subtle reinterpretation of a TLS relocation or addition of a new
663*8d67ca89SAndroid Build Coastguard Worker   relocation
664*8d67ca89SAndroid Build Coastguard Worker * arm64: a new TLS relocation reduces compiler/assembler compatibility with non-Android
665*8d67ca89SAndroid Build Coastguard Worker
666*8d67ca89SAndroid Build Coastguard WorkerThe point of variant 2 was backwards-compatibility, and ARM Android needs to remain
667*8d67ca89SAndroid Build Coastguard Workerbackwards-compatible, so we could use variant 2 for ARM. Problems:
668*8d67ca89SAndroid Build Coastguard Worker
669*8d67ca89SAndroid Build Coastguard Worker * When linking an executable, the static linker needs to know how TLS is allocated because it
670*8d67ca89SAndroid Build Coastguard Worker   writes TP-relative offsets for IE/LE-model accesses. Clang doesn't tell the linker to target
671*8d67ca89SAndroid Build Coastguard Worker   Android, so it could pass an `--tls-variant2` flag to configure lld.
672*8d67ca89SAndroid Build Coastguard Worker
673*8d67ca89SAndroid Build Coastguard Worker * On arm64, there are different sets of static LE relocations accommodating different ranges of
674*8d67ca89SAndroid Build Coastguard Worker   offsets from TP:
675*8d67ca89SAndroid Build Coastguard Worker
676*8d67ca89SAndroid Build Coastguard Worker   Size | TP offset range   | Static LE relocation types
677*8d67ca89SAndroid Build Coastguard Worker   ---- | ----------------- | ---------------------------------------
678*8d67ca89SAndroid Build Coastguard Worker   12   | 0 <= x < 2^12     | `R_AARCH64_TLSLE_ADD_TPREL_LO12`
679*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST8_TPREL_LO12`
680*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST16_TPREL_LO12`
681*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST32_TPREL_LO12`
682*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST64_TPREL_LO12`
683*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST128_TPREL_LO12`
684*8d67ca89SAndroid Build Coastguard Worker   16   | -2^16 <= x < 2^16 | `R_AARCH64_TLSLE_MOVW_TPREL_G0`
685*8d67ca89SAndroid Build Coastguard Worker   24   | 0 <= x < 2^24     | `R_AARCH64_TLSLE_ADD_TPREL_HI12`
686*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_ADD_TPREL_LO12_NC`
687*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST8_TPREL_LO12_NC`
688*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST16_TPREL_LO12_NC`
689*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST32_TPREL_LO12_NC`
690*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST64_TPREL_LO12_NC`
691*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_LDST128_TPREL_LO12_NC`
692*8d67ca89SAndroid Build Coastguard Worker   32   | -2^32 <= x < 2^32 | `R_AARCH64_TLSLE_MOVW_TPREL_G1`
693*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_MOVW_TPREL_G0_NC`
694*8d67ca89SAndroid Build Coastguard Worker   48   | -2^48 <= x < 2^48 | `R_AARCH64_TLSLE_MOVW_TPREL_G2`
695*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_MOVW_TPREL_G1_NC`
696*8d67ca89SAndroid Build Coastguard Worker   "    | "                 | `R_AARCH64_TLSLE_MOVW_TPREL_G0_NC`
697*8d67ca89SAndroid Build Coastguard Worker
698*8d67ca89SAndroid Build Coastguard Worker   GCC for arm64 defaults to the 24-bit model and has an `-mtls-size=SIZE` option for setting other
699*8d67ca89SAndroid Build Coastguard Worker   supported sizes. (It supports 12, 24, 32, and 48.) Clang has only implemented the 24-bit model,
700*8d67ca89SAndroid Build Coastguard Worker   but that could change. (Clang [briefly used][D44355] load/store relocations, but it was reverted
701*8d67ca89SAndroid Build Coastguard Worker   because no linker supported them: [BFD], [Gold], [LLD]).
702*8d67ca89SAndroid Build Coastguard Worker
703*8d67ca89SAndroid Build Coastguard Worker   The 16-, 32-, and 48-bit models use a `movn/movz` instruction to set the highest 16 bits to a
704*8d67ca89SAndroid Build Coastguard Worker   positive or negative value, then `movk` to set the remaining 16 bit chunks. In principle, these
705*8d67ca89SAndroid Build Coastguard Worker   relocations should be able to accommodate a negative TP offset.
706*8d67ca89SAndroid Build Coastguard Worker
707*8d67ca89SAndroid Build Coastguard Worker   The 24-bit model uses `add` to set the high 12 bits, then places the low 12 bits into another
708*8d67ca89SAndroid Build Coastguard Worker   `add` or a load/store instruction.
709*8d67ca89SAndroid Build Coastguard Worker
710*8d67ca89SAndroid Build Coastguard WorkerMaybe we could modify the `R_AARCH64_TLSLE_ADD_TPREL_HI12` relocation to allow a negative TP offset
711*8d67ca89SAndroid Build Coastguard Workerby converting the relocated `add` instruction to a `sub`. Alternately, we could add a new
712*8d67ca89SAndroid Build Coastguard Worker`R_AARCH64_TLSLE_SUB_TPREL_HI12` relocation, and Clang would use a different TLS LE instruction
713*8d67ca89SAndroid Build Coastguard Workersequence when targeting Android/arm64.
714*8d67ca89SAndroid Build Coastguard Worker
715*8d67ca89SAndroid Build Coastguard Worker * LLD's arm64 relaxations from GD and IE to LE would need to use `movn` instead of `movk` for
716*8d67ca89SAndroid Build Coastguard Worker   Android.
717*8d67ca89SAndroid Build Coastguard Worker
718*8d67ca89SAndroid Build Coastguard Worker * Binaries linked with the flag crash on non-Bionic, and binaries without the flag crash on Bionic.
719*8d67ca89SAndroid Build Coastguard Worker   We might want to mark the binaries somehow to indicate the non-standard TLS ABI. Suggestion:
720*8d67ca89SAndroid Build Coastguard Worker    * Use an `--android-tls-variant2` flag (or `--bionic-tls-variant2`, we're trying to make [Bionic
721*8d67ca89SAndroid Build Coastguard Worker      run on the host](http://b/31559095))
722*8d67ca89SAndroid Build Coastguard Worker    * Add a `PT_ANDROID_TLS_TPOFF` segment?
723*8d67ca89SAndroid Build Coastguard Worker    * Add a [`.note.gnu.property`](https://reviews.llvm.org/D53906#1283425) with a
724*8d67ca89SAndroid Build Coastguard Worker      "`GNU_PROPERTY_TLS_TPOFF`" property value?
725*8d67ca89SAndroid Build Coastguard Worker
726*8d67ca89SAndroid Build Coastguard Worker[D44355]: https://reviews.llvm.org/D44355
727*8d67ca89SAndroid Build Coastguard Worker[BFD]: https://sourceware.org/bugzilla/show_bug.cgi?id=22970
728*8d67ca89SAndroid Build Coastguard Worker[Gold]: https://sourceware.org/bugzilla/show_bug.cgi?id=22969
729*8d67ca89SAndroid Build Coastguard Worker[LLD]: https://bugs.llvm.org/show_bug.cgi?id=36727
730*8d67ca89SAndroid Build Coastguard Worker
731*8d67ca89SAndroid Build Coastguard Worker### Workaround: Reserve an Extra-Large TCB on ARM
732*8d67ca89SAndroid Build Coastguard Worker
733*8d67ca89SAndroid Build Coastguard WorkerPros: Minimal linker change, no change to TLS relocations.
734*8d67ca89SAndroid Build Coastguard WorkerCons: The reserved amount becomes an arbitrary but immutable part of the Android ABI.
735*8d67ca89SAndroid Build Coastguard Worker
736*8d67ca89SAndroid Build Coastguard WorkerAdd an lld option: `--android-tls[-tcb=SIZE]`
737*8d67ca89SAndroid Build Coastguard Worker
738*8d67ca89SAndroid Build Coastguard WorkerAs with the first workaround, we'd probably want to mark the binary to indicate the non-standard
739*8d67ca89SAndroid Build Coastguard WorkerTP-to-TLS-segment offset.
740*8d67ca89SAndroid Build Coastguard Worker
741*8d67ca89SAndroid Build Coastguard WorkerReservation amount:
742*8d67ca89SAndroid Build Coastguard Worker * We would reserve at least 6 words to cover the stack guard
743*8d67ca89SAndroid Build Coastguard Worker * Reserving 16 covers all the existing Bionic slots and gives a little room for expansion. (If we
744*8d67ca89SAndroid Build Coastguard Worker   ever needed more than 16 slots, we could allocate the space before TP.)
745*8d67ca89SAndroid Build Coastguard Worker * 16 isn't enough for the pthread keys, so the Go runtime is still a problem.
746*8d67ca89SAndroid Build Coastguard Worker * Reserving 138 words is enough for existing slots and pthread keys.
747*8d67ca89SAndroid Build Coastguard Worker
748*8d67ca89SAndroid Build Coastguard Worker### Workaround: Use Variant 1 Everywhere with an Extra-Large TCB
749*8d67ca89SAndroid Build Coastguard Worker
750*8d67ca89SAndroid Build Coastguard WorkerPros:
751*8d67ca89SAndroid Build Coastguard Worker * memory layout is the same on all architectures, avoids native bridge complications
752*8d67ca89SAndroid Build Coastguard Worker * x86/x86-64 relocations probably handle positive offsets without issue
753*8d67ca89SAndroid Build Coastguard Worker
754*8d67ca89SAndroid Build Coastguard WorkerCons:
755*8d67ca89SAndroid Build Coastguard Worker * The reserved amount is still arbitrary.
756*8d67ca89SAndroid Build Coastguard Worker
757*8d67ca89SAndroid Build Coastguard Worker### Workaround: No LE Model in Android Executables
758*8d67ca89SAndroid Build Coastguard Worker
759*8d67ca89SAndroid Build Coastguard WorkerPros:
760*8d67ca89SAndroid Build Coastguard Worker * Keeps options open. We can allow LE later if we want.
761*8d67ca89SAndroid Build Coastguard Worker * Bionic's existing memory layout doesn't change, and arm32 and 32-bit x86 have the same layout
762*8d67ca89SAndroid Build Coastguard Worker * Fixes everything but static executables
763*8d67ca89SAndroid Build Coastguard Worker
764*8d67ca89SAndroid Build Coastguard WorkerCons:
765*8d67ca89SAndroid Build Coastguard Worker * more intrusive toolchain changes (affects both Clang and LLD)
766*8d67ca89SAndroid Build Coastguard Worker * statically-linked executables still need another workaround
767*8d67ca89SAndroid Build Coastguard Worker * somewhat larger/slower executables (they must use IE, not LE)
768*8d67ca89SAndroid Build Coastguard Worker
769*8d67ca89SAndroid Build Coastguard WorkerThe layout conflict is apparently only a problem because an executable assumes that its TLS segment
770*8d67ca89SAndroid Build Coastguard Workeris located at a statically-known offset from the TP (i.e. it uses the LE model). An initially-loaded
771*8d67ca89SAndroid Build Coastguard Workershared object can still use the efficient IE access model, but its TLS segment offset is known at
772*8d67ca89SAndroid Build Coastguard Workerload-time, not link-time. If we can guarantee that Android's executables also use the IE model, not
773*8d67ca89SAndroid Build Coastguard WorkerLE, then the Bionic loader can place the executable's TLS segment at any offset from the TP, leaving
774*8d67ca89SAndroid Build Coastguard Workerthe existing thread-specific memory layout untouched.
775*8d67ca89SAndroid Build Coastguard Worker
776*8d67ca89SAndroid Build Coastguard WorkerThis workaround doesn't help with statically-linked executables, but they're probably less of a
777*8d67ca89SAndroid Build Coastguard Workerproblem, because the linker and `libc.a` are usually packaged together.
778*8d67ca89SAndroid Build Coastguard Worker
779*8d67ca89SAndroid Build Coastguard WorkerA likely problem: LD is normally relaxed to LE, not to IE. We'd either have to disable LD usage in
780*8d67ca89SAndroid Build Coastguard Workerthe compiler (bad for performance) or add LD->IE relaxation. This relaxation requires that IE code
781*8d67ca89SAndroid Build Coastguard Workersequences be no larger than LD code sequences, which may not be the case on some architectures.
782*8d67ca89SAndroid Build Coastguard Worker(XXX: In some past testing, it looked feasible for TLSDESC but not the traditional design.)
783*8d67ca89SAndroid Build Coastguard Worker
784*8d67ca89SAndroid Build Coastguard WorkerTo implement:
785*8d67ca89SAndroid Build Coastguard Worker * Clang would need to stop generating LE accesses.
786*8d67ca89SAndroid Build Coastguard Worker * LLD would need to relax GD and LD to IE instead of LE.
787*8d67ca89SAndroid Build Coastguard Worker * LLD should abort if it sees a TLS LE relocation.
788*8d67ca89SAndroid Build Coastguard Worker * LLD must not statically resolve an executable's IE relocation in the GOT. (It might assume that
789*8d67ca89SAndroid Build Coastguard Worker   it knows its value.)
790*8d67ca89SAndroid Build Coastguard Worker * Perhaps LLD should mark executables specially, because a normal ELF linker's output would quietly
791*8d67ca89SAndroid Build Coastguard Worker   trample on `pthread_internal_t`. We need something like `DF_STATIC_TLS`, but instead of
792*8d67ca89SAndroid Build Coastguard Worker   indicating IE in an solib, we want to indicate the lack of LE in an executable.
793*8d67ca89SAndroid Build Coastguard Worker
794*8d67ca89SAndroid Build Coastguard Worker### (Non-)workaround for Go: Allocate a Slot with Go's Magic Values
795*8d67ca89SAndroid Build Coastguard Worker
796*8d67ca89SAndroid Build Coastguard WorkerThe Go runtime allocates its thread-local "g" variable by searching for a hard-coded magic constant
797*8d67ca89SAndroid Build Coastguard Worker(`0x23581321` for arm32 and `0x23581321345589` for arm64). As long as it finds its constant at a
798*8d67ca89SAndroid Build Coastguard Workersmall positive offset from TP (within the first 384 words), it will think it has found the pthread
799*8d67ca89SAndroid Build Coastguard Workerkey it allocated.
800*8d67ca89SAndroid Build Coastguard Worker
801*8d67ca89SAndroid Build Coastguard WorkerAs a temporary compatibility hack, we might try to keep these programs running by reserving a TLS
802*8d67ca89SAndroid Build Coastguard Workerslot with this magic value. This hack doesn't appear to work, however. The runtime finds its pthread
803*8d67ca89SAndroid Build Coastguard Workerkey, but apps segfault. Perhaps the Go runtime expects its "g" variable to be zero-initialized ([one
804*8d67ca89SAndroid Build Coastguard Workerexample][go-tlsg-zero]). With this hack, it's never zero, but with its current allocation strategy,
805*8d67ca89SAndroid Build Coastguard Workerit is typically zero. After [Bionic's pthread key system was rewritten to be
806*8d67ca89SAndroid Build Coastguard Workerlock-free][bionic-lockfree-keys] for Android M, though, it's not guaranteed, because a key could be
807*8d67ca89SAndroid Build Coastguard Workerrecycled.
808*8d67ca89SAndroid Build Coastguard Worker
809*8d67ca89SAndroid Build Coastguard Worker[go-tlsg-zero]: https://go.googlesource.com/go/+/5bc1fd42f6d185b8ff0201db09fb82886978908b/src/runtime/asm_arm64.s#980
810*8d67ca89SAndroid Build Coastguard Worker
811*8d67ca89SAndroid Build Coastguard Worker### Workaround for Go: place pthread keys after the executable's TLS
812*8d67ca89SAndroid Build Coastguard Worker
813*8d67ca89SAndroid Build Coastguard WorkerMost Android executables do not use any `thread_local` variables. In the prototype, with the
814*8d67ca89SAndroid Build Coastguard WorkerAOSP hikey960 build, only `/system/bin/netd` had a TLS segment, and it was only 32 bytes. As long as
815*8d67ca89SAndroid Build Coastguard Worker`/system/bin/app_process{32,64}` limits its use of TLS memory, then the pthread keys could be
816*8d67ca89SAndroid Build Coastguard Workerallocated after `app_process`' TLS segment, and Go will still find them.
817*8d67ca89SAndroid Build Coastguard Worker
818*8d67ca89SAndroid Build Coastguard WorkerGo scans 384 words from the thread pointer. If there are at most 16 Bionic slots and 130 pthread
819*8d67ca89SAndroid Build Coastguard Workerkeys (2 words per key), then `app_process` can use at most 108 words of TLS memory.
820*8d67ca89SAndroid Build Coastguard Worker
821*8d67ca89SAndroid Build Coastguard WorkerDrawback: In principle, this might make pthread key accesses slower, because Bionic can't assume
822*8d67ca89SAndroid Build Coastguard Workerthat pthread keys are at a fixed offset from the thread pointer anymore. It must load an offset from
823*8d67ca89SAndroid Build Coastguard Workersomewhere (a global variable, another TLS slot, ...). `__get_thread()` already uses a TLS slot to
824*8d67ca89SAndroid Build Coastguard Workerfind `pthread_internal_t`, though, rather than assume a fixed offset. (XXX: I think it could be
825*8d67ca89SAndroid Build Coastguard Workeroptimized.)
826*8d67ca89SAndroid Build Coastguard Worker
827*8d67ca89SAndroid Build Coastguard Worker## TODO: Memory Layout Querying APIs (Proposed)
828*8d67ca89SAndroid Build Coastguard Worker
829*8d67ca89SAndroid Build Coastguard Worker * https://sourceware.org/glibc/wiki/ThreadPropertiesAPI
830*8d67ca89SAndroid Build Coastguard Worker * http://b/30609580
831*8d67ca89SAndroid Build Coastguard Worker
832*8d67ca89SAndroid Build Coastguard Worker## TODO: Sanitizers
833*8d67ca89SAndroid Build Coastguard Worker
834*8d67ca89SAndroid Build Coastguard WorkerXXX: Maybe a sanitizer would want to intercept allocations of TLS memory, and that could be hard if
835*8d67ca89SAndroid Build Coastguard Workerthe loader is allocating it.
836*8d67ca89SAndroid Build Coastguard Worker * It looks like glibc's ld.so re-relocates itself after loading a program, so a program's symbols
837*8d67ca89SAndroid Build Coastguard Worker   can interpose call in the loader: https://sourceware.org/ml/libc-alpha/2014-01/msg00501.html
838*8d67ca89SAndroid Build Coastguard Worker
839*8d67ca89SAndroid Build Coastguard Worker## TODO: Other
840*8d67ca89SAndroid Build Coastguard Worker
841*8d67ca89SAndroid Build Coastguard WorkerMissing:
842*8d67ca89SAndroid Build Coastguard Worker * `dlsym` of a TLS variable
843*8d67ca89SAndroid Build Coastguard Worker * debugger support
844*8d67ca89SAndroid Build Coastguard Worker
845*8d67ca89SAndroid Build Coastguard Worker# References
846*8d67ca89SAndroid Build Coastguard Worker
847*8d67ca89SAndroid Build Coastguard WorkerGeneral (and x86/x86-64)
848*8d67ca89SAndroid Build Coastguard Worker * Ulrich Drepper's TLS document, ["ELF Handling For Thread-Local Storage."][drepper] Describes the
849*8d67ca89SAndroid Build Coastguard Worker   overall ELF TLS design and ABI details for x86 and x86-64 (as well as several other architectures
850*8d67ca89SAndroid Build Coastguard Worker   that Android doesn't target).
851*8d67ca89SAndroid Build Coastguard Worker * Alexandre Oliva's TLSDESC proposal with details for x86 and x86-64: ["Thread-Local Storage
852*8d67ca89SAndroid Build Coastguard Worker   Descriptors for IA32 and AMD64/EM64T."][tlsdesc-x86]
853*8d67ca89SAndroid Build Coastguard Worker * [x86 and x86-64 SystemV psABIs][psabi-x86].
854*8d67ca89SAndroid Build Coastguard Worker
855*8d67ca89SAndroid Build Coastguard Workerarm32:
856*8d67ca89SAndroid Build Coastguard Worker * Alexandre Oliva's TLSDESC proposal for arm32: ["Thread-Local Storage Descriptors for the ARM
857*8d67ca89SAndroid Build Coastguard Worker   platform."][tlsdesc-arm]
858*8d67ca89SAndroid Build Coastguard Worker * ["Addenda to, and Errata in, the ABI for the ARM® Architecture."][arm-addenda] Section 3,
859*8d67ca89SAndroid Build Coastguard Worker   "Addendum: Thread Local Storage" has details for arm32 non-TLSDESC ELF TLS.
860*8d67ca89SAndroid Build Coastguard Worker * ["Run-time ABI for the ARM® Architecture."][arm-rtabi] Documents `__aeabi_read_tp`.
861*8d67ca89SAndroid Build Coastguard Worker * ["ELF for the ARM® Architecture."][arm-elf] List TLS relocations (traditional and TLSDESC).
862*8d67ca89SAndroid Build Coastguard Worker
863*8d67ca89SAndroid Build Coastguard Workerarm64:
864*8d67ca89SAndroid Build Coastguard Worker * [2015 LLVM bugtracker comment][llvm22408] with an excerpt from an unnamed ARM draft specification
865*8d67ca89SAndroid Build Coastguard Worker   describing arm64 code sequences necessary for linker relaxation
866*8d67ca89SAndroid Build Coastguard Worker * ["ELF for the ARM® 64-bit Architecture (AArch64)."][arm64-elf] Lists TLS relocations (traditional
867*8d67ca89SAndroid Build Coastguard Worker   and TLSDESC).
868*8d67ca89SAndroid Build Coastguard Worker
869*8d67ca89SAndroid Build Coastguard Worker[drepper]: https://www.akkadia.org/drepper/tls.pdf
870*8d67ca89SAndroid Build Coastguard Worker[tlsdesc-x86]: https://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt
871*8d67ca89SAndroid Build Coastguard Worker[psabi-x86]: https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI
872*8d67ca89SAndroid Build Coastguard Worker[tlsdesc-arm]: https://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-ARM.txt
873*8d67ca89SAndroid Build Coastguard Worker[arm-addenda]: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0045e/IHI0045E_ABI_addenda.pdf
874*8d67ca89SAndroid Build Coastguard Worker[arm-rtabi]: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0043d/IHI0043D_rtabi.pdf
875*8d67ca89SAndroid Build Coastguard Worker[arm-elf]: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044f/IHI0044F_aaelf.pdf
876*8d67ca89SAndroid Build Coastguard Worker[llvm22408]: https://bugs.llvm.org/show_bug.cgi?id=22408#c10
877*8d67ca89SAndroid Build Coastguard Worker[arm64-elf]: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0056b/IHI0056B_aaelf64.pdf
878