1*9880d681SAndroid Build Coastguard Worker======================================= 2*9880d681SAndroid Build Coastguard WorkerLLVM's Optional Rich Disassembly Output 3*9880d681SAndroid Build Coastguard Worker======================================= 4*9880d681SAndroid Build Coastguard Worker 5*9880d681SAndroid Build Coastguard Worker.. contents:: 6*9880d681SAndroid Build Coastguard Worker :local: 7*9880d681SAndroid Build Coastguard Worker 8*9880d681SAndroid Build Coastguard WorkerIntroduction 9*9880d681SAndroid Build Coastguard Worker============ 10*9880d681SAndroid Build Coastguard Worker 11*9880d681SAndroid Build Coastguard WorkerLLVM's default disassembly output is raw text. To allow consumers more ability 12*9880d681SAndroid Build Coastguard Workerto introspect the instructions' textual representation or to reformat for a more 13*9880d681SAndroid Build Coastguard Workeruser friendly display there is an optional rich disassembly output. 14*9880d681SAndroid Build Coastguard Worker 15*9880d681SAndroid Build Coastguard WorkerThis optional output is sufficient to reference into individual portions of the 16*9880d681SAndroid Build Coastguard Workerinstruction text. This is intended for clients like disassemblers, list file 17*9880d681SAndroid Build Coastguard Workergenerators, and pretty-printers, which need more than the raw instructions and 18*9880d681SAndroid Build Coastguard Workerthe ability to print them. 19*9880d681SAndroid Build Coastguard Worker 20*9880d681SAndroid Build Coastguard WorkerTo provide this functionality the assembly text is marked up with annotations. 21*9880d681SAndroid Build Coastguard WorkerThe markup is simple enough in syntax to be robust even in the case of version 22*9880d681SAndroid Build Coastguard Workermismatches between consumers and producers. That is, the syntax generally does 23*9880d681SAndroid Build Coastguard Workernot carry semantics beyond "this text has an annotation," so consumers can 24*9880d681SAndroid Build Coastguard Workersimply ignore annotations they do not understand or do not care about. 25*9880d681SAndroid Build Coastguard Worker 26*9880d681SAndroid Build Coastguard WorkerAfter calling ``LLVMCreateDisasm()`` to create a disassembler context the 27*9880d681SAndroid Build Coastguard Workeroptional output is enable with this call: 28*9880d681SAndroid Build Coastguard Worker 29*9880d681SAndroid Build Coastguard Worker.. code-block:: c 30*9880d681SAndroid Build Coastguard Worker 31*9880d681SAndroid Build Coastguard Worker LLVMSetDisasmOptions(DC, LLVMDisassembler_Option_UseMarkup); 32*9880d681SAndroid Build Coastguard Worker 33*9880d681SAndroid Build Coastguard WorkerThen subsequent calls to ``LLVMDisasmInstruction()`` will return output strings 34*9880d681SAndroid Build Coastguard Workerwith the marked up annotations. 35*9880d681SAndroid Build Coastguard Worker 36*9880d681SAndroid Build Coastguard WorkerInstruction Annotations 37*9880d681SAndroid Build Coastguard Worker======================= 38*9880d681SAndroid Build Coastguard Worker 39*9880d681SAndroid Build Coastguard Worker.. _contextual markups: 40*9880d681SAndroid Build Coastguard Worker 41*9880d681SAndroid Build Coastguard WorkerContextual markups 42*9880d681SAndroid Build Coastguard Worker------------------ 43*9880d681SAndroid Build Coastguard Worker 44*9880d681SAndroid Build Coastguard WorkerAnnoated assembly display will supply contextual markup to help clients more 45*9880d681SAndroid Build Coastguard Workerefficiently implement things like pretty printers. Most markup will be target 46*9880d681SAndroid Build Coastguard Workerindependent, so clients can effectively provide good display without any target 47*9880d681SAndroid Build Coastguard Workerspecific knowledge. 48*9880d681SAndroid Build Coastguard Worker 49*9880d681SAndroid Build Coastguard WorkerAnnotated assembly goes through the normal instruction printer, but optionally 50*9880d681SAndroid Build Coastguard Workerincludes contextual tags on portions of the instruction string. An annotation 51*9880d681SAndroid Build Coastguard Workeris any '<' '>' delimited section of text(1). 52*9880d681SAndroid Build Coastguard Worker 53*9880d681SAndroid Build Coastguard Worker.. code-block:: bat 54*9880d681SAndroid Build Coastguard Worker 55*9880d681SAndroid Build Coastguard Worker annotation: '<' tag-name tag-modifier-list ':' annotated-text '>' 56*9880d681SAndroid Build Coastguard Worker tag-name: identifier 57*9880d681SAndroid Build Coastguard Worker tag-modifier-list: comma delimited identifier list 58*9880d681SAndroid Build Coastguard Worker 59*9880d681SAndroid Build Coastguard WorkerThe tag-name is an identifier which gives the type of the annotation. For the 60*9880d681SAndroid Build Coastguard Workerfirst pass, this will be very simple, with memory references, registers, and 61*9880d681SAndroid Build Coastguard Workerimmediates having the tag names "mem", "reg", and "imm", respectively. 62*9880d681SAndroid Build Coastguard Worker 63*9880d681SAndroid Build Coastguard WorkerThe tag-modifier-list is typically additional target-specific context, such as 64*9880d681SAndroid Build Coastguard Workerregister class. 65*9880d681SAndroid Build Coastguard Worker 66*9880d681SAndroid Build Coastguard WorkerClients should accept and ignore any tag-names or tag-modifiers they do not 67*9880d681SAndroid Build Coastguard Workerunderstand, allowing the annotations to grow in richness without breaking older 68*9880d681SAndroid Build Coastguard Workerclients. 69*9880d681SAndroid Build Coastguard Worker 70*9880d681SAndroid Build Coastguard WorkerFor example, a possible annotation of an ARM load of a stack-relative location 71*9880d681SAndroid Build Coastguard Workermight be annotated as: 72*9880d681SAndroid Build Coastguard Worker 73*9880d681SAndroid Build Coastguard Worker.. code-block:: nasm 74*9880d681SAndroid Build Coastguard Worker 75*9880d681SAndroid Build Coastguard Worker ldr <reg gpr:r0>, <mem regoffset:[<reg gpr:sp>, <imm:#4>]> 76*9880d681SAndroid Build Coastguard Worker 77*9880d681SAndroid Build Coastguard Worker 78*9880d681SAndroid Build Coastguard Worker1: For assembly dialects in which '<' and/or '>' are legal tokens, a literal token is escaped by following immediately with a repeat of the character. For example, a literal '<' character is output as '<<' in an annotated assembly string. 79*9880d681SAndroid Build Coastguard Worker 80*9880d681SAndroid Build Coastguard WorkerC API Details 81*9880d681SAndroid Build Coastguard Worker------------- 82*9880d681SAndroid Build Coastguard Worker 83*9880d681SAndroid Build Coastguard WorkerThe intended consumers of this information use the C API, therefore the new C 84*9880d681SAndroid Build Coastguard WorkerAPI function for the disassembler will be added to provide an option to produce 85*9880d681SAndroid Build Coastguard Workerdisassembled instructions with annotations, ``LLVMSetDisasmOptions()`` and the 86*9880d681SAndroid Build Coastguard Worker``LLVMDisassembler_Option_UseMarkup`` option (see above). 87