# Tutorial
The simplest use for the STG tools is to extract, store and compare ABI
representations.
This tutorial uses long options throughout. Equivalent short options can be
found in the manual pages for [`stg`](stg.md) and [`stgdiff`](stgdiff.md). Both
tools understand `-` as a shorthand for `/dev/stdout`.
Working Example - code and compilation
This small code sample will be used as a working example. Copy it into a file
called `tree.c`.
```c
struct N {
struct N * left;
struct N * right;
int value;
};
unsigned int count(struct N * tree) {
return tree ? count(tree->left) + count(tree->right) + 1 : 0;
}
int sum(struct N * tree) {
return tree ? sum(tree->left) + sum(tree->right) + tree->value : 0;
}
```
Compile it:
```shell
gcc -Wall -Wextra -g -c tree.c -o tree.o
```
## Extraction from ELF / DWARF
`stg` is the tool for extracting ABI representations, though it can do more
sophisticated things as well. The simplest invocation of `stg` looks something
like this:
```shell
stg --elf library.so --output library.stg
```
Adding the `--annotate` option can be useful, especially if trying to debug ABI
issues or when experimenting with the tools, like now.
If the output consists of just symbols and you get a warning about missing DWARF
information, this means that `library.so` has no DWARF debugging information.
For meaningful results, `stg` should be run on an *unstripped* ELF file which
may require build system adjustments.
Working Example - ABI extraction
Run this:
```shell
stg --elf tree.o --annotate --output -
```
And you should get something like this:
Output
```proto
version: 0x00000002
root_id: 0x84ea5130 # interface
pointer_reference {
id: 0x32b38621
kind: POINTER
pointee_type_id: 0xe08efe1a # struct N
}
primitive {
id: 0x4585663f
name: "unsigned int"
encoding: UNSIGNED_INTEGER
bytesize: 0x00000004
}
primitive {
id: 0x6720d32f
name: "int"
encoding: SIGNED_INTEGER
bytesize: 0x00000004
}
member {
id: 0x35cbdb23
name: "left"
type_id: 0x32b38621 # struct N*
}
member {
id: 0x0b440ffb
name: "right"
type_id: 0x32b38621 # struct N*
offset: 64
}
member {
id: 0xa06f75d5
name: "value"
type_id: 0x6720d32f # int
offset: 128
}
struct_union {
id: 0xe08efe1a
kind: STRUCT
name: "N"
definition {
bytesize: 24
member_id: 0x35cbdb23 # struct N* left
member_id: 0x0b440ffb # struct N* right
member_id: 0xa06f75d5 # int value
}
}
function {
id: 0x912c02a7
return_type_id: 0x6720d32f # int
parameter_id: 0x32b38621 # struct N*
}
function {
id: 0xc2779f73
return_type_id: 0x4585663f # unsigned int
parameter_id: 0x32b38621 # struct N*
}
elf_symbol {
id: 0xbb237197
name: "count"
is_defined: true
symbol_type: FUNCTION
type_id: 0xc2779f73 # unsigned int(struct N*)
full_name: "count"
}
elf_symbol {
id: 0x4fdeca38
name: "sum"
is_defined: true
symbol_type: FUNCTION
type_id: 0x912c02a7 # int(struct N*)
full_name: "sum"
}
interface {
id: 0x84ea5130
symbol_id: 0xbb237197 # unsigned int count(struct N*)
symbol_id: 0x4fdeca38 # int sum(struct N*)
}
```
## Filtering
One issue when first starting to manage the ABI of a binary is the wish to
restrict the interface surface to just the necessary minimum. Any superfluous
symbols or type definitions in the ABI representation can result in spurious ABI
differences in reports later on.
When it comes to the symbols exposed, it's common to control symbol
*visibility*. Type definitions can be either exposed in public header files or
hidden in private header files, with perhaps only public forward declarations,
but this does not remove any type definitions in the DWARF information.
STG provides filtering facilities for both symbols and types, for example:
```shell
stg --files '*.h' --elf library.so --output library.stg
```
This will ensure that definitions of any types defined outside any header files,
and perhaps used as opaque pointer handles, are omitted from the ABI
representation. If you separate public and private headers, then use an
appropriate glob pattern that distinguishes the two.
Sets of symbol or file names can be read from a file. In this example, all
symbols whose names begin with `api_`, except those in the `obsolete` file, are
kept.
```shell
stg --symbols 'api_* & ! :obsolete' --elf library.so --output library.stg
```
For historical reasons, the literal filter file format is compatible with
libabigail's symbol list one, but this is subject to change.
```ini
[list]
# one symbol per line
foo # comments, whitespace and empty lines are all ignored
bar
baz
```
Working Example - filtering the ABI
Let's say that `struct N` is supposed to be an opaque type that user code only
gets pointers to and, additionally, the function `count` should be excluded from
the ABI (perhaps due to an argument over its return type). We can exclude the
definition of `struct N`, along with that of any other types defined in
`tree.c`, using a file filter. The symbol can be excluded by name.
Run this:
```shell
stg --elf tree.o --files '*.h' --symbols '!count' --output -
```
The result should be something like this:
Output
```proto
version: 0x00000002
root_id: 0x84ea5130
pointer_reference {
id: 0x26944aa7
kind: POINTER
pointee_type_id: 0xb011cc02
}
primitive {
id: 0x6720d32f
name: "int"
encoding: SIGNED_INTEGER
bytesize: 0x00000004
}
struct_union {
id: 0xb011cc02
kind: STRUCT
name: "N"
}
function {
id: 0x9425f186
return_type_id: 0x6720d32f
parameter_id: 0x26944aa7
}
elf_symbol {
id: 0x4fdeca38
name: "sum"
is_defined: true
symbol_type: FUNCTION
type_id: 0x9425f186
full_name: "sum"
}
interface {
id: 0x84ea5130
symbol_id: 0x4fdeca38
}
```
## ABI Comparison
`stgdiff` is the tool for comparing ABI representations and reporting
differences, though it has some other, more specialised, uses. The simplest
invocation of `stgdiff` looks something like this:
```shell
stgdiff --stg old/library.stg new/library.stg --output -
```
This will report ABI differences in the default (`small`) format.
Working Example - ABI differences - small format
The function `sum` has a type that depends on `struct N`. Any change to either
might affect the ABI exposed via `sum`. For example, if the type of the `value`
member is changed to `short` and the file is recompiled, STG can detect this
difference.
First rerun the STG extraction, specifying `--output tree-old.stg`. Make the
source code change, recompile and extract the ABI with `--output tree-new.stg`.
Then run this:
```shell
stgdiff --stg tree-old.stg tree-new.stg --output -
```
To get this:
```text
type 'struct N' changed
member changed from 'int value' to 'short int value'
type changed from 'int' to 'short int'
```
The `small` format omits parts of the ABI graph which haven't changed.[^1] To
see all impacted nodes, use `--format flat` instead.
[^1]: The similarly named `short` format goes a bit further and will omit and
summarise certain repetitive differences.
Working Example - ABI differences - flat format
```text
function symbol 'int sum(struct N*)' changed
type 'int(struct N*)' changed
parameter 1 type 'struct N*' changed
pointed-to type 'struct N' changed
type 'struct N' changed
member 'struct N* left' changed
type 'struct N*' changed
pointed-to type 'struct N' changed
member 'struct N* right' changed
type 'struct N*' changed
pointed-to type 'struct N' changed
member changed from 'int value' to 'short int value'
type changed from 'int' to 'short int'
```
And if you really want to see more of the graph structure, use `--format plain`.
Working Example - ABI differences - plain format
```text
function symbol 'int sum(struct N*)' changed
type 'int(struct N*)' changed
parameter 1 type 'struct N*' changed
pointed-to type 'struct N' changed
member 'struct N* left' changed
type 'struct N*' changed
pointed-to type 'struct N' changed
(being reported)
member 'struct N* right' changed
type 'struct N*' changed
pointed-to type 'struct N' changed
(being reported)
member changed from 'int value' to 'short int value'
type changed from 'int' to 'short int'
```
Or just use `--format viz` which generates input for
[Graphviz](https://graphviz.org/).
Working Example - ABI differences - viz format
```dot
digraph "ABI diff" {
"0" [shape=rectangle, label="'interface'"]
"1" [label="'int sum(struct N*)'"]
"2" [label="'int(struct N*)'"]
"3" [label="'struct N*'"]
"4" [shape=rectangle, label="'struct N'"]
"5" [label="'struct N* left'"]
"5" -> "3" [label=""]
"4" -> "5" [label=""]
"6" [label="'struct N* right'"]
"6" -> "3" [label=""]
"4" -> "6" [label=""]
"7" [label="'int value' → 'short int value'"]
"8" [color=red, label="'int' → 'short int'"]
"7" -> "8" [label=""]
"4" -> "7" [label=""]
"3" -> "4" [label="pointed-to"]
"2" -> "3" [label="parameter 1"]
"1" -> "2" [label=""]
"0" -> "1" [label=""]
}
```