xref: /aosp_15_r20/external/bcc/tools/compactsnoop_example.txt (revision 387f9dfdfa2baef462e92476d413c7bc2470293e)
1Demonstrations of compactstall, the Linux eBPF/bcc version.
2
3
4compactsnoop traces the compact zone system-wide, and print various details.
5Example output (manual trigger by echo 1 > /proc/sys/vm/compact_memory):
6
7# ./compactsnoop
8COMM           PID    NODE ZONE         ORDER MODE      LAT(ms)           STATUS
9zsh            23685  0    ZONE_DMA     -1    SYNC        0.025         complete
10zsh            23685  0    ZONE_DMA32   -1    SYNC        3.925         complete
11zsh            23685  0    ZONE_NORMAL  -1    SYNC      113.975         complete
12zsh            23685  1    ZONE_NORMAL  -1    SYNC        81.57         complete
13zsh            23685  0    ZONE_DMA     -1    SYNC         0.02         complete
14zsh            23685  0    ZONE_DMA32   -1    SYNC        4.631         complete
15zsh            23685  0    ZONE_NORMAL  -1    SYNC      113.975         complete
16zsh            23685  1    ZONE_NORMAL  -1    SYNC       80.647         complete
17zsh            23685  0    ZONE_DMA     -1    SYNC        0.020         complete
18zsh            23685  0    ZONE_DMA32   -1    SYNC        3.367         complete
19zsh            23685  0    ZONE_NORMAL  -1    SYNC       115.18         complete
20zsh            23685  1    ZONE_NORMAL  -1    SYNC       81.766         complete
21zsh            23685  0    ZONE_DMA     -1    SYNC        0.025         complete
22zsh            23685  0    ZONE_DMA32   -1    SYNC        4.346         complete
23zsh            23685  0    ZONE_NORMAL  -1    SYNC      114.570         complete
24zsh            23685  1    ZONE_NORMAL  -1    SYNC       80.820         complete
25zsh            23685  0    ZONE_DMA     -1    SYNC        0.026         complete
26zsh            23685  0    ZONE_DMA32   -1    SYNC        4.611         complete
27zsh            23685  0    ZONE_NORMAL  -1    SYNC      113.993         complete
28zsh            23685  1    ZONE_NORMAL  -1    SYNC       80.928         complete
29zsh            23685  0    ZONE_DMA     -1    SYNC         0.02         complete
30zsh            23685  0    ZONE_DMA32   -1    SYNC        3.889         complete
31zsh            23685  0    ZONE_NORMAL  -1    SYNC      113.776         complete
32zsh            23685  1    ZONE_NORMAL  -1    SYNC       80.727         complete
33^C
34
35While tracing, the processes alloc pages due to memory fragmentation is too
36serious to meet contiguous memory requirements in the system, compact zone
37events happened, which will increase the waiting delay of the processes.
38
39compactsnoop can be useful for discovering when compact_stall(/proc/vmstat)
40continues to increase, whether it is caused by some critical processes or not.
41
42The STATUS include (CentOS 7.6's kernel)
43
44    compact_status = {
45        # COMPACT_SKIPPED: compaction didn't start as it was not possible or direct reclaim was more suitable
46        0: "skipped",
47        # COMPACT_CONTINUE: compaction should continue to another pageblock
48        1: "continue",
49        # COMPACT_PARTIAL: direct compaction partially compacted a zone and there are suitable pages
50        2: "partial",
51        # COMPACT_COMPLETE: The full zone was compacted
52        3: "complete",
53    }
54
55or (kernel 4.7 and above)
56
57    compact_status = {
58        # COMPACT_NOT_SUITABLE_ZONE: For more detailed tracepoint output - internal to compaction
59        0: "not_suitable_zone",
60        # COMPACT_SKIPPED: compaction didn't start as it was not possible or direct reclaim was more suitable
61        1: "skipped",
62        # COMPACT_DEFERRED: compaction didn't start as it was deferred due to past failures
63        2: "deferred",
64        # COMPACT_NOT_SUITABLE_PAGE: For more detailed tracepoint output - internal to compaction
65        3: "no_suitable_page",
66        # COMPACT_CONTINUE: compaction should continue to another pageblock
67        4: "continue",
68        # COMPACT_COMPLETE: The full zone was compacted scanned but wasn't successful to compact suitable pages.
69        5: "complete",
70        # COMPACT_PARTIAL_SKIPPED: direct compaction has scanned part of the zone but wasn't successful to compact suitable pages.
71        6: "partial_skipped",
72        # COMPACT_CONTENDED: compaction terminated prematurely due to lock contentions
73        7: "contended",
74        # COMPACT_SUCCESS: direct compaction terminated after concluding that the allocation should now succeed
75        8: "success",
76    }
77
78The -p option can be used to filter on a PID, which is filtered in-kernel. Here
79I've used it with -T to print timestamps:
80
81# ./compactsnoop -Tp 24376
82TIME(s)         COMM           PID    NODE ZONE         ORDER MODE      LAT(ms)           STATUS
83101.364115000   zsh            24376  0    ZONE_DMA     -1    SYNC        0.025         complete
84101.364555000   zsh            24376  0    ZONE_DMA32   -1    SYNC        3.925         complete
85^C
86
87This shows the zsh process allocs pages, and compact zone events happening,
88and the delays are not affected much.
89
90A maximum tracing duration can be set with the -d option. For example, to trace
91for 2 seconds:
92
93# ./compactsnoop -d 2
94COMM           PID    NODE ZONE         ORDER MODE       LAT(ms)           STATUS
95zsh            26385  0    ZONE_DMA     -1    SYNC      0.025444         complete
96^C
97
98The -e option prints out extra columns
99
100# ./compactsnoop -e
101COMM           PID    NODE ZONE         ORDER MODE    FRAGIDX  MIN      LOW      HIGH     FREE       LAT(ms)           STATUS
102summ           28276  1    ZONE_NORMAL  3     ASYNC   0.728    11284    14105    16926    14193         3.58          partial
103summ           28276  0    ZONE_NORMAL  2     ASYNC   -1.000   11043    13803    16564    14479          0.0         complete
104summ           28276  1    ZONE_NORMAL  2     ASYNC   -1.000   11284    14105    16926    14785        0.019         complete
105summ           28276  0    ZONE_NORMAL  2     ASYNC   -1.000   11043    13803    16564    15199        0.006          partial
106summ           28276  1    ZONE_NORMAL  2     ASYNC   -1.000   11284    14105    16926    17360        0.030         complete
107summ           28276  0    ZONE_NORMAL  2     ASYNC   -1.000   11043    13803    16564    15443        0.024         complete
108summ           28276  1    ZONE_NORMAL  2     ASYNC   -1.000   11284    14105    16926    15634        0.018         complete
109summ           28276  1    ZONE_NORMAL  3     ASYNC   0.832    11284    14105    16926    15301        0.006          partial
110summ           28276  0    ZONE_NORMAL  2     ASYNC   -1.000   11043    13803    16564    14774        0.005          partial
111summ           28276  1    ZONE_NORMAL  3     ASYNC   0.733    11284    14105    16926    19888        0.012          partial
112^C
113
114The FRAGIDX is short for fragmentation index, which only makes sense if an
115allocation of a requested size would fail. If that is true, the fragmentation
116index indicates whether external fragmentation or a lack of memory was the
117problem. The value can be used to determine if page reclaim or compaction
118should be used.
119
120Index is between 0 and 1 so return within 3 decimal places
121
1220 => allocation would fail due to lack of memory
1231 => allocation would fail due to fragmentation
124
125We can see the whole buddy's fragmentation index from /sys/kernel/debug/extfrag/extfrag_index
126
127The MIN/LOW/HIGH shows the watermarks of the zone, which can also get from
128/proc/zoneinfo, and FREE means nr_free_pages (can be found in /proc/zoneinfo too).
129
130
131The -K option prints out kernel stack
132
133# ./compactsnoop -K -e
134
135summ           28276  0    ZONE_NORMAL  3     ASYNC   0.528    11043    13803    16564    22654       13.258          partial
136               kretprobe_trampoline+0x0
137               try_to_compact_pages+0x121
138               __alloc_pages_direct_compact+0xac
139               __alloc_pages_slowpath+0x3e9
140               __alloc_pages_nodemask+0x404
141               alloc_pages_current+0x98
142               new_slab+0x2c5
143               ___slab_alloc+0x3ac
144               __slab_alloc+0x40
145               kmem_cache_alloc_node+0x8b
146               copy_process+0x18e
147               do_fork+0x91
148               sys_clone+0x16
149               stub_clone+0x44
150
151summ           28276  1    ZONE_NORMAL  3     ASYNC   -1.000   11284    14105    16926    22074        0.008          partial
152               kretprobe_trampoline+0x0
153               try_to_compact_pages+0x121
154               __alloc_pages_direct_compact+0xac
155               __alloc_pages_slowpath+0x3e9
156               __alloc_pages_nodemask+0x404
157               alloc_pages_current+0x98
158               new_slab+0x2c5
159               ___slab_alloc+0x3ac
160               __slab_alloc+0x40
161               kmem_cache_alloc_node+0x8b
162               copy_process+0x18e
163               do_fork+0x91
164               sys_clone+0x16
165               stub_clone+0x44
166
167summ           28276  0    ZONE_NORMAL  3     ASYNC   0.527    11043    13803    16564    25653        9.812          partial
168               kretprobe_trampoline+0x0
169               try_to_compact_pages+0x121
170               __alloc_pages_direct_compact+0xac
171               __alloc_pages_slowpath+0x3e9
172               __alloc_pages_nodemask+0x404
173               alloc_pages_current+0x98
174               new_slab+0x2c5
175               ___slab_alloc+0x3ac
176               __slab_alloc+0x40
177               kmem_cache_alloc_node+0x8b
178               copy_process+0x18e
179               do_fork+0x91
180               sys_clone+0x16
181               stub_clone+0x44
182
183# ./compactsnoop -h
184usage: compactsnoop.py [-h] [-T] [-p PID] [-d DURATION] [-K] [-e]
185
186Trace compact zone
187
188optional arguments:
189  -h, --help            show this help message and exit
190  -T, --timestamp       include timestamp on output
191  -p PID, --pid PID     trace this PID only
192  -d DURATION, --duration DURATION
193                        total duration of trace in seconds
194  -K, --kernel-stack    output kernel stack trace
195  -e, --extended_fields
196                        show system memory state
197
198examples:
199    ./compactsnoop          # trace all compact stall
200    ./compactsnoop -T       # include timestamps
201    ./compactsnoop -d 10    # trace for 10 seconds only
202    ./compactsnoop -K       # output kernel stack trace
203    ./compactsnoop -e       # show extended fields
204