xref: /aosp_15_r20/external/bcc/tools/stackcount_example.txt (revision 387f9dfdfa2baef462e92476d413c7bc2470293e)
1Demonstrations of stackcount, the Linux eBPF/bcc version.
2
3
4This program traces functions and frequency counts them with their entire
5stack trace, summarized in-kernel for efficiency. For example, counting
6stack traces that led to the submit_bio() kernel function, which creates
7block device I/O:
8
9# ./stackcount submit_bio
10Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
11^C
12  submit_bio
13  submit_bh
14  journal_submit_commit_record.isra.13
15  jbd2_journal_commit_transaction
16  kjournald2
17  kthread
18  ret_from_fork
19  mb_cache_list
20    1
21
22  submit_bio
23  __block_write_full_page.constprop.39
24  block_write_full_page
25  blkdev_writepage
26  __writepage
27  write_cache_pages
28  generic_writepages
29  do_writepages
30  __writeback_single_inode
31  writeback_sb_inodes
32  __writeback_inodes_wb
33    2
34
35  submit_bio
36  __block_write_full_page.constprop.39
37  block_write_full_page
38  blkdev_writepage
39  __writepage
40  write_cache_pages
41  generic_writepages
42  do_writepages
43  __filemap_fdatawrite_range
44  filemap_fdatawrite
45  fdatawrite_one_bdev
46    36
47
48  submit_bio
49  submit_bh
50  jbd2_journal_commit_transaction
51  kjournald2
52  kthread
53  ret_from_fork
54  mb_cache_list
55    38
56
57  submit_bio
58  ext4_writepages
59  do_writepages
60  __filemap_fdatawrite_range
61  filemap_flush
62  ext4_alloc_da_blocks
63  ext4_rename
64  ext4_rename2
65  vfs_rename
66  sys_rename
67  entry_SYSCALL_64_fastpath
68    79
69
70Detaching...
71
72The output shows unique stack traces, in order from leaf (on-CPU) to root,
73followed by their occurrence count. The last stack trace in the above output
74shows syscall handling, ext4_rename(), and filemap_flush(): looks like an
75application issued file rename has caused back end disk I/O due to ext4
76block allocation and a filemap_flush().
77
78
79Now adding the -P option to display stacks separately for each process:
80
81# ./stackcount -P submit_bio
82Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
83^C
84  submit_bio
85  ext4_writepages
86  do_writepages
87  __filemap_fdatawrite_range
88  filemap_flush
89  ext4_alloc_da_blocks
90  ext4_release_file
91  __fput
92  ____fput
93  task_work_run
94  exit_to_usermode_loop
95  syscall_return_slowpath
96  entry_SYSCALL_64_fastpath
97  [unknown]
98  [unknown]
99    tar [15069]
100    5
101
102  submit_bio
103  ext4_bio_write_page
104  mpage_submit_page
105  mpage_map_and_submit_buffers
106  ext4_writepages
107  do_writepages
108  __filemap_fdatawrite_range
109  filemap_flush
110  ext4_alloc_da_blocks
111  ext4_release_file
112  __fput
113  ____fput
114  task_work_run
115  exit_to_usermode_loop
116  syscall_return_slowpath
117  entry_SYSCALL_64_fastpath
118  [unknown]
119  [unknown]
120    tar [15069]
121    15
122
123  submit_bio
124  ext4_readpages
125  __do_page_cache_readahead
126  ondemand_readahead
127  page_cache_async_readahead
128  generic_file_read_iter
129  __vfs_read
130  vfs_read
131  sys_read
132  entry_SYSCALL_64_fastpath
133  [unknown]
134    tar [15069]
135    113
136
137Detaching...
138
139The last stack trace in the above output shows syscall handling, sys_read(),
140vfs_read(), and then "readahead" functions: looks like an application issued
141file read has triggered read ahead. With "-P", the application can be seen
142after the stack trace, in this case, "tar [15069]" for the "tar" command,
143PID 15069.
144
145The order of printed stack traces is from least to most frequent. The most
146frequent in this case, the ext4_readpages() stack, was taken 113 times during
147tracing.
148
149The "[unknown]" frames are from user-level, since this simple workload is
150the tar command, which apparently has been compiled without frame pointers.
151It's a common compiler optimization, but it breaks frame pointer-based stack
152walkers. Similar broken stacks will be seen by other profilers and debuggers
153that use frame pointers. Hopefully your application preserves them so that
154the user-level stack trace is visible. So how does one get frame pointers, if
155your application doesn't have them to start with? For the current bcc (until
156it supports other stack walkers), you need to be running an application binaries
157that preserves frame pointers, eg, using gcc's -fno-omit-frame-pointer. That's
158about all I'll say here: this is a big topic that is not bcc/BPF specific.
159
160It can be useful to trace the path to submit_bio to explain unusual rates of
161disk IOPS. These could have in-kernel origins (eg, background scrub).
162
163
164Now adding the -d option to delimit kernel and user stacks:
165
166# ./stackcount -P -d submit_bio
167Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
168^C
169  submit_bio
170  submit_bh
171  journal_submit_commit_record
172  jbd2_journal_commit_transaction
173  kjournald2
174  kthread
175  ret_from_fork
176    --
177    jbd2/xvda1-8 [405]
178    1
179
180  submit_bio
181  submit_bh
182  jbd2_journal_commit_transaction
183  kjournald2
184  kthread
185  ret_from_fork
186    --
187    jbd2/xvda1-8 [405]
188    2
189
190  submit_bio
191  ext4_writepages
192  do_writepages
193  __filemap_fdatawrite_range
194  filemap_flush
195  ext4_alloc_da_blocks
196  ext4_release_file
197  __fput
198  ____fput
199  task_work_run
200  exit_to_usermode_loop
201  syscall_return_slowpath
202  entry_SYSCALL_64_fastpath
203    --
204  [unknown]
205  [unknown]
206    tar [15187]
207    5
208
209  submit_bio
210  ext4_bio_write_page
211  mpage_submit_page
212  mpage_map_and_submit_buffers
213  ext4_writepages
214  do_writepages
215  __filemap_fdatawrite_range
216  filemap_flush
217  ext4_alloc_da_blocks
218  ext4_release_file
219  __fput
220  ____fput
221  task_work_run
222  exit_to_usermode_loop
223  syscall_return_slowpath
224  entry_SYSCALL_64_fastpath
225    --
226  [unknown]
227  [unknown]
228    tar [15187]
229    15
230
231  submit_bio
232  ext4_readpages
233  __do_page_cache_readahead
234  ondemand_readahead
235  page_cache_async_readahead
236  generic_file_read_iter
237  __vfs_read
238  vfs_read
239  sys_read
240  entry_SYSCALL_64_fastpath
241    --
242  [unknown]
243  [unknown]
244  [unknown]
245    tar [15187]
246    171
247
248Detaching...
249
250A "--" is printed between the kernel and user stacks.
251
252
253As a different example, here is the kernel function hrtimer_init_sleeper():
254
255# ./stackcount.py -P -d hrtimer_init_sleeper
256Tracing 1 functions for "hrtimer_init_sleeper"... Hit Ctrl-C to end.
257^C
258  hrtimer_init_sleeper
259  do_futex
260  SyS_futex
261  entry_SYSCALL_64_fastpath
262    --
263  [unknown]
264    containerd [16020]
265    1
266
267  hrtimer_init_sleeper
268  do_futex
269  SyS_futex
270  entry_SYSCALL_64_fastpath
271    --
272  __pthread_cond_timedwait
273  Monitor::IWait(Thread*, long)
274  Monitor::wait(bool, long, bool)
275  CompileQueue::get()
276  CompileBroker::compiler_thread_loop()
277  JavaThread::thread_main_inner()
278  JavaThread::run()
279  java_start(Thread*)
280  start_thread
281    java [4996]
282    1
283
284  hrtimer_init_sleeper
285  do_futex
286  SyS_futex
287  entry_SYSCALL_64_fastpath
288    --
289  [unknown]
290  [unknown]
291    containerd [16020]
292    1
293
294  hrtimer_init_sleeper
295  do_futex
296  SyS_futex
297  entry_SYSCALL_64_fastpath
298    --
299  __pthread_cond_timedwait
300  VMThread::loop()
301  VMThread::run()
302  java_start(Thread*)
303  start_thread
304    java [4996]
305    3
306
307  hrtimer_init_sleeper
308  do_futex
309  SyS_futex
310  entry_SYSCALL_64_fastpath
311    --
312  [unknown]
313    dockerd [16008]
314    4
315
316  hrtimer_init_sleeper
317  do_futex
318  SyS_futex
319  entry_SYSCALL_64_fastpath
320    --
321  [unknown]
322  [unknown]
323    dockerd [16008]
324    4
325
326  hrtimer_init_sleeper
327  do_futex
328  SyS_futex
329  entry_SYSCALL_64_fastpath
330    --
331  __pthread_cond_timedwait
332  Lio/netty/util/ThreadDeathWatcher$Watcher;::run
333  Interpreter
334  Interpreter
335  call_stub
336  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)
337  JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)
338  JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)
339  thread_entry(JavaThread*, Thread*)
340  JavaThread::thread_main_inner()
341  JavaThread::run()
342  java_start(Thread*)
343  start_thread
344    java [4996]
345    4
346
347  hrtimer_init_sleeper
348  do_futex
349  SyS_futex
350  entry_SYSCALL_64_fastpath
351    --
352  __pthread_cond_timedwait
353  clock_gettime
354  [unknown]
355    java [4996]
356    79
357
358Detaching...
359
360I was just trying to find a more interesting example. This output includes
361some Java stacks where user-level has been walked correctly (even includes a
362JIT symbol translation). dockerd and containerd don't have frame pointers
363(grumble), but Java does (which is running with -XX:+PreserveFramePointer).
364
365
366Here's another kernel function, ip_output():
367
368# ./stackcount.py -P -d ip_output
369Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
370^C
371  ip_output
372  ip_queue_xmit
373  tcp_transmit_skb
374  tcp_write_xmit
375  __tcp_push_pending_frames
376  tcp_push
377  tcp_sendmsg
378  inet_sendmsg
379  sock_sendmsg
380  sock_write_iter
381  __vfs_write
382  vfs_write
383  SyS_write
384  entry_SYSCALL_64_fastpath
385    --
386  __write_nocancel
387  [unknown]
388    sshd [15015]
389    5
390
391  ip_output
392  ip_queue_xmit
393  tcp_transmit_skb
394  tcp_write_xmit
395  __tcp_push_pending_frames
396  tcp_push
397  tcp_sendmsg
398  inet_sendmsg
399  sock_sendmsg
400  sock_write_iter
401  __vfs_write
402  vfs_write
403  SyS_write
404  entry_SYSCALL_64_fastpath
405    --
406  __write_nocancel
407  [unknown]
408  [unknown]
409    sshd [8234]
410    5
411
412  ip_output
413  ip_queue_xmit
414  tcp_transmit_skb
415  tcp_write_xmit
416  __tcp_push_pending_frames
417  tcp_push
418  tcp_sendmsg
419  inet_sendmsg
420  sock_sendmsg
421  sock_write_iter
422  __vfs_write
423  vfs_write
424  SyS_write
425  entry_SYSCALL_64_fastpath
426    --
427  __write_nocancel
428    sshd [15015]
429    7
430
431Detaching...
432
433This time just sshd is triggering ip_output() calls.
434
435
436Watch what happens if I filter on kernel stacks only (-K) for ip_output():
437
438# ./stackcount.py -K ip_output
439Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
440^C
441  ip_output
442  ip_queue_xmit
443  tcp_transmit_skb
444  tcp_write_xmit
445  __tcp_push_pending_frames
446  tcp_push
447  tcp_sendmsg
448  inet_sendmsg
449  sock_sendmsg
450  sock_write_iter
451  __vfs_write
452  vfs_write
453  SyS_write
454  entry_SYSCALL_64_fastpath
455    13
456
457Detaching...
458
459They have grouped together as a single unique stack, since the kernel part
460was the same.
461
462
463Here is just the user stacks, fetched during the kernel function ip_output():
464
465# ./stackcount.py -P -U ip_output
466Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
467^C
468  [unknown]
469    snmpd [1645]
470    1
471
472  __write_nocancel
473  [unknown]
474  [unknown]
475    sshd [8234]
476    3
477
478  __write_nocancel
479    sshd [15015]
480    4
481
482I should really run a custom sshd with frame pointers so we can see its
483stack trace...
484
485
486User-space functions can also be traced if a library name is provided. For
487example, to quickly identify code locations that allocate heap memory for
488PID 4902 (using -p), by tracing malloc from libc ("c:malloc"):
489
490# ./stackcount -p 4902 c:malloc
491Tracing 1 functions for "malloc"... Hit Ctrl-C to end.
492^C
493  malloc
494  rbtree_new
495  main
496  [unknown]
497    12
498
499  malloc
500  _rbtree_node_new_internal
501  _rbtree_node_insert
502  rbtree_insert
503  main
504  [unknown]
505    1189
506
507Detaching...
508
509Kernel stacks are absent as this didn't enter kernel code.
510
511Note that user-space uses of stackcount can be somewhat more limited because
512a lot of user-space libraries and binaries are compiled without frame-pointers
513as discussed earlier (the -fomit-frame-pointer compiler default) or are used
514without debuginfo.
515
516
517In addition to kernel and user-space functions, kernel tracepoints and USDT
518tracepoints are also supported.
519
520For example, to determine where threads are being created in a particular
521process, use the pthread_create USDT tracepoint:
522
523# ./stackcount -P -p $(pidof parprimes) u:pthread:pthread_create
524Tracing 1 functions for "u:pthread:pthread_create"... Hit Ctrl-C to end.
525^C
526
527    parprimes [11923]
528  pthread_create@@GLIBC_2.2.5
529  main
530  __libc_start_main
531  [unknown]
532    7
533
534You can use "readelf -n file" to see if it has USDT tracepoints.
535
536
537Similarly, to determine where context switching is happening in the kernel,
538use the sched:sched_switch kernel tracepoint:
539
540# ./stackcount -P t:sched:sched_switch
541  __schedule
542  schedule
543  worker_thread
544  kthread
545  ret_from_fork
546    kworker/0:2 [25482]
547    1
548
549  __schedule
550  schedule
551  schedule_hrtimeout_range_clock
552  schedule_hrtimeout_range
553  ep_poll
554  SyS_epoll_wait
555  entry_SYSCALL_64_fastpath
556  epoll_wait
557  Lsun/nio/ch/SelectorImpl;::lockAndDoSelect
558  Lsun/nio/ch/SelectorImpl;::select
559  Lio/netty/channel/nio/NioEventLoop;::select
560  Lio/netty/channel/nio/NioEventLoop;::run
561  Interpreter
562  Interpreter
563  call_stub
564  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)
565  JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)
566  JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*)
567  thread_entry(JavaThread*, Thread*)
568  JavaThread::thread_main_inner()
569  JavaThread::run()
570  java_start(Thread*)
571  start_thread
572    java [4996]
573    1
574
575... (omitted for brevity)
576
577  __schedule
578  schedule
579  schedule_preempt_disabled
580  cpu_startup_entry
581  xen_play_dead
582  arch_cpu_idle_dead
583  cpu_startup_entry
584  cpu_bringup_and_idle
585    swapper/1 [0]
586    289
587
588
589A -i option can be used to set an output interval, and -T to include a
590timestamp. For example:
591
592# ./stackcount.py -P -Tdi 1 submit_bio
593Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
594
59506:05:13
596
59706:05:14
598  submit_bio
599  xfs_do_writepage
600  write_cache_pages
601  xfs_vm_writepages
602  do_writepages
603  __writeback_single_inode
604  writeback_sb_inodes
605  __writeback_inodes_wb
606  wb_writeback
607  wb_workfn
608  process_one_work
609  worker_thread
610  kthread
611  ret_from_fork
612    --
613    kworker/u16:1 [15686]
614    1
615
616  submit_bio
617  process_one_work
618  worker_thread
619  kthread
620  ret_from_fork
621    --
622    kworker/u16:0 [16007]
623    1
624
625  submit_bio
626  xfs_buf_submit
627  xlog_bdstrat
628  xlog_sync
629  xlog_state_release_iclog
630  _xfs_log_force
631  xfs_log_force
632  xfs_fs_sync_fs
633  sync_fs_one_sb
634  iterate_supers
635  sys_sync
636  entry_SYSCALL_64_fastpath
637    --
638  [unknown]
639    sync [16039]
640    1
641
642  submit_bio
643  submit_bh
644  journal_submit_commit_record
645  jbd2_journal_commit_transaction
646  kjournald2
647  kthread
648  ret_from_fork
649    --
650    jbd2/xvda1-8 [405]
651    1
652
653  submit_bio
654  process_one_work
655  worker_thread
656  kthread
657  ret_from_fork
658    --
659    kworker/0:2 [25482]
660    2
661
662  submit_bio
663  ext4_writepages
664  do_writepages
665  __writeback_single_inode
666  writeback_sb_inodes
667  __writeback_inodes_wb
668  wb_writeback
669  wb_workfn
670  process_one_work
671  worker_thread
672  kthread
673  ret_from_fork
674    --
675    kworker/u16:0 [16007]
676    4
677
678  submit_bio
679  xfs_vm_writepages
680  do_writepages
681  __writeback_single_inode
682  writeback_sb_inodes
683  __writeback_inodes_wb
684  wb_writeback
685  wb_workfn
686  process_one_work
687  worker_thread
688  kthread
689  ret_from_fork
690    --
691    kworker/u16:1 [15686]
692    5
693
694  submit_bio
695  __block_write_full_page
696  block_write_full_page
697  blkdev_writepage
698  __writepage
699  write_cache_pages
700  generic_writepages
701  blkdev_writepages
702  do_writepages
703  __filemap_fdatawrite_range
704  filemap_fdatawrite
705  fdatawrite_one_bdev
706  iterate_bdevs
707  sys_sync
708  entry_SYSCALL_64_fastpath
709    --
710  [unknown]
711    sync [16039]
712    7
713
714  submit_bio
715  submit_bh
716  jbd2_journal_commit_transaction
717  kjournald2
718  kthread
719  ret_from_fork
720    --
721    jbd2/xvda1-8 [405]
722    8
723
724  submit_bio
725  ext4_bio_write_page
726  mpage_submit_page
727  mpage_map_and_submit_buffers
728  ext4_writepages
729  do_writepages
730  __writeback_single_inode
731  writeback_sb_inodes
732  __writeback_inodes_wb
733  wb_writeback
734  wb_workfn
735  process_one_work
736  worker_thread
737  kthread
738  ret_from_fork
739    --
740    kworker/u16:0 [16007]
741    8
742
743  submit_bio
744  __block_write_full_page
745  block_write_full_page
746  blkdev_writepage
747  __writepage
748  write_cache_pages
749  generic_writepages
750  blkdev_writepages
751  do_writepages
752  __writeback_single_inode
753  writeback_sb_inodes
754  __writeback_inodes_wb
755  wb_writeback
756  wb_workfn
757  process_one_work
758  worker_thread
759  kthread
760  ret_from_fork
761    --
762    kworker/u16:0 [16007]
763    60
764
765
76606:05:15
767
76806:05:16
769
770Detaching...
771
772This only included output for the 06:05:14 interval. The other internals
773did not span block device I/O.
774
775
776The -s output prints the return instruction offset for each function (aka
777symbol offset). Eg:
778
779# ./stackcount.py -P -s tcp_sendmsg
780Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end.
781^C
782  tcp_sendmsg+0x1
783  sock_sendmsg+0x38
784  sock_write_iter+0x85
785  __vfs_write+0xe3
786  vfs_write+0xb8
787  SyS_write+0x55
788  entry_SYSCALL_64_fastpath+0x1e
789  __write_nocancel+0x7
790    sshd [15015]
791    3
792
793  tcp_sendmsg+0x1
794  sock_sendmsg+0x38
795  sock_write_iter+0x85
796  __vfs_write+0xe3
797  vfs_write+0xb8
798  SyS_write+0x55
799  entry_SYSCALL_64_fastpath+0x1e
800  __write_nocancel+0x7
801    sshd [8234]
802    3
803
804Detaching...
805
806If it wasn't clear how one function called another, knowing the instruction
807offset can help you locate the lines of code from a disassembly dump.
808
809
810The -v output is verbose, and shows raw addresses:
811
812./stackcount.py -P -v tcp_sendmsg
813Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end.
814^C
815  ffffffff817b05c1 tcp_sendmsg
816  ffffffff8173ea48 sock_sendmsg
817  ffffffff8173eae5 sock_write_iter
818  ffffffff81232b33 __vfs_write
819  ffffffff812331b8 vfs_write
820  ffffffff81234625 SyS_write
821  ffffffff818739bb entry_SYSCALL_64_fastpath
822  7f120511e6e0     __write_nocancel
823    sshd [8234]
824    3
825
826  ffffffff817b05c1 tcp_sendmsg
827  ffffffff8173ea48 sock_sendmsg
828  ffffffff8173eae5 sock_write_iter
829  ffffffff81232b33 __vfs_write
830  ffffffff812331b8 vfs_write
831  ffffffff81234625 SyS_write
832  ffffffff818739bb entry_SYSCALL_64_fastpath
833  7f919c5a26e0     __write_nocancel
834    sshd [15015]
835    11
836
837Detaching...
838
839
840A wildcard can also be used. Eg, all functions beginning with "tcp_send",
841kernel stacks only (-K) with offsets (-s):
842
843# ./stackcount -Ks 'tcp_send*'
844Tracing 14 functions for "tcp_send*"... Hit Ctrl-C to end.
845^C
846  tcp_send_delayed_ack0x1
847  tcp_rcv_established0x3b1
848  tcp_v4_do_rcv0x130
849  tcp_v4_rcv0x8e0
850  ip_local_deliver_finish0x9f
851  ip_local_deliver0x51
852  ip_rcv_finish0x8a
853  ip_rcv0x29d
854  __netif_receive_skb_core0x637
855  __netif_receive_skb0x18
856  netif_receive_skb_internal0x23
857    1
858
859  tcp_send_delayed_ack0x1
860  tcp_rcv_established0x222
861  tcp_v4_do_rcv0x130
862  tcp_v4_rcv0x8e0
863  ip_local_deliver_finish0x9f
864  ip_local_deliver0x51
865  ip_rcv_finish0x8a
866  ip_rcv0x29d
867  __netif_receive_skb_core0x637
868  __netif_receive_skb0x18
869  netif_receive_skb_internal0x23
870    4
871
872  tcp_send_mss0x1
873  inet_sendmsg0x67
874  sock_sendmsg0x38
875  sock_write_iter0x78
876  __vfs_write0xaa
877  vfs_write0xa9
878  sys_write0x46
879  entry_SYSCALL_64_fastpath0x16
880    7
881
882  tcp_sendmsg0x1
883  sock_sendmsg0x38
884  sock_write_iter0x78
885  __vfs_write0xaa
886  vfs_write0xa9
887  sys_write0x46
888  entry_SYSCALL_64_fastpath0x16
889    7
890
891Detaching...
892
893Use -r to allow regular expressions.
894
895
896The -f option will emit folded output, which can be used as input to other
897tools including flame graphs. For example, with delimiters as well:
898
899# ./stackcount.py -P -df t:sched:sched_switch
900^Csnmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1
901kworker/7:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1
902watchdog/0;-;ret_from_fork;kthread;smpboot_thread_fn;schedule;__schedule 1
903snmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1
904svscan;[unknown];-;entry_SYSCALL_64_fastpath;SyS_nanosleep;hrtimer_nanosleep;do_nanosleep;schedule;__schedule 1
905python;[unknown];__select_nocancel;-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1
906kworker/2:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1
907[...]
908
909Flame graphs visualize stack traces. For information about them and links
910to open source software, see http://www.brendangregg.com/flamegraphs.html .
911This folded output can be piped directly into flamegraph.pl (the Perl version).
912
913
914USAGE message:
915
916# ./stackcount -h
917usage: stackcount [-h] [-p PID] [-c CPU] [-i INTERVAL] [-D DURATION] [-T] [-r]
918                  [-s] [-P] [-K] [-U] [-v] [-d] [-f] [--debug]
919                  pattern
920
921Count events and their stack traces
922
923positional arguments:
924  pattern               search expression for events
925
926optional arguments:
927  -h, --help            show this help message and exit
928  -p PID, --pid PID     trace this PID only
929  -c CPU, --cpu CPU     trace this CPU only
930  -i INTERVAL, --interval INTERVAL
931                        summary interval, seconds
932  -D DURATION, --duration DURATION
933                        total duration of trace, seconds
934  -T, --timestamp       include timestamp on output
935  -r, --regexp          use regular expressions. Default is "*" wildcards
936                        only.
937  -s, --offset          show address offsets
938  -P, --perpid          display stacks separately for each process
939  -K, --kernel-stacks-only
940                        kernel stack only
941  -U, --user-stacks-only
942                        user stack only
943  -v, --verbose         show raw addresses
944  -d, --delimited       insert delimiter between kernel/user stacks
945  -f, --folded          output folded format
946  --debug               print BPF program before starting (for debugging
947                        purposes)
948
949examples:
950    ./stackcount submit_bio         # count kernel stack traces for submit_bio
951    ./stackcount -d ip_output       # include a user/kernel stack delimiter
952    ./stackcount -s ip_output       # show symbol offsets
953    ./stackcount -sv ip_output      # show offsets and raw addresses (verbose)
954    ./stackcount 'tcp_send*'        # count stacks for funcs matching tcp_send*
955    ./stackcount -r '^tcp_send.*'   # same as above, using regular expressions
956    ./stackcount -Ti 5 ip_output    # output every 5 seconds, with timestamps
957    ./stackcount -p 185 ip_output   # count ip_output stacks for PID 185 only
958    ./stackcount -p 185 c:malloc    # count stacks for malloc in PID 185
959    ./stackcount t:sched:sched_fork # count stacks for sched_fork tracepoint
960    ./stackcount -p 185 u:node:*    # count stacks for all USDT probes in node
961    ./stackcount -c 1 put_prev_entity   # count put_prev_entity stacks for CPU 1 only
962    ./stackcount -K t:sched:sched_switch   # kernel stacks only
963    ./stackcount -U t:sched:sched_switch   # user stacks only
964