1Demonstrations of stackcount, the Linux eBPF/bcc version. 2 3 4This program traces functions and frequency counts them with their entire 5stack trace, summarized in-kernel for efficiency. For example, counting 6stack traces that led to the submit_bio() kernel function, which creates 7block device I/O: 8 9# ./stackcount submit_bio 10Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. 11^C 12 submit_bio 13 submit_bh 14 journal_submit_commit_record.isra.13 15 jbd2_journal_commit_transaction 16 kjournald2 17 kthread 18 ret_from_fork 19 mb_cache_list 20 1 21 22 submit_bio 23 __block_write_full_page.constprop.39 24 block_write_full_page 25 blkdev_writepage 26 __writepage 27 write_cache_pages 28 generic_writepages 29 do_writepages 30 __writeback_single_inode 31 writeback_sb_inodes 32 __writeback_inodes_wb 33 2 34 35 submit_bio 36 __block_write_full_page.constprop.39 37 block_write_full_page 38 blkdev_writepage 39 __writepage 40 write_cache_pages 41 generic_writepages 42 do_writepages 43 __filemap_fdatawrite_range 44 filemap_fdatawrite 45 fdatawrite_one_bdev 46 36 47 48 submit_bio 49 submit_bh 50 jbd2_journal_commit_transaction 51 kjournald2 52 kthread 53 ret_from_fork 54 mb_cache_list 55 38 56 57 submit_bio 58 ext4_writepages 59 do_writepages 60 __filemap_fdatawrite_range 61 filemap_flush 62 ext4_alloc_da_blocks 63 ext4_rename 64 ext4_rename2 65 vfs_rename 66 sys_rename 67 entry_SYSCALL_64_fastpath 68 79 69 70Detaching... 71 72The output shows unique stack traces, in order from leaf (on-CPU) to root, 73followed by their occurrence count. The last stack trace in the above output 74shows syscall handling, ext4_rename(), and filemap_flush(): looks like an 75application issued file rename has caused back end disk I/O due to ext4 76block allocation and a filemap_flush(). 77 78 79Now adding the -P option to display stacks separately for each process: 80 81# ./stackcount -P submit_bio 82Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. 83^C 84 submit_bio 85 ext4_writepages 86 do_writepages 87 __filemap_fdatawrite_range 88 filemap_flush 89 ext4_alloc_da_blocks 90 ext4_release_file 91 __fput 92 ____fput 93 task_work_run 94 exit_to_usermode_loop 95 syscall_return_slowpath 96 entry_SYSCALL_64_fastpath 97 [unknown] 98 [unknown] 99 tar [15069] 100 5 101 102 submit_bio 103 ext4_bio_write_page 104 mpage_submit_page 105 mpage_map_and_submit_buffers 106 ext4_writepages 107 do_writepages 108 __filemap_fdatawrite_range 109 filemap_flush 110 ext4_alloc_da_blocks 111 ext4_release_file 112 __fput 113 ____fput 114 task_work_run 115 exit_to_usermode_loop 116 syscall_return_slowpath 117 entry_SYSCALL_64_fastpath 118 [unknown] 119 [unknown] 120 tar [15069] 121 15 122 123 submit_bio 124 ext4_readpages 125 __do_page_cache_readahead 126 ondemand_readahead 127 page_cache_async_readahead 128 generic_file_read_iter 129 __vfs_read 130 vfs_read 131 sys_read 132 entry_SYSCALL_64_fastpath 133 [unknown] 134 tar [15069] 135 113 136 137Detaching... 138 139The last stack trace in the above output shows syscall handling, sys_read(), 140vfs_read(), and then "readahead" functions: looks like an application issued 141file read has triggered read ahead. With "-P", the application can be seen 142after the stack trace, in this case, "tar [15069]" for the "tar" command, 143PID 15069. 144 145The order of printed stack traces is from least to most frequent. The most 146frequent in this case, the ext4_readpages() stack, was taken 113 times during 147tracing. 148 149The "[unknown]" frames are from user-level, since this simple workload is 150the tar command, which apparently has been compiled without frame pointers. 151It's a common compiler optimization, but it breaks frame pointer-based stack 152walkers. Similar broken stacks will be seen by other profilers and debuggers 153that use frame pointers. Hopefully your application preserves them so that 154the user-level stack trace is visible. So how does one get frame pointers, if 155your application doesn't have them to start with? For the current bcc (until 156it supports other stack walkers), you need to be running an application binaries 157that preserves frame pointers, eg, using gcc's -fno-omit-frame-pointer. That's 158about all I'll say here: this is a big topic that is not bcc/BPF specific. 159 160It can be useful to trace the path to submit_bio to explain unusual rates of 161disk IOPS. These could have in-kernel origins (eg, background scrub). 162 163 164Now adding the -d option to delimit kernel and user stacks: 165 166# ./stackcount -P -d submit_bio 167Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. 168^C 169 submit_bio 170 submit_bh 171 journal_submit_commit_record 172 jbd2_journal_commit_transaction 173 kjournald2 174 kthread 175 ret_from_fork 176 -- 177 jbd2/xvda1-8 [405] 178 1 179 180 submit_bio 181 submit_bh 182 jbd2_journal_commit_transaction 183 kjournald2 184 kthread 185 ret_from_fork 186 -- 187 jbd2/xvda1-8 [405] 188 2 189 190 submit_bio 191 ext4_writepages 192 do_writepages 193 __filemap_fdatawrite_range 194 filemap_flush 195 ext4_alloc_da_blocks 196 ext4_release_file 197 __fput 198 ____fput 199 task_work_run 200 exit_to_usermode_loop 201 syscall_return_slowpath 202 entry_SYSCALL_64_fastpath 203 -- 204 [unknown] 205 [unknown] 206 tar [15187] 207 5 208 209 submit_bio 210 ext4_bio_write_page 211 mpage_submit_page 212 mpage_map_and_submit_buffers 213 ext4_writepages 214 do_writepages 215 __filemap_fdatawrite_range 216 filemap_flush 217 ext4_alloc_da_blocks 218 ext4_release_file 219 __fput 220 ____fput 221 task_work_run 222 exit_to_usermode_loop 223 syscall_return_slowpath 224 entry_SYSCALL_64_fastpath 225 -- 226 [unknown] 227 [unknown] 228 tar [15187] 229 15 230 231 submit_bio 232 ext4_readpages 233 __do_page_cache_readahead 234 ondemand_readahead 235 page_cache_async_readahead 236 generic_file_read_iter 237 __vfs_read 238 vfs_read 239 sys_read 240 entry_SYSCALL_64_fastpath 241 -- 242 [unknown] 243 [unknown] 244 [unknown] 245 tar [15187] 246 171 247 248Detaching... 249 250A "--" is printed between the kernel and user stacks. 251 252 253As a different example, here is the kernel function hrtimer_init_sleeper(): 254 255# ./stackcount.py -P -d hrtimer_init_sleeper 256Tracing 1 functions for "hrtimer_init_sleeper"... Hit Ctrl-C to end. 257^C 258 hrtimer_init_sleeper 259 do_futex 260 SyS_futex 261 entry_SYSCALL_64_fastpath 262 -- 263 [unknown] 264 containerd [16020] 265 1 266 267 hrtimer_init_sleeper 268 do_futex 269 SyS_futex 270 entry_SYSCALL_64_fastpath 271 -- 272 __pthread_cond_timedwait 273 Monitor::IWait(Thread*, long) 274 Monitor::wait(bool, long, bool) 275 CompileQueue::get() 276 CompileBroker::compiler_thread_loop() 277 JavaThread::thread_main_inner() 278 JavaThread::run() 279 java_start(Thread*) 280 start_thread 281 java [4996] 282 1 283 284 hrtimer_init_sleeper 285 do_futex 286 SyS_futex 287 entry_SYSCALL_64_fastpath 288 -- 289 [unknown] 290 [unknown] 291 containerd [16020] 292 1 293 294 hrtimer_init_sleeper 295 do_futex 296 SyS_futex 297 entry_SYSCALL_64_fastpath 298 -- 299 __pthread_cond_timedwait 300 VMThread::loop() 301 VMThread::run() 302 java_start(Thread*) 303 start_thread 304 java [4996] 305 3 306 307 hrtimer_init_sleeper 308 do_futex 309 SyS_futex 310 entry_SYSCALL_64_fastpath 311 -- 312 [unknown] 313 dockerd [16008] 314 4 315 316 hrtimer_init_sleeper 317 do_futex 318 SyS_futex 319 entry_SYSCALL_64_fastpath 320 -- 321 [unknown] 322 [unknown] 323 dockerd [16008] 324 4 325 326 hrtimer_init_sleeper 327 do_futex 328 SyS_futex 329 entry_SYSCALL_64_fastpath 330 -- 331 __pthread_cond_timedwait 332 Lio/netty/util/ThreadDeathWatcher$Watcher;::run 333 Interpreter 334 Interpreter 335 call_stub 336 JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*) 337 JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*) 338 JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*) 339 thread_entry(JavaThread*, Thread*) 340 JavaThread::thread_main_inner() 341 JavaThread::run() 342 java_start(Thread*) 343 start_thread 344 java [4996] 345 4 346 347 hrtimer_init_sleeper 348 do_futex 349 SyS_futex 350 entry_SYSCALL_64_fastpath 351 -- 352 __pthread_cond_timedwait 353 clock_gettime 354 [unknown] 355 java [4996] 356 79 357 358Detaching... 359 360I was just trying to find a more interesting example. This output includes 361some Java stacks where user-level has been walked correctly (even includes a 362JIT symbol translation). dockerd and containerd don't have frame pointers 363(grumble), but Java does (which is running with -XX:+PreserveFramePointer). 364 365 366Here's another kernel function, ip_output(): 367 368# ./stackcount.py -P -d ip_output 369Tracing 1 functions for "ip_output"... Hit Ctrl-C to end. 370^C 371 ip_output 372 ip_queue_xmit 373 tcp_transmit_skb 374 tcp_write_xmit 375 __tcp_push_pending_frames 376 tcp_push 377 tcp_sendmsg 378 inet_sendmsg 379 sock_sendmsg 380 sock_write_iter 381 __vfs_write 382 vfs_write 383 SyS_write 384 entry_SYSCALL_64_fastpath 385 -- 386 __write_nocancel 387 [unknown] 388 sshd [15015] 389 5 390 391 ip_output 392 ip_queue_xmit 393 tcp_transmit_skb 394 tcp_write_xmit 395 __tcp_push_pending_frames 396 tcp_push 397 tcp_sendmsg 398 inet_sendmsg 399 sock_sendmsg 400 sock_write_iter 401 __vfs_write 402 vfs_write 403 SyS_write 404 entry_SYSCALL_64_fastpath 405 -- 406 __write_nocancel 407 [unknown] 408 [unknown] 409 sshd [8234] 410 5 411 412 ip_output 413 ip_queue_xmit 414 tcp_transmit_skb 415 tcp_write_xmit 416 __tcp_push_pending_frames 417 tcp_push 418 tcp_sendmsg 419 inet_sendmsg 420 sock_sendmsg 421 sock_write_iter 422 __vfs_write 423 vfs_write 424 SyS_write 425 entry_SYSCALL_64_fastpath 426 -- 427 __write_nocancel 428 sshd [15015] 429 7 430 431Detaching... 432 433This time just sshd is triggering ip_output() calls. 434 435 436Watch what happens if I filter on kernel stacks only (-K) for ip_output(): 437 438# ./stackcount.py -K ip_output 439Tracing 1 functions for "ip_output"... Hit Ctrl-C to end. 440^C 441 ip_output 442 ip_queue_xmit 443 tcp_transmit_skb 444 tcp_write_xmit 445 __tcp_push_pending_frames 446 tcp_push 447 tcp_sendmsg 448 inet_sendmsg 449 sock_sendmsg 450 sock_write_iter 451 __vfs_write 452 vfs_write 453 SyS_write 454 entry_SYSCALL_64_fastpath 455 13 456 457Detaching... 458 459They have grouped together as a single unique stack, since the kernel part 460was the same. 461 462 463Here is just the user stacks, fetched during the kernel function ip_output(): 464 465# ./stackcount.py -P -U ip_output 466Tracing 1 functions for "ip_output"... Hit Ctrl-C to end. 467^C 468 [unknown] 469 snmpd [1645] 470 1 471 472 __write_nocancel 473 [unknown] 474 [unknown] 475 sshd [8234] 476 3 477 478 __write_nocancel 479 sshd [15015] 480 4 481 482I should really run a custom sshd with frame pointers so we can see its 483stack trace... 484 485 486User-space functions can also be traced if a library name is provided. For 487example, to quickly identify code locations that allocate heap memory for 488PID 4902 (using -p), by tracing malloc from libc ("c:malloc"): 489 490# ./stackcount -p 4902 c:malloc 491Tracing 1 functions for "malloc"... Hit Ctrl-C to end. 492^C 493 malloc 494 rbtree_new 495 main 496 [unknown] 497 12 498 499 malloc 500 _rbtree_node_new_internal 501 _rbtree_node_insert 502 rbtree_insert 503 main 504 [unknown] 505 1189 506 507Detaching... 508 509Kernel stacks are absent as this didn't enter kernel code. 510 511Note that user-space uses of stackcount can be somewhat more limited because 512a lot of user-space libraries and binaries are compiled without frame-pointers 513as discussed earlier (the -fomit-frame-pointer compiler default) or are used 514without debuginfo. 515 516 517In addition to kernel and user-space functions, kernel tracepoints and USDT 518tracepoints are also supported. 519 520For example, to determine where threads are being created in a particular 521process, use the pthread_create USDT tracepoint: 522 523# ./stackcount -P -p $(pidof parprimes) u:pthread:pthread_create 524Tracing 1 functions for "u:pthread:pthread_create"... Hit Ctrl-C to end. 525^C 526 527 parprimes [11923] 528 pthread_create@@GLIBC_2.2.5 529 main 530 __libc_start_main 531 [unknown] 532 7 533 534You can use "readelf -n file" to see if it has USDT tracepoints. 535 536 537Similarly, to determine where context switching is happening in the kernel, 538use the sched:sched_switch kernel tracepoint: 539 540# ./stackcount -P t:sched:sched_switch 541 __schedule 542 schedule 543 worker_thread 544 kthread 545 ret_from_fork 546 kworker/0:2 [25482] 547 1 548 549 __schedule 550 schedule 551 schedule_hrtimeout_range_clock 552 schedule_hrtimeout_range 553 ep_poll 554 SyS_epoll_wait 555 entry_SYSCALL_64_fastpath 556 epoll_wait 557 Lsun/nio/ch/SelectorImpl;::lockAndDoSelect 558 Lsun/nio/ch/SelectorImpl;::select 559 Lio/netty/channel/nio/NioEventLoop;::select 560 Lio/netty/channel/nio/NioEventLoop;::run 561 Interpreter 562 Interpreter 563 call_stub 564 JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*) 565 JavaCalls::call_virtual(JavaValue*, KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*) 566 JavaCalls::call_virtual(JavaValue*, Handle, KlassHandle, Symbol*, Symbol*, Thread*) 567 thread_entry(JavaThread*, Thread*) 568 JavaThread::thread_main_inner() 569 JavaThread::run() 570 java_start(Thread*) 571 start_thread 572 java [4996] 573 1 574 575... (omitted for brevity) 576 577 __schedule 578 schedule 579 schedule_preempt_disabled 580 cpu_startup_entry 581 xen_play_dead 582 arch_cpu_idle_dead 583 cpu_startup_entry 584 cpu_bringup_and_idle 585 swapper/1 [0] 586 289 587 588 589A -i option can be used to set an output interval, and -T to include a 590timestamp. For example: 591 592# ./stackcount.py -P -Tdi 1 submit_bio 593Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end. 594 59506:05:13 596 59706:05:14 598 submit_bio 599 xfs_do_writepage 600 write_cache_pages 601 xfs_vm_writepages 602 do_writepages 603 __writeback_single_inode 604 writeback_sb_inodes 605 __writeback_inodes_wb 606 wb_writeback 607 wb_workfn 608 process_one_work 609 worker_thread 610 kthread 611 ret_from_fork 612 -- 613 kworker/u16:1 [15686] 614 1 615 616 submit_bio 617 process_one_work 618 worker_thread 619 kthread 620 ret_from_fork 621 -- 622 kworker/u16:0 [16007] 623 1 624 625 submit_bio 626 xfs_buf_submit 627 xlog_bdstrat 628 xlog_sync 629 xlog_state_release_iclog 630 _xfs_log_force 631 xfs_log_force 632 xfs_fs_sync_fs 633 sync_fs_one_sb 634 iterate_supers 635 sys_sync 636 entry_SYSCALL_64_fastpath 637 -- 638 [unknown] 639 sync [16039] 640 1 641 642 submit_bio 643 submit_bh 644 journal_submit_commit_record 645 jbd2_journal_commit_transaction 646 kjournald2 647 kthread 648 ret_from_fork 649 -- 650 jbd2/xvda1-8 [405] 651 1 652 653 submit_bio 654 process_one_work 655 worker_thread 656 kthread 657 ret_from_fork 658 -- 659 kworker/0:2 [25482] 660 2 661 662 submit_bio 663 ext4_writepages 664 do_writepages 665 __writeback_single_inode 666 writeback_sb_inodes 667 __writeback_inodes_wb 668 wb_writeback 669 wb_workfn 670 process_one_work 671 worker_thread 672 kthread 673 ret_from_fork 674 -- 675 kworker/u16:0 [16007] 676 4 677 678 submit_bio 679 xfs_vm_writepages 680 do_writepages 681 __writeback_single_inode 682 writeback_sb_inodes 683 __writeback_inodes_wb 684 wb_writeback 685 wb_workfn 686 process_one_work 687 worker_thread 688 kthread 689 ret_from_fork 690 -- 691 kworker/u16:1 [15686] 692 5 693 694 submit_bio 695 __block_write_full_page 696 block_write_full_page 697 blkdev_writepage 698 __writepage 699 write_cache_pages 700 generic_writepages 701 blkdev_writepages 702 do_writepages 703 __filemap_fdatawrite_range 704 filemap_fdatawrite 705 fdatawrite_one_bdev 706 iterate_bdevs 707 sys_sync 708 entry_SYSCALL_64_fastpath 709 -- 710 [unknown] 711 sync [16039] 712 7 713 714 submit_bio 715 submit_bh 716 jbd2_journal_commit_transaction 717 kjournald2 718 kthread 719 ret_from_fork 720 -- 721 jbd2/xvda1-8 [405] 722 8 723 724 submit_bio 725 ext4_bio_write_page 726 mpage_submit_page 727 mpage_map_and_submit_buffers 728 ext4_writepages 729 do_writepages 730 __writeback_single_inode 731 writeback_sb_inodes 732 __writeback_inodes_wb 733 wb_writeback 734 wb_workfn 735 process_one_work 736 worker_thread 737 kthread 738 ret_from_fork 739 -- 740 kworker/u16:0 [16007] 741 8 742 743 submit_bio 744 __block_write_full_page 745 block_write_full_page 746 blkdev_writepage 747 __writepage 748 write_cache_pages 749 generic_writepages 750 blkdev_writepages 751 do_writepages 752 __writeback_single_inode 753 writeback_sb_inodes 754 __writeback_inodes_wb 755 wb_writeback 756 wb_workfn 757 process_one_work 758 worker_thread 759 kthread 760 ret_from_fork 761 -- 762 kworker/u16:0 [16007] 763 60 764 765 76606:05:15 767 76806:05:16 769 770Detaching... 771 772This only included output for the 06:05:14 interval. The other internals 773did not span block device I/O. 774 775 776The -s output prints the return instruction offset for each function (aka 777symbol offset). Eg: 778 779# ./stackcount.py -P -s tcp_sendmsg 780Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end. 781^C 782 tcp_sendmsg+0x1 783 sock_sendmsg+0x38 784 sock_write_iter+0x85 785 __vfs_write+0xe3 786 vfs_write+0xb8 787 SyS_write+0x55 788 entry_SYSCALL_64_fastpath+0x1e 789 __write_nocancel+0x7 790 sshd [15015] 791 3 792 793 tcp_sendmsg+0x1 794 sock_sendmsg+0x38 795 sock_write_iter+0x85 796 __vfs_write+0xe3 797 vfs_write+0xb8 798 SyS_write+0x55 799 entry_SYSCALL_64_fastpath+0x1e 800 __write_nocancel+0x7 801 sshd [8234] 802 3 803 804Detaching... 805 806If it wasn't clear how one function called another, knowing the instruction 807offset can help you locate the lines of code from a disassembly dump. 808 809 810The -v output is verbose, and shows raw addresses: 811 812./stackcount.py -P -v tcp_sendmsg 813Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end. 814^C 815 ffffffff817b05c1 tcp_sendmsg 816 ffffffff8173ea48 sock_sendmsg 817 ffffffff8173eae5 sock_write_iter 818 ffffffff81232b33 __vfs_write 819 ffffffff812331b8 vfs_write 820 ffffffff81234625 SyS_write 821 ffffffff818739bb entry_SYSCALL_64_fastpath 822 7f120511e6e0 __write_nocancel 823 sshd [8234] 824 3 825 826 ffffffff817b05c1 tcp_sendmsg 827 ffffffff8173ea48 sock_sendmsg 828 ffffffff8173eae5 sock_write_iter 829 ffffffff81232b33 __vfs_write 830 ffffffff812331b8 vfs_write 831 ffffffff81234625 SyS_write 832 ffffffff818739bb entry_SYSCALL_64_fastpath 833 7f919c5a26e0 __write_nocancel 834 sshd [15015] 835 11 836 837Detaching... 838 839 840A wildcard can also be used. Eg, all functions beginning with "tcp_send", 841kernel stacks only (-K) with offsets (-s): 842 843# ./stackcount -Ks 'tcp_send*' 844Tracing 14 functions for "tcp_send*"... Hit Ctrl-C to end. 845^C 846 tcp_send_delayed_ack0x1 847 tcp_rcv_established0x3b1 848 tcp_v4_do_rcv0x130 849 tcp_v4_rcv0x8e0 850 ip_local_deliver_finish0x9f 851 ip_local_deliver0x51 852 ip_rcv_finish0x8a 853 ip_rcv0x29d 854 __netif_receive_skb_core0x637 855 __netif_receive_skb0x18 856 netif_receive_skb_internal0x23 857 1 858 859 tcp_send_delayed_ack0x1 860 tcp_rcv_established0x222 861 tcp_v4_do_rcv0x130 862 tcp_v4_rcv0x8e0 863 ip_local_deliver_finish0x9f 864 ip_local_deliver0x51 865 ip_rcv_finish0x8a 866 ip_rcv0x29d 867 __netif_receive_skb_core0x637 868 __netif_receive_skb0x18 869 netif_receive_skb_internal0x23 870 4 871 872 tcp_send_mss0x1 873 inet_sendmsg0x67 874 sock_sendmsg0x38 875 sock_write_iter0x78 876 __vfs_write0xaa 877 vfs_write0xa9 878 sys_write0x46 879 entry_SYSCALL_64_fastpath0x16 880 7 881 882 tcp_sendmsg0x1 883 sock_sendmsg0x38 884 sock_write_iter0x78 885 __vfs_write0xaa 886 vfs_write0xa9 887 sys_write0x46 888 entry_SYSCALL_64_fastpath0x16 889 7 890 891Detaching... 892 893Use -r to allow regular expressions. 894 895 896The -f option will emit folded output, which can be used as input to other 897tools including flame graphs. For example, with delimiters as well: 898 899# ./stackcount.py -P -df t:sched:sched_switch 900^Csnmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1 901kworker/7:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1 902watchdog/0;-;ret_from_fork;kthread;smpboot_thread_fn;schedule;__schedule 1 903snmp-pass;[unknown];[unknown];[unknown];[unknown];[unknown];-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1 904svscan;[unknown];-;entry_SYSCALL_64_fastpath;SyS_nanosleep;hrtimer_nanosleep;do_nanosleep;schedule;__schedule 1 905python;[unknown];__select_nocancel;-;entry_SYSCALL_64_fastpath;SyS_select;core_sys_select;do_select;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule 1 906kworker/2:0;-;ret_from_fork;kthread;worker_thread;schedule;__schedule 1 907[...] 908 909Flame graphs visualize stack traces. For information about them and links 910to open source software, see http://www.brendangregg.com/flamegraphs.html . 911This folded output can be piped directly into flamegraph.pl (the Perl version). 912 913 914USAGE message: 915 916# ./stackcount -h 917usage: stackcount [-h] [-p PID] [-c CPU] [-i INTERVAL] [-D DURATION] [-T] [-r] 918 [-s] [-P] [-K] [-U] [-v] [-d] [-f] [--debug] 919 pattern 920 921Count events and their stack traces 922 923positional arguments: 924 pattern search expression for events 925 926optional arguments: 927 -h, --help show this help message and exit 928 -p PID, --pid PID trace this PID only 929 -c CPU, --cpu CPU trace this CPU only 930 -i INTERVAL, --interval INTERVAL 931 summary interval, seconds 932 -D DURATION, --duration DURATION 933 total duration of trace, seconds 934 -T, --timestamp include timestamp on output 935 -r, --regexp use regular expressions. Default is "*" wildcards 936 only. 937 -s, --offset show address offsets 938 -P, --perpid display stacks separately for each process 939 -K, --kernel-stacks-only 940 kernel stack only 941 -U, --user-stacks-only 942 user stack only 943 -v, --verbose show raw addresses 944 -d, --delimited insert delimiter between kernel/user stacks 945 -f, --folded output folded format 946 --debug print BPF program before starting (for debugging 947 purposes) 948 949examples: 950 ./stackcount submit_bio # count kernel stack traces for submit_bio 951 ./stackcount -d ip_output # include a user/kernel stack delimiter 952 ./stackcount -s ip_output # show symbol offsets 953 ./stackcount -sv ip_output # show offsets and raw addresses (verbose) 954 ./stackcount 'tcp_send*' # count stacks for funcs matching tcp_send* 955 ./stackcount -r '^tcp_send.*' # same as above, using regular expressions 956 ./stackcount -Ti 5 ip_output # output every 5 seconds, with timestamps 957 ./stackcount -p 185 ip_output # count ip_output stacks for PID 185 only 958 ./stackcount -p 185 c:malloc # count stacks for malloc in PID 185 959 ./stackcount t:sched:sched_fork # count stacks for sched_fork tracepoint 960 ./stackcount -p 185 u:node:* # count stacks for all USDT probes in node 961 ./stackcount -c 1 put_prev_entity # count put_prev_entity stacks for CPU 1 only 962 ./stackcount -K t:sched:sched_switch # kernel stacks only 963 ./stackcount -U t:sched:sched_switch # user stacks only 964