1Demonstrations of dirtop, the Linux eBPF/bcc version. 2 3 4dirtop shows reads and writes by directory. For example: 5 6# ./dirtop.py -d '/hdfs/uuid/*/yarn' 7Tracing... Output every 1 secs. Hit Ctrl-C to end 8 914:28:12 loadavg: 25.00 22.85 21.22 31/2921 66450 10 11READS WRITES R_Kb W_Kb PATH 121030 2852 8 147341 /hdfs/uuid/c11da291-28de-4a77-873e-44bb452d238b/yarn 133308 2459 10980 24893 /hdfs/uuid/bf829d08-1455-45b8-81fa-05c3303e8c45/yarn 142227 7165 6484 11157 /hdfs/uuid/76dc0b77-e2fd-4476-818f-2b5c3c452396/yarn 151985 9576 6431 6616 /hdfs/uuid/99c178d5-a209-4af2-8467-7382c7f03c1b/yarn 161986 398 6474 6486 /hdfs/uuid/7d512fe7-b20d-464c-a75a-dbf8b687ee1c/yarn 17764 3685 5 7069 /hdfs/uuid/250b21c8-1714-45fe-8c08-d45d0271c6bd/yarn 18432 1603 259 6402 /hdfs/uuid/4a833770-767e-43b3-b696-dc98901bce26/yarn 19993 5856 320 129 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn 20612 5645 4 249 /hdfs/uuid/8138a53b-b942-44d3-82df-51575f1a3901/yarn 21818 21 6 166 /hdfs/uuid/fada8004-53ff-48df-9396-165d8e42925b/yarn 22174 23 1 171 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn 23376 6281 2 97 /hdfs/uuid/0cc3683f-4800-4c73-8075-8d77dc7cf116/yarn 24370 4588 2 96 /hdfs/uuid/a78f846a-58c4-4d10-a9f5-42f16a6134a0/yarn 25190 6420 1 86 /hdfs/uuid/2c6a7223-cb18-4916-a1b6-8cd02bda1d31/yarn 26178 123 1 17 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn 27[...] 28 29This shows various directories read and written when hadoop runs. 30By default the output is sorted by the total read size in Kbytes (R_Kb). 31Sorting order can be changed via -s option. 32This is instrumenting at the VFS interface, so this is reads and writes that 33may return entirely from the file system cache (page cache). 34 35While not printed, the average read and write size can be calculated by 36dividing R_Kb by READS, and the same for writes. 37 38This script works by tracing the vfs_read() and vfs_write() functions using 39kernel dynamic tracing, which instruments explicit read and write calls. If 40files are read or written using another means (eg, via mmap()), then they 41will not be visible using this tool. 42 43This should be useful for file system workload characterization when analyzing 44the performance of applications. 45 46Note that tracing VFS level reads and writes can be a frequent activity, and 47this tool can begin to cost measurable overhead at high I/O rates. 48 49 50A -C option will stop clearing the screen, and -r with a number will restrict 51the output to that many rows (20 by default). For example, not clearing 52the screen and showing the top 5 only: 53 54# ./dirtop -d '/hdfs/uuid/*/yarn' -Cr 5 55Tracing... Output every 1 secs. Hit Ctrl-C to end 56 5714:29:08 loadavg: 25.66 23.42 21.51 17/2850 67167 58 59READS WRITES R_Kb W_Kb PATH 60100 8429 0 48243 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn 612066 4091 8176 26457 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn 6210 2043 0 8172 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn 6338 1368 0 2652 /hdfs/uuid/a78f846a-58c4-4d10-a9f5-42f16a6134a0/yarn 6486 19 0 123 /hdfs/uuid/c11da291-28de-4a77-873e-44bb452d238b/yarn 65 6614:29:09 loadavg: 25.66 23.42 21.51 15/2849 67170 67 68READS WRITES R_Kb W_Kb PATH 691204 5619 4388 33767 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn 702208 3511 8744 22992 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn 7162 4010 0 21181 /hdfs/uuid/8138a53b-b942-44d3-82df-51575f1a3901/yarn 7222 2187 0 8748 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn 7374 1097 0 4388 /hdfs/uuid/4a833770-767e-43b3-b696-dc98901bce26/yarn 74 75[..] 76 77 78 79USAGE message: 80 81# ./dirtop.py -h 82usage: dirtop.py [-h] [-C] [-r MAXROWS] [-s {all,reads,writes,rbytes,wbytes}] 83 [-p PID] -d ROOTDIRS 84 [interval] [count] 85 86File reads and writes by process 87 88positional arguments: 89 interval output interval, in seconds 90 count number of outputs 91 92optional arguments: 93 -h, --help show this help message and exit 94 -C, --noclear don't clear the screen 95 -r MAXROWS, --maxrows MAXROWS 96 maximum rows to print, default 20 97 -s {all,reads,writes,rbytes,wbytes}, --sort {all,reads,writes,rbytes,wbytes} 98 sort column, default all 99 -p PID, --pid PID trace this PID only 100 -d ROOTDIRS, --root-directories ROOTDIRS 101 select the directories to observe, separated by commas 102 103examples: 104 ./dirtop -d '/hdfs/uuid/*/yarn' # directory I/O top, 1 second refresh 105 ./dirtop -d '/hdfs/uuid/*/yarn' -C # don't clear the screen 106 ./dirtop -d '/hdfs/uuid/*/yarn' 5 # 5 second summaries 107 ./dirtop -d '/hdfs/uuid/*/yarn' 5 10 # 5 second summaries, 10 times only 108 ./dirtop -d '/hdfs/uuid/*/yarn,/hdfs/uuid/*/data' # Running dirtop on two set of directories 109