1*387f9dfdSAndroid Build Coastguard Worker#!/usr/bin/env python 2*387f9dfdSAndroid Build Coastguard Worker# 3*387f9dfdSAndroid Build Coastguard Worker# deadlock Detects potential deadlocks (lock order inversions) 4*387f9dfdSAndroid Build Coastguard Worker# on a running process. For Linux, uses BCC, eBPF. 5*387f9dfdSAndroid Build Coastguard Worker# 6*387f9dfdSAndroid Build Coastguard Worker# USAGE: deadlock.py [-h] [--binary BINARY] [--dump-graph DUMP_GRAPH] 7*387f9dfdSAndroid Build Coastguard Worker# [--verbose] [--lock-symbols LOCK_SYMBOLS] 8*387f9dfdSAndroid Build Coastguard Worker# [--unlock-symbols UNLOCK_SYMBOLS] 9*387f9dfdSAndroid Build Coastguard Worker# pid 10*387f9dfdSAndroid Build Coastguard Worker# 11*387f9dfdSAndroid Build Coastguard Worker# This traces pthread mutex lock and unlock calls to build a directed graph 12*387f9dfdSAndroid Build Coastguard Worker# representing the mutex wait graph: 13*387f9dfdSAndroid Build Coastguard Worker# 14*387f9dfdSAndroid Build Coastguard Worker# - Nodes in the graph represent mutexes. 15*387f9dfdSAndroid Build Coastguard Worker# - Edge (A, B) exists if there exists some thread T where lock(A) was called 16*387f9dfdSAndroid Build Coastguard Worker# and lock(B) was called before unlock(A) was called. 17*387f9dfdSAndroid Build Coastguard Worker# 18*387f9dfdSAndroid Build Coastguard Worker# If the program finds a potential lock order inversion, the program will dump 19*387f9dfdSAndroid Build Coastguard Worker# the cycle of mutexes and the stack traces where each mutex was acquired, and 20*387f9dfdSAndroid Build Coastguard Worker# then exit. 21*387f9dfdSAndroid Build Coastguard Worker# 22*387f9dfdSAndroid Build Coastguard Worker# This program can only find potential deadlocks that occur while the program 23*387f9dfdSAndroid Build Coastguard Worker# is tracing the process. It cannot find deadlocks that may have occurred 24*387f9dfdSAndroid Build Coastguard Worker# before the program was attached to the process. 25*387f9dfdSAndroid Build Coastguard Worker# 26*387f9dfdSAndroid Build Coastguard Worker# Since this traces all mutex lock and unlock events and all thread creation 27*387f9dfdSAndroid Build Coastguard Worker# events on the traced process, the overhead of this bpf program can be very 28*387f9dfdSAndroid Build Coastguard Worker# high if the process has many threads and mutexes. You should only run this on 29*387f9dfdSAndroid Build Coastguard Worker# a process where the slowdown is acceptable. 30*387f9dfdSAndroid Build Coastguard Worker# 31*387f9dfdSAndroid Build Coastguard Worker# Note: This tool does not work for shared mutexes or recursive mutexes. 32*387f9dfdSAndroid Build Coastguard Worker# 33*387f9dfdSAndroid Build Coastguard Worker# For shared (read-write) mutexes, a deadlock requires a cycle in the wait 34*387f9dfdSAndroid Build Coastguard Worker# graph where at least one of the mutexes in the cycle is acquiring exclusive 35*387f9dfdSAndroid Build Coastguard Worker# (write) ownership. 36*387f9dfdSAndroid Build Coastguard Worker# 37*387f9dfdSAndroid Build Coastguard Worker# For recursive mutexes, lock() is called multiple times on the same mutex. 38*387f9dfdSAndroid Build Coastguard Worker# However, there is no way to determine if a mutex is a recursive mutex 39*387f9dfdSAndroid Build Coastguard Worker# after the mutex has been created. As a result, this tool will not find 40*387f9dfdSAndroid Build Coastguard Worker# potential deadlocks that involve only one mutex. 41*387f9dfdSAndroid Build Coastguard Worker# 42*387f9dfdSAndroid Build Coastguard Worker# Copyright 2017 Facebook, Inc. 43*387f9dfdSAndroid Build Coastguard Worker# Licensed under the Apache License, Version 2.0 (the "License") 44*387f9dfdSAndroid Build Coastguard Worker# 45*387f9dfdSAndroid Build Coastguard Worker# 01-Feb-2017 Kenny Yu Created this. 46*387f9dfdSAndroid Build Coastguard Worker 47*387f9dfdSAndroid Build Coastguard Workerfrom __future__ import ( 48*387f9dfdSAndroid Build Coastguard Worker absolute_import, division, unicode_literals, print_function 49*387f9dfdSAndroid Build Coastguard Worker) 50*387f9dfdSAndroid Build Coastguard Workerfrom bcc import BPF 51*387f9dfdSAndroid Build Coastguard Workerfrom collections import defaultdict 52*387f9dfdSAndroid Build Coastguard Workerimport argparse 53*387f9dfdSAndroid Build Coastguard Workerimport json 54*387f9dfdSAndroid Build Coastguard Workerimport os 55*387f9dfdSAndroid Build Coastguard Workerimport subprocess 56*387f9dfdSAndroid Build Coastguard Workerimport sys 57*387f9dfdSAndroid Build Coastguard Workerimport time 58*387f9dfdSAndroid Build Coastguard Worker 59*387f9dfdSAndroid Build Coastguard Worker 60*387f9dfdSAndroid Build Coastguard Workerclass DiGraph(object): 61*387f9dfdSAndroid Build Coastguard Worker ''' 62*387f9dfdSAndroid Build Coastguard Worker Adapted from networkx: http://networkx.github.io/ 63*387f9dfdSAndroid Build Coastguard Worker Represents a directed graph. Edges can store (key, value) attributes. 64*387f9dfdSAndroid Build Coastguard Worker ''' 65*387f9dfdSAndroid Build Coastguard Worker 66*387f9dfdSAndroid Build Coastguard Worker def __init__(self): 67*387f9dfdSAndroid Build Coastguard Worker # Map of node -> set of nodes 68*387f9dfdSAndroid Build Coastguard Worker self.adjacency_map = {} 69*387f9dfdSAndroid Build Coastguard Worker # Map of (node1, node2) -> map string -> arbitrary attribute 70*387f9dfdSAndroid Build Coastguard Worker # This will not be copied in subgraph() 71*387f9dfdSAndroid Build Coastguard Worker self.attributes_map = {} 72*387f9dfdSAndroid Build Coastguard Worker 73*387f9dfdSAndroid Build Coastguard Worker def neighbors(self, node): 74*387f9dfdSAndroid Build Coastguard Worker return self.adjacency_map.get(node, set()) 75*387f9dfdSAndroid Build Coastguard Worker 76*387f9dfdSAndroid Build Coastguard Worker def edges(self): 77*387f9dfdSAndroid Build Coastguard Worker edges = [] 78*387f9dfdSAndroid Build Coastguard Worker for node, neighbors in self.adjacency_map.items(): 79*387f9dfdSAndroid Build Coastguard Worker for neighbor in neighbors: 80*387f9dfdSAndroid Build Coastguard Worker edges.append((node, neighbor)) 81*387f9dfdSAndroid Build Coastguard Worker return edges 82*387f9dfdSAndroid Build Coastguard Worker 83*387f9dfdSAndroid Build Coastguard Worker def nodes(self): 84*387f9dfdSAndroid Build Coastguard Worker return self.adjacency_map.keys() 85*387f9dfdSAndroid Build Coastguard Worker 86*387f9dfdSAndroid Build Coastguard Worker def attributes(self, node1, node2): 87*387f9dfdSAndroid Build Coastguard Worker return self.attributes_map[(node1, node2)] 88*387f9dfdSAndroid Build Coastguard Worker 89*387f9dfdSAndroid Build Coastguard Worker def add_edge(self, node1, node2, **kwargs): 90*387f9dfdSAndroid Build Coastguard Worker if node1 not in self.adjacency_map: 91*387f9dfdSAndroid Build Coastguard Worker self.adjacency_map[node1] = set() 92*387f9dfdSAndroid Build Coastguard Worker if node2 not in self.adjacency_map: 93*387f9dfdSAndroid Build Coastguard Worker self.adjacency_map[node2] = set() 94*387f9dfdSAndroid Build Coastguard Worker self.adjacency_map[node1].add(node2) 95*387f9dfdSAndroid Build Coastguard Worker self.attributes_map[(node1, node2)] = kwargs 96*387f9dfdSAndroid Build Coastguard Worker 97*387f9dfdSAndroid Build Coastguard Worker def remove_node(self, node): 98*387f9dfdSAndroid Build Coastguard Worker self.adjacency_map.pop(node, None) 99*387f9dfdSAndroid Build Coastguard Worker for _, neighbors in self.adjacency_map.items(): 100*387f9dfdSAndroid Build Coastguard Worker neighbors.discard(node) 101*387f9dfdSAndroid Build Coastguard Worker 102*387f9dfdSAndroid Build Coastguard Worker def subgraph(self, nodes): 103*387f9dfdSAndroid Build Coastguard Worker graph = DiGraph() 104*387f9dfdSAndroid Build Coastguard Worker for node in nodes: 105*387f9dfdSAndroid Build Coastguard Worker for neighbor in self.neighbors(node): 106*387f9dfdSAndroid Build Coastguard Worker if neighbor in nodes: 107*387f9dfdSAndroid Build Coastguard Worker graph.add_edge(node, neighbor) 108*387f9dfdSAndroid Build Coastguard Worker return graph 109*387f9dfdSAndroid Build Coastguard Worker 110*387f9dfdSAndroid Build Coastguard Worker def node_link_data(self): 111*387f9dfdSAndroid Build Coastguard Worker ''' 112*387f9dfdSAndroid Build Coastguard Worker Returns the graph as a dictionary in a format that can be 113*387f9dfdSAndroid Build Coastguard Worker serialized. 114*387f9dfdSAndroid Build Coastguard Worker ''' 115*387f9dfdSAndroid Build Coastguard Worker data = { 116*387f9dfdSAndroid Build Coastguard Worker 'directed': True, 117*387f9dfdSAndroid Build Coastguard Worker 'multigraph': False, 118*387f9dfdSAndroid Build Coastguard Worker 'graph': {}, 119*387f9dfdSAndroid Build Coastguard Worker 'links': [], 120*387f9dfdSAndroid Build Coastguard Worker 'nodes': [], 121*387f9dfdSAndroid Build Coastguard Worker } 122*387f9dfdSAndroid Build Coastguard Worker 123*387f9dfdSAndroid Build Coastguard Worker # Do one pass to build a map of node -> position in nodes 124*387f9dfdSAndroid Build Coastguard Worker node_to_number = {} 125*387f9dfdSAndroid Build Coastguard Worker for node in self.adjacency_map.keys(): 126*387f9dfdSAndroid Build Coastguard Worker node_to_number[node] = len(data['nodes']) 127*387f9dfdSAndroid Build Coastguard Worker data['nodes'].append({'id': node}) 128*387f9dfdSAndroid Build Coastguard Worker 129*387f9dfdSAndroid Build Coastguard Worker # Do another pass to build the link information 130*387f9dfdSAndroid Build Coastguard Worker for node, neighbors in self.adjacency_map.items(): 131*387f9dfdSAndroid Build Coastguard Worker for neighbor in neighbors: 132*387f9dfdSAndroid Build Coastguard Worker link = self.attributes_map[(node, neighbor)].copy() 133*387f9dfdSAndroid Build Coastguard Worker link['source'] = node_to_number[node] 134*387f9dfdSAndroid Build Coastguard Worker link['target'] = node_to_number[neighbor] 135*387f9dfdSAndroid Build Coastguard Worker data['links'].append(link) 136*387f9dfdSAndroid Build Coastguard Worker return data 137*387f9dfdSAndroid Build Coastguard Worker 138*387f9dfdSAndroid Build Coastguard Worker 139*387f9dfdSAndroid Build Coastguard Workerdef strongly_connected_components(G): 140*387f9dfdSAndroid Build Coastguard Worker ''' 141*387f9dfdSAndroid Build Coastguard Worker Adapted from networkx: http://networkx.github.io/ 142*387f9dfdSAndroid Build Coastguard Worker Parameters 143*387f9dfdSAndroid Build Coastguard Worker ---------- 144*387f9dfdSAndroid Build Coastguard Worker G : DiGraph 145*387f9dfdSAndroid Build Coastguard Worker Returns 146*387f9dfdSAndroid Build Coastguard Worker ------- 147*387f9dfdSAndroid Build Coastguard Worker comp : generator of sets 148*387f9dfdSAndroid Build Coastguard Worker A generator of sets of nodes, one for each strongly connected 149*387f9dfdSAndroid Build Coastguard Worker component of G. 150*387f9dfdSAndroid Build Coastguard Worker ''' 151*387f9dfdSAndroid Build Coastguard Worker preorder = {} 152*387f9dfdSAndroid Build Coastguard Worker lowlink = {} 153*387f9dfdSAndroid Build Coastguard Worker scc_found = {} 154*387f9dfdSAndroid Build Coastguard Worker scc_queue = [] 155*387f9dfdSAndroid Build Coastguard Worker i = 0 # Preorder counter 156*387f9dfdSAndroid Build Coastguard Worker for source in G.nodes(): 157*387f9dfdSAndroid Build Coastguard Worker if source not in scc_found: 158*387f9dfdSAndroid Build Coastguard Worker queue = [source] 159*387f9dfdSAndroid Build Coastguard Worker while queue: 160*387f9dfdSAndroid Build Coastguard Worker v = queue[-1] 161*387f9dfdSAndroid Build Coastguard Worker if v not in preorder: 162*387f9dfdSAndroid Build Coastguard Worker i = i + 1 163*387f9dfdSAndroid Build Coastguard Worker preorder[v] = i 164*387f9dfdSAndroid Build Coastguard Worker done = 1 165*387f9dfdSAndroid Build Coastguard Worker v_nbrs = G.neighbors(v) 166*387f9dfdSAndroid Build Coastguard Worker for w in v_nbrs: 167*387f9dfdSAndroid Build Coastguard Worker if w not in preorder: 168*387f9dfdSAndroid Build Coastguard Worker queue.append(w) 169*387f9dfdSAndroid Build Coastguard Worker done = 0 170*387f9dfdSAndroid Build Coastguard Worker break 171*387f9dfdSAndroid Build Coastguard Worker if done == 1: 172*387f9dfdSAndroid Build Coastguard Worker lowlink[v] = preorder[v] 173*387f9dfdSAndroid Build Coastguard Worker for w in v_nbrs: 174*387f9dfdSAndroid Build Coastguard Worker if w not in scc_found: 175*387f9dfdSAndroid Build Coastguard Worker if preorder[w] > preorder[v]: 176*387f9dfdSAndroid Build Coastguard Worker lowlink[v] = min([lowlink[v], lowlink[w]]) 177*387f9dfdSAndroid Build Coastguard Worker else: 178*387f9dfdSAndroid Build Coastguard Worker lowlink[v] = min([lowlink[v], preorder[w]]) 179*387f9dfdSAndroid Build Coastguard Worker queue.pop() 180*387f9dfdSAndroid Build Coastguard Worker if lowlink[v] == preorder[v]: 181*387f9dfdSAndroid Build Coastguard Worker scc_found[v] = True 182*387f9dfdSAndroid Build Coastguard Worker scc = {v} 183*387f9dfdSAndroid Build Coastguard Worker while ( 184*387f9dfdSAndroid Build Coastguard Worker scc_queue and preorder[scc_queue[-1]] > preorder[v] 185*387f9dfdSAndroid Build Coastguard Worker ): 186*387f9dfdSAndroid Build Coastguard Worker k = scc_queue.pop() 187*387f9dfdSAndroid Build Coastguard Worker scc_found[k] = True 188*387f9dfdSAndroid Build Coastguard Worker scc.add(k) 189*387f9dfdSAndroid Build Coastguard Worker yield scc 190*387f9dfdSAndroid Build Coastguard Worker else: 191*387f9dfdSAndroid Build Coastguard Worker scc_queue.append(v) 192*387f9dfdSAndroid Build Coastguard Worker 193*387f9dfdSAndroid Build Coastguard Worker 194*387f9dfdSAndroid Build Coastguard Workerdef simple_cycles(G): 195*387f9dfdSAndroid Build Coastguard Worker ''' 196*387f9dfdSAndroid Build Coastguard Worker Adapted from networkx: http://networkx.github.io/ 197*387f9dfdSAndroid Build Coastguard Worker Parameters 198*387f9dfdSAndroid Build Coastguard Worker ---------- 199*387f9dfdSAndroid Build Coastguard Worker G : DiGraph 200*387f9dfdSAndroid Build Coastguard Worker Returns 201*387f9dfdSAndroid Build Coastguard Worker ------- 202*387f9dfdSAndroid Build Coastguard Worker cycle_generator: generator 203*387f9dfdSAndroid Build Coastguard Worker A generator that produces elementary cycles of the graph. 204*387f9dfdSAndroid Build Coastguard Worker Each cycle is represented by a list of nodes along the cycle. 205*387f9dfdSAndroid Build Coastguard Worker ''' 206*387f9dfdSAndroid Build Coastguard Worker 207*387f9dfdSAndroid Build Coastguard Worker def _unblock(thisnode, blocked, B): 208*387f9dfdSAndroid Build Coastguard Worker stack = set([thisnode]) 209*387f9dfdSAndroid Build Coastguard Worker while stack: 210*387f9dfdSAndroid Build Coastguard Worker node = stack.pop() 211*387f9dfdSAndroid Build Coastguard Worker if node in blocked: 212*387f9dfdSAndroid Build Coastguard Worker blocked.remove(node) 213*387f9dfdSAndroid Build Coastguard Worker stack.update(B[node]) 214*387f9dfdSAndroid Build Coastguard Worker B[node].clear() 215*387f9dfdSAndroid Build Coastguard Worker 216*387f9dfdSAndroid Build Coastguard Worker # Johnson's algorithm requires some ordering of the nodes. 217*387f9dfdSAndroid Build Coastguard Worker # We assign the arbitrary ordering given by the strongly connected comps 218*387f9dfdSAndroid Build Coastguard Worker # There is no need to track the ordering as each node removed as processed. 219*387f9dfdSAndroid Build Coastguard Worker # save the actual graph so we can mutate it here 220*387f9dfdSAndroid Build Coastguard Worker # We only take the edges because we do not want to 221*387f9dfdSAndroid Build Coastguard Worker # copy edge and node attributes here. 222*387f9dfdSAndroid Build Coastguard Worker subG = G.subgraph(G.nodes()) 223*387f9dfdSAndroid Build Coastguard Worker sccs = list(strongly_connected_components(subG)) 224*387f9dfdSAndroid Build Coastguard Worker while sccs: 225*387f9dfdSAndroid Build Coastguard Worker scc = sccs.pop() 226*387f9dfdSAndroid Build Coastguard Worker # order of scc determines ordering of nodes 227*387f9dfdSAndroid Build Coastguard Worker startnode = scc.pop() 228*387f9dfdSAndroid Build Coastguard Worker # Processing node runs 'circuit' routine from recursive version 229*387f9dfdSAndroid Build Coastguard Worker path = [startnode] 230*387f9dfdSAndroid Build Coastguard Worker blocked = set() # vertex: blocked from search? 231*387f9dfdSAndroid Build Coastguard Worker closed = set() # nodes involved in a cycle 232*387f9dfdSAndroid Build Coastguard Worker blocked.add(startnode) 233*387f9dfdSAndroid Build Coastguard Worker B = defaultdict(set) # graph portions that yield no elementary circuit 234*387f9dfdSAndroid Build Coastguard Worker stack = [(startnode, list(subG.neighbors(startnode)))] 235*387f9dfdSAndroid Build Coastguard Worker while stack: 236*387f9dfdSAndroid Build Coastguard Worker thisnode, nbrs = stack[-1] 237*387f9dfdSAndroid Build Coastguard Worker if nbrs: 238*387f9dfdSAndroid Build Coastguard Worker nextnode = nbrs.pop() 239*387f9dfdSAndroid Build Coastguard Worker if nextnode == startnode: 240*387f9dfdSAndroid Build Coastguard Worker yield path[:] 241*387f9dfdSAndroid Build Coastguard Worker closed.update(path) 242*387f9dfdSAndroid Build Coastguard Worker elif nextnode not in blocked: 243*387f9dfdSAndroid Build Coastguard Worker path.append(nextnode) 244*387f9dfdSAndroid Build Coastguard Worker stack.append((nextnode, list(subG.neighbors(nextnode)))) 245*387f9dfdSAndroid Build Coastguard Worker closed.discard(nextnode) 246*387f9dfdSAndroid Build Coastguard Worker blocked.add(nextnode) 247*387f9dfdSAndroid Build Coastguard Worker continue 248*387f9dfdSAndroid Build Coastguard Worker # done with nextnode... look for more neighbors 249*387f9dfdSAndroid Build Coastguard Worker if not nbrs: # no more nbrs 250*387f9dfdSAndroid Build Coastguard Worker if thisnode in closed: 251*387f9dfdSAndroid Build Coastguard Worker _unblock(thisnode, blocked, B) 252*387f9dfdSAndroid Build Coastguard Worker else: 253*387f9dfdSAndroid Build Coastguard Worker for nbr in subG.neighbors(thisnode): 254*387f9dfdSAndroid Build Coastguard Worker if thisnode not in B[nbr]: 255*387f9dfdSAndroid Build Coastguard Worker B[nbr].add(thisnode) 256*387f9dfdSAndroid Build Coastguard Worker stack.pop() 257*387f9dfdSAndroid Build Coastguard Worker path.pop() 258*387f9dfdSAndroid Build Coastguard Worker # done processing this node 259*387f9dfdSAndroid Build Coastguard Worker subG.remove_node(startnode) 260*387f9dfdSAndroid Build Coastguard Worker H = subG.subgraph(scc) # make smaller to avoid work in SCC routine 261*387f9dfdSAndroid Build Coastguard Worker sccs.extend(list(strongly_connected_components(H))) 262*387f9dfdSAndroid Build Coastguard Worker 263*387f9dfdSAndroid Build Coastguard Worker 264*387f9dfdSAndroid Build Coastguard Workerdef find_cycle(graph): 265*387f9dfdSAndroid Build Coastguard Worker ''' 266*387f9dfdSAndroid Build Coastguard Worker Looks for a cycle in the graph. If found, returns the first cycle. 267*387f9dfdSAndroid Build Coastguard Worker If nodes a1, a2, ..., an are in a cycle, then this returns: 268*387f9dfdSAndroid Build Coastguard Worker [(a1,a2), (a2,a3), ... (an-1,an), (an, a1)] 269*387f9dfdSAndroid Build Coastguard Worker Otherwise returns an empty list. 270*387f9dfdSAndroid Build Coastguard Worker ''' 271*387f9dfdSAndroid Build Coastguard Worker cycles = list(simple_cycles(graph)) 272*387f9dfdSAndroid Build Coastguard Worker if cycles: 273*387f9dfdSAndroid Build Coastguard Worker nodes = cycles[0] 274*387f9dfdSAndroid Build Coastguard Worker nodes.append(nodes[0]) 275*387f9dfdSAndroid Build Coastguard Worker edges = [] 276*387f9dfdSAndroid Build Coastguard Worker prev = nodes[0] 277*387f9dfdSAndroid Build Coastguard Worker for node in nodes[1:]: 278*387f9dfdSAndroid Build Coastguard Worker edges.append((prev, node)) 279*387f9dfdSAndroid Build Coastguard Worker prev = node 280*387f9dfdSAndroid Build Coastguard Worker return edges 281*387f9dfdSAndroid Build Coastguard Worker else: 282*387f9dfdSAndroid Build Coastguard Worker return [] 283*387f9dfdSAndroid Build Coastguard Worker 284*387f9dfdSAndroid Build Coastguard Worker 285*387f9dfdSAndroid Build Coastguard Workerdef print_cycle(binary, graph, edges, thread_info, print_stack_trace_fn): 286*387f9dfdSAndroid Build Coastguard Worker ''' 287*387f9dfdSAndroid Build Coastguard Worker Prints the cycle in the mutex graph in the following format: 288*387f9dfdSAndroid Build Coastguard Worker 289*387f9dfdSAndroid Build Coastguard Worker Potential Deadlock Detected! 290*387f9dfdSAndroid Build Coastguard Worker 291*387f9dfdSAndroid Build Coastguard Worker Cycle in lock order graph: M0 => M1 => M2 => M0 292*387f9dfdSAndroid Build Coastguard Worker 293*387f9dfdSAndroid Build Coastguard Worker for (m, n) in cycle: 294*387f9dfdSAndroid Build Coastguard Worker Mutex n acquired here while holding Mutex m in thread T: 295*387f9dfdSAndroid Build Coastguard Worker [ stack trace ] 296*387f9dfdSAndroid Build Coastguard Worker 297*387f9dfdSAndroid Build Coastguard Worker Mutex m previously acquired by thread T here: 298*387f9dfdSAndroid Build Coastguard Worker [ stack trace ] 299*387f9dfdSAndroid Build Coastguard Worker 300*387f9dfdSAndroid Build Coastguard Worker for T in all threads: 301*387f9dfdSAndroid Build Coastguard Worker Thread T was created here: 302*387f9dfdSAndroid Build Coastguard Worker [ stack trace ] 303*387f9dfdSAndroid Build Coastguard Worker ''' 304*387f9dfdSAndroid Build Coastguard Worker 305*387f9dfdSAndroid Build Coastguard Worker # List of mutexes in the cycle, first and last repeated 306*387f9dfdSAndroid Build Coastguard Worker nodes_in_order = [] 307*387f9dfdSAndroid Build Coastguard Worker # Map mutex address -> readable alias 308*387f9dfdSAndroid Build Coastguard Worker node_addr_to_name = {} 309*387f9dfdSAndroid Build Coastguard Worker for counter, (m, n) in enumerate(edges): 310*387f9dfdSAndroid Build Coastguard Worker nodes_in_order.append(m) 311*387f9dfdSAndroid Build Coastguard Worker # For global or static variables, try to symbolize the mutex address. 312*387f9dfdSAndroid Build Coastguard Worker symbol = symbolize_with_objdump(binary, m) 313*387f9dfdSAndroid Build Coastguard Worker if symbol: 314*387f9dfdSAndroid Build Coastguard Worker symbol += ' ' 315*387f9dfdSAndroid Build Coastguard Worker node_addr_to_name[m] = 'Mutex M%d (%s0x%016x)' % (counter, symbol, m) 316*387f9dfdSAndroid Build Coastguard Worker nodes_in_order.append(nodes_in_order[0]) 317*387f9dfdSAndroid Build Coastguard Worker 318*387f9dfdSAndroid Build Coastguard Worker print('----------------\nPotential Deadlock Detected!\n') 319*387f9dfdSAndroid Build Coastguard Worker print( 320*387f9dfdSAndroid Build Coastguard Worker 'Cycle in lock order graph: %s\n' % 321*387f9dfdSAndroid Build Coastguard Worker (' => '.join([node_addr_to_name[n] for n in nodes_in_order])) 322*387f9dfdSAndroid Build Coastguard Worker ) 323*387f9dfdSAndroid Build Coastguard Worker 324*387f9dfdSAndroid Build Coastguard Worker # Set of threads involved in the lock inversion 325*387f9dfdSAndroid Build Coastguard Worker thread_pids = set() 326*387f9dfdSAndroid Build Coastguard Worker 327*387f9dfdSAndroid Build Coastguard Worker # For each edge in the cycle, print where the two mutexes were held 328*387f9dfdSAndroid Build Coastguard Worker for (m, n) in edges: 329*387f9dfdSAndroid Build Coastguard Worker thread_pid = graph.attributes(m, n)['thread_pid'] 330*387f9dfdSAndroid Build Coastguard Worker thread_comm = graph.attributes(m, n)['thread_comm'] 331*387f9dfdSAndroid Build Coastguard Worker first_mutex_stack_id = graph.attributes(m, n)['first_mutex_stack_id'] 332*387f9dfdSAndroid Build Coastguard Worker second_mutex_stack_id = graph.attributes(m, n)['second_mutex_stack_id'] 333*387f9dfdSAndroid Build Coastguard Worker thread_pids.add(thread_pid) 334*387f9dfdSAndroid Build Coastguard Worker print( 335*387f9dfdSAndroid Build Coastguard Worker '%s acquired here while holding %s in Thread %d (%s):' % ( 336*387f9dfdSAndroid Build Coastguard Worker node_addr_to_name[n], node_addr_to_name[m], thread_pid, 337*387f9dfdSAndroid Build Coastguard Worker thread_comm 338*387f9dfdSAndroid Build Coastguard Worker ) 339*387f9dfdSAndroid Build Coastguard Worker ) 340*387f9dfdSAndroid Build Coastguard Worker print_stack_trace_fn(second_mutex_stack_id) 341*387f9dfdSAndroid Build Coastguard Worker print('') 342*387f9dfdSAndroid Build Coastguard Worker print( 343*387f9dfdSAndroid Build Coastguard Worker '%s previously acquired by the same Thread %d (%s) here:' % 344*387f9dfdSAndroid Build Coastguard Worker (node_addr_to_name[m], thread_pid, thread_comm) 345*387f9dfdSAndroid Build Coastguard Worker ) 346*387f9dfdSAndroid Build Coastguard Worker print_stack_trace_fn(first_mutex_stack_id) 347*387f9dfdSAndroid Build Coastguard Worker print('') 348*387f9dfdSAndroid Build Coastguard Worker 349*387f9dfdSAndroid Build Coastguard Worker # Print where the threads were created, if available 350*387f9dfdSAndroid Build Coastguard Worker for thread_pid in thread_pids: 351*387f9dfdSAndroid Build Coastguard Worker parent_pid, stack_id, parent_comm = thread_info.get( 352*387f9dfdSAndroid Build Coastguard Worker thread_pid, (None, None, None) 353*387f9dfdSAndroid Build Coastguard Worker ) 354*387f9dfdSAndroid Build Coastguard Worker if parent_pid: 355*387f9dfdSAndroid Build Coastguard Worker print( 356*387f9dfdSAndroid Build Coastguard Worker 'Thread %d created by Thread %d (%s) here: ' % 357*387f9dfdSAndroid Build Coastguard Worker (thread_pid, parent_pid, parent_comm) 358*387f9dfdSAndroid Build Coastguard Worker ) 359*387f9dfdSAndroid Build Coastguard Worker print_stack_trace_fn(stack_id) 360*387f9dfdSAndroid Build Coastguard Worker else: 361*387f9dfdSAndroid Build Coastguard Worker print( 362*387f9dfdSAndroid Build Coastguard Worker 'Could not find stack trace where Thread %d was created' % 363*387f9dfdSAndroid Build Coastguard Worker thread_pid 364*387f9dfdSAndroid Build Coastguard Worker ) 365*387f9dfdSAndroid Build Coastguard Worker print('') 366*387f9dfdSAndroid Build Coastguard Worker 367*387f9dfdSAndroid Build Coastguard Worker 368*387f9dfdSAndroid Build Coastguard Workerdef symbolize_with_objdump(binary, addr): 369*387f9dfdSAndroid Build Coastguard Worker ''' 370*387f9dfdSAndroid Build Coastguard Worker Searches the binary for the address using objdump. Returns the symbol if 371*387f9dfdSAndroid Build Coastguard Worker it is found, otherwise returns empty string. 372*387f9dfdSAndroid Build Coastguard Worker ''' 373*387f9dfdSAndroid Build Coastguard Worker try: 374*387f9dfdSAndroid Build Coastguard Worker command = ( 375*387f9dfdSAndroid Build Coastguard Worker 'objdump -tT %s | grep %x | awk {\'print $NF\'} | c++filt' % 376*387f9dfdSAndroid Build Coastguard Worker (binary, addr) 377*387f9dfdSAndroid Build Coastguard Worker ) 378*387f9dfdSAndroid Build Coastguard Worker output = subprocess.check_output(command, shell=True) 379*387f9dfdSAndroid Build Coastguard Worker return output.decode('utf-8').strip() 380*387f9dfdSAndroid Build Coastguard Worker except subprocess.CalledProcessError: 381*387f9dfdSAndroid Build Coastguard Worker return '' 382*387f9dfdSAndroid Build Coastguard Worker 383*387f9dfdSAndroid Build Coastguard Worker 384*387f9dfdSAndroid Build Coastguard Workerdef strlist(s): 385*387f9dfdSAndroid Build Coastguard Worker '''Given a comma-separated string, returns a list of substrings''' 386*387f9dfdSAndroid Build Coastguard Worker return s.strip().split(',') 387*387f9dfdSAndroid Build Coastguard Worker 388*387f9dfdSAndroid Build Coastguard Worker 389*387f9dfdSAndroid Build Coastguard Workerdef main(): 390*387f9dfdSAndroid Build Coastguard Worker examples = '''Examples: 391*387f9dfdSAndroid Build Coastguard Worker deadlock 181 # Analyze PID 181 392*387f9dfdSAndroid Build Coastguard Worker 393*387f9dfdSAndroid Build Coastguard Worker deadlock 181 --binary /lib/x86_64-linux-gnu/libpthread.so.0 394*387f9dfdSAndroid Build Coastguard Worker # Analyze PID 181 and locks from this binary. 395*387f9dfdSAndroid Build Coastguard Worker # If tracing a process that is running from 396*387f9dfdSAndroid Build Coastguard Worker # a dynamically-linked binary, this argument 397*387f9dfdSAndroid Build Coastguard Worker # is required and should be the path to the 398*387f9dfdSAndroid Build Coastguard Worker # pthread library. 399*387f9dfdSAndroid Build Coastguard Worker 400*387f9dfdSAndroid Build Coastguard Worker deadlock 181 --verbose 401*387f9dfdSAndroid Build Coastguard Worker # Analyze PID 181 and print statistics about 402*387f9dfdSAndroid Build Coastguard Worker # the mutex wait graph. 403*387f9dfdSAndroid Build Coastguard Worker 404*387f9dfdSAndroid Build Coastguard Worker deadlock 181 --lock-symbols my_mutex_lock1,my_mutex_lock2 \\ 405*387f9dfdSAndroid Build Coastguard Worker --unlock-symbols my_mutex_unlock1,my_mutex_unlock2 406*387f9dfdSAndroid Build Coastguard Worker # Analyze PID 181 and trace custom mutex 407*387f9dfdSAndroid Build Coastguard Worker # symbols instead of pthread mutexes. 408*387f9dfdSAndroid Build Coastguard Worker 409*387f9dfdSAndroid Build Coastguard Worker deadlock 181 --dump-graph graph.json 410*387f9dfdSAndroid Build Coastguard Worker # Analyze PID 181 and dump the mutex wait 411*387f9dfdSAndroid Build Coastguard Worker # graph to graph.json. 412*387f9dfdSAndroid Build Coastguard Worker ''' 413*387f9dfdSAndroid Build Coastguard Worker parser = argparse.ArgumentParser( 414*387f9dfdSAndroid Build Coastguard Worker description=( 415*387f9dfdSAndroid Build Coastguard Worker 'Detect potential deadlocks (lock inversions) in a running binary.' 416*387f9dfdSAndroid Build Coastguard Worker '\nMust be run as root.' 417*387f9dfdSAndroid Build Coastguard Worker ), 418*387f9dfdSAndroid Build Coastguard Worker formatter_class=argparse.RawDescriptionHelpFormatter, 419*387f9dfdSAndroid Build Coastguard Worker epilog=examples, 420*387f9dfdSAndroid Build Coastguard Worker ) 421*387f9dfdSAndroid Build Coastguard Worker parser.add_argument('pid', type=int, help='Pid to trace') 422*387f9dfdSAndroid Build Coastguard Worker # Binaries with `:` in the path will fail to attach uprobes on kernels 423*387f9dfdSAndroid Build Coastguard Worker # running without this patch: https://lkml.org/lkml/2017/1/13/585. 424*387f9dfdSAndroid Build Coastguard Worker # Symlinks to the binary without `:` in the path can get around this issue. 425*387f9dfdSAndroid Build Coastguard Worker parser.add_argument( 426*387f9dfdSAndroid Build Coastguard Worker '--binary', 427*387f9dfdSAndroid Build Coastguard Worker type=str, 428*387f9dfdSAndroid Build Coastguard Worker default='', 429*387f9dfdSAndroid Build Coastguard Worker help='If set, trace the mutexes from the binary at this path. ' 430*387f9dfdSAndroid Build Coastguard Worker 'For statically-linked binaries, this argument is not required. ' 431*387f9dfdSAndroid Build Coastguard Worker 'For dynamically-linked binaries, this argument is required and ' 432*387f9dfdSAndroid Build Coastguard Worker 'should be the path of the pthread library the binary is using. ' 433*387f9dfdSAndroid Build Coastguard Worker 'Example: /lib/x86_64-linux-gnu/libpthread.so.0', 434*387f9dfdSAndroid Build Coastguard Worker ) 435*387f9dfdSAndroid Build Coastguard Worker parser.add_argument( 436*387f9dfdSAndroid Build Coastguard Worker '--dump-graph', 437*387f9dfdSAndroid Build Coastguard Worker type=str, 438*387f9dfdSAndroid Build Coastguard Worker default='', 439*387f9dfdSAndroid Build Coastguard Worker help='If set, this will dump the mutex graph to the specified file.', 440*387f9dfdSAndroid Build Coastguard Worker ) 441*387f9dfdSAndroid Build Coastguard Worker parser.add_argument( 442*387f9dfdSAndroid Build Coastguard Worker '--verbose', 443*387f9dfdSAndroid Build Coastguard Worker action='store_true', 444*387f9dfdSAndroid Build Coastguard Worker help='Print statistics about the mutex wait graph.', 445*387f9dfdSAndroid Build Coastguard Worker ) 446*387f9dfdSAndroid Build Coastguard Worker parser.add_argument( 447*387f9dfdSAndroid Build Coastguard Worker '--lock-symbols', 448*387f9dfdSAndroid Build Coastguard Worker type=strlist, 449*387f9dfdSAndroid Build Coastguard Worker default=['pthread_mutex_lock'], 450*387f9dfdSAndroid Build Coastguard Worker help='Comma-separated list of lock symbols to trace. Default is ' 451*387f9dfdSAndroid Build Coastguard Worker 'pthread_mutex_lock. These symbols cannot be inlined in the binary.', 452*387f9dfdSAndroid Build Coastguard Worker ) 453*387f9dfdSAndroid Build Coastguard Worker parser.add_argument( 454*387f9dfdSAndroid Build Coastguard Worker '--unlock-symbols', 455*387f9dfdSAndroid Build Coastguard Worker type=strlist, 456*387f9dfdSAndroid Build Coastguard Worker default=['pthread_mutex_unlock'], 457*387f9dfdSAndroid Build Coastguard Worker help='Comma-separated list of unlock symbols to trace. Default is ' 458*387f9dfdSAndroid Build Coastguard Worker 'pthread_mutex_unlock. These symbols cannot be inlined in the binary.', 459*387f9dfdSAndroid Build Coastguard Worker ) 460*387f9dfdSAndroid Build Coastguard Worker parser.add_argument( 461*387f9dfdSAndroid Build Coastguard Worker '-t', '--threads', type=int, default=65536, 462*387f9dfdSAndroid Build Coastguard Worker help='Specifies the maximum number of threads to trace. default 65536. ' 463*387f9dfdSAndroid Build Coastguard Worker 'Note. 40 bytes per thread.' 464*387f9dfdSAndroid Build Coastguard Worker ) 465*387f9dfdSAndroid Build Coastguard Worker parser.add_argument( 466*387f9dfdSAndroid Build Coastguard Worker '-e', '--edges', type=int, default=65536, 467*387f9dfdSAndroid Build Coastguard Worker help='Specifies the maximum number of edge cases that can be recorded. ' 468*387f9dfdSAndroid Build Coastguard Worker 'default 65536. Note. 88 bytes per edge case.' 469*387f9dfdSAndroid Build Coastguard Worker ) 470*387f9dfdSAndroid Build Coastguard Worker parser.add_argument( 471*387f9dfdSAndroid Build Coastguard Worker '-s', '--stacktraces', type=int, default=65536, 472*387f9dfdSAndroid Build Coastguard Worker help='Specifies the maximum number of stack traces that can be recorded. ' 473*387f9dfdSAndroid Build Coastguard Worker 'This number is rounded up to the next power of two.' 474*387f9dfdSAndroid Build Coastguard Worker 'default 65536. Note. 1 kbytes vmalloced per stack trace.' 475*387f9dfdSAndroid Build Coastguard Worker ) 476*387f9dfdSAndroid Build Coastguard Worker 477*387f9dfdSAndroid Build Coastguard Worker args = parser.parse_args() 478*387f9dfdSAndroid Build Coastguard Worker if not args.binary: 479*387f9dfdSAndroid Build Coastguard Worker try: 480*387f9dfdSAndroid Build Coastguard Worker args.binary = os.readlink('/proc/%d/exe' % args.pid) 481*387f9dfdSAndroid Build Coastguard Worker except OSError as e: 482*387f9dfdSAndroid Build Coastguard Worker print('%s. Is the process (pid=%d) running?' % (str(e), args.pid)) 483*387f9dfdSAndroid Build Coastguard Worker sys.exit(1) 484*387f9dfdSAndroid Build Coastguard Worker 485*387f9dfdSAndroid Build Coastguard Worker with open('deadlock.c') as f: 486*387f9dfdSAndroid Build Coastguard Worker text = f.read() 487*387f9dfdSAndroid Build Coastguard Worker text = text.replace('MAX_THREADS', str(args.threads)); 488*387f9dfdSAndroid Build Coastguard Worker text = text.replace('MAX_EDGES', str(args.edges)); 489*387f9dfdSAndroid Build Coastguard Worker text = text.replace('MAX_TRACES', str(args.stacktraces)); 490*387f9dfdSAndroid Build Coastguard Worker bpf = BPF(text=text) 491*387f9dfdSAndroid Build Coastguard Worker 492*387f9dfdSAndroid Build Coastguard Worker # Trace where threads are created 493*387f9dfdSAndroid Build Coastguard Worker bpf.attach_kretprobe(event=bpf.get_syscall_fnname('clone'), fn_name='trace_clone') 494*387f9dfdSAndroid Build Coastguard Worker 495*387f9dfdSAndroid Build Coastguard Worker # We must trace unlock first, otherwise in the time we attached the probe 496*387f9dfdSAndroid Build Coastguard Worker # on lock() and have not yet attached the probe on unlock(), a thread can 497*387f9dfdSAndroid Build Coastguard Worker # acquire mutexes and release them, but the release events will not be 498*387f9dfdSAndroid Build Coastguard Worker # traced, resulting in noisy reports. 499*387f9dfdSAndroid Build Coastguard Worker for symbol in args.unlock_symbols: 500*387f9dfdSAndroid Build Coastguard Worker try: 501*387f9dfdSAndroid Build Coastguard Worker bpf.attach_uprobe( 502*387f9dfdSAndroid Build Coastguard Worker name=args.binary, 503*387f9dfdSAndroid Build Coastguard Worker sym=symbol, 504*387f9dfdSAndroid Build Coastguard Worker fn_name='trace_mutex_release', 505*387f9dfdSAndroid Build Coastguard Worker pid=args.pid, 506*387f9dfdSAndroid Build Coastguard Worker ) 507*387f9dfdSAndroid Build Coastguard Worker except Exception as e: 508*387f9dfdSAndroid Build Coastguard Worker print('%s. Failed to attach to symbol: %s\nIs --binary argument missing?' % (str(e), symbol)) 509*387f9dfdSAndroid Build Coastguard Worker sys.exit(1) 510*387f9dfdSAndroid Build Coastguard Worker for symbol in args.lock_symbols: 511*387f9dfdSAndroid Build Coastguard Worker try: 512*387f9dfdSAndroid Build Coastguard Worker bpf.attach_uprobe( 513*387f9dfdSAndroid Build Coastguard Worker name=args.binary, 514*387f9dfdSAndroid Build Coastguard Worker sym=symbol, 515*387f9dfdSAndroid Build Coastguard Worker fn_name='trace_mutex_acquire', 516*387f9dfdSAndroid Build Coastguard Worker pid=args.pid, 517*387f9dfdSAndroid Build Coastguard Worker ) 518*387f9dfdSAndroid Build Coastguard Worker except Exception as e: 519*387f9dfdSAndroid Build Coastguard Worker print('%s. Failed to attach to symbol: %s' % (str(e), symbol)) 520*387f9dfdSAndroid Build Coastguard Worker sys.exit(1) 521*387f9dfdSAndroid Build Coastguard Worker 522*387f9dfdSAndroid Build Coastguard Worker def print_stack_trace(stack_id): 523*387f9dfdSAndroid Build Coastguard Worker '''Closure that prints the symbolized stack trace.''' 524*387f9dfdSAndroid Build Coastguard Worker for addr in bpf.get_table('stack_traces').walk(stack_id): 525*387f9dfdSAndroid Build Coastguard Worker line = bpf.sym(addr, args.pid) 526*387f9dfdSAndroid Build Coastguard Worker # Try to symbolize with objdump if we cannot with bpf. 527*387f9dfdSAndroid Build Coastguard Worker if line == '[unknown]': 528*387f9dfdSAndroid Build Coastguard Worker symbol = symbolize_with_objdump(args.binary, addr) 529*387f9dfdSAndroid Build Coastguard Worker if symbol: 530*387f9dfdSAndroid Build Coastguard Worker line = symbol 531*387f9dfdSAndroid Build Coastguard Worker print('@ %016x %s' % (addr, line)) 532*387f9dfdSAndroid Build Coastguard Worker 533*387f9dfdSAndroid Build Coastguard Worker print('Tracing... Hit Ctrl-C to end.') 534*387f9dfdSAndroid Build Coastguard Worker while True: 535*387f9dfdSAndroid Build Coastguard Worker try: 536*387f9dfdSAndroid Build Coastguard Worker # Map of child thread pid -> parent info 537*387f9dfdSAndroid Build Coastguard Worker thread_info = { 538*387f9dfdSAndroid Build Coastguard Worker child.value: (parent.parent_pid, parent.stack_id, parent.comm) 539*387f9dfdSAndroid Build Coastguard Worker for child, parent in bpf.get_table('thread_to_parent').items() 540*387f9dfdSAndroid Build Coastguard Worker } 541*387f9dfdSAndroid Build Coastguard Worker 542*387f9dfdSAndroid Build Coastguard Worker # Mutex wait directed graph. Nodes are mutexes. Edge (A,B) exists 543*387f9dfdSAndroid Build Coastguard Worker # if there exists some thread T where lock(A) was called and 544*387f9dfdSAndroid Build Coastguard Worker # lock(B) was called before unlock(A) was called. 545*387f9dfdSAndroid Build Coastguard Worker graph = DiGraph() 546*387f9dfdSAndroid Build Coastguard Worker for key, leaf in bpf.get_table('edges').items(): 547*387f9dfdSAndroid Build Coastguard Worker graph.add_edge( 548*387f9dfdSAndroid Build Coastguard Worker key.mutex1, 549*387f9dfdSAndroid Build Coastguard Worker key.mutex2, 550*387f9dfdSAndroid Build Coastguard Worker thread_pid=leaf.thread_pid, 551*387f9dfdSAndroid Build Coastguard Worker thread_comm=leaf.comm.decode('utf-8'), 552*387f9dfdSAndroid Build Coastguard Worker first_mutex_stack_id=leaf.mutex1_stack_id, 553*387f9dfdSAndroid Build Coastguard Worker second_mutex_stack_id=leaf.mutex2_stack_id, 554*387f9dfdSAndroid Build Coastguard Worker ) 555*387f9dfdSAndroid Build Coastguard Worker if args.verbose: 556*387f9dfdSAndroid Build Coastguard Worker print( 557*387f9dfdSAndroid Build Coastguard Worker 'Mutexes: %d, Edges: %d' % 558*387f9dfdSAndroid Build Coastguard Worker (len(graph.nodes()), len(graph.edges())) 559*387f9dfdSAndroid Build Coastguard Worker ) 560*387f9dfdSAndroid Build Coastguard Worker if args.dump_graph: 561*387f9dfdSAndroid Build Coastguard Worker with open(args.dump_graph, 'w') as f: 562*387f9dfdSAndroid Build Coastguard Worker data = graph.node_link_data() 563*387f9dfdSAndroid Build Coastguard Worker f.write(json.dumps(data, indent=2)) 564*387f9dfdSAndroid Build Coastguard Worker 565*387f9dfdSAndroid Build Coastguard Worker cycle = find_cycle(graph) 566*387f9dfdSAndroid Build Coastguard Worker if cycle: 567*387f9dfdSAndroid Build Coastguard Worker print_cycle( 568*387f9dfdSAndroid Build Coastguard Worker args.binary, graph, cycle, thread_info, print_stack_trace 569*387f9dfdSAndroid Build Coastguard Worker ) 570*387f9dfdSAndroid Build Coastguard Worker sys.exit(1) 571*387f9dfdSAndroid Build Coastguard Worker 572*387f9dfdSAndroid Build Coastguard Worker time.sleep(1) 573*387f9dfdSAndroid Build Coastguard Worker except KeyboardInterrupt: 574*387f9dfdSAndroid Build Coastguard Worker break 575*387f9dfdSAndroid Build Coastguard Worker 576*387f9dfdSAndroid Build Coastguard Worker 577*387f9dfdSAndroid Build Coastguard Workerif __name__ == '__main__': 578*387f9dfdSAndroid Build Coastguard Worker main() 579