README.md
1# Trace Redaction
2
3## Timeline
4
5### Intro
6
7The timeline is at the center of the redaction system. It provides an
8efficient method to find which package a thread/process belongs to.
9
10The timeline allows queries to be connected to time. Without this, there's a
11significant privacy conern because a pid can be recycled. Just because the pid
12is excluded from redaction before time T, doesn't mean it should be redacted
13after time T.
14
15### General Structure
16
17The timeline uses an event-based pattern using two events:
18
19- __Open Event:__ Marks the begining of a pid's new lifespan.
20- __Close Event:__ Marks the end of a pids's lifespan.
21
22An event-based structure (compared to a span-based structure) is used as it is
23better suited to handle errors/issues in the underlying data. For example, if a
24pid doesn't explictly ends before being reused (e.g. two back-to-back open
25events), the event-based structure "just works".
26
27Open events contain the thread's full state. The close event only contains the
28information needed to reference the thread's previous event.
29
30```c++
31struct Open {
32 uint64_t ts;
33 int32_t pid;
34 int32_t ppid;
35 uint64_t uid;
36};
37
38struct Close {
39 uint64_t ts;
40 int32_t pid;
41};
42```
43
44The vast majory of threads will have one event, an open event provided by the
45`ProcessTree`. For some threads, they will have multiple open (`ProcessTree`,
46`NewTask`) and close events (`ProcFree`) in alternating order.
47
48### Query
49
50```c++
51struct Slice {
52 int32_t pid;
53 uint64_t uid;
54};
55
56class Timeline {
57 Slice Query(uint64_t ts, int32_t pid) const;
58};
59
60```
61
62Events, regardless of type, are stored in contiguous memory and are ordered
63first by pid and second by time. This is done to allow events to be found
64via a binary search.
65
66The vast majory of threads will have one event, the open event. Some threads
67may have close and re-open events.
68
69To handle a query,
70
711. Use a binary search to find the lower bound for `pid` (the first instance of
72 `pid`)
731. Scan forward to find the last event before `ts` (for `pid`)
74
75If an event was found:
76
77```c++
78if (e.type == kOpen && uid != 0)
79 return Slice(pid, e.uid);
80
81// The pid is active, check the parent for a uid.
82if (e.type == kOpen && uid == 0)
83 return Query(ts, e.ppid);
84
85return Slice(pid, kNoPackage);
86```
87
88If `pid` does not have an immediate package (`uid`), the parent must be
89searched. The parent-child tree is short, so the recursive search will be
90relatively short. To minimize this even more, a union-find operation is applied
91because any queries can be made.
92
93__Simple runtime overview:__
94
95Initialization:
96
97- $sort + union\ find$
98
99- $nlogn + mlogn$
100 - where $n=events$
101 - and $m=approx\ average\ depth$
102
103Query:
104
105- $P(p) = m_p * (logn + e_p)$
106 - where $m_p=\ distance\ from\ pid\ to\ uid$
107 - and $n=events$
108 - and $e_p=number\ of\ events\ for\ process\ p$
109
110- Because of the union-find in initialization, $m_p \to 0$
111
112To further reduce the runtime, the search domain is reduces by remove all open
113events for $pids$ that don't connect to a target $uid$. By removing open events,
114and close events, there are two advantages:
115
1161. Removing open events are safe and simple. By removing open events, those pids
117can never be marked by active. Keeping the close events effectively reminds the
118system that the pid is not active.
119
1201. The number of open events exceeds the number of close events. Removing open
121events will have a greater effect on the number of events.
122
123__Example:__
124
125|Name|Value|Notes|
126|-|-|-|
127|tids|3666|Total number of threads.|
128|freed threads|5|Number of threads that were freed.|
129|reused threads|0|No threads were used more than one time.|
130|process tids|64|Total number of threads connected to the target process.|
131
132After initialization, there would only be 64 open events and 5 close events.
133This means that every uid lookup would be $logn\ |\ n=64 = 6$. Finding the uid
134given a pid is one of the most common operations during redaction because uid
135determines if something needs to be redacted.
136
137## Scrub Task Rename Spec
138
139### Background
140
141`task_rename` are generated when a thread renames itself. This often happens
142after (but not limited to) a `task_newtask` event. The `task_rename` event
143exposes the threads old name and the threads new name.
144
145### Protobuf Message(s)
146
147__New task event:__
148
149```javascript
150event {
151 timestamp: 6702094133317685
152 pid: 6167
153 task_newtask {
154 pid: 7972
155 comm: "adbd"
156 clone_flags: 4001536
157 oom_score_adj: -1000
158 }
159}
160```
161
162__Task rename event:__
163
164```javascript
165event {
166 timestamp: 6702094133665498
167 pid: 7972
168 task_rename {
169 pid: 7972
170 oldcomm: "adbd"
171 newcomm: "shell svc 7971"
172 oom_score_adj: -1000
173 }
174}
175```
176
177### Method
178
179A `task_rename` should be redacted when `event.pid` does not belong to that
180target package (`context.package_uid`). Since the pid's naming information will
181be removed everywhere, and naming information is effectively metadata, the whole
182event can be dropped without effecting the integrity of the trace.
183