Name Date Size #Lines LOC

..--

BUILD.gnH A D25-Apr-20255 KiB158152

README.mdH A D25-Apr-20254.9 KiB183134

boardphase_packet_filter_integrationtest.ccH A D25-Apr-20253.9 KiB13681

broadphase_packet_filter.ccH A D25-Apr-20253.4 KiB10565

broadphase_packet_filter.hH A D25-Apr-20252.1 KiB5620

broadphase_packet_filter_unittest.ccH A D25-Apr-20256.7 KiB208123

collect_frame_cookies.ccH A D25-Apr-20257 KiB236155

collect_frame_cookies.hH A D25-Apr-20252 KiB6129

collect_frame_cookies_integrationtest.ccH A D25-Apr-20253.4 KiB11672

collect_frame_cookies_unittest.ccH A D25-Apr-202513.6 KiB418299

collect_system_info.ccH A D25-Apr-20252.7 KiB9452

collect_system_info.hH A D25-Apr-20251.7 KiB5119

collect_system_info_unittest.ccH A D25-Apr-20253 KiB10059

collect_timeline_events.ccH A D25-Apr-20255.5 KiB154101

collect_timeline_events.hH A D25-Apr-20251.4 KiB4114

collect_timeline_events_unittest.ccH A D25-Apr-20255.2 KiB182115

filter_sched_waking_events_integrationtest.ccH A D25-Apr-20256.1 KiB18091

filter_sched_waking_events_unittest.ccH A D25-Apr-202510.4 KiB351190

filter_task_rename_integrationtest.ccH A D25-Apr-20253.7 KiB11371

filtering.ccH A D25-Apr-20251.2 KiB4017

filtering.hH A D25-Apr-20251.6 KiB5931

find_package_uid.ccH A D25-Apr-20252.8 KiB8849

find_package_uid.hH A D25-Apr-20251.5 KiB4314

find_package_uid_unittest.ccH A D25-Apr-20257.7 KiB279131

frame_cookie.hH A D25-Apr-20251.4 KiB439

main.ccH A D25-Apr-20251.8 KiB6133

merge_threads.ccH A D25-Apr-20251.5 KiB4920

merge_threads.hH A D25-Apr-20251.2 KiB3814

modify.ccH A D25-Apr-20251.9 KiB5931

modify.hH A D25-Apr-20252.3 KiB7445

populate_allow_lists.ccH A D25-Apr-20255.1 KiB10358

populate_allow_lists.hH A D25-Apr-20251.2 KiB3611

process_thread_timeline.ccH A D25-Apr-20255.1 KiB16581

process_thread_timeline.hH A D25-Apr-20254.3 KiB13164

process_thread_timeline_integrationtest.ccH A D25-Apr-20253.8 KiB10956

process_thread_timeline_unittest.ccH A D25-Apr-20258.5 KiB258172

proto_util.ccH A D25-Apr-20251.9 KiB6030

proto_util.hH A D25-Apr-20251.2 KiB3710

proto_util_unittest.ccH A D25-Apr-20259.7 KiB271204

prune_package_list.ccH A D25-Apr-20253 KiB9352

prune_package_list.hH A D25-Apr-20251.4 KiB4317

prune_package_list_integrationtest.ccH A D25-Apr-20255.7 KiB165103

prune_package_list_unittest.ccH A D25-Apr-20255.1 KiB18597

redact_ftrace_events.ccH A D25-Apr-20256.1 KiB185118

redact_ftrace_events.hH A D25-Apr-20252.8 KiB8041

redact_process_events.ccH A D25-Apr-202514.2 KiB453334

redact_process_events.hH A D25-Apr-20254 KiB12166

redact_process_events_unittest.ccH A D25-Apr-202525 KiB822519

redact_process_trees.ccH A D25-Apr-20255.7 KiB198136

redact_process_trees.hH A D25-Apr-20253 KiB9156

redact_process_trees_integrationtest.ccH A D25-Apr-20258.1 KiB246137

redact_sched_events.ccH A D25-Apr-202520.5 KiB627406

redact_sched_events.hH A D25-Apr-20254 KiB12278

redact_sched_events_integrationtest.ccH A D25-Apr-20256.4 KiB18893

redact_sched_events_unittest.ccH A D25-Apr-202525.7 KiB802505

scrub_process_stats.ccH A D25-Apr-20253.7 KiB11875

scrub_process_stats.hH A D25-Apr-20251.7 KiB5427

scrub_process_stats_integrationtest.ccH A D25-Apr-20255.2 KiB14687

trace_processor_integrationtest.ccH A D25-Apr-20259.9 KiB307184

trace_redaction_framework.ccH A D25-Apr-20251.1 KiB3612

trace_redaction_framework.hH A D25-Apr-202514.2 KiB40681

trace_redaction_integration_fixture.ccH A D25-Apr-20252.2 KiB7238

trace_redaction_integration_fixture.hH A D25-Apr-20251.9 KiB5923

trace_redactor.ccH A D25-Apr-20259.3 KiB275169

trace_redactor.hH A D25-Apr-20253.8 KiB12555

verify_integrity.ccH A D25-Apr-20255.6 KiB175110

verify_integrity.hH A D25-Apr-20251.6 KiB4717

verify_integrity_unittest.ccH A D25-Apr-20258.9 KiB348206

README.md

1# Trace Redaction
2
3## Timeline
4
5### Intro
6
7The timeline is at the center of the redaction system. It provides an
8efficient method to find which package a thread/process belongs to.
9
10The timeline allows queries to be connected to time. Without this, there's a
11significant privacy conern because a pid can be recycled. Just because the pid
12is excluded from redaction before time T, doesn't mean it should be redacted
13after time T.
14
15### General Structure
16
17The timeline uses an event-based pattern using two events:
18
19- __Open Event:__ Marks the begining of a pid's new lifespan.
20- __Close Event:__ Marks the end of a pids's lifespan.
21
22An event-based structure (compared to a span-based structure) is used as it is
23better suited to handle errors/issues in the underlying data. For example, if a
24pid doesn't explictly ends before being reused (e.g. two back-to-back open
25events), the event-based structure "just works".
26
27Open events contain the thread's full state. The close event only contains the
28information needed to reference the thread's previous event.
29
30```c++
31struct Open {
32    uint64_t ts;
33    int32_t  pid;
34    int32_t  ppid;
35    uint64_t uid;
36};
37
38struct Close {
39    uint64_t ts;
40    int32_t  pid;
41};
42```
43
44The vast majory of threads will have one event, an open event provided by the
45`ProcessTree`. For some threads, they will have multiple open (`ProcessTree`,
46`NewTask`) and close events (`ProcFree`) in alternating order.
47
48### Query
49
50```c++
51struct Slice {
52    int32_t  pid;
53    uint64_t uid;
54};
55
56class Timeline {
57  Slice Query(uint64_t ts, int32_t pid) const;
58};
59
60```
61
62Events, regardless of type, are stored in contiguous memory and are ordered
63first by pid and second by time. This is done to allow events to be found
64via a binary search.
65
66The vast majory of threads will have one event, the open event. Some threads
67may have close and re-open events.
68
69To handle a query,
70
711. Use a binary search to find the lower bound for `pid` (the first instance of
72 `pid`)
731. Scan forward to find the last event before `ts` (for `pid`)
74
75If an event was found:
76
77```c++
78if (e.type == kOpen && uid != 0)
79  return Slice(pid, e.uid);
80
81// The pid is active, check the parent for a uid.
82if (e.type == kOpen && uid == 0)
83  return Query(ts, e.ppid);
84
85return Slice(pid, kNoPackage);
86```
87
88If `pid` does not have an immediate package (`uid`), the parent must be
89searched. The parent-child tree is short, so the recursive search will be
90relatively short. To minimize this even more, a union-find operation is applied
91because any queries can be made.
92
93__Simple runtime overview:__
94
95Initialization:
96
97- $sort + union\ find$
98
99- $nlogn + mlogn$
100  - where $n=events$
101  - and $m=approx\ average\ depth$
102
103Query:
104
105- $P(p) = m_p * (logn + e_p)$
106  - where $m_p=\ distance\ from\ pid\ to\ uid$
107  - and $n=events$
108  - and $e_p=number\ of\ events\ for\ process\ p$
109
110- Because of the union-find in initialization, $m_p \to 0$
111
112To further reduce the runtime, the search domain is reduces by remove all open
113events for $pids$ that don't connect to a target $uid$. By removing open events,
114and close events, there are two advantages:
115
1161. Removing open events are safe and simple. By removing open events, those pids
117can never be marked by active. Keeping the close events effectively reminds the
118system that the pid is not active.
119
1201. The number of open events exceeds the number of close events. Removing open
121events will have a greater effect on the number of events.
122
123__Example:__
124
125|Name|Value|Notes|
126|-|-|-|
127|tids|3666|Total number of threads.|
128|freed threads|5|Number of threads that were freed.|
129|reused threads|0|No threads were used more than one time.|
130|process tids|64|Total number of threads connected to the target process.|
131
132After initialization, there would only be 64 open events and 5 close events.
133This means that every uid lookup would be $logn\ |\ n=64 = 6$. Finding the uid
134given a pid is one of the most common operations during redaction because uid
135determines if something needs to be redacted.
136
137## Scrub Task Rename Spec
138
139### Background
140
141`task_rename` are generated when a thread renames itself. This often happens
142after (but not limited to) a `task_newtask` event. The `task_rename` event
143exposes the threads old name and the threads new name.
144
145### Protobuf Message(s)
146
147__New task event:__
148
149```javascript
150event {
151  timestamp: 6702094133317685
152  pid: 6167
153  task_newtask {
154    pid: 7972
155    comm: "adbd"
156    clone_flags: 4001536
157    oom_score_adj: -1000
158  }
159}
160```
161
162__Task rename event:__
163
164```javascript
165event {
166  timestamp: 6702094133665498
167  pid: 7972
168  task_rename {
169    pid: 7972
170    oldcomm: "adbd"
171    newcomm: "shell svc 7971"
172    oom_score_adj: -1000
173  }
174}
175```
176
177### Method
178
179A `task_rename` should be redacted when `event.pid` does not belong to that
180target package (`context.package_uid`). Since the pid's naming information will
181be removed everywhere, and naming information is effectively metadata, the whole
182event can be dropped without effecting the integrity of the trace.
183