xref: /aosp_15_r20/external/rappor/bin/README.md (revision 2abb31345f6c95944768b5222a9a5ed3fc68cc00)
1*2abb3134SXin LiCommand Line Tools
2*2abb3134SXin Li==================
3*2abb3134SXin Li
4*2abb3134SXin LiThis directory contains command line tools for RAPPOR analysis.
5*2abb3134SXin Li
6*2abb3134SXin LiAnalysis Tools
7*2abb3134SXin Li--------------
8*2abb3134SXin Li
9*2abb3134SXin Li### decode-dist
10*2abb3134SXin Li
11*2abb3134SXin LiDecode a distribution -- requires a "counts" file (summed bits from reports),
12*2abb3134SXin Limap file, and a params file.  See `test.sh decode-dist` in this dir for an
13*2abb3134SXin Liexample.
14*2abb3134SXin Li
15*2abb3134SXin Li### decode-assoc
16*2abb3134SXin Li
17*2abb3134SXin LiDecode a joint distribution between 2 variables ("association analysis").  See
18*2abb3134SXin Li`test.sh decode-assoc-R` or `test.sh decode-assoc-cpp` in this dir for an
19*2abb3134SXin Liexample.
20*2abb3134SXin Li
21*2abb3134SXin LiCurrently it only supports associating strings vs. booleans.
22*2abb3134SXin Li
23*2abb3134SXin Li### Setup
24*2abb3134SXin Li
25*2abb3134SXin LiBoth of these tools are written in R, and require several R libraries to be
26*2abb3134SXin Liinstalled (see `../setup.sh r-packages`).
27*2abb3134SXin Li
28*2abb3134SXin Li`decode-assoc` also shells out to a native binary written in C++ if
29*2abb3134SXin Li`--em-executable` is passed.  This requires a C++ compiler (see
30*2abb3134SXin Li`analysis/cpp/run.sh`).  You can run `test.sh decode-assoc-cpp` to test it.
31*2abb3134SXin Li
32*2abb3134SXin Li
33*2abb3134SXin LiHelper Tools
34*2abb3134SXin Li------------
35*2abb3134SXin Li
36*2abb3134SXin LiThese are simple Python implementations of tools needed for analysis.  At
37*2abb3134SXin LiGoogle, Chrome uses alternative C++/Go implementations of these tools.
38*2abb3134SXin Li
39*2abb3134SXin Li### sum-bits
40*2abb3134SXin Li
41*2abb3134SXin LiGiven a CSV file with RAPPOR reports (IRRs), produce a "counts" CSV file on
42*2abb3134SXin Listdout.  This is the `m x (k+1)` matrix that is used in the R analysis (where m
43*2abb3134SXin Li= #cohorts and k = report width in bits).
44*2abb3134SXin Li
45*2abb3134SXin Li### hash-candidates
46*2abb3134SXin Li
47*2abb3134SXin LiGiven a list of candidates on stdin, produce a CSV file of hashes (the "map
48*2abb3134SXin Lifile").  Each row has `m x h` cells (where m = #cohorts and h = #hashes)
49*2abb3134SXin Li
50*2abb3134SXin LiSee the `regtest.sh` script for examples of how these tools are invoked.
51*2abb3134SXin Li
52