xref: /aosp_15_r20/external/rappor/pipeline/README.md (revision 2abb31345f6c95944768b5222a9a5ed3fc68cc00)
1*2abb3134SXin Lipipeline
2*2abb3134SXin Li========
3*2abb3134SXin Li
4*2abb3134SXin LiThis directory contains tools and scripts for running a cron job that does
5*2abb3134SXin LiRAPPOR analysis and generates an HTML dashboard.
6*2abb3134SXin Li
7*2abb3134SXin LiIt works like this:
8*2abb3134SXin Li
9*2abb3134SXin Li1. `task_spec.py` generates a text file where each line corresponds to a process
10*2abb3134SXin Li   to be run (a "task").  The process is `bin/decode-dist` or
11*2abb3134SXin Li   `bin/decode-assoc`.  The line contains the task parameters.
12*2abb3134SXin Li
13*2abb3134SXin Li2. `xargs -P` is used to run processes in parallel.  Our analysis is generally
14*2abb3134SXin Li   single-threaded (i.e. because R is single-threaded), so this helps utilize
15*2abb3134SXin Li   the machine fully.  Each task places its output in a different subdirectory.
16*2abb3134SXin Li
17*2abb3134SXin Li3. `cook.sh` calls `combine_results.py` to combine analysis results into a time
18*2abb3134SXin Li   series.  It also calls `combine_status.py` to keep track of task data for
19*2abb3134SXin Li   "meta-analysis".  `metric_status.R` generates more summary CSV files.
20*2abb3134SXin Li
21*2abb3134SXin Li4. `ui.sh` calls `csv_to_html.py` to generate an HTML fragments from the CSV
22*2abb3134SXin Li   files.
23*2abb3134SXin Li
24*2abb3134SXin Li5. The JavaScript in `ui/ui.js` is loaded from static HTML, and makes AJAX calls
25*2abb3134SXin Li   to retrieve the HTML fragments.  The page is made interactive with
26*2abb3134SXin Li   `ui/table-lib.js`.
27*2abb3134SXin Li
28*2abb3134SXin Li`dist.sh` and `assoc.sh` contain functions which coordinate this process.
29*2abb3134SXin Li
30*2abb3134SXin Li`alarm-lib.sh` is used to kill processes that have been running for too long.
31*2abb3134SXin Li
32*2abb3134SXin LiTesting
33*2abb3134SXin Li-------
34*2abb3134SXin Li
35*2abb3134SXin Li`pipeline/regtest.sh` contains end-to-end demos of this process.  Right now it
36*2abb3134SXin Lidepends on testdata from elsewhere in the tree:
37*2abb3134SXin Li
38*2abb3134SXin Li
39*2abb3134SXin Li    rappor$ ./demo.sh run   # prepare dist testdata
40*2abb3134SXin Li    rappor$ cd bin
41*2abb3134SXin Li
42*2abb3134SXin Li    bin$ ./test.sh write-assoc-testdata  # prepare assoc testdata
43*2abb3134SXin Li    bin$ cd ../pipeline
44*2abb3134SXin Li
45*2abb3134SXin Li    pipeline$ ./regtest.sh dist
46*2abb3134SXin Li    pipeline$ ./regtest.sh assoc
47*2abb3134SXin Li
48*2abb3134SXin Li    pipeline$ python -m SimpleHTTPServer  # start a static web server
49*2abb3134SXin Li
50*2abb3134SXin Li    http://localhost:8000/_tmp/
51*2abb3134SXin Li
52*2abb3134SXin Li
53