1pipeline 2======== 3 4This directory contains tools and scripts for running a cron job that does 5RAPPOR analysis and generates an HTML dashboard. 6 7It works like this: 8 91. `task_spec.py` generates a text file where each line corresponds to a process 10 to be run (a "task"). The process is `bin/decode-dist` or 11 `bin/decode-assoc`. The line contains the task parameters. 12 132. `xargs -P` is used to run processes in parallel. Our analysis is generally 14 single-threaded (i.e. because R is single-threaded), so this helps utilize 15 the machine fully. Each task places its output in a different subdirectory. 16 173. `cook.sh` calls `combine_results.py` to combine analysis results into a time 18 series. It also calls `combine_status.py` to keep track of task data for 19 "meta-analysis". `metric_status.R` generates more summary CSV files. 20 214. `ui.sh` calls `csv_to_html.py` to generate an HTML fragments from the CSV 22 files. 23 245. The JavaScript in `ui/ui.js` is loaded from static HTML, and makes AJAX calls 25 to retrieve the HTML fragments. The page is made interactive with 26 `ui/table-lib.js`. 27 28`dist.sh` and `assoc.sh` contain functions which coordinate this process. 29 30`alarm-lib.sh` is used to kill processes that have been running for too long. 31 32Testing 33------- 34 35`pipeline/regtest.sh` contains end-to-end demos of this process. Right now it 36depends on testdata from elsewhere in the tree: 37 38 39 rappor$ ./demo.sh run # prepare dist testdata 40 rappor$ cd bin 41 42 bin$ ./test.sh write-assoc-testdata # prepare assoc testdata 43 bin$ cd ../pipeline 44 45 pipeline$ ./regtest.sh dist 46 pipeline$ ./regtest.sh assoc 47 48 pipeline$ python -m SimpleHTTPServer # start a static web server 49 50 http://localhost:8000/_tmp/ 51 52 53