1*de1e4e89SAndroid Build Coastguard WorkerI. About the distribution tables 2*de1e4e89SAndroid Build Coastguard Worker 3*de1e4e89SAndroid Build Coastguard WorkerThe table used for "synthesizing" the distribution is essentially a scaled, 4*de1e4e89SAndroid Build Coastguard Workertranslated, inverse to the cumulative distribution function. 5*de1e4e89SAndroid Build Coastguard Worker 6*de1e4e89SAndroid Build Coastguard WorkerHere's how to think about it: Let F() be the cumulative distribution 7*de1e4e89SAndroid Build Coastguard Workerfunction for a probability distribution X. We'll assume we've scaled 8*de1e4e89SAndroid Build Coastguard Workerthings so that X has mean 0 and standard deviation 1, though that's not 9*de1e4e89SAndroid Build Coastguard Workerso important here. Then: 10*de1e4e89SAndroid Build Coastguard Worker 11*de1e4e89SAndroid Build Coastguard Worker F(x) = P(X <= x) = \int_{-inf}^x f 12*de1e4e89SAndroid Build Coastguard Worker 13*de1e4e89SAndroid Build Coastguard Workerwhere f is the probability density function. 14*de1e4e89SAndroid Build Coastguard Worker 15*de1e4e89SAndroid Build Coastguard WorkerF is monotonically increasing, so has an inverse function G, with range 16*de1e4e89SAndroid Build Coastguard Worker0 to 1. Here, G(t) = the x such that P(X <= x) = t. (In general, G may 17*de1e4e89SAndroid Build Coastguard Workerhave singularities if X has point masses, i.e., points x such that 18*de1e4e89SAndroid Build Coastguard WorkerP(X = x) > 0.) 19*de1e4e89SAndroid Build Coastguard Worker 20*de1e4e89SAndroid Build Coastguard WorkerNow we create a tabular representation of G as follows: Choose some table 21*de1e4e89SAndroid Build Coastguard Workersize N, and for the ith entry, put in G(i/N). Let's call this table T. 22*de1e4e89SAndroid Build Coastguard Worker 23*de1e4e89SAndroid Build Coastguard WorkerThe claim now is, I can create a (discrete) random variable Y whose 24*de1e4e89SAndroid Build Coastguard Workerdistribution has the same approximate "shape" as X, simply by letting 25*de1e4e89SAndroid Build Coastguard WorkerY = T(U), where U is a discrete uniform random variable with range 1 to N. 26*de1e4e89SAndroid Build Coastguard WorkerTo see this, it's enough to show that Y's cumulative distribution function, 27*de1e4e89SAndroid Build Coastguard Worker(let's call it H), is a discrete approximation to F. But 28*de1e4e89SAndroid Build Coastguard Worker 29*de1e4e89SAndroid Build Coastguard Worker H(x) = P(Y <= x) 30*de1e4e89SAndroid Build Coastguard Worker = (# of entries in T <= x) / N -- as Y chosen uniformly from T 31*de1e4e89SAndroid Build Coastguard Worker = i/N, where i is the largest integer such that G(i/N) <= x 32*de1e4e89SAndroid Build Coastguard Worker = i/N, where i is the largest integer such that i/N <= F(x) 33*de1e4e89SAndroid Build Coastguard Worker -- since G and F are inverse functions (and F is 34*de1e4e89SAndroid Build Coastguard Worker increasing) 35*de1e4e89SAndroid Build Coastguard Worker = floor(N*F(x))/N 36*de1e4e89SAndroid Build Coastguard Worker 37*de1e4e89SAndroid Build Coastguard Workeras desired. 38*de1e4e89SAndroid Build Coastguard Worker 39*de1e4e89SAndroid Build Coastguard WorkerII. How to create distribution tables (in theory) 40*de1e4e89SAndroid Build Coastguard Worker 41*de1e4e89SAndroid Build Coastguard WorkerHow can we create this table in practice? In some cases, F may have a 42*de1e4e89SAndroid Build Coastguard Workersimple expression which allows evaluating its inverse directly. The 43*de1e4e89SAndroid Build Coastguard WorkerPareto distribution is one example of this. In other cases, and 44*de1e4e89SAndroid Build Coastguard Workerespecially for matching an experimentally observed distribution, it's 45*de1e4e89SAndroid Build Coastguard Workereasiest simply to create a table for F and "invert" it. Here, we give 46*de1e4e89SAndroid Build Coastguard Workera concrete example, namely how the new "experimental" distribution was 47*de1e4e89SAndroid Build Coastguard Workercreated. 48*de1e4e89SAndroid Build Coastguard Worker 49*de1e4e89SAndroid Build Coastguard Worker1. Collect enough data points to characterize the distribution. Here, I 50*de1e4e89SAndroid Build Coastguard Workercollected 25,000 "ping" roundtrip times to a "distant" point (time.nist.gov). 51*de1e4e89SAndroid Build Coastguard WorkerThat's far more data than is really necessary, but it was fairly painless to 52*de1e4e89SAndroid Build Coastguard Workercollect it, so... 53*de1e4e89SAndroid Build Coastguard Worker 54*de1e4e89SAndroid Build Coastguard Worker2. Normalize the data so that it has mean 0 and standard deviation 1. 55*de1e4e89SAndroid Build Coastguard Worker 56*de1e4e89SAndroid Build Coastguard Worker3. Determine the cumulative distribution. The code I wrote creates a table 57*de1e4e89SAndroid Build Coastguard Workercovering the range -10 to +10, with granularity .00005. Obviously, this 58*de1e4e89SAndroid Build Coastguard Workeris absurdly over-precise, but since it's a one-time only computation, I 59*de1e4e89SAndroid Build Coastguard Workerfigured it hardly mattered. 60*de1e4e89SAndroid Build Coastguard Worker 61*de1e4e89SAndroid Build Coastguard Worker4. Invert the table: for each table entry F(x) = y, make the y*TABLESIZE 62*de1e4e89SAndroid Build Coastguard Worker(here, 4096) entry be x*TABLEFACTOR (here, 8192). This creates a table 63*de1e4e89SAndroid Build Coastguard Workerfor the ("normalized") inverse of size TABLESIZE, covering its domain 0 64*de1e4e89SAndroid Build Coastguard Workerto 1 with granularity 1/TABLESIZE. Note that even with the granularity 65*de1e4e89SAndroid Build Coastguard Workerused in creating the table for F, it's possible not all the entries in 66*de1e4e89SAndroid Build Coastguard Workerthe table for G will be filled in. So, make a pass through the 67*de1e4e89SAndroid Build Coastguard Workerinverse's table, filling in any missing entries by linear interpolation. 68*de1e4e89SAndroid Build Coastguard Worker 69*de1e4e89SAndroid Build Coastguard WorkerIII. How to create distribution tables (in practice) 70*de1e4e89SAndroid Build Coastguard Worker 71*de1e4e89SAndroid Build Coastguard WorkerIf you want to do all this yourself, I've provided several tools to help: 72*de1e4e89SAndroid Build Coastguard Worker 73*de1e4e89SAndroid Build Coastguard Worker1. maketable does the steps 2-4 above, and then generates the appropriate 74*de1e4e89SAndroid Build Coastguard Workerheader file. So if you have your own time distribution, you can generate 75*de1e4e89SAndroid Build Coastguard Workerthe header simply by: 76*de1e4e89SAndroid Build Coastguard Worker 77*de1e4e89SAndroid Build Coastguard Worker maketable < time.values > header.h 78*de1e4e89SAndroid Build Coastguard Worker 79*de1e4e89SAndroid Build Coastguard Worker2. As explained in the other README file, the somewhat sleazy way I have 80*de1e4e89SAndroid Build Coastguard Workerof generating correlated values needs correction. You can generate your 81*de1e4e89SAndroid Build Coastguard Workerown correction tables by compiling makesigtable and makemutable with 82*de1e4e89SAndroid Build Coastguard Workeryour header file. Check the Makefile to see how this is done. 83*de1e4e89SAndroid Build Coastguard Worker 84*de1e4e89SAndroid Build Coastguard Worker3. Warning: maketable, makesigtable and especially makemutable do 85*de1e4e89SAndroid Build Coastguard Workerenormous amounts of floating point arithmetic. Don't try running 86*de1e4e89SAndroid Build Coastguard Workerthese on an old 486. (NIST Net itself will run fine on such a 87*de1e4e89SAndroid Build Coastguard Workersystem, since in operation, it just needs to do a few simple integral 88*de1e4e89SAndroid Build Coastguard Workercalculations. But getting there takes some work.) 89*de1e4e89SAndroid Build Coastguard Worker 90*de1e4e89SAndroid Build Coastguard Worker4. The tables produced are all normalized for mean 0 and standard 91*de1e4e89SAndroid Build Coastguard Workerdeviation 1. How do you know what values to use for real? Here, I've 92*de1e4e89SAndroid Build Coastguard Workerprovided a simple "stats" utility. Give it a series of floating point 93*de1e4e89SAndroid Build Coastguard Workervalues, and it will return their mean (mu), standard deviation (sigma), 94*de1e4e89SAndroid Build Coastguard Workerand correlation coefficient (rho). You can then plug these values 95*de1e4e89SAndroid Build Coastguard Workerdirectly into NIST Net. 96