1*de1e4e89SAndroid Build Coastguard WorkerNotes about distribution tables from Nistnet 2*de1e4e89SAndroid Build Coastguard Worker------------------------------------------------------------------------------- 3*de1e4e89SAndroid Build Coastguard WorkerI. About the distribution tables 4*de1e4e89SAndroid Build Coastguard Worker 5*de1e4e89SAndroid Build Coastguard WorkerThe table used for "synthesizing" the distribution is essentially a scaled, 6*de1e4e89SAndroid Build Coastguard Workertranslated, inverse to the cumulative distribution function. 7*de1e4e89SAndroid Build Coastguard Worker 8*de1e4e89SAndroid Build Coastguard WorkerHere's how to think about it: Let F() be the cumulative distribution 9*de1e4e89SAndroid Build Coastguard Workerfunction for a probability distribution X. We'll assume we've scaled 10*de1e4e89SAndroid Build Coastguard Workerthings so that X has mean 0 and standard deviation 1, though that's not 11*de1e4e89SAndroid Build Coastguard Workerso important here. Then: 12*de1e4e89SAndroid Build Coastguard Worker 13*de1e4e89SAndroid Build Coastguard Worker F(x) = P(X <= x) = \int_{-inf}^x f 14*de1e4e89SAndroid Build Coastguard Worker 15*de1e4e89SAndroid Build Coastguard Workerwhere f is the probability density function. 16*de1e4e89SAndroid Build Coastguard Worker 17*de1e4e89SAndroid Build Coastguard WorkerF is monotonically increasing, so has an inverse function G, with range 18*de1e4e89SAndroid Build Coastguard Worker0 to 1. Here, G(t) = the x such that P(X <= x) = t. (In general, G may 19*de1e4e89SAndroid Build Coastguard Workerhave singularities if X has point masses, i.e., points x such that 20*de1e4e89SAndroid Build Coastguard WorkerP(X = x) > 0.) 21*de1e4e89SAndroid Build Coastguard Worker 22*de1e4e89SAndroid Build Coastguard WorkerNow we create a tabular representation of G as follows: Choose some table 23*de1e4e89SAndroid Build Coastguard Workersize N, and for the ith entry, put in G(i/N). Let's call this table T. 24*de1e4e89SAndroid Build Coastguard Worker 25*de1e4e89SAndroid Build Coastguard WorkerThe claim now is, I can create a (discrete) random variable Y whose 26*de1e4e89SAndroid Build Coastguard Workerdistribution has the same approximate "shape" as X, simply by letting 27*de1e4e89SAndroid Build Coastguard WorkerY = T(U), where U is a discrete uniform random variable with range 1 to N. 28*de1e4e89SAndroid Build Coastguard WorkerTo see this, it's enough to show that Y's cumulative distribution function, 29*de1e4e89SAndroid Build Coastguard Worker(let's call it H), is a discrete approximation to F. But 30*de1e4e89SAndroid Build Coastguard Worker 31*de1e4e89SAndroid Build Coastguard Worker H(x) = P(Y <= x) 32*de1e4e89SAndroid Build Coastguard Worker = (# of entries in T <= x) / N -- as Y chosen uniformly from T 33*de1e4e89SAndroid Build Coastguard Worker = i/N, where i is the largest integer such that G(i/N) <= x 34*de1e4e89SAndroid Build Coastguard Worker = i/N, where i is the largest integer such that i/N <= F(x) 35*de1e4e89SAndroid Build Coastguard Worker -- since G and F are inverse functions (and F is 36*de1e4e89SAndroid Build Coastguard Worker increasing) 37*de1e4e89SAndroid Build Coastguard Worker = floor(N*F(x))/N 38*de1e4e89SAndroid Build Coastguard Worker 39*de1e4e89SAndroid Build Coastguard Workeras desired. 40*de1e4e89SAndroid Build Coastguard Worker 41*de1e4e89SAndroid Build Coastguard WorkerII. How to create distribution tables (in theory) 42*de1e4e89SAndroid Build Coastguard Worker 43*de1e4e89SAndroid Build Coastguard WorkerHow can we create this table in practice? In some cases, F may have a 44*de1e4e89SAndroid Build Coastguard Workersimple expression which allows evaluating its inverse directly. The 45*de1e4e89SAndroid Build Coastguard Workerpareto distribution is one example of this. In other cases, and 46*de1e4e89SAndroid Build Coastguard Workerespecially for matching an experimentally observed distribution, it's 47*de1e4e89SAndroid Build Coastguard Workereasiest simply to create a table for F and "invert" it. Here, we give 48*de1e4e89SAndroid Build Coastguard Workera concrete example, namely how the new "experimental" distribution was 49*de1e4e89SAndroid Build Coastguard Workercreated. 50*de1e4e89SAndroid Build Coastguard Worker 51*de1e4e89SAndroid Build Coastguard Worker1. Collect enough data points to characterize the distribution. Here, I 52*de1e4e89SAndroid Build Coastguard Workercollected 25,000 "ping" roundtrip times to a "distant" point (time.nist.gov). 53*de1e4e89SAndroid Build Coastguard WorkerThat's far more data than is really necessary, but it was fairly painless to 54*de1e4e89SAndroid Build Coastguard Workercollect it, so... 55*de1e4e89SAndroid Build Coastguard Worker 56*de1e4e89SAndroid Build Coastguard Worker2. Normalize the data so that it has mean 0 and standard deviation 1. 57*de1e4e89SAndroid Build Coastguard Worker 58*de1e4e89SAndroid Build Coastguard Worker3. Determine the cumulative distribution. The code I wrote creates a table 59*de1e4e89SAndroid Build Coastguard Workercovering the range -10 to +10, with granularity .00005. Obviously, this 60*de1e4e89SAndroid Build Coastguard Workeris absurdly over-precise, but since it's a one-time only computation, I 61*de1e4e89SAndroid Build Coastguard Workerfigured it hardly mattered. 62*de1e4e89SAndroid Build Coastguard Worker 63*de1e4e89SAndroid Build Coastguard Worker4. Invert the table: for each table entry F(x) = y, make the y*TABLESIZE 64*de1e4e89SAndroid Build Coastguard Worker(here, 4096) entry be x*TABLEFACTOR (here, 8192). This creates a table 65*de1e4e89SAndroid Build Coastguard Workerfor the ("normalized") inverse of size TABLESIZE, covering its domain 0 66*de1e4e89SAndroid Build Coastguard Workerto 1 with granularity 1/TABLESIZE. Note that even with the granularity 67*de1e4e89SAndroid Build Coastguard Workerused in creating the table for F, it's possible not all the entries in 68*de1e4e89SAndroid Build Coastguard Workerthe table for G will be filled in. So, make a pass through the 69*de1e4e89SAndroid Build Coastguard Workerinverse's table, filling in any missing entries by linear interpolation. 70*de1e4e89SAndroid Build Coastguard Worker 71*de1e4e89SAndroid Build Coastguard WorkerIII. How to create distribution tables (in practice) 72*de1e4e89SAndroid Build Coastguard Worker 73*de1e4e89SAndroid Build Coastguard WorkerIf you want to do all this yourself, I've provided several tools to help: 74*de1e4e89SAndroid Build Coastguard Worker 75*de1e4e89SAndroid Build Coastguard Worker1. maketable does the steps 2-4 above, and then generates the appropriate 76*de1e4e89SAndroid Build Coastguard Workerheader file. So if you have your own time distribution, you can generate 77*de1e4e89SAndroid Build Coastguard Workerthe header simply by: 78*de1e4e89SAndroid Build Coastguard Worker 79*de1e4e89SAndroid Build Coastguard Worker maketable < time.values > header.h 80*de1e4e89SAndroid Build Coastguard Worker 81*de1e4e89SAndroid Build Coastguard Worker2. As explained in the other README file, the somewhat sleazy way I have 82*de1e4e89SAndroid Build Coastguard Workerof generating correlated values needs correction. You can generate your 83*de1e4e89SAndroid Build Coastguard Workerown correction tables by compiling makesigtable and makemutable with 84*de1e4e89SAndroid Build Coastguard Workeryour header file. Check the Makefile to see how this is done. 85*de1e4e89SAndroid Build Coastguard Worker 86*de1e4e89SAndroid Build Coastguard Worker3. Warning: maketable, makesigtable and especially makemutable do 87*de1e4e89SAndroid Build Coastguard Workerenormous amounts of floating point arithmetic. Don't try running 88*de1e4e89SAndroid Build Coastguard Workerthese on an old 486. (NIST Net itself will run fine on such a 89*de1e4e89SAndroid Build Coastguard Workersystem, since in operation, it just needs to do a few simple integral 90*de1e4e89SAndroid Build Coastguard Workercalculations. But getting there takes some work.) 91*de1e4e89SAndroid Build Coastguard Worker 92*de1e4e89SAndroid Build Coastguard Worker4. The tables produced are all normalized for mean 0 and standard 93*de1e4e89SAndroid Build Coastguard Workerdeviation 1. How do you know what values to use for real? Here, I've 94*de1e4e89SAndroid Build Coastguard Workerprovided a simple "stats" utility. Give it a series of floating point 95*de1e4e89SAndroid Build Coastguard Workervalues, and it will return their mean (mu), standard deviation (sigma), 96*de1e4e89SAndroid Build Coastguard Workerand correlation coefficient (rho). You can then plug these values 97*de1e4e89SAndroid Build Coastguard Workerdirectly into NIST Net. 98