1:mod:`random` --- Generate pseudo-random numbers
2================================================
3
4.. module:: random
5   :synopsis: Generate pseudo-random numbers with various common distributions.
6
7**Source code:** :source:`Lib/random.py`
8
9--------------
10
11This module implements pseudo-random number generators for various
12distributions.
13
14For integers, there is uniform selection from a range. For sequences, there is
15uniform selection of a random element, a function to generate a random
16permutation of a list in-place, and a function for random sampling without
17replacement.
18
19On the real line, there are functions to compute uniform, normal (Gaussian),
20lognormal, negative exponential, gamma, and beta distributions. For generating
21distributions of angles, the von Mises distribution is available.
22
23Almost all module functions depend on the basic function :func:`.random`, which
24generates a random float uniformly in the half-open range ``0.0 <= X < 1.0``.
25Python uses the Mersenne Twister as the core generator.  It produces 53-bit precision
26floats and has a period of 2\*\*19937-1.  The underlying implementation in C is
27both fast and threadsafe.  The Mersenne Twister is one of the most extensively
28tested random number generators in existence.  However, being completely
29deterministic, it is not suitable for all purposes, and is completely unsuitable
30for cryptographic purposes.
31
32The functions supplied by this module are actually bound methods of a hidden
33instance of the :class:`random.Random` class.  You can instantiate your own
34instances of :class:`Random` to get generators that don't share state.
35
36Class :class:`Random` can also be subclassed if you want to use a different
37basic generator of your own devising: in that case, override the :meth:`~Random.random`,
38:meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods.
39Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this
40allows :meth:`randrange` to produce selections over an arbitrarily large range.
41
42The :mod:`random` module also provides the :class:`SystemRandom` class which
43uses the system function :func:`os.urandom` to generate random numbers
44from sources provided by the operating system.
45
46.. warning::
47
48   The pseudo-random generators of this module should not be used for
49   security purposes.  For security or cryptographic uses, see the
50   :mod:`secrets` module.
51
52.. seealso::
53
54   M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally
55   equidistributed uniform pseudorandom number generator", ACM Transactions on
56   Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998.
57
58
59   `Complementary-Multiply-with-Carry recipe
60   <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative
61   random number generator with a long period and comparatively simple update
62   operations.
63
64
65Bookkeeping functions
66---------------------
67
68.. function:: seed(a=None, version=2)
69
70   Initialize the random number generator.
71
72   If *a* is omitted or ``None``, the current system time is used.  If
73   randomness sources are provided by the operating system, they are used
74   instead of the system time (see the :func:`os.urandom` function for details
75   on availability).
76
77   If *a* is an int, it is used directly.
78
79   With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray`
80   object gets converted to an :class:`int` and all of its bits are used.
81
82   With version 1 (provided for reproducing random sequences from older versions
83   of Python), the algorithm for :class:`str` and :class:`bytes` generates a
84   narrower range of seeds.
85
86   .. versionchanged:: 3.2
87      Moved to the version 2 scheme which uses all of the bits in a string seed.
88
89   .. versionchanged:: 3.11
90      The *seed* must be one of the following types:
91      *NoneType*, :class:`int`, :class:`float`, :class:`str`,
92      :class:`bytes`, or :class:`bytearray`.
93
94.. function:: getstate()
95
96   Return an object capturing the current internal state of the generator.  This
97   object can be passed to :func:`setstate` to restore the state.
98
99
100.. function:: setstate(state)
101
102   *state* should have been obtained from a previous call to :func:`getstate`, and
103   :func:`setstate` restores the internal state of the generator to what it was at
104   the time :func:`getstate` was called.
105
106
107Functions for bytes
108-------------------
109
110.. function:: randbytes(n)
111
112   Generate *n* random bytes.
113
114   This method should not be used for generating security tokens.
115   Use :func:`secrets.token_bytes` instead.
116
117   .. versionadded:: 3.9
118
119
120Functions for integers
121----------------------
122
123.. function:: randrange(stop)
124              randrange(start, stop[, step])
125
126   Return a randomly selected element from ``range(start, stop, step)``.  This is
127   equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a
128   range object.
129
130   The positional argument pattern matches that of :func:`range`.  Keyword arguments
131   should not be used because the function may use them in unexpected ways.
132
133   .. versionchanged:: 3.2
134      :meth:`randrange` is more sophisticated about producing equally distributed
135      values.  Formerly it used a style like ``int(random()*n)`` which could produce
136      slightly uneven distributions.
137
138   .. deprecated:: 3.10
139      The automatic conversion of non-integer types to equivalent integers is
140      deprecated.  Currently ``randrange(10.0)`` is losslessly converted to
141      ``randrange(10)``.  In the future, this will raise a :exc:`TypeError`.
142
143   .. deprecated:: 3.10
144      The exception raised for non-integral values such as ``randrange(10.5)``
145      or ``randrange('10')`` will be changed from :exc:`ValueError` to
146      :exc:`TypeError`.
147
148.. function:: randint(a, b)
149
150   Return a random integer *N* such that ``a <= N <= b``.  Alias for
151   ``randrange(a, b+1)``.
152
153.. function:: getrandbits(k)
154
155   Returns a non-negative Python integer with *k* random bits. This method
156   is supplied with the MersenneTwister generator and some other generators
157   may also provide it as an optional part of the API. When available,
158   :meth:`getrandbits` enables :meth:`randrange` to handle arbitrarily large
159   ranges.
160
161   .. versionchanged:: 3.9
162      This method now accepts zero for *k*.
163
164
165Functions for sequences
166-----------------------
167
168.. function:: choice(seq)
169
170   Return a random element from the non-empty sequence *seq*. If *seq* is empty,
171   raises :exc:`IndexError`.
172
173.. function:: choices(population, weights=None, *, cum_weights=None, k=1)
174
175   Return a *k* sized list of elements chosen from the *population* with replacement.
176   If the *population* is empty, raises :exc:`IndexError`.
177
178   If a *weights* sequence is specified, selections are made according to the
179   relative weights.  Alternatively, if a *cum_weights* sequence is given, the
180   selections are made according to the cumulative weights (perhaps computed
181   using :func:`itertools.accumulate`).  For example, the relative weights
182   ``[10, 5, 30, 5]`` are equivalent to the cumulative weights
183   ``[10, 15, 45, 50]``.  Internally, the relative weights are converted to
184   cumulative weights before making selections, so supplying the cumulative
185   weights saves work.
186
187   If neither *weights* nor *cum_weights* are specified, selections are made
188   with equal probability.  If a weights sequence is supplied, it must be
189   the same length as the *population* sequence.  It is a :exc:`TypeError`
190   to specify both *weights* and *cum_weights*.
191
192   The *weights* or *cum_weights* can use any numeric type that interoperates
193   with the :class:`float` values returned by :func:`random` (that includes
194   integers, floats, and fractions but excludes decimals).  Weights are assumed
195   to be non-negative and finite.  A :exc:`ValueError` is raised if all
196   weights are zero.
197
198   For a given seed, the :func:`choices` function with equal weighting
199   typically produces a different sequence than repeated calls to
200   :func:`choice`.  The algorithm used by :func:`choices` uses floating
201   point arithmetic for internal consistency and speed.  The algorithm used
202   by :func:`choice` defaults to integer arithmetic with repeated selections
203   to avoid small biases from round-off error.
204
205   .. versionadded:: 3.6
206
207   .. versionchanged:: 3.9
208      Raises a :exc:`ValueError` if all weights are zero.
209
210
211.. function:: shuffle(x)
212
213   Shuffle the sequence *x* in place.
214
215   To shuffle an immutable sequence and return a new shuffled list, use
216   ``sample(x, k=len(x))`` instead.
217
218   Note that even for small ``len(x)``, the total number of permutations of *x*
219   can quickly grow larger than the period of most random number generators.
220   This implies that most permutations of a long sequence can never be
221   generated.  For example, a sequence of length 2080 is the largest that
222   can fit within the period of the Mersenne Twister random number generator.
223
224   .. deprecated-removed:: 3.9 3.11
225      The optional parameter *random*.
226
227
228.. function:: sample(population, k, *, counts=None)
229
230   Return a *k* length list of unique elements chosen from the population
231   sequence.  Used for random sampling without replacement.
232
233   Returns a new list containing elements from the population while leaving the
234   original population unchanged.  The resulting list is in selection order so that
235   all sub-slices will also be valid random samples.  This allows raffle winners
236   (the sample) to be partitioned into grand prize and second place winners (the
237   subslices).
238
239   Members of the population need not be :term:`hashable` or unique.  If the population
240   contains repeats, then each occurrence is a possible selection in the sample.
241
242   Repeated elements can be specified one at a time or with the optional
243   keyword-only *counts* parameter.  For example, ``sample(['red', 'blue'],
244   counts=[4, 2], k=5)`` is equivalent to ``sample(['red', 'red', 'red', 'red',
245   'blue', 'blue'], k=5)``.
246
247   To choose a sample from a range of integers, use a :func:`range` object as an
248   argument.  This is especially fast and space efficient for sampling from a large
249   population:  ``sample(range(10000000), k=60)``.
250
251   If the sample size is larger than the population size, a :exc:`ValueError`
252   is raised.
253
254   .. versionchanged:: 3.9
255      Added the *counts* parameter.
256
257   .. versionchanged:: 3.11
258
259      The *population* must be a sequence.  Automatic conversion of sets
260      to lists is no longer supported.
261
262
263.. _real-valued-distributions:
264
265Real-valued distributions
266-------------------------
267
268The following functions generate specific real-valued distributions. Function
269parameters are named after the corresponding variables in the distribution's
270equation, as used in common mathematical practice; most of these equations can
271be found in any statistics text.
272
273
274.. function:: random()
275
276   Return the next random floating point number in the range ``0.0 <= X < 1.0``
277
278
279.. function:: uniform(a, b)
280
281   Return a random floating point number *N* such that ``a <= N <= b`` for
282   ``a <= b`` and ``b <= N <= a`` for ``b < a``.
283
284   The end-point value ``b`` may or may not be included in the range
285   depending on floating-point rounding in the equation ``a + (b-a) * random()``.
286
287
288.. function:: triangular(low, high, mode)
289
290   Return a random floating point number *N* such that ``low <= N <= high`` and
291   with the specified *mode* between those bounds.  The *low* and *high* bounds
292   default to zero and one.  The *mode* argument defaults to the midpoint
293   between the bounds, giving a symmetric distribution.
294
295
296.. function:: betavariate(alpha, beta)
297
298   Beta distribution.  Conditions on the parameters are ``alpha > 0`` and
299   ``beta > 0``. Returned values range between 0 and 1.
300
301
302.. function:: expovariate(lambd)
303
304   Exponential distribution.  *lambd* is 1.0 divided by the desired
305   mean.  It should be nonzero.  (The parameter would be called
306   "lambda", but that is a reserved word in Python.)  Returned values
307   range from 0 to positive infinity if *lambd* is positive, and from
308   negative infinity to 0 if *lambd* is negative.
309
310
311.. function:: gammavariate(alpha, beta)
312
313   Gamma distribution.  (*Not* the gamma function!)  The shape and
314   scale parameters, *alpha* and *beta*, must have positive values.
315   (Calling conventions vary and some sources define 'beta'
316   as the inverse of the scale).
317
318   The probability distribution function is::
319
320                 x ** (alpha - 1) * math.exp(-x / beta)
321       pdf(x) =  --------------------------------------
322                   math.gamma(alpha) * beta ** alpha
323
324
325.. function:: gauss(mu=0.0, sigma=1.0)
326
327   Normal distribution, also called the Gaussian distribution.
328   *mu* is the mean,
329   and *sigma* is the standard deviation.  This is slightly faster than
330   the :func:`normalvariate` function defined below.
331
332   Multithreading note:  When two threads call this function
333   simultaneously, it is possible that they will receive the
334   same return value.  This can be avoided in three ways.
335   1) Have each thread use a different instance of the random
336   number generator. 2) Put locks around all calls. 3) Use the
337   slower, but thread-safe :func:`normalvariate` function instead.
338
339   .. versionchanged:: 3.11
340      *mu* and *sigma* now have default arguments.
341
342
343.. function:: lognormvariate(mu, sigma)
344
345   Log normal distribution.  If you take the natural logarithm of this
346   distribution, you'll get a normal distribution with mean *mu* and standard
347   deviation *sigma*.  *mu* can have any value, and *sigma* must be greater than
348   zero.
349
350
351.. function:: normalvariate(mu=0.0, sigma=1.0)
352
353   Normal distribution.  *mu* is the mean, and *sigma* is the standard deviation.
354
355   .. versionchanged:: 3.11
356      *mu* and *sigma* now have default arguments.
357
358
359.. function:: vonmisesvariate(mu, kappa)
360
361   *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa*
362   is the concentration parameter, which must be greater than or equal to zero.  If
363   *kappa* is equal to zero, this distribution reduces to a uniform random angle
364   over the range 0 to 2\*\ *pi*.
365
366
367.. function:: paretovariate(alpha)
368
369   Pareto distribution.  *alpha* is the shape parameter.
370
371
372.. function:: weibullvariate(alpha, beta)
373
374   Weibull distribution.  *alpha* is the scale parameter and *beta* is the shape
375   parameter.
376
377
378Alternative Generator
379---------------------
380
381.. class:: Random([seed])
382
383   Class that implements the default pseudo-random number generator used by the
384   :mod:`random` module.
385
386   .. deprecated:: 3.9
387      In the future, the *seed* must be one of the following types:
388      :class:`NoneType`, :class:`int`, :class:`float`, :class:`str`,
389      :class:`bytes`, or :class:`bytearray`.
390
391.. class:: SystemRandom([seed])
392
393   Class that uses the :func:`os.urandom` function for generating random numbers
394   from sources provided by the operating system. Not available on all systems.
395   Does not rely on software state, and sequences are not reproducible. Accordingly,
396   the :meth:`seed` method has no effect and is ignored.
397   The :meth:`getstate` and :meth:`setstate` methods raise
398   :exc:`NotImplementedError` if called.
399
400
401Notes on Reproducibility
402------------------------
403
404Sometimes it is useful to be able to reproduce the sequences given by a
405pseudo-random number generator.  By re-using a seed value, the same sequence should be
406reproducible from run to run as long as multiple threads are not running.
407
408Most of the random module's algorithms and seeding functions are subject to
409change across Python versions, but two aspects are guaranteed not to change:
410
411* If a new seeding method is added, then a backward compatible seeder will be
412  offered.
413
414* The generator's :meth:`~Random.random` method will continue to produce the same
415  sequence when the compatible seeder is given the same seed.
416
417.. _random-examples:
418
419Examples
420--------
421
422Basic examples::
423
424   >>> random()                             # Random float:  0.0 <= x < 1.0
425   0.37444887175646646
426
427   >>> uniform(2.5, 10.0)                   # Random float:  2.5 <= x <= 10.0
428   3.1800146073117523
429
430   >>> expovariate(1 / 5)                   # Interval between arrivals averaging 5 seconds
431   5.148957571865031
432
433   >>> randrange(10)                        # Integer from 0 to 9 inclusive
434   7
435
436   >>> randrange(0, 101, 2)                 # Even integer from 0 to 100 inclusive
437   26
438
439   >>> choice(['win', 'lose', 'draw'])      # Single random element from a sequence
440   'draw'
441
442   >>> deck = 'ace two three four'.split()
443   >>> shuffle(deck)                        # Shuffle a list
444   >>> deck
445   ['four', 'two', 'ace', 'three']
446
447   >>> sample([10, 20, 30, 40, 50], k=4)    # Four samples without replacement
448   [40, 10, 50, 30]
449
450Simulations::
451
452   >>> # Six roulette wheel spins (weighted sampling with replacement)
453   >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6)
454   ['red', 'green', 'black', 'black', 'red', 'black']
455
456   >>> # Deal 20 cards without replacement from a deck
457   >>> # of 52 playing cards, and determine the proportion of cards
458   >>> # with a ten-value:  ten, jack, queen, or king.
459   >>> dealt = sample(['tens', 'low cards'], counts=[16, 36], k=20)
460   >>> dealt.count('tens') / 20
461   0.15
462
463   >>> # Estimate the probability of getting 5 or more heads from 7 spins
464   >>> # of a biased coin that settles on heads 60% of the time.
465   >>> def trial():
466   ...     return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5
467   ...
468   >>> sum(trial() for i in range(10_000)) / 10_000
469   0.4169
470
471   >>> # Probability of the median of 5 samples being in middle two quartiles
472   >>> def trial():
473   ...     return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500
474   ...
475   >>> sum(trial() for i in range(10_000)) / 10_000
476   0.7958
477
478Example of `statistical bootstrapping
479<https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling
480with replacement to estimate a confidence interval for the mean of a sample::
481
482   # https://www.thoughtco.com/example-of-bootstrapping-3126155
483   from statistics import fmean as mean
484   from random import choices
485
486   data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95]
487   means = sorted(mean(choices(data, k=len(data))) for i in range(100))
488   print(f'The sample mean of {mean(data):.1f} has a 90% confidence '
489         f'interval from {means[5]:.1f} to {means[94]:.1f}')
490
491Example of a `resampling permutation test
492<https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_
493to determine the statistical significance or `p-value
494<https://en.wikipedia.org/wiki/P-value>`_ of an observed difference
495between the effects of a drug versus a placebo::
496
497    # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson
498    from statistics import fmean as mean
499    from random import shuffle
500
501    drug = [54, 73, 53, 70, 73, 68, 52, 65, 65]
502    placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46]
503    observed_diff = mean(drug) - mean(placebo)
504
505    n = 10_000
506    count = 0
507    combined = drug + placebo
508    for i in range(n):
509        shuffle(combined)
510        new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):])
511        count += (new_diff >= observed_diff)
512
513    print(f'{n} label reshufflings produced only {count} instances with a difference')
514    print(f'at least as extreme as the observed difference of {observed_diff:.1f}.')
515    print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null')
516    print(f'hypothesis that there is no difference between the drug and the placebo.')
517
518Simulation of arrival times and service deliveries for a multiserver queue::
519
520    from heapq import heapify, heapreplace
521    from random import expovariate, gauss
522    from statistics import mean, quantiles
523
524    average_arrival_interval = 5.6
525    average_service_time = 15.0
526    stdev_service_time = 3.5
527    num_servers = 3
528
529    waits = []
530    arrival_time = 0.0
531    servers = [0.0] * num_servers  # time when each server becomes available
532    heapify(servers)
533    for i in range(1_000_000):
534        arrival_time += expovariate(1.0 / average_arrival_interval)
535        next_server_available = servers[0]
536        wait = max(0.0, next_server_available - arrival_time)
537        waits.append(wait)
538        service_duration = max(0.0, gauss(average_service_time, stdev_service_time))
539        service_completed = arrival_time + wait + service_duration
540        heapreplace(servers, service_completed)
541
542    print(f'Mean wait: {mean(waits):.1f}   Max wait: {max(waits):.1f}')
543    print('Quartiles:', [round(q, 1) for q in quantiles(waits)])
544
545.. seealso::
546
547   `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_
548   a video tutorial by
549   `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_
550   on statistical analysis using just a few fundamental concepts
551   including simulation, sampling, shuffling, and cross-validation.
552
553   `Economics Simulation
554   <https://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_
555   a simulation of a marketplace by
556   `Peter Norvig <https://norvig.com/bio.html>`_ that shows effective
557   use of many of the tools and distributions provided by this module
558   (gauss, uniform, sample, betavariate, choice, triangular, and randrange).
559
560   `A Concrete Introduction to Probability (using Python)
561   <https://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_
562   a tutorial by `Peter Norvig <https://norvig.com/bio.html>`_ covering
563   the basics of probability theory, how to write simulations, and
564   how to perform data analysis using Python.
565
566
567Recipes
568-------
569
570These recipes show how to efficiently make random selections
571from the combinatoric iterators in the :mod:`itertools` module:
572
573.. testcode::
574   import random
575
576   def random_product(*args, repeat=1):
577       "Random selection from itertools.product(*args, **kwds)"
578       pools = [tuple(pool) for pool in args] * repeat
579       return tuple(map(random.choice, pools))
580
581   def random_permutation(iterable, r=None):
582       "Random selection from itertools.permutations(iterable, r)"
583       pool = tuple(iterable)
584       r = len(pool) if r is None else r
585       return tuple(random.sample(pool, r))
586
587   def random_combination(iterable, r):
588       "Random selection from itertools.combinations(iterable, r)"
589       pool = tuple(iterable)
590       n = len(pool)
591       indices = sorted(random.sample(range(n), r))
592       return tuple(pool[i] for i in indices)
593
594   def random_combination_with_replacement(iterable, r):
595       "Choose r elements with replacement.  Order the result to match the iterable."
596       # Result will be in set(itertools.combinations_with_replacement(iterable, r)).
597       pool = tuple(iterable)
598       n = len(pool)
599       indices = sorted(random.choices(range(n), k=r))
600       return tuple(pool[i] for i in indices)
601
602The default :func:`.random` returns multiples of 2⁻⁵³ in the range
603*0.0 ≤ x < 1.0*.  All such numbers are evenly spaced and are exactly
604representable as Python floats.  However, many other representable
605floats in that interval are not possible selections.  For example,
606``0.05954861408025609`` isn't an integer multiple of 2⁻⁵³.
607
608The following recipe takes a different approach.  All floats in the
609interval are possible selections.  The mantissa comes from a uniform
610distribution of integers in the range *2⁵² ≤ mantissa < 2⁵³*.  The
611exponent comes from a geometric distribution where exponents smaller
612than *-53* occur half as often as the next larger exponent.
613
614::
615
616    from random import Random
617    from math import ldexp
618
619    class FullRandom(Random):
620
621        def random(self):
622            mantissa = 0x10_0000_0000_0000 | self.getrandbits(52)
623            exponent = -53
624            x = 0
625            while not x:
626                x = self.getrandbits(32)
627                exponent += x.bit_length() - 32
628            return ldexp(mantissa, exponent)
629
630All :ref:`real valued distributions <real-valued-distributions>`
631in the class will use the new method::
632
633    >>> fr = FullRandom()
634    >>> fr.random()
635    0.05954861408025609
636    >>> fr.expovariate(0.25)
637    8.87925541791544
638
639The recipe is conceptually equivalent to an algorithm that chooses from
640all the multiples of 2⁻¹⁰⁷⁴ in the range *0.0 ≤ x < 1.0*.  All such
641numbers are evenly spaced, but most have to be rounded down to the
642nearest representable Python float.  (The value 2⁻¹⁰⁷⁴ is the smallest
643positive unnormalized float and is equal to ``math.ulp(0.0)``.)
644
645
646.. seealso::
647
648   `Generating Pseudo-random Floating-Point Values
649   <https://allendowney.com/research/rand/downey07randfloat.pdf>`_ a
650   paper by Allen B. Downey describing ways to generate more
651   fine-grained floats than normally generated by :func:`.random`.
652