1:mod:`zlib` --- Compression compatible with :program:`gzip`
2===========================================================
3
4.. module:: zlib
5   :synopsis: Low-level interface to compression and decompression routines
6              compatible with gzip.
7
8--------------
9
10For applications that require data compression, the functions in this module
11allow compression and decompression, using the zlib library. The zlib library
12has its own home page at https://www.zlib.net.   There are known
13incompatibilities between the Python module and versions of the zlib library
14earlier than 1.1.3; 1.1.3 has a `security vulnerability <https://zlib.net/zlib_faq.html#faq33>`_, so we recommend using
151.1.4 or later.
16
17zlib's functions have many options and often need to be used in a particular
18order.  This documentation doesn't attempt to cover all of the permutations;
19consult the zlib manual at http://www.zlib.net/manual.html for authoritative
20information.
21
22For reading and writing ``.gz`` files see the :mod:`gzip` module.
23
24The available exception and functions in this module are:
25
26
27.. exception:: error
28
29   Exception raised on compression and decompression errors.
30
31
32.. function:: adler32(data[, value])
33
34   Computes an Adler-32 checksum of *data*.  (An Adler-32 checksum is almost as
35   reliable as a CRC32 but can be computed much more quickly.)  The result
36   is an unsigned 32-bit integer.  If *value* is present, it is used as
37   the starting value of the checksum; otherwise, a default value of 1
38   is used.  Passing in *value* allows computing a running checksum over the
39   concatenation of several inputs.  The algorithm is not cryptographically
40   strong, and should not be used for authentication or digital signatures.  Since
41   the algorithm is designed for use as a checksum algorithm, it is not suitable
42   for use as a general hash algorithm.
43
44   .. versionchanged:: 3.0
45      The result is always unsigned.
46
47.. function:: compress(data, /, level=-1, wbits=MAX_WBITS)
48
49   Compresses the bytes in *data*, returning a bytes object containing compressed data.
50   *level* is an integer from ``0`` to ``9`` or ``-1`` controlling the level of compression;
51   ``1`` (Z_BEST_SPEED) is fastest and produces the least compression, ``9`` (Z_BEST_COMPRESSION)
52   is slowest and produces the most.  ``0`` (Z_NO_COMPRESSION) is no compression.
53   The default value is ``-1`` (Z_DEFAULT_COMPRESSION).  Z_DEFAULT_COMPRESSION represents a default
54   compromise between speed and compression (currently equivalent to level 6).
55
56   .. _compress-wbits:
57
58   The *wbits* argument controls the size of the history buffer (or the
59   "window size") used when compressing data, and whether a header and
60   trailer is included in the output.  It can take several ranges of values,
61   defaulting to ``15`` (MAX_WBITS):
62
63   * +9 to +15: The base-two logarithm of the window size, which
64     therefore ranges between 512 and 32768.  Larger values produce
65     better compression at the expense of greater memory usage.  The
66     resulting output will include a zlib-specific header and trailer.
67
68   * −9 to −15: Uses the absolute value of *wbits* as the
69     window size logarithm, while producing a raw output stream with no
70     header or trailing checksum.
71
72   * +25 to +31 = 16 + (9 to 15): Uses the low 4 bits of the value as the
73     window size logarithm, while including a basic :program:`gzip` header
74     and trailing checksum in the output.
75
76   Raises the :exc:`error` exception if any error occurs.
77
78   .. versionchanged:: 3.6
79      *level* can now be used as a keyword parameter.
80
81   .. versionchanged:: 3.11
82      The *wbits* parameter is now available to set window bits and
83      compression type.
84
85.. function:: compressobj(level=-1, method=DEFLATED, wbits=MAX_WBITS, memLevel=DEF_MEM_LEVEL, strategy=Z_DEFAULT_STRATEGY[, zdict])
86
87   Returns a compression object, to be used for compressing data streams that won't
88   fit into memory at once.
89
90   *level* is the compression level -- an integer from ``0`` to ``9`` or ``-1``.
91   A value of ``1`` (Z_BEST_SPEED) is fastest and produces the least compression,
92   while a value of ``9`` (Z_BEST_COMPRESSION) is slowest and produces the most.
93   ``0`` (Z_NO_COMPRESSION) is no compression.  The default value is ``-1`` (Z_DEFAULT_COMPRESSION).
94   Z_DEFAULT_COMPRESSION represents a default compromise between speed and compression
95   (currently equivalent to level 6).
96
97   *method* is the compression algorithm. Currently, the only supported value is
98   :const:`DEFLATED`.
99
100   The *wbits* parameter controls the size of the history buffer (or the
101   "window size"), and what header and trailer format will be used. It has
102   the same meaning as `described for compress() <#compress-wbits>`__.
103
104   The *memLevel* argument controls the amount of memory used for the
105   internal compression state. Valid values range from ``1`` to ``9``.
106   Higher values use more memory, but are faster and produce smaller output.
107
108   *strategy* is used to tune the compression algorithm. Possible values are
109   :const:`Z_DEFAULT_STRATEGY`, :const:`Z_FILTERED`, :const:`Z_HUFFMAN_ONLY`,
110   :const:`Z_RLE` (zlib 1.2.0.1) and :const:`Z_FIXED` (zlib 1.2.2.2).
111
112   *zdict* is a predefined compression dictionary. This is a sequence of bytes
113   (such as a :class:`bytes` object) containing subsequences that are expected
114   to occur frequently in the data that is to be compressed. Those subsequences
115   that are expected to be most common should come at the end of the dictionary.
116
117   .. versionchanged:: 3.3
118      Added the *zdict* parameter and keyword argument support.
119
120
121.. function:: crc32(data[, value])
122
123   .. index::
124      single: Cyclic Redundancy Check
125      single: checksum; Cyclic Redundancy Check
126
127   Computes a CRC (Cyclic Redundancy Check) checksum of *data*. The
128   result is an unsigned 32-bit integer. If *value* is present, it is used
129   as the starting value of the checksum; otherwise, a default value of 0
130   is used.  Passing in *value* allows computing a running checksum over the
131   concatenation of several inputs.  The algorithm is not cryptographically
132   strong, and should not be used for authentication or digital signatures.  Since
133   the algorithm is designed for use as a checksum algorithm, it is not suitable
134   for use as a general hash algorithm.
135
136   .. versionchanged:: 3.0
137      The result is always unsigned.
138
139.. function:: decompress(data, /, wbits=MAX_WBITS, bufsize=DEF_BUF_SIZE)
140
141   Decompresses the bytes in *data*, returning a bytes object containing the
142   uncompressed data.  The *wbits* parameter depends on
143   the format of *data*, and is discussed further below.
144   If *bufsize* is given, it is used as the initial size of the output
145   buffer.  Raises the :exc:`error` exception if any error occurs.
146
147   .. _decompress-wbits:
148
149   The *wbits* parameter controls the size of the history buffer
150   (or "window size"), and what header and trailer format is expected.
151   It is similar to the parameter for :func:`compressobj`, but accepts
152   more ranges of values:
153
154   * +8 to +15: The base-two logarithm of the window size.  The input
155     must include a zlib header and trailer.
156
157   * 0: Automatically determine the window size from the zlib header.
158     Only supported since zlib 1.2.3.5.
159
160   * −8 to −15: Uses the absolute value of *wbits* as the window size
161     logarithm.  The input must be a raw stream with no header or trailer.
162
163   * +24 to +31 = 16 + (8 to 15): Uses the low 4 bits of the value as
164     the window size logarithm.  The input must include a gzip header and
165     trailer.
166
167   * +40 to +47 = 32 + (8 to 15): Uses the low 4 bits of the value as
168     the window size logarithm, and automatically accepts either
169     the zlib or gzip format.
170
171   When decompressing a stream, the window size must not be smaller
172   than the size originally used to compress the stream; using a too-small
173   value may result in an :exc:`error` exception. The default *wbits* value
174   corresponds to the largest window size and requires a zlib header and
175   trailer to be included.
176
177   *bufsize* is the initial size of the buffer used to hold decompressed data.  If
178   more space is required, the buffer size will be increased as needed, so you
179   don't have to get this value exactly right; tuning it will only save a few calls
180   to :c:func:`malloc`.
181
182   .. versionchanged:: 3.6
183      *wbits* and *bufsize* can be used as keyword arguments.
184
185.. function:: decompressobj(wbits=MAX_WBITS[, zdict])
186
187   Returns a decompression object, to be used for decompressing data streams that
188   won't fit into memory at once.
189
190   The *wbits* parameter controls the size of the history buffer (or the
191   "window size"), and what header and trailer format is expected.  It has
192   the same meaning as `described for decompress() <#decompress-wbits>`__.
193
194   The *zdict* parameter specifies a predefined compression dictionary. If
195   provided, this must be the same dictionary as was used by the compressor that
196   produced the data that is to be decompressed.
197
198   .. note::
199
200      If *zdict* is a mutable object (such as a :class:`bytearray`), you must not
201      modify its contents between the call to :func:`decompressobj` and the first
202      call to the decompressor's ``decompress()`` method.
203
204   .. versionchanged:: 3.3
205      Added the *zdict* parameter.
206
207
208Compression objects support the following methods:
209
210
211.. method:: Compress.compress(data)
212
213   Compress *data*, returning a bytes object containing compressed data for at least
214   part of the data in *data*.  This data should be concatenated to the output
215   produced by any preceding calls to the :meth:`compress` method.  Some input may
216   be kept in internal buffers for later processing.
217
218
219.. method:: Compress.flush([mode])
220
221   All pending input is processed, and a bytes object containing the remaining compressed
222   output is returned.  *mode* can be selected from the constants
223   :const:`Z_NO_FLUSH`, :const:`Z_PARTIAL_FLUSH`, :const:`Z_SYNC_FLUSH`,
224   :const:`Z_FULL_FLUSH`, :const:`Z_BLOCK` (zlib 1.2.3.4), or :const:`Z_FINISH`,
225   defaulting to :const:`Z_FINISH`.  Except :const:`Z_FINISH`, all constants
226   allow compressing further bytestrings of data, while :const:`Z_FINISH` finishes the
227   compressed stream and prevents compressing any more data.  After calling :meth:`flush`
228   with *mode* set to :const:`Z_FINISH`, the :meth:`compress` method cannot be called again;
229   the only realistic action is to delete the object.
230
231
232.. method:: Compress.copy()
233
234   Returns a copy of the compression object.  This can be used to efficiently
235   compress a set of data that share a common initial prefix.
236
237
238.. versionchanged:: 3.8
239   Added :func:`copy.copy` and :func:`copy.deepcopy` support to compression
240   objects.
241
242
243Decompression objects support the following methods and attributes:
244
245
246.. attribute:: Decompress.unused_data
247
248   A bytes object which contains any bytes past the end of the compressed data. That is,
249   this remains ``b""`` until the last byte that contains compression data is
250   available.  If the whole bytestring turned out to contain compressed data, this is
251   ``b""``, an empty bytes object.
252
253
254.. attribute:: Decompress.unconsumed_tail
255
256   A bytes object that contains any data that was not consumed by the last
257   :meth:`decompress` call because it exceeded the limit for the uncompressed data
258   buffer.  This data has not yet been seen by the zlib machinery, so you must feed
259   it (possibly with further data concatenated to it) back to a subsequent
260   :meth:`decompress` method call in order to get correct output.
261
262
263.. attribute:: Decompress.eof
264
265   A boolean indicating whether the end of the compressed data stream has been
266   reached.
267
268   This makes it possible to distinguish between a properly formed compressed
269   stream, and an incomplete or truncated one.
270
271   .. versionadded:: 3.3
272
273
274.. method:: Decompress.decompress(data, max_length=0)
275
276   Decompress *data*, returning a bytes object containing the uncompressed data
277   corresponding to at least part of the data in *string*.  This data should be
278   concatenated to the output produced by any preceding calls to the
279   :meth:`decompress` method.  Some of the input data may be preserved in internal
280   buffers for later processing.
281
282   If the optional parameter *max_length* is non-zero then the return value will be
283   no longer than *max_length*. This may mean that not all of the compressed input
284   can be processed; and unconsumed data will be stored in the attribute
285   :attr:`unconsumed_tail`. This bytestring must be passed to a subsequent call to
286   :meth:`decompress` if decompression is to continue.  If *max_length* is zero
287   then the whole input is decompressed, and :attr:`unconsumed_tail` is empty.
288
289   .. versionchanged:: 3.6
290      *max_length* can be used as a keyword argument.
291
292
293.. method:: Decompress.flush([length])
294
295   All pending input is processed, and a bytes object containing the remaining
296   uncompressed output is returned.  After calling :meth:`flush`, the
297   :meth:`decompress` method cannot be called again; the only realistic action is
298   to delete the object.
299
300   The optional parameter *length* sets the initial size of the output buffer.
301
302
303.. method:: Decompress.copy()
304
305   Returns a copy of the decompression object.  This can be used to save the state
306   of the decompressor midway through the data stream in order to speed up random
307   seeks into the stream at a future point.
308
309
310.. versionchanged:: 3.8
311   Added :func:`copy.copy` and :func:`copy.deepcopy` support to decompression
312   objects.
313
314
315Information about the version of the zlib library in use is available through
316the following constants:
317
318
319.. data:: ZLIB_VERSION
320
321   The version string of the zlib library that was used for building the module.
322   This may be different from the zlib library actually used at runtime, which
323   is available as :const:`ZLIB_RUNTIME_VERSION`.
324
325
326.. data:: ZLIB_RUNTIME_VERSION
327
328   The version string of the zlib library actually loaded by the interpreter.
329
330   .. versionadded:: 3.3
331
332
333.. seealso::
334
335   Module :mod:`gzip`
336      Reading and writing :program:`gzip`\ -format files.
337
338   http://www.zlib.net
339      The zlib library home page.
340
341   http://www.zlib.net/manual.html
342      The zlib manual explains  the semantics and usage of the library's many
343      functions.
344