17db96d56Sopenharmony_ci:mod:`lzma` --- Compression using the LZMA algorithm
27db96d56Sopenharmony_ci====================================================
37db96d56Sopenharmony_ci
47db96d56Sopenharmony_ci.. module:: lzma
57db96d56Sopenharmony_ci   :synopsis: A Python wrapper for the liblzma compression library.
67db96d56Sopenharmony_ci
77db96d56Sopenharmony_ci.. moduleauthor:: Nadeem Vawda <nadeem.vawda@gmail.com>
87db96d56Sopenharmony_ci.. sectionauthor:: Nadeem Vawda <nadeem.vawda@gmail.com>
97db96d56Sopenharmony_ci
107db96d56Sopenharmony_ci.. versionadded:: 3.3
117db96d56Sopenharmony_ci
127db96d56Sopenharmony_ci**Source code:** :source:`Lib/lzma.py`
137db96d56Sopenharmony_ci
147db96d56Sopenharmony_ci--------------
157db96d56Sopenharmony_ci
167db96d56Sopenharmony_ciThis module provides classes and convenience functions for compressing and
177db96d56Sopenharmony_cidecompressing data using the LZMA compression algorithm. Also included is a file
187db96d56Sopenharmony_ciinterface supporting the ``.xz`` and legacy ``.lzma`` file formats used by the
197db96d56Sopenharmony_ci:program:`xz` utility, as well as raw compressed streams.
207db96d56Sopenharmony_ci
217db96d56Sopenharmony_ciThe interface provided by this module is very similar to that of the :mod:`bz2`
227db96d56Sopenharmony_cimodule. Note that :class:`LZMAFile` and :class:`bz2.BZ2File` are *not*
237db96d56Sopenharmony_cithread-safe, so if you need to use a single :class:`LZMAFile` instance
247db96d56Sopenharmony_cifrom multiple threads, it is necessary to protect it with a lock.
257db96d56Sopenharmony_ci
267db96d56Sopenharmony_ci
277db96d56Sopenharmony_ci.. exception:: LZMAError
287db96d56Sopenharmony_ci
297db96d56Sopenharmony_ci   This exception is raised when an error occurs during compression or
307db96d56Sopenharmony_ci   decompression, or while initializing the compressor/decompressor state.
317db96d56Sopenharmony_ci
327db96d56Sopenharmony_ci
337db96d56Sopenharmony_ciReading and writing compressed files
347db96d56Sopenharmony_ci------------------------------------
357db96d56Sopenharmony_ci
367db96d56Sopenharmony_ci.. function:: open(filename, mode="rb", *, format=None, check=-1, preset=None, filters=None, encoding=None, errors=None, newline=None)
377db96d56Sopenharmony_ci
387db96d56Sopenharmony_ci   Open an LZMA-compressed file in binary or text mode, returning a :term:`file
397db96d56Sopenharmony_ci   object`.
407db96d56Sopenharmony_ci
417db96d56Sopenharmony_ci   The *filename* argument can be either an actual file name (given as a
427db96d56Sopenharmony_ci   :class:`str`, :class:`bytes` or :term:`path-like <path-like object>` object), in
437db96d56Sopenharmony_ci   which case the named file is opened, or it can be an existing file object
447db96d56Sopenharmony_ci   to read from or write to.
457db96d56Sopenharmony_ci
467db96d56Sopenharmony_ci   The *mode* argument can be any of ``"r"``, ``"rb"``, ``"w"``, ``"wb"``,
477db96d56Sopenharmony_ci   ``"x"``, ``"xb"``, ``"a"`` or ``"ab"`` for binary mode, or ``"rt"``,
487db96d56Sopenharmony_ci   ``"wt"``, ``"xt"``, or ``"at"`` for text mode. The default is ``"rb"``.
497db96d56Sopenharmony_ci
507db96d56Sopenharmony_ci   When opening a file for reading, the *format* and *filters* arguments have
517db96d56Sopenharmony_ci   the same meanings as for :class:`LZMADecompressor`. In this case, the *check*
527db96d56Sopenharmony_ci   and *preset* arguments should not be used.
537db96d56Sopenharmony_ci
547db96d56Sopenharmony_ci   When opening a file for writing, the *format*, *check*, *preset* and
557db96d56Sopenharmony_ci   *filters* arguments have the same meanings as for :class:`LZMACompressor`.
567db96d56Sopenharmony_ci
577db96d56Sopenharmony_ci   For binary mode, this function is equivalent to the :class:`LZMAFile`
587db96d56Sopenharmony_ci   constructor: ``LZMAFile(filename, mode, ...)``. In this case, the *encoding*,
597db96d56Sopenharmony_ci   *errors* and *newline* arguments must not be provided.
607db96d56Sopenharmony_ci
617db96d56Sopenharmony_ci   For text mode, a :class:`LZMAFile` object is created, and wrapped in an
627db96d56Sopenharmony_ci   :class:`io.TextIOWrapper` instance with the specified encoding, error
637db96d56Sopenharmony_ci   handling behavior, and line ending(s).
647db96d56Sopenharmony_ci
657db96d56Sopenharmony_ci   .. versionchanged:: 3.4
667db96d56Sopenharmony_ci      Added support for the ``"x"``, ``"xb"`` and ``"xt"`` modes.
677db96d56Sopenharmony_ci
687db96d56Sopenharmony_ci   .. versionchanged:: 3.6
697db96d56Sopenharmony_ci      Accepts a :term:`path-like object`.
707db96d56Sopenharmony_ci
717db96d56Sopenharmony_ci
727db96d56Sopenharmony_ci.. class:: LZMAFile(filename=None, mode="r", *, format=None, check=-1, preset=None, filters=None)
737db96d56Sopenharmony_ci
747db96d56Sopenharmony_ci   Open an LZMA-compressed file in binary mode.
757db96d56Sopenharmony_ci
767db96d56Sopenharmony_ci   An :class:`LZMAFile` can wrap an already-open :term:`file object`, or operate
777db96d56Sopenharmony_ci   directly on a named file. The *filename* argument specifies either the file
787db96d56Sopenharmony_ci   object to wrap, or the name of the file to open (as a :class:`str`,
797db96d56Sopenharmony_ci   :class:`bytes` or :term:`path-like <path-like object>` object). When wrapping an
807db96d56Sopenharmony_ci   existing file object, the wrapped file will not be closed when the
817db96d56Sopenharmony_ci   :class:`LZMAFile` is closed.
827db96d56Sopenharmony_ci
837db96d56Sopenharmony_ci   The *mode* argument can be either ``"r"`` for reading (default), ``"w"`` for
847db96d56Sopenharmony_ci   overwriting, ``"x"`` for exclusive creation, or ``"a"`` for appending. These
857db96d56Sopenharmony_ci   can equivalently be given as ``"rb"``, ``"wb"``, ``"xb"`` and ``"ab"``
867db96d56Sopenharmony_ci   respectively.
877db96d56Sopenharmony_ci
887db96d56Sopenharmony_ci   If *filename* is a file object (rather than an actual file name), a mode of
897db96d56Sopenharmony_ci   ``"w"`` does not truncate the file, and is instead equivalent to ``"a"``.
907db96d56Sopenharmony_ci
917db96d56Sopenharmony_ci   When opening a file for reading, the input file may be the concatenation of
927db96d56Sopenharmony_ci   multiple separate compressed streams. These are transparently decoded as a
937db96d56Sopenharmony_ci   single logical stream.
947db96d56Sopenharmony_ci
957db96d56Sopenharmony_ci   When opening a file for reading, the *format* and *filters* arguments have
967db96d56Sopenharmony_ci   the same meanings as for :class:`LZMADecompressor`. In this case, the *check*
977db96d56Sopenharmony_ci   and *preset* arguments should not be used.
987db96d56Sopenharmony_ci
997db96d56Sopenharmony_ci   When opening a file for writing, the *format*, *check*, *preset* and
1007db96d56Sopenharmony_ci   *filters* arguments have the same meanings as for :class:`LZMACompressor`.
1017db96d56Sopenharmony_ci
1027db96d56Sopenharmony_ci   :class:`LZMAFile` supports all the members specified by
1037db96d56Sopenharmony_ci   :class:`io.BufferedIOBase`, except for :meth:`detach` and :meth:`truncate`.
1047db96d56Sopenharmony_ci   Iteration and the :keyword:`with` statement are supported.
1057db96d56Sopenharmony_ci
1067db96d56Sopenharmony_ci   The following method is also provided:
1077db96d56Sopenharmony_ci
1087db96d56Sopenharmony_ci   .. method:: peek(size=-1)
1097db96d56Sopenharmony_ci
1107db96d56Sopenharmony_ci      Return buffered data without advancing the file position. At least one
1117db96d56Sopenharmony_ci      byte of data will be returned, unless EOF has been reached. The exact
1127db96d56Sopenharmony_ci      number of bytes returned is unspecified (the *size* argument is ignored).
1137db96d56Sopenharmony_ci
1147db96d56Sopenharmony_ci      .. note:: While calling :meth:`peek` does not change the file position of
1157db96d56Sopenharmony_ci         the :class:`LZMAFile`, it may change the position of the underlying
1167db96d56Sopenharmony_ci         file object (e.g. if the :class:`LZMAFile` was constructed by passing a
1177db96d56Sopenharmony_ci         file object for *filename*).
1187db96d56Sopenharmony_ci
1197db96d56Sopenharmony_ci   .. versionchanged:: 3.4
1207db96d56Sopenharmony_ci      Added support for the ``"x"`` and ``"xb"`` modes.
1217db96d56Sopenharmony_ci
1227db96d56Sopenharmony_ci   .. versionchanged:: 3.5
1237db96d56Sopenharmony_ci      The :meth:`~io.BufferedIOBase.read` method now accepts an argument of
1247db96d56Sopenharmony_ci      ``None``.
1257db96d56Sopenharmony_ci
1267db96d56Sopenharmony_ci   .. versionchanged:: 3.6
1277db96d56Sopenharmony_ci      Accepts a :term:`path-like object`.
1287db96d56Sopenharmony_ci
1297db96d56Sopenharmony_ci
1307db96d56Sopenharmony_ciCompressing and decompressing data in memory
1317db96d56Sopenharmony_ci--------------------------------------------
1327db96d56Sopenharmony_ci
1337db96d56Sopenharmony_ci.. class:: LZMACompressor(format=FORMAT_XZ, check=-1, preset=None, filters=None)
1347db96d56Sopenharmony_ci
1357db96d56Sopenharmony_ci   Create a compressor object, which can be used to compress data incrementally.
1367db96d56Sopenharmony_ci
1377db96d56Sopenharmony_ci   For a more convenient way of compressing a single chunk of data, see
1387db96d56Sopenharmony_ci   :func:`compress`.
1397db96d56Sopenharmony_ci
1407db96d56Sopenharmony_ci   The *format* argument specifies what container format should be used.
1417db96d56Sopenharmony_ci   Possible values are:
1427db96d56Sopenharmony_ci
1437db96d56Sopenharmony_ci   * :const:`FORMAT_XZ`: The ``.xz`` container format.
1447db96d56Sopenharmony_ci      This is the default format.
1457db96d56Sopenharmony_ci
1467db96d56Sopenharmony_ci   * :const:`FORMAT_ALONE`: The legacy ``.lzma`` container format.
1477db96d56Sopenharmony_ci      This format is more limited than ``.xz`` -- it does not support integrity
1487db96d56Sopenharmony_ci      checks or multiple filters.
1497db96d56Sopenharmony_ci
1507db96d56Sopenharmony_ci   * :const:`FORMAT_RAW`: A raw data stream, not using any container format.
1517db96d56Sopenharmony_ci      This format specifier does not support integrity checks, and requires that
1527db96d56Sopenharmony_ci      you always specify a custom filter chain (for both compression and
1537db96d56Sopenharmony_ci      decompression). Additionally, data compressed in this manner cannot be
1547db96d56Sopenharmony_ci      decompressed using :const:`FORMAT_AUTO` (see :class:`LZMADecompressor`).
1557db96d56Sopenharmony_ci
1567db96d56Sopenharmony_ci   The *check* argument specifies the type of integrity check to include in the
1577db96d56Sopenharmony_ci   compressed data. This check is used when decompressing, to ensure that the
1587db96d56Sopenharmony_ci   data has not been corrupted. Possible values are:
1597db96d56Sopenharmony_ci
1607db96d56Sopenharmony_ci   * :const:`CHECK_NONE`: No integrity check.
1617db96d56Sopenharmony_ci     This is the default (and the only acceptable value) for
1627db96d56Sopenharmony_ci     :const:`FORMAT_ALONE` and :const:`FORMAT_RAW`.
1637db96d56Sopenharmony_ci
1647db96d56Sopenharmony_ci   * :const:`CHECK_CRC32`: 32-bit Cyclic Redundancy Check.
1657db96d56Sopenharmony_ci
1667db96d56Sopenharmony_ci   * :const:`CHECK_CRC64`: 64-bit Cyclic Redundancy Check.
1677db96d56Sopenharmony_ci     This is the default for :const:`FORMAT_XZ`.
1687db96d56Sopenharmony_ci
1697db96d56Sopenharmony_ci   * :const:`CHECK_SHA256`: 256-bit Secure Hash Algorithm.
1707db96d56Sopenharmony_ci
1717db96d56Sopenharmony_ci   If the specified check is not supported, an :class:`LZMAError` is raised.
1727db96d56Sopenharmony_ci
1737db96d56Sopenharmony_ci   The compression settings can be specified either as a preset compression
1747db96d56Sopenharmony_ci   level (with the *preset* argument), or in detail as a custom filter chain
1757db96d56Sopenharmony_ci   (with the *filters* argument).
1767db96d56Sopenharmony_ci
1777db96d56Sopenharmony_ci   The *preset* argument (if provided) should be an integer between ``0`` and
1787db96d56Sopenharmony_ci   ``9`` (inclusive), optionally OR-ed with the constant
1797db96d56Sopenharmony_ci   :const:`PRESET_EXTREME`. If neither *preset* nor *filters* are given, the
1807db96d56Sopenharmony_ci   default behavior is to use :const:`PRESET_DEFAULT` (preset level ``6``).
1817db96d56Sopenharmony_ci   Higher presets produce smaller output, but make the compression process
1827db96d56Sopenharmony_ci   slower.
1837db96d56Sopenharmony_ci
1847db96d56Sopenharmony_ci   .. note::
1857db96d56Sopenharmony_ci
1867db96d56Sopenharmony_ci      In addition to being more CPU-intensive, compression with higher presets
1877db96d56Sopenharmony_ci      also requires much more memory (and produces output that needs more memory
1887db96d56Sopenharmony_ci      to decompress). With preset ``9`` for example, the overhead for an
1897db96d56Sopenharmony_ci      :class:`LZMACompressor` object can be as high as 800 MiB. For this reason,
1907db96d56Sopenharmony_ci      it is generally best to stick with the default preset.
1917db96d56Sopenharmony_ci
1927db96d56Sopenharmony_ci   The *filters* argument (if provided) should be a filter chain specifier.
1937db96d56Sopenharmony_ci   See :ref:`filter-chain-specs` for details.
1947db96d56Sopenharmony_ci
1957db96d56Sopenharmony_ci   .. method:: compress(data)
1967db96d56Sopenharmony_ci
1977db96d56Sopenharmony_ci      Compress *data* (a :class:`bytes` object), returning a :class:`bytes`
1987db96d56Sopenharmony_ci      object containing compressed data for at least part of the input. Some of
1997db96d56Sopenharmony_ci      *data* may be buffered internally, for use in later calls to
2007db96d56Sopenharmony_ci      :meth:`compress` and :meth:`flush`. The returned data should be
2017db96d56Sopenharmony_ci      concatenated with the output of any previous calls to :meth:`compress`.
2027db96d56Sopenharmony_ci
2037db96d56Sopenharmony_ci   .. method:: flush()
2047db96d56Sopenharmony_ci
2057db96d56Sopenharmony_ci      Finish the compression process, returning a :class:`bytes` object
2067db96d56Sopenharmony_ci      containing any data stored in the compressor's internal buffers.
2077db96d56Sopenharmony_ci
2087db96d56Sopenharmony_ci      The compressor cannot be used after this method has been called.
2097db96d56Sopenharmony_ci
2107db96d56Sopenharmony_ci
2117db96d56Sopenharmony_ci.. class:: LZMADecompressor(format=FORMAT_AUTO, memlimit=None, filters=None)
2127db96d56Sopenharmony_ci
2137db96d56Sopenharmony_ci   Create a decompressor object, which can be used to decompress data
2147db96d56Sopenharmony_ci   incrementally.
2157db96d56Sopenharmony_ci
2167db96d56Sopenharmony_ci   For a more convenient way of decompressing an entire compressed stream at
2177db96d56Sopenharmony_ci   once, see :func:`decompress`.
2187db96d56Sopenharmony_ci
2197db96d56Sopenharmony_ci   The *format* argument specifies the container format that should be used. The
2207db96d56Sopenharmony_ci   default is :const:`FORMAT_AUTO`, which can decompress both ``.xz`` and
2217db96d56Sopenharmony_ci   ``.lzma`` files. Other possible values are :const:`FORMAT_XZ`,
2227db96d56Sopenharmony_ci   :const:`FORMAT_ALONE`, and :const:`FORMAT_RAW`.
2237db96d56Sopenharmony_ci
2247db96d56Sopenharmony_ci   The *memlimit* argument specifies a limit (in bytes) on the amount of memory
2257db96d56Sopenharmony_ci   that the decompressor can use. When this argument is used, decompression will
2267db96d56Sopenharmony_ci   fail with an :class:`LZMAError` if it is not possible to decompress the input
2277db96d56Sopenharmony_ci   within the given memory limit.
2287db96d56Sopenharmony_ci
2297db96d56Sopenharmony_ci   The *filters* argument specifies the filter chain that was used to create
2307db96d56Sopenharmony_ci   the stream being decompressed. This argument is required if *format* is
2317db96d56Sopenharmony_ci   :const:`FORMAT_RAW`, but should not be used for other formats.
2327db96d56Sopenharmony_ci   See :ref:`filter-chain-specs` for more information about filter chains.
2337db96d56Sopenharmony_ci
2347db96d56Sopenharmony_ci   .. note::
2357db96d56Sopenharmony_ci      This class does not transparently handle inputs containing multiple
2367db96d56Sopenharmony_ci      compressed streams, unlike :func:`decompress` and :class:`LZMAFile`. To
2377db96d56Sopenharmony_ci      decompress a multi-stream input with :class:`LZMADecompressor`, you must
2387db96d56Sopenharmony_ci      create a new decompressor for each stream.
2397db96d56Sopenharmony_ci
2407db96d56Sopenharmony_ci   .. method:: decompress(data, max_length=-1)
2417db96d56Sopenharmony_ci
2427db96d56Sopenharmony_ci      Decompress *data* (a :term:`bytes-like object`), returning
2437db96d56Sopenharmony_ci      uncompressed data as bytes. Some of *data* may be buffered
2447db96d56Sopenharmony_ci      internally, for use in later calls to :meth:`decompress`. The
2457db96d56Sopenharmony_ci      returned data should be concatenated with the output of any
2467db96d56Sopenharmony_ci      previous calls to :meth:`decompress`.
2477db96d56Sopenharmony_ci
2487db96d56Sopenharmony_ci      If *max_length* is nonnegative, returns at most *max_length*
2497db96d56Sopenharmony_ci      bytes of decompressed data. If this limit is reached and further
2507db96d56Sopenharmony_ci      output can be produced, the :attr:`~.needs_input` attribute will
2517db96d56Sopenharmony_ci      be set to ``False``. In this case, the next call to
2527db96d56Sopenharmony_ci      :meth:`~.decompress` may provide *data* as ``b''`` to obtain
2537db96d56Sopenharmony_ci      more of the output.
2547db96d56Sopenharmony_ci
2557db96d56Sopenharmony_ci      If all of the input data was decompressed and returned (either
2567db96d56Sopenharmony_ci      because this was less than *max_length* bytes, or because
2577db96d56Sopenharmony_ci      *max_length* was negative), the :attr:`~.needs_input` attribute
2587db96d56Sopenharmony_ci      will be set to ``True``.
2597db96d56Sopenharmony_ci
2607db96d56Sopenharmony_ci      Attempting to decompress data after the end of stream is reached
2617db96d56Sopenharmony_ci      raises an :exc:`EOFError`.  Any data found after the end of the
2627db96d56Sopenharmony_ci      stream is ignored and saved in the :attr:`~.unused_data` attribute.
2637db96d56Sopenharmony_ci
2647db96d56Sopenharmony_ci      .. versionchanged:: 3.5
2657db96d56Sopenharmony_ci         Added the *max_length* parameter.
2667db96d56Sopenharmony_ci
2677db96d56Sopenharmony_ci   .. attribute:: check
2687db96d56Sopenharmony_ci
2697db96d56Sopenharmony_ci      The ID of the integrity check used by the input stream. This may be
2707db96d56Sopenharmony_ci      :const:`CHECK_UNKNOWN` until enough of the input has been decoded to
2717db96d56Sopenharmony_ci      determine what integrity check it uses.
2727db96d56Sopenharmony_ci
2737db96d56Sopenharmony_ci   .. attribute:: eof
2747db96d56Sopenharmony_ci
2757db96d56Sopenharmony_ci      ``True`` if the end-of-stream marker has been reached.
2767db96d56Sopenharmony_ci
2777db96d56Sopenharmony_ci   .. attribute:: unused_data
2787db96d56Sopenharmony_ci
2797db96d56Sopenharmony_ci      Data found after the end of the compressed stream.
2807db96d56Sopenharmony_ci
2817db96d56Sopenharmony_ci      Before the end of the stream is reached, this will be ``b""``.
2827db96d56Sopenharmony_ci
2837db96d56Sopenharmony_ci   .. attribute:: needs_input
2847db96d56Sopenharmony_ci
2857db96d56Sopenharmony_ci      ``False`` if the :meth:`.decompress` method can provide more
2867db96d56Sopenharmony_ci      decompressed data before requiring new uncompressed input.
2877db96d56Sopenharmony_ci
2887db96d56Sopenharmony_ci      .. versionadded:: 3.5
2897db96d56Sopenharmony_ci
2907db96d56Sopenharmony_ci.. function:: compress(data, format=FORMAT_XZ, check=-1, preset=None, filters=None)
2917db96d56Sopenharmony_ci
2927db96d56Sopenharmony_ci   Compress *data* (a :class:`bytes` object), returning the compressed data as a
2937db96d56Sopenharmony_ci   :class:`bytes` object.
2947db96d56Sopenharmony_ci
2957db96d56Sopenharmony_ci   See :class:`LZMACompressor` above for a description of the *format*, *check*,
2967db96d56Sopenharmony_ci   *preset* and *filters* arguments.
2977db96d56Sopenharmony_ci
2987db96d56Sopenharmony_ci
2997db96d56Sopenharmony_ci.. function:: decompress(data, format=FORMAT_AUTO, memlimit=None, filters=None)
3007db96d56Sopenharmony_ci
3017db96d56Sopenharmony_ci   Decompress *data* (a :class:`bytes` object), returning the uncompressed data
3027db96d56Sopenharmony_ci   as a :class:`bytes` object.
3037db96d56Sopenharmony_ci
3047db96d56Sopenharmony_ci   If *data* is the concatenation of multiple distinct compressed streams,
3057db96d56Sopenharmony_ci   decompress all of these streams, and return the concatenation of the results.
3067db96d56Sopenharmony_ci
3077db96d56Sopenharmony_ci   See :class:`LZMADecompressor` above for a description of the *format*,
3087db96d56Sopenharmony_ci   *memlimit* and *filters* arguments.
3097db96d56Sopenharmony_ci
3107db96d56Sopenharmony_ci
3117db96d56Sopenharmony_ciMiscellaneous
3127db96d56Sopenharmony_ci-------------
3137db96d56Sopenharmony_ci
3147db96d56Sopenharmony_ci.. function:: is_check_supported(check)
3157db96d56Sopenharmony_ci
3167db96d56Sopenharmony_ci   Return ``True`` if the given integrity check is supported on this system.
3177db96d56Sopenharmony_ci
3187db96d56Sopenharmony_ci   :const:`CHECK_NONE` and :const:`CHECK_CRC32` are always supported.
3197db96d56Sopenharmony_ci   :const:`CHECK_CRC64` and :const:`CHECK_SHA256` may be unavailable if you are
3207db96d56Sopenharmony_ci   using a version of :program:`liblzma` that was compiled with a limited
3217db96d56Sopenharmony_ci   feature set.
3227db96d56Sopenharmony_ci
3237db96d56Sopenharmony_ci
3247db96d56Sopenharmony_ci.. _filter-chain-specs:
3257db96d56Sopenharmony_ci
3267db96d56Sopenharmony_ciSpecifying custom filter chains
3277db96d56Sopenharmony_ci-------------------------------
3287db96d56Sopenharmony_ci
3297db96d56Sopenharmony_ciA filter chain specifier is a sequence of dictionaries, where each dictionary
3307db96d56Sopenharmony_cicontains the ID and options for a single filter. Each dictionary must contain
3317db96d56Sopenharmony_cithe key ``"id"``, and may contain additional keys to specify filter-dependent
3327db96d56Sopenharmony_cioptions. Valid filter IDs are as follows:
3337db96d56Sopenharmony_ci
3347db96d56Sopenharmony_ci* Compression filters:
3357db96d56Sopenharmony_ci   * :const:`FILTER_LZMA1` (for use with :const:`FORMAT_ALONE`)
3367db96d56Sopenharmony_ci   * :const:`FILTER_LZMA2` (for use with :const:`FORMAT_XZ` and :const:`FORMAT_RAW`)
3377db96d56Sopenharmony_ci
3387db96d56Sopenharmony_ci* Delta filter:
3397db96d56Sopenharmony_ci   * :const:`FILTER_DELTA`
3407db96d56Sopenharmony_ci
3417db96d56Sopenharmony_ci* Branch-Call-Jump (BCJ) filters:
3427db96d56Sopenharmony_ci   * :const:`FILTER_X86`
3437db96d56Sopenharmony_ci   * :const:`FILTER_IA64`
3447db96d56Sopenharmony_ci   * :const:`FILTER_ARM`
3457db96d56Sopenharmony_ci   * :const:`FILTER_ARMTHUMB`
3467db96d56Sopenharmony_ci   * :const:`FILTER_POWERPC`
3477db96d56Sopenharmony_ci   * :const:`FILTER_SPARC`
3487db96d56Sopenharmony_ci
3497db96d56Sopenharmony_ciA filter chain can consist of up to 4 filters, and cannot be empty. The last
3507db96d56Sopenharmony_cifilter in the chain must be a compression filter, and any other filters must be
3517db96d56Sopenharmony_cidelta or BCJ filters.
3527db96d56Sopenharmony_ci
3537db96d56Sopenharmony_ciCompression filters support the following options (specified as additional
3547db96d56Sopenharmony_cientries in the dictionary representing the filter):
3557db96d56Sopenharmony_ci
3567db96d56Sopenharmony_ci   * ``preset``: A compression preset to use as a source of default values for
3577db96d56Sopenharmony_ci     options that are not specified explicitly.
3587db96d56Sopenharmony_ci   * ``dict_size``: Dictionary size in bytes. This should be between 4 KiB and
3597db96d56Sopenharmony_ci     1.5 GiB (inclusive).
3607db96d56Sopenharmony_ci   * ``lc``: Number of literal context bits.
3617db96d56Sopenharmony_ci   * ``lp``: Number of literal position bits. The sum ``lc + lp`` must be at
3627db96d56Sopenharmony_ci     most 4.
3637db96d56Sopenharmony_ci   * ``pb``: Number of position bits; must be at most 4.
3647db96d56Sopenharmony_ci   * ``mode``: :const:`MODE_FAST` or :const:`MODE_NORMAL`.
3657db96d56Sopenharmony_ci   * ``nice_len``: What should be considered a "nice length" for a match.
3667db96d56Sopenharmony_ci     This should be 273 or less.
3677db96d56Sopenharmony_ci   * ``mf``: What match finder to use -- :const:`MF_HC3`, :const:`MF_HC4`,
3687db96d56Sopenharmony_ci     :const:`MF_BT2`, :const:`MF_BT3`, or :const:`MF_BT4`.
3697db96d56Sopenharmony_ci   * ``depth``: Maximum search depth used by match finder. 0 (default) means to
3707db96d56Sopenharmony_ci     select automatically based on other filter options.
3717db96d56Sopenharmony_ci
3727db96d56Sopenharmony_ciThe delta filter stores the differences between bytes, producing more repetitive
3737db96d56Sopenharmony_ciinput for the compressor in certain circumstances. It supports one option,
3747db96d56Sopenharmony_ci``dist``. This indicates the distance between bytes to be subtracted. The
3757db96d56Sopenharmony_cidefault is 1, i.e. take the differences between adjacent bytes.
3767db96d56Sopenharmony_ci
3777db96d56Sopenharmony_ciThe BCJ filters are intended to be applied to machine code. They convert
3787db96d56Sopenharmony_cirelative branches, calls and jumps in the code to use absolute addressing, with
3797db96d56Sopenharmony_cithe aim of increasing the redundancy that can be exploited by the compressor.
3807db96d56Sopenharmony_ciThese filters support one option, ``start_offset``. This specifies the address
3817db96d56Sopenharmony_cithat should be mapped to the beginning of the input data. The default is 0.
3827db96d56Sopenharmony_ci
3837db96d56Sopenharmony_ci
3847db96d56Sopenharmony_ciExamples
3857db96d56Sopenharmony_ci--------
3867db96d56Sopenharmony_ci
3877db96d56Sopenharmony_ciReading in a compressed file::
3887db96d56Sopenharmony_ci
3897db96d56Sopenharmony_ci   import lzma
3907db96d56Sopenharmony_ci   with lzma.open("file.xz") as f:
3917db96d56Sopenharmony_ci       file_content = f.read()
3927db96d56Sopenharmony_ci
3937db96d56Sopenharmony_ciCreating a compressed file::
3947db96d56Sopenharmony_ci
3957db96d56Sopenharmony_ci   import lzma
3967db96d56Sopenharmony_ci   data = b"Insert Data Here"
3977db96d56Sopenharmony_ci   with lzma.open("file.xz", "w") as f:
3987db96d56Sopenharmony_ci       f.write(data)
3997db96d56Sopenharmony_ci
4007db96d56Sopenharmony_ciCompressing data in memory::
4017db96d56Sopenharmony_ci
4027db96d56Sopenharmony_ci   import lzma
4037db96d56Sopenharmony_ci   data_in = b"Insert Data Here"
4047db96d56Sopenharmony_ci   data_out = lzma.compress(data_in)
4057db96d56Sopenharmony_ci
4067db96d56Sopenharmony_ciIncremental compression::
4077db96d56Sopenharmony_ci
4087db96d56Sopenharmony_ci   import lzma
4097db96d56Sopenharmony_ci   lzc = lzma.LZMACompressor()
4107db96d56Sopenharmony_ci   out1 = lzc.compress(b"Some data\n")
4117db96d56Sopenharmony_ci   out2 = lzc.compress(b"Another piece of data\n")
4127db96d56Sopenharmony_ci   out3 = lzc.compress(b"Even more data\n")
4137db96d56Sopenharmony_ci   out4 = lzc.flush()
4147db96d56Sopenharmony_ci   # Concatenate all the partial results:
4157db96d56Sopenharmony_ci   result = b"".join([out1, out2, out3, out4])
4167db96d56Sopenharmony_ci
4177db96d56Sopenharmony_ciWriting compressed data to an already-open file::
4187db96d56Sopenharmony_ci
4197db96d56Sopenharmony_ci   import lzma
4207db96d56Sopenharmony_ci   with open("file.xz", "wb") as f:
4217db96d56Sopenharmony_ci       f.write(b"This data will not be compressed\n")
4227db96d56Sopenharmony_ci       with lzma.open(f, "w") as lzf:
4237db96d56Sopenharmony_ci           lzf.write(b"This *will* be compressed\n")
4247db96d56Sopenharmony_ci       f.write(b"Not compressed\n")
4257db96d56Sopenharmony_ci
4267db96d56Sopenharmony_ciCreating a compressed file using a custom filter chain::
4277db96d56Sopenharmony_ci
4287db96d56Sopenharmony_ci   import lzma
4297db96d56Sopenharmony_ci   my_filters = [
4307db96d56Sopenharmony_ci       {"id": lzma.FILTER_DELTA, "dist": 5},
4317db96d56Sopenharmony_ci       {"id": lzma.FILTER_LZMA2, "preset": 7 | lzma.PRESET_EXTREME},
4327db96d56Sopenharmony_ci   ]
4337db96d56Sopenharmony_ci   with lzma.open("file.xz", "w", filters=my_filters) as f:
4347db96d56Sopenharmony_ci       f.write(b"blah blah blah")
435