17db96d56Sopenharmony_ci:mod:`lzma` --- Compression using the LZMA algorithm 27db96d56Sopenharmony_ci==================================================== 37db96d56Sopenharmony_ci 47db96d56Sopenharmony_ci.. module:: lzma 57db96d56Sopenharmony_ci :synopsis: A Python wrapper for the liblzma compression library. 67db96d56Sopenharmony_ci 77db96d56Sopenharmony_ci.. moduleauthor:: Nadeem Vawda <nadeem.vawda@gmail.com> 87db96d56Sopenharmony_ci.. sectionauthor:: Nadeem Vawda <nadeem.vawda@gmail.com> 97db96d56Sopenharmony_ci 107db96d56Sopenharmony_ci.. versionadded:: 3.3 117db96d56Sopenharmony_ci 127db96d56Sopenharmony_ci**Source code:** :source:`Lib/lzma.py` 137db96d56Sopenharmony_ci 147db96d56Sopenharmony_ci-------------- 157db96d56Sopenharmony_ci 167db96d56Sopenharmony_ciThis module provides classes and convenience functions for compressing and 177db96d56Sopenharmony_cidecompressing data using the LZMA compression algorithm. Also included is a file 187db96d56Sopenharmony_ciinterface supporting the ``.xz`` and legacy ``.lzma`` file formats used by the 197db96d56Sopenharmony_ci:program:`xz` utility, as well as raw compressed streams. 207db96d56Sopenharmony_ci 217db96d56Sopenharmony_ciThe interface provided by this module is very similar to that of the :mod:`bz2` 227db96d56Sopenharmony_cimodule. Note that :class:`LZMAFile` and :class:`bz2.BZ2File` are *not* 237db96d56Sopenharmony_cithread-safe, so if you need to use a single :class:`LZMAFile` instance 247db96d56Sopenharmony_cifrom multiple threads, it is necessary to protect it with a lock. 257db96d56Sopenharmony_ci 267db96d56Sopenharmony_ci 277db96d56Sopenharmony_ci.. exception:: LZMAError 287db96d56Sopenharmony_ci 297db96d56Sopenharmony_ci This exception is raised when an error occurs during compression or 307db96d56Sopenharmony_ci decompression, or while initializing the compressor/decompressor state. 317db96d56Sopenharmony_ci 327db96d56Sopenharmony_ci 337db96d56Sopenharmony_ciReading and writing compressed files 347db96d56Sopenharmony_ci------------------------------------ 357db96d56Sopenharmony_ci 367db96d56Sopenharmony_ci.. function:: open(filename, mode="rb", *, format=None, check=-1, preset=None, filters=None, encoding=None, errors=None, newline=None) 377db96d56Sopenharmony_ci 387db96d56Sopenharmony_ci Open an LZMA-compressed file in binary or text mode, returning a :term:`file 397db96d56Sopenharmony_ci object`. 407db96d56Sopenharmony_ci 417db96d56Sopenharmony_ci The *filename* argument can be either an actual file name (given as a 427db96d56Sopenharmony_ci :class:`str`, :class:`bytes` or :term:`path-like <path-like object>` object), in 437db96d56Sopenharmony_ci which case the named file is opened, or it can be an existing file object 447db96d56Sopenharmony_ci to read from or write to. 457db96d56Sopenharmony_ci 467db96d56Sopenharmony_ci The *mode* argument can be any of ``"r"``, ``"rb"``, ``"w"``, ``"wb"``, 477db96d56Sopenharmony_ci ``"x"``, ``"xb"``, ``"a"`` or ``"ab"`` for binary mode, or ``"rt"``, 487db96d56Sopenharmony_ci ``"wt"``, ``"xt"``, or ``"at"`` for text mode. The default is ``"rb"``. 497db96d56Sopenharmony_ci 507db96d56Sopenharmony_ci When opening a file for reading, the *format* and *filters* arguments have 517db96d56Sopenharmony_ci the same meanings as for :class:`LZMADecompressor`. In this case, the *check* 527db96d56Sopenharmony_ci and *preset* arguments should not be used. 537db96d56Sopenharmony_ci 547db96d56Sopenharmony_ci When opening a file for writing, the *format*, *check*, *preset* and 557db96d56Sopenharmony_ci *filters* arguments have the same meanings as for :class:`LZMACompressor`. 567db96d56Sopenharmony_ci 577db96d56Sopenharmony_ci For binary mode, this function is equivalent to the :class:`LZMAFile` 587db96d56Sopenharmony_ci constructor: ``LZMAFile(filename, mode, ...)``. In this case, the *encoding*, 597db96d56Sopenharmony_ci *errors* and *newline* arguments must not be provided. 607db96d56Sopenharmony_ci 617db96d56Sopenharmony_ci For text mode, a :class:`LZMAFile` object is created, and wrapped in an 627db96d56Sopenharmony_ci :class:`io.TextIOWrapper` instance with the specified encoding, error 637db96d56Sopenharmony_ci handling behavior, and line ending(s). 647db96d56Sopenharmony_ci 657db96d56Sopenharmony_ci .. versionchanged:: 3.4 667db96d56Sopenharmony_ci Added support for the ``"x"``, ``"xb"`` and ``"xt"`` modes. 677db96d56Sopenharmony_ci 687db96d56Sopenharmony_ci .. versionchanged:: 3.6 697db96d56Sopenharmony_ci Accepts a :term:`path-like object`. 707db96d56Sopenharmony_ci 717db96d56Sopenharmony_ci 727db96d56Sopenharmony_ci.. class:: LZMAFile(filename=None, mode="r", *, format=None, check=-1, preset=None, filters=None) 737db96d56Sopenharmony_ci 747db96d56Sopenharmony_ci Open an LZMA-compressed file in binary mode. 757db96d56Sopenharmony_ci 767db96d56Sopenharmony_ci An :class:`LZMAFile` can wrap an already-open :term:`file object`, or operate 777db96d56Sopenharmony_ci directly on a named file. The *filename* argument specifies either the file 787db96d56Sopenharmony_ci object to wrap, or the name of the file to open (as a :class:`str`, 797db96d56Sopenharmony_ci :class:`bytes` or :term:`path-like <path-like object>` object). When wrapping an 807db96d56Sopenharmony_ci existing file object, the wrapped file will not be closed when the 817db96d56Sopenharmony_ci :class:`LZMAFile` is closed. 827db96d56Sopenharmony_ci 837db96d56Sopenharmony_ci The *mode* argument can be either ``"r"`` for reading (default), ``"w"`` for 847db96d56Sopenharmony_ci overwriting, ``"x"`` for exclusive creation, or ``"a"`` for appending. These 857db96d56Sopenharmony_ci can equivalently be given as ``"rb"``, ``"wb"``, ``"xb"`` and ``"ab"`` 867db96d56Sopenharmony_ci respectively. 877db96d56Sopenharmony_ci 887db96d56Sopenharmony_ci If *filename* is a file object (rather than an actual file name), a mode of 897db96d56Sopenharmony_ci ``"w"`` does not truncate the file, and is instead equivalent to ``"a"``. 907db96d56Sopenharmony_ci 917db96d56Sopenharmony_ci When opening a file for reading, the input file may be the concatenation of 927db96d56Sopenharmony_ci multiple separate compressed streams. These are transparently decoded as a 937db96d56Sopenharmony_ci single logical stream. 947db96d56Sopenharmony_ci 957db96d56Sopenharmony_ci When opening a file for reading, the *format* and *filters* arguments have 967db96d56Sopenharmony_ci the same meanings as for :class:`LZMADecompressor`. In this case, the *check* 977db96d56Sopenharmony_ci and *preset* arguments should not be used. 987db96d56Sopenharmony_ci 997db96d56Sopenharmony_ci When opening a file for writing, the *format*, *check*, *preset* and 1007db96d56Sopenharmony_ci *filters* arguments have the same meanings as for :class:`LZMACompressor`. 1017db96d56Sopenharmony_ci 1027db96d56Sopenharmony_ci :class:`LZMAFile` supports all the members specified by 1037db96d56Sopenharmony_ci :class:`io.BufferedIOBase`, except for :meth:`detach` and :meth:`truncate`. 1047db96d56Sopenharmony_ci Iteration and the :keyword:`with` statement are supported. 1057db96d56Sopenharmony_ci 1067db96d56Sopenharmony_ci The following method is also provided: 1077db96d56Sopenharmony_ci 1087db96d56Sopenharmony_ci .. method:: peek(size=-1) 1097db96d56Sopenharmony_ci 1107db96d56Sopenharmony_ci Return buffered data without advancing the file position. At least one 1117db96d56Sopenharmony_ci byte of data will be returned, unless EOF has been reached. The exact 1127db96d56Sopenharmony_ci number of bytes returned is unspecified (the *size* argument is ignored). 1137db96d56Sopenharmony_ci 1147db96d56Sopenharmony_ci .. note:: While calling :meth:`peek` does not change the file position of 1157db96d56Sopenharmony_ci the :class:`LZMAFile`, it may change the position of the underlying 1167db96d56Sopenharmony_ci file object (e.g. if the :class:`LZMAFile` was constructed by passing a 1177db96d56Sopenharmony_ci file object for *filename*). 1187db96d56Sopenharmony_ci 1197db96d56Sopenharmony_ci .. versionchanged:: 3.4 1207db96d56Sopenharmony_ci Added support for the ``"x"`` and ``"xb"`` modes. 1217db96d56Sopenharmony_ci 1227db96d56Sopenharmony_ci .. versionchanged:: 3.5 1237db96d56Sopenharmony_ci The :meth:`~io.BufferedIOBase.read` method now accepts an argument of 1247db96d56Sopenharmony_ci ``None``. 1257db96d56Sopenharmony_ci 1267db96d56Sopenharmony_ci .. versionchanged:: 3.6 1277db96d56Sopenharmony_ci Accepts a :term:`path-like object`. 1287db96d56Sopenharmony_ci 1297db96d56Sopenharmony_ci 1307db96d56Sopenharmony_ciCompressing and decompressing data in memory 1317db96d56Sopenharmony_ci-------------------------------------------- 1327db96d56Sopenharmony_ci 1337db96d56Sopenharmony_ci.. class:: LZMACompressor(format=FORMAT_XZ, check=-1, preset=None, filters=None) 1347db96d56Sopenharmony_ci 1357db96d56Sopenharmony_ci Create a compressor object, which can be used to compress data incrementally. 1367db96d56Sopenharmony_ci 1377db96d56Sopenharmony_ci For a more convenient way of compressing a single chunk of data, see 1387db96d56Sopenharmony_ci :func:`compress`. 1397db96d56Sopenharmony_ci 1407db96d56Sopenharmony_ci The *format* argument specifies what container format should be used. 1417db96d56Sopenharmony_ci Possible values are: 1427db96d56Sopenharmony_ci 1437db96d56Sopenharmony_ci * :const:`FORMAT_XZ`: The ``.xz`` container format. 1447db96d56Sopenharmony_ci This is the default format. 1457db96d56Sopenharmony_ci 1467db96d56Sopenharmony_ci * :const:`FORMAT_ALONE`: The legacy ``.lzma`` container format. 1477db96d56Sopenharmony_ci This format is more limited than ``.xz`` -- it does not support integrity 1487db96d56Sopenharmony_ci checks or multiple filters. 1497db96d56Sopenharmony_ci 1507db96d56Sopenharmony_ci * :const:`FORMAT_RAW`: A raw data stream, not using any container format. 1517db96d56Sopenharmony_ci This format specifier does not support integrity checks, and requires that 1527db96d56Sopenharmony_ci you always specify a custom filter chain (for both compression and 1537db96d56Sopenharmony_ci decompression). Additionally, data compressed in this manner cannot be 1547db96d56Sopenharmony_ci decompressed using :const:`FORMAT_AUTO` (see :class:`LZMADecompressor`). 1557db96d56Sopenharmony_ci 1567db96d56Sopenharmony_ci The *check* argument specifies the type of integrity check to include in the 1577db96d56Sopenharmony_ci compressed data. This check is used when decompressing, to ensure that the 1587db96d56Sopenharmony_ci data has not been corrupted. Possible values are: 1597db96d56Sopenharmony_ci 1607db96d56Sopenharmony_ci * :const:`CHECK_NONE`: No integrity check. 1617db96d56Sopenharmony_ci This is the default (and the only acceptable value) for 1627db96d56Sopenharmony_ci :const:`FORMAT_ALONE` and :const:`FORMAT_RAW`. 1637db96d56Sopenharmony_ci 1647db96d56Sopenharmony_ci * :const:`CHECK_CRC32`: 32-bit Cyclic Redundancy Check. 1657db96d56Sopenharmony_ci 1667db96d56Sopenharmony_ci * :const:`CHECK_CRC64`: 64-bit Cyclic Redundancy Check. 1677db96d56Sopenharmony_ci This is the default for :const:`FORMAT_XZ`. 1687db96d56Sopenharmony_ci 1697db96d56Sopenharmony_ci * :const:`CHECK_SHA256`: 256-bit Secure Hash Algorithm. 1707db96d56Sopenharmony_ci 1717db96d56Sopenharmony_ci If the specified check is not supported, an :class:`LZMAError` is raised. 1727db96d56Sopenharmony_ci 1737db96d56Sopenharmony_ci The compression settings can be specified either as a preset compression 1747db96d56Sopenharmony_ci level (with the *preset* argument), or in detail as a custom filter chain 1757db96d56Sopenharmony_ci (with the *filters* argument). 1767db96d56Sopenharmony_ci 1777db96d56Sopenharmony_ci The *preset* argument (if provided) should be an integer between ``0`` and 1787db96d56Sopenharmony_ci ``9`` (inclusive), optionally OR-ed with the constant 1797db96d56Sopenharmony_ci :const:`PRESET_EXTREME`. If neither *preset* nor *filters* are given, the 1807db96d56Sopenharmony_ci default behavior is to use :const:`PRESET_DEFAULT` (preset level ``6``). 1817db96d56Sopenharmony_ci Higher presets produce smaller output, but make the compression process 1827db96d56Sopenharmony_ci slower. 1837db96d56Sopenharmony_ci 1847db96d56Sopenharmony_ci .. note:: 1857db96d56Sopenharmony_ci 1867db96d56Sopenharmony_ci In addition to being more CPU-intensive, compression with higher presets 1877db96d56Sopenharmony_ci also requires much more memory (and produces output that needs more memory 1887db96d56Sopenharmony_ci to decompress). With preset ``9`` for example, the overhead for an 1897db96d56Sopenharmony_ci :class:`LZMACompressor` object can be as high as 800 MiB. For this reason, 1907db96d56Sopenharmony_ci it is generally best to stick with the default preset. 1917db96d56Sopenharmony_ci 1927db96d56Sopenharmony_ci The *filters* argument (if provided) should be a filter chain specifier. 1937db96d56Sopenharmony_ci See :ref:`filter-chain-specs` for details. 1947db96d56Sopenharmony_ci 1957db96d56Sopenharmony_ci .. method:: compress(data) 1967db96d56Sopenharmony_ci 1977db96d56Sopenharmony_ci Compress *data* (a :class:`bytes` object), returning a :class:`bytes` 1987db96d56Sopenharmony_ci object containing compressed data for at least part of the input. Some of 1997db96d56Sopenharmony_ci *data* may be buffered internally, for use in later calls to 2007db96d56Sopenharmony_ci :meth:`compress` and :meth:`flush`. The returned data should be 2017db96d56Sopenharmony_ci concatenated with the output of any previous calls to :meth:`compress`. 2027db96d56Sopenharmony_ci 2037db96d56Sopenharmony_ci .. method:: flush() 2047db96d56Sopenharmony_ci 2057db96d56Sopenharmony_ci Finish the compression process, returning a :class:`bytes` object 2067db96d56Sopenharmony_ci containing any data stored in the compressor's internal buffers. 2077db96d56Sopenharmony_ci 2087db96d56Sopenharmony_ci The compressor cannot be used after this method has been called. 2097db96d56Sopenharmony_ci 2107db96d56Sopenharmony_ci 2117db96d56Sopenharmony_ci.. class:: LZMADecompressor(format=FORMAT_AUTO, memlimit=None, filters=None) 2127db96d56Sopenharmony_ci 2137db96d56Sopenharmony_ci Create a decompressor object, which can be used to decompress data 2147db96d56Sopenharmony_ci incrementally. 2157db96d56Sopenharmony_ci 2167db96d56Sopenharmony_ci For a more convenient way of decompressing an entire compressed stream at 2177db96d56Sopenharmony_ci once, see :func:`decompress`. 2187db96d56Sopenharmony_ci 2197db96d56Sopenharmony_ci The *format* argument specifies the container format that should be used. The 2207db96d56Sopenharmony_ci default is :const:`FORMAT_AUTO`, which can decompress both ``.xz`` and 2217db96d56Sopenharmony_ci ``.lzma`` files. Other possible values are :const:`FORMAT_XZ`, 2227db96d56Sopenharmony_ci :const:`FORMAT_ALONE`, and :const:`FORMAT_RAW`. 2237db96d56Sopenharmony_ci 2247db96d56Sopenharmony_ci The *memlimit* argument specifies a limit (in bytes) on the amount of memory 2257db96d56Sopenharmony_ci that the decompressor can use. When this argument is used, decompression will 2267db96d56Sopenharmony_ci fail with an :class:`LZMAError` if it is not possible to decompress the input 2277db96d56Sopenharmony_ci within the given memory limit. 2287db96d56Sopenharmony_ci 2297db96d56Sopenharmony_ci The *filters* argument specifies the filter chain that was used to create 2307db96d56Sopenharmony_ci the stream being decompressed. This argument is required if *format* is 2317db96d56Sopenharmony_ci :const:`FORMAT_RAW`, but should not be used for other formats. 2327db96d56Sopenharmony_ci See :ref:`filter-chain-specs` for more information about filter chains. 2337db96d56Sopenharmony_ci 2347db96d56Sopenharmony_ci .. note:: 2357db96d56Sopenharmony_ci This class does not transparently handle inputs containing multiple 2367db96d56Sopenharmony_ci compressed streams, unlike :func:`decompress` and :class:`LZMAFile`. To 2377db96d56Sopenharmony_ci decompress a multi-stream input with :class:`LZMADecompressor`, you must 2387db96d56Sopenharmony_ci create a new decompressor for each stream. 2397db96d56Sopenharmony_ci 2407db96d56Sopenharmony_ci .. method:: decompress(data, max_length=-1) 2417db96d56Sopenharmony_ci 2427db96d56Sopenharmony_ci Decompress *data* (a :term:`bytes-like object`), returning 2437db96d56Sopenharmony_ci uncompressed data as bytes. Some of *data* may be buffered 2447db96d56Sopenharmony_ci internally, for use in later calls to :meth:`decompress`. The 2457db96d56Sopenharmony_ci returned data should be concatenated with the output of any 2467db96d56Sopenharmony_ci previous calls to :meth:`decompress`. 2477db96d56Sopenharmony_ci 2487db96d56Sopenharmony_ci If *max_length* is nonnegative, returns at most *max_length* 2497db96d56Sopenharmony_ci bytes of decompressed data. If this limit is reached and further 2507db96d56Sopenharmony_ci output can be produced, the :attr:`~.needs_input` attribute will 2517db96d56Sopenharmony_ci be set to ``False``. In this case, the next call to 2527db96d56Sopenharmony_ci :meth:`~.decompress` may provide *data* as ``b''`` to obtain 2537db96d56Sopenharmony_ci more of the output. 2547db96d56Sopenharmony_ci 2557db96d56Sopenharmony_ci If all of the input data was decompressed and returned (either 2567db96d56Sopenharmony_ci because this was less than *max_length* bytes, or because 2577db96d56Sopenharmony_ci *max_length* was negative), the :attr:`~.needs_input` attribute 2587db96d56Sopenharmony_ci will be set to ``True``. 2597db96d56Sopenharmony_ci 2607db96d56Sopenharmony_ci Attempting to decompress data after the end of stream is reached 2617db96d56Sopenharmony_ci raises an :exc:`EOFError`. Any data found after the end of the 2627db96d56Sopenharmony_ci stream is ignored and saved in the :attr:`~.unused_data` attribute. 2637db96d56Sopenharmony_ci 2647db96d56Sopenharmony_ci .. versionchanged:: 3.5 2657db96d56Sopenharmony_ci Added the *max_length* parameter. 2667db96d56Sopenharmony_ci 2677db96d56Sopenharmony_ci .. attribute:: check 2687db96d56Sopenharmony_ci 2697db96d56Sopenharmony_ci The ID of the integrity check used by the input stream. This may be 2707db96d56Sopenharmony_ci :const:`CHECK_UNKNOWN` until enough of the input has been decoded to 2717db96d56Sopenharmony_ci determine what integrity check it uses. 2727db96d56Sopenharmony_ci 2737db96d56Sopenharmony_ci .. attribute:: eof 2747db96d56Sopenharmony_ci 2757db96d56Sopenharmony_ci ``True`` if the end-of-stream marker has been reached. 2767db96d56Sopenharmony_ci 2777db96d56Sopenharmony_ci .. attribute:: unused_data 2787db96d56Sopenharmony_ci 2797db96d56Sopenharmony_ci Data found after the end of the compressed stream. 2807db96d56Sopenharmony_ci 2817db96d56Sopenharmony_ci Before the end of the stream is reached, this will be ``b""``. 2827db96d56Sopenharmony_ci 2837db96d56Sopenharmony_ci .. attribute:: needs_input 2847db96d56Sopenharmony_ci 2857db96d56Sopenharmony_ci ``False`` if the :meth:`.decompress` method can provide more 2867db96d56Sopenharmony_ci decompressed data before requiring new uncompressed input. 2877db96d56Sopenharmony_ci 2887db96d56Sopenharmony_ci .. versionadded:: 3.5 2897db96d56Sopenharmony_ci 2907db96d56Sopenharmony_ci.. function:: compress(data, format=FORMAT_XZ, check=-1, preset=None, filters=None) 2917db96d56Sopenharmony_ci 2927db96d56Sopenharmony_ci Compress *data* (a :class:`bytes` object), returning the compressed data as a 2937db96d56Sopenharmony_ci :class:`bytes` object. 2947db96d56Sopenharmony_ci 2957db96d56Sopenharmony_ci See :class:`LZMACompressor` above for a description of the *format*, *check*, 2967db96d56Sopenharmony_ci *preset* and *filters* arguments. 2977db96d56Sopenharmony_ci 2987db96d56Sopenharmony_ci 2997db96d56Sopenharmony_ci.. function:: decompress(data, format=FORMAT_AUTO, memlimit=None, filters=None) 3007db96d56Sopenharmony_ci 3017db96d56Sopenharmony_ci Decompress *data* (a :class:`bytes` object), returning the uncompressed data 3027db96d56Sopenharmony_ci as a :class:`bytes` object. 3037db96d56Sopenharmony_ci 3047db96d56Sopenharmony_ci If *data* is the concatenation of multiple distinct compressed streams, 3057db96d56Sopenharmony_ci decompress all of these streams, and return the concatenation of the results. 3067db96d56Sopenharmony_ci 3077db96d56Sopenharmony_ci See :class:`LZMADecompressor` above for a description of the *format*, 3087db96d56Sopenharmony_ci *memlimit* and *filters* arguments. 3097db96d56Sopenharmony_ci 3107db96d56Sopenharmony_ci 3117db96d56Sopenharmony_ciMiscellaneous 3127db96d56Sopenharmony_ci------------- 3137db96d56Sopenharmony_ci 3147db96d56Sopenharmony_ci.. function:: is_check_supported(check) 3157db96d56Sopenharmony_ci 3167db96d56Sopenharmony_ci Return ``True`` if the given integrity check is supported on this system. 3177db96d56Sopenharmony_ci 3187db96d56Sopenharmony_ci :const:`CHECK_NONE` and :const:`CHECK_CRC32` are always supported. 3197db96d56Sopenharmony_ci :const:`CHECK_CRC64` and :const:`CHECK_SHA256` may be unavailable if you are 3207db96d56Sopenharmony_ci using a version of :program:`liblzma` that was compiled with a limited 3217db96d56Sopenharmony_ci feature set. 3227db96d56Sopenharmony_ci 3237db96d56Sopenharmony_ci 3247db96d56Sopenharmony_ci.. _filter-chain-specs: 3257db96d56Sopenharmony_ci 3267db96d56Sopenharmony_ciSpecifying custom filter chains 3277db96d56Sopenharmony_ci------------------------------- 3287db96d56Sopenharmony_ci 3297db96d56Sopenharmony_ciA filter chain specifier is a sequence of dictionaries, where each dictionary 3307db96d56Sopenharmony_cicontains the ID and options for a single filter. Each dictionary must contain 3317db96d56Sopenharmony_cithe key ``"id"``, and may contain additional keys to specify filter-dependent 3327db96d56Sopenharmony_cioptions. Valid filter IDs are as follows: 3337db96d56Sopenharmony_ci 3347db96d56Sopenharmony_ci* Compression filters: 3357db96d56Sopenharmony_ci * :const:`FILTER_LZMA1` (for use with :const:`FORMAT_ALONE`) 3367db96d56Sopenharmony_ci * :const:`FILTER_LZMA2` (for use with :const:`FORMAT_XZ` and :const:`FORMAT_RAW`) 3377db96d56Sopenharmony_ci 3387db96d56Sopenharmony_ci* Delta filter: 3397db96d56Sopenharmony_ci * :const:`FILTER_DELTA` 3407db96d56Sopenharmony_ci 3417db96d56Sopenharmony_ci* Branch-Call-Jump (BCJ) filters: 3427db96d56Sopenharmony_ci * :const:`FILTER_X86` 3437db96d56Sopenharmony_ci * :const:`FILTER_IA64` 3447db96d56Sopenharmony_ci * :const:`FILTER_ARM` 3457db96d56Sopenharmony_ci * :const:`FILTER_ARMTHUMB` 3467db96d56Sopenharmony_ci * :const:`FILTER_POWERPC` 3477db96d56Sopenharmony_ci * :const:`FILTER_SPARC` 3487db96d56Sopenharmony_ci 3497db96d56Sopenharmony_ciA filter chain can consist of up to 4 filters, and cannot be empty. The last 3507db96d56Sopenharmony_cifilter in the chain must be a compression filter, and any other filters must be 3517db96d56Sopenharmony_cidelta or BCJ filters. 3527db96d56Sopenharmony_ci 3537db96d56Sopenharmony_ciCompression filters support the following options (specified as additional 3547db96d56Sopenharmony_cientries in the dictionary representing the filter): 3557db96d56Sopenharmony_ci 3567db96d56Sopenharmony_ci * ``preset``: A compression preset to use as a source of default values for 3577db96d56Sopenharmony_ci options that are not specified explicitly. 3587db96d56Sopenharmony_ci * ``dict_size``: Dictionary size in bytes. This should be between 4 KiB and 3597db96d56Sopenharmony_ci 1.5 GiB (inclusive). 3607db96d56Sopenharmony_ci * ``lc``: Number of literal context bits. 3617db96d56Sopenharmony_ci * ``lp``: Number of literal position bits. The sum ``lc + lp`` must be at 3627db96d56Sopenharmony_ci most 4. 3637db96d56Sopenharmony_ci * ``pb``: Number of position bits; must be at most 4. 3647db96d56Sopenharmony_ci * ``mode``: :const:`MODE_FAST` or :const:`MODE_NORMAL`. 3657db96d56Sopenharmony_ci * ``nice_len``: What should be considered a "nice length" for a match. 3667db96d56Sopenharmony_ci This should be 273 or less. 3677db96d56Sopenharmony_ci * ``mf``: What match finder to use -- :const:`MF_HC3`, :const:`MF_HC4`, 3687db96d56Sopenharmony_ci :const:`MF_BT2`, :const:`MF_BT3`, or :const:`MF_BT4`. 3697db96d56Sopenharmony_ci * ``depth``: Maximum search depth used by match finder. 0 (default) means to 3707db96d56Sopenharmony_ci select automatically based on other filter options. 3717db96d56Sopenharmony_ci 3727db96d56Sopenharmony_ciThe delta filter stores the differences between bytes, producing more repetitive 3737db96d56Sopenharmony_ciinput for the compressor in certain circumstances. It supports one option, 3747db96d56Sopenharmony_ci``dist``. This indicates the distance between bytes to be subtracted. The 3757db96d56Sopenharmony_cidefault is 1, i.e. take the differences between adjacent bytes. 3767db96d56Sopenharmony_ci 3777db96d56Sopenharmony_ciThe BCJ filters are intended to be applied to machine code. They convert 3787db96d56Sopenharmony_cirelative branches, calls and jumps in the code to use absolute addressing, with 3797db96d56Sopenharmony_cithe aim of increasing the redundancy that can be exploited by the compressor. 3807db96d56Sopenharmony_ciThese filters support one option, ``start_offset``. This specifies the address 3817db96d56Sopenharmony_cithat should be mapped to the beginning of the input data. The default is 0. 3827db96d56Sopenharmony_ci 3837db96d56Sopenharmony_ci 3847db96d56Sopenharmony_ciExamples 3857db96d56Sopenharmony_ci-------- 3867db96d56Sopenharmony_ci 3877db96d56Sopenharmony_ciReading in a compressed file:: 3887db96d56Sopenharmony_ci 3897db96d56Sopenharmony_ci import lzma 3907db96d56Sopenharmony_ci with lzma.open("file.xz") as f: 3917db96d56Sopenharmony_ci file_content = f.read() 3927db96d56Sopenharmony_ci 3937db96d56Sopenharmony_ciCreating a compressed file:: 3947db96d56Sopenharmony_ci 3957db96d56Sopenharmony_ci import lzma 3967db96d56Sopenharmony_ci data = b"Insert Data Here" 3977db96d56Sopenharmony_ci with lzma.open("file.xz", "w") as f: 3987db96d56Sopenharmony_ci f.write(data) 3997db96d56Sopenharmony_ci 4007db96d56Sopenharmony_ciCompressing data in memory:: 4017db96d56Sopenharmony_ci 4027db96d56Sopenharmony_ci import lzma 4037db96d56Sopenharmony_ci data_in = b"Insert Data Here" 4047db96d56Sopenharmony_ci data_out = lzma.compress(data_in) 4057db96d56Sopenharmony_ci 4067db96d56Sopenharmony_ciIncremental compression:: 4077db96d56Sopenharmony_ci 4087db96d56Sopenharmony_ci import lzma 4097db96d56Sopenharmony_ci lzc = lzma.LZMACompressor() 4107db96d56Sopenharmony_ci out1 = lzc.compress(b"Some data\n") 4117db96d56Sopenharmony_ci out2 = lzc.compress(b"Another piece of data\n") 4127db96d56Sopenharmony_ci out3 = lzc.compress(b"Even more data\n") 4137db96d56Sopenharmony_ci out4 = lzc.flush() 4147db96d56Sopenharmony_ci # Concatenate all the partial results: 4157db96d56Sopenharmony_ci result = b"".join([out1, out2, out3, out4]) 4167db96d56Sopenharmony_ci 4177db96d56Sopenharmony_ciWriting compressed data to an already-open file:: 4187db96d56Sopenharmony_ci 4197db96d56Sopenharmony_ci import lzma 4207db96d56Sopenharmony_ci with open("file.xz", "wb") as f: 4217db96d56Sopenharmony_ci f.write(b"This data will not be compressed\n") 4227db96d56Sopenharmony_ci with lzma.open(f, "w") as lzf: 4237db96d56Sopenharmony_ci lzf.write(b"This *will* be compressed\n") 4247db96d56Sopenharmony_ci f.write(b"Not compressed\n") 4257db96d56Sopenharmony_ci 4267db96d56Sopenharmony_ciCreating a compressed file using a custom filter chain:: 4277db96d56Sopenharmony_ci 4287db96d56Sopenharmony_ci import lzma 4297db96d56Sopenharmony_ci my_filters = [ 4307db96d56Sopenharmony_ci {"id": lzma.FILTER_DELTA, "dist": 5}, 4317db96d56Sopenharmony_ci {"id": lzma.FILTER_LZMA2, "preset": 7 | lzma.PRESET_EXTREME}, 4327db96d56Sopenharmony_ci ] 4337db96d56Sopenharmony_ci with lzma.open("file.xz", "w", filters=my_filters) as f: 4347db96d56Sopenharmony_ci f.write(b"blah blah blah") 435