17db96d56Sopenharmony_ci:mod:`tarfile` --- Read and write tar archive files 27db96d56Sopenharmony_ci=================================================== 37db96d56Sopenharmony_ci 47db96d56Sopenharmony_ci.. module:: tarfile 57db96d56Sopenharmony_ci :synopsis: Read and write tar-format archive files. 67db96d56Sopenharmony_ci 77db96d56Sopenharmony_ci.. moduleauthor:: Lars Gustäbel <lars@gustaebel.de> 87db96d56Sopenharmony_ci.. sectionauthor:: Lars Gustäbel <lars@gustaebel.de> 97db96d56Sopenharmony_ci 107db96d56Sopenharmony_ci**Source code:** :source:`Lib/tarfile.py` 117db96d56Sopenharmony_ci 127db96d56Sopenharmony_ci-------------- 137db96d56Sopenharmony_ci 147db96d56Sopenharmony_ciThe :mod:`tarfile` module makes it possible to read and write tar 157db96d56Sopenharmony_ciarchives, including those using gzip, bz2 and lzma compression. 167db96d56Sopenharmony_ciUse the :mod:`zipfile` module to read or write :file:`.zip` files, or the 177db96d56Sopenharmony_cihigher-level functions in :ref:`shutil <archiving-operations>`. 187db96d56Sopenharmony_ci 197db96d56Sopenharmony_ciSome facts and figures: 207db96d56Sopenharmony_ci 217db96d56Sopenharmony_ci* reads and writes :mod:`gzip`, :mod:`bz2` and :mod:`lzma` compressed archives 227db96d56Sopenharmony_ci if the respective modules are available. 237db96d56Sopenharmony_ci 247db96d56Sopenharmony_ci* read/write support for the POSIX.1-1988 (ustar) format. 257db96d56Sopenharmony_ci 267db96d56Sopenharmony_ci* read/write support for the GNU tar format including *longname* and *longlink* 277db96d56Sopenharmony_ci extensions, read-only support for all variants of the *sparse* extension 287db96d56Sopenharmony_ci including restoration of sparse files. 297db96d56Sopenharmony_ci 307db96d56Sopenharmony_ci* read/write support for the POSIX.1-2001 (pax) format. 317db96d56Sopenharmony_ci 327db96d56Sopenharmony_ci* handles directories, regular files, hardlinks, symbolic links, fifos, 337db96d56Sopenharmony_ci character devices and block devices and is able to acquire and restore file 347db96d56Sopenharmony_ci information like timestamp, access permissions and owner. 357db96d56Sopenharmony_ci 367db96d56Sopenharmony_ci.. versionchanged:: 3.3 377db96d56Sopenharmony_ci Added support for :mod:`lzma` compression. 387db96d56Sopenharmony_ci 397db96d56Sopenharmony_ci 407db96d56Sopenharmony_ci.. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, **kwargs) 417db96d56Sopenharmony_ci 427db96d56Sopenharmony_ci Return a :class:`TarFile` object for the pathname *name*. For detailed 437db96d56Sopenharmony_ci information on :class:`TarFile` objects and the keyword arguments that are 447db96d56Sopenharmony_ci allowed, see :ref:`tarfile-objects`. 457db96d56Sopenharmony_ci 467db96d56Sopenharmony_ci *mode* has to be a string of the form ``'filemode[:compression]'``, it defaults 477db96d56Sopenharmony_ci to ``'r'``. Here is a full list of mode combinations: 487db96d56Sopenharmony_ci 497db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 507db96d56Sopenharmony_ci | mode | action | 517db96d56Sopenharmony_ci +==================+=============================================+ 527db96d56Sopenharmony_ci | ``'r' or 'r:*'`` | Open for reading with transparent | 537db96d56Sopenharmony_ci | | compression (recommended). | 547db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 557db96d56Sopenharmony_ci | ``'r:'`` | Open for reading exclusively without | 567db96d56Sopenharmony_ci | | compression. | 577db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 587db96d56Sopenharmony_ci | ``'r:gz'`` | Open for reading with gzip compression. | 597db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 607db96d56Sopenharmony_ci | ``'r:bz2'`` | Open for reading with bzip2 compression. | 617db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 627db96d56Sopenharmony_ci | ``'r:xz'`` | Open for reading with lzma compression. | 637db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 647db96d56Sopenharmony_ci | ``'x'`` or | Create a tarfile exclusively without | 657db96d56Sopenharmony_ci | ``'x:'`` | compression. | 667db96d56Sopenharmony_ci | | Raise a :exc:`FileExistsError` exception | 677db96d56Sopenharmony_ci | | if it already exists. | 687db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 697db96d56Sopenharmony_ci | ``'x:gz'`` | Create a tarfile with gzip compression. | 707db96d56Sopenharmony_ci | | Raise a :exc:`FileExistsError` exception | 717db96d56Sopenharmony_ci | | if it already exists. | 727db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 737db96d56Sopenharmony_ci | ``'x:bz2'`` | Create a tarfile with bzip2 compression. | 747db96d56Sopenharmony_ci | | Raise a :exc:`FileExistsError` exception | 757db96d56Sopenharmony_ci | | if it already exists. | 767db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 777db96d56Sopenharmony_ci | ``'x:xz'`` | Create a tarfile with lzma compression. | 787db96d56Sopenharmony_ci | | Raise a :exc:`FileExistsError` exception | 797db96d56Sopenharmony_ci | | if it already exists. | 807db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 817db96d56Sopenharmony_ci | ``'a' or 'a:'`` | Open for appending with no compression. The | 827db96d56Sopenharmony_ci | | file is created if it does not exist. | 837db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 847db96d56Sopenharmony_ci | ``'w' or 'w:'`` | Open for uncompressed writing. | 857db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 867db96d56Sopenharmony_ci | ``'w:gz'`` | Open for gzip compressed writing. | 877db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 887db96d56Sopenharmony_ci | ``'w:bz2'`` | Open for bzip2 compressed writing. | 897db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 907db96d56Sopenharmony_ci | ``'w:xz'`` | Open for lzma compressed writing. | 917db96d56Sopenharmony_ci +------------------+---------------------------------------------+ 927db96d56Sopenharmony_ci 937db96d56Sopenharmony_ci Note that ``'a:gz'``, ``'a:bz2'`` or ``'a:xz'`` is not possible. If *mode* 947db96d56Sopenharmony_ci is not suitable to open a certain (compressed) file for reading, 957db96d56Sopenharmony_ci :exc:`ReadError` is raised. Use *mode* ``'r'`` to avoid this. If a 967db96d56Sopenharmony_ci compression method is not supported, :exc:`CompressionError` is raised. 977db96d56Sopenharmony_ci 987db96d56Sopenharmony_ci If *fileobj* is specified, it is used as an alternative to a :term:`file object` 997db96d56Sopenharmony_ci opened in binary mode for *name*. It is supposed to be at position 0. 1007db96d56Sopenharmony_ci 1017db96d56Sopenharmony_ci For modes ``'w:gz'``, ``'r:gz'``, ``'w:bz2'``, ``'r:bz2'``, ``'x:gz'``, 1027db96d56Sopenharmony_ci ``'x:bz2'``, :func:`tarfile.open` accepts the keyword argument 1037db96d56Sopenharmony_ci *compresslevel* (default ``9``) to specify the compression level of the file. 1047db96d56Sopenharmony_ci 1057db96d56Sopenharmony_ci For modes ``'w:xz'`` and ``'x:xz'``, :func:`tarfile.open` accepts the 1067db96d56Sopenharmony_ci keyword argument *preset* to specify the compression level of the file. 1077db96d56Sopenharmony_ci 1087db96d56Sopenharmony_ci For special purposes, there is a second format for *mode*: 1097db96d56Sopenharmony_ci ``'filemode|[compression]'``. :func:`tarfile.open` will return a :class:`TarFile` 1107db96d56Sopenharmony_ci object that processes its data as a stream of blocks. No random seeking will 1117db96d56Sopenharmony_ci be done on the file. If given, *fileobj* may be any object that has a 1127db96d56Sopenharmony_ci :meth:`read` or :meth:`write` method (depending on the *mode*). *bufsize* 1137db96d56Sopenharmony_ci specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant 1147db96d56Sopenharmony_ci in combination with e.g. ``sys.stdin``, a socket :term:`file object` or a tape 1157db96d56Sopenharmony_ci device. However, such a :class:`TarFile` object is limited in that it does 1167db96d56Sopenharmony_ci not allow random access, see :ref:`tar-examples`. The currently 1177db96d56Sopenharmony_ci possible modes: 1187db96d56Sopenharmony_ci 1197db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1207db96d56Sopenharmony_ci | Mode | Action | 1217db96d56Sopenharmony_ci +=============+============================================+ 1227db96d56Sopenharmony_ci | ``'r|*'`` | Open a *stream* of tar blocks for reading | 1237db96d56Sopenharmony_ci | | with transparent compression. | 1247db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1257db96d56Sopenharmony_ci | ``'r|'`` | Open a *stream* of uncompressed tar blocks | 1267db96d56Sopenharmony_ci | | for reading. | 1277db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1287db96d56Sopenharmony_ci | ``'r|gz'`` | Open a gzip compressed *stream* for | 1297db96d56Sopenharmony_ci | | reading. | 1307db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1317db96d56Sopenharmony_ci | ``'r|bz2'`` | Open a bzip2 compressed *stream* for | 1327db96d56Sopenharmony_ci | | reading. | 1337db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1347db96d56Sopenharmony_ci | ``'r|xz'`` | Open an lzma compressed *stream* for | 1357db96d56Sopenharmony_ci | | reading. | 1367db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1377db96d56Sopenharmony_ci | ``'w|'`` | Open an uncompressed *stream* for writing. | 1387db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1397db96d56Sopenharmony_ci | ``'w|gz'`` | Open a gzip compressed *stream* for | 1407db96d56Sopenharmony_ci | | writing. | 1417db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1427db96d56Sopenharmony_ci | ``'w|bz2'`` | Open a bzip2 compressed *stream* for | 1437db96d56Sopenharmony_ci | | writing. | 1447db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1457db96d56Sopenharmony_ci | ``'w|xz'`` | Open an lzma compressed *stream* for | 1467db96d56Sopenharmony_ci | | writing. | 1477db96d56Sopenharmony_ci +-------------+--------------------------------------------+ 1487db96d56Sopenharmony_ci 1497db96d56Sopenharmony_ci .. versionchanged:: 3.5 1507db96d56Sopenharmony_ci The ``'x'`` (exclusive creation) mode was added. 1517db96d56Sopenharmony_ci 1527db96d56Sopenharmony_ci .. versionchanged:: 3.6 1537db96d56Sopenharmony_ci The *name* parameter accepts a :term:`path-like object`. 1547db96d56Sopenharmony_ci 1557db96d56Sopenharmony_ci 1567db96d56Sopenharmony_ci.. class:: TarFile 1577db96d56Sopenharmony_ci :noindex: 1587db96d56Sopenharmony_ci 1597db96d56Sopenharmony_ci Class for reading and writing tar archives. Do not use this class directly: 1607db96d56Sopenharmony_ci use :func:`tarfile.open` instead. See :ref:`tarfile-objects`. 1617db96d56Sopenharmony_ci 1627db96d56Sopenharmony_ci 1637db96d56Sopenharmony_ci.. function:: is_tarfile(name) 1647db96d56Sopenharmony_ci 1657db96d56Sopenharmony_ci Return :const:`True` if *name* is a tar archive file, that the :mod:`tarfile` 1667db96d56Sopenharmony_ci module can read. *name* may be a :class:`str`, file, or file-like object. 1677db96d56Sopenharmony_ci 1687db96d56Sopenharmony_ci .. versionchanged:: 3.9 1697db96d56Sopenharmony_ci Support for file and file-like objects. 1707db96d56Sopenharmony_ci 1717db96d56Sopenharmony_ci 1727db96d56Sopenharmony_ciThe :mod:`tarfile` module defines the following exceptions: 1737db96d56Sopenharmony_ci 1747db96d56Sopenharmony_ci 1757db96d56Sopenharmony_ci.. exception:: TarError 1767db96d56Sopenharmony_ci 1777db96d56Sopenharmony_ci Base class for all :mod:`tarfile` exceptions. 1787db96d56Sopenharmony_ci 1797db96d56Sopenharmony_ci 1807db96d56Sopenharmony_ci.. exception:: ReadError 1817db96d56Sopenharmony_ci 1827db96d56Sopenharmony_ci Is raised when a tar archive is opened, that either cannot be handled by the 1837db96d56Sopenharmony_ci :mod:`tarfile` module or is somehow invalid. 1847db96d56Sopenharmony_ci 1857db96d56Sopenharmony_ci 1867db96d56Sopenharmony_ci.. exception:: CompressionError 1877db96d56Sopenharmony_ci 1887db96d56Sopenharmony_ci Is raised when a compression method is not supported or when the data cannot be 1897db96d56Sopenharmony_ci decoded properly. 1907db96d56Sopenharmony_ci 1917db96d56Sopenharmony_ci 1927db96d56Sopenharmony_ci.. exception:: StreamError 1937db96d56Sopenharmony_ci 1947db96d56Sopenharmony_ci Is raised for the limitations that are typical for stream-like :class:`TarFile` 1957db96d56Sopenharmony_ci objects. 1967db96d56Sopenharmony_ci 1977db96d56Sopenharmony_ci 1987db96d56Sopenharmony_ci.. exception:: ExtractError 1997db96d56Sopenharmony_ci 2007db96d56Sopenharmony_ci Is raised for *non-fatal* errors when using :meth:`TarFile.extract`, but only if 2017db96d56Sopenharmony_ci :attr:`TarFile.errorlevel`\ ``== 2``. 2027db96d56Sopenharmony_ci 2037db96d56Sopenharmony_ci 2047db96d56Sopenharmony_ci.. exception:: HeaderError 2057db96d56Sopenharmony_ci 2067db96d56Sopenharmony_ci Is raised by :meth:`TarInfo.frombuf` if the buffer it gets is invalid. 2077db96d56Sopenharmony_ci 2087db96d56Sopenharmony_ci 2097db96d56Sopenharmony_ci.. exception:: FilterError 2107db96d56Sopenharmony_ci 2117db96d56Sopenharmony_ci Base class for members :ref:`refused <tarfile-extraction-refuse>` by 2127db96d56Sopenharmony_ci filters. 2137db96d56Sopenharmony_ci 2147db96d56Sopenharmony_ci .. attribute:: tarinfo 2157db96d56Sopenharmony_ci 2167db96d56Sopenharmony_ci Information about the member that the filter refused to extract, 2177db96d56Sopenharmony_ci as :ref:`TarInfo <tarinfo-objects>`. 2187db96d56Sopenharmony_ci 2197db96d56Sopenharmony_ci.. exception:: AbsolutePathError 2207db96d56Sopenharmony_ci 2217db96d56Sopenharmony_ci Raised to refuse extracting a member with an absolute path. 2227db96d56Sopenharmony_ci 2237db96d56Sopenharmony_ci.. exception:: OutsideDestinationError 2247db96d56Sopenharmony_ci 2257db96d56Sopenharmony_ci Raised to refuse extracting a member outside the destination directory. 2267db96d56Sopenharmony_ci 2277db96d56Sopenharmony_ci.. exception:: SpecialFileError 2287db96d56Sopenharmony_ci 2297db96d56Sopenharmony_ci Raised to refuse extracting a special file (e.g. a device or pipe). 2307db96d56Sopenharmony_ci 2317db96d56Sopenharmony_ci.. exception:: AbsoluteLinkError 2327db96d56Sopenharmony_ci 2337db96d56Sopenharmony_ci Raised to refuse extracting a symbolic link with an absolute path. 2347db96d56Sopenharmony_ci 2357db96d56Sopenharmony_ci.. exception:: LinkOutsideDestinationError 2367db96d56Sopenharmony_ci 2377db96d56Sopenharmony_ci Raised to refuse extracting a symbolic link pointing outside the destination 2387db96d56Sopenharmony_ci directory. 2397db96d56Sopenharmony_ci 2407db96d56Sopenharmony_ci 2417db96d56Sopenharmony_ciThe following constants are available at the module level: 2427db96d56Sopenharmony_ci 2437db96d56Sopenharmony_ci.. data:: ENCODING 2447db96d56Sopenharmony_ci 2457db96d56Sopenharmony_ci The default character encoding: ``'utf-8'`` on Windows, the value returned by 2467db96d56Sopenharmony_ci :func:`sys.getfilesystemencoding` otherwise. 2477db96d56Sopenharmony_ci 2487db96d56Sopenharmony_ci 2497db96d56Sopenharmony_ciEach of the following constants defines a tar archive format that the 2507db96d56Sopenharmony_ci:mod:`tarfile` module is able to create. See section :ref:`tar-formats` for 2517db96d56Sopenharmony_cidetails. 2527db96d56Sopenharmony_ci 2537db96d56Sopenharmony_ci 2547db96d56Sopenharmony_ci.. data:: USTAR_FORMAT 2557db96d56Sopenharmony_ci 2567db96d56Sopenharmony_ci POSIX.1-1988 (ustar) format. 2577db96d56Sopenharmony_ci 2587db96d56Sopenharmony_ci 2597db96d56Sopenharmony_ci.. data:: GNU_FORMAT 2607db96d56Sopenharmony_ci 2617db96d56Sopenharmony_ci GNU tar format. 2627db96d56Sopenharmony_ci 2637db96d56Sopenharmony_ci 2647db96d56Sopenharmony_ci.. data:: PAX_FORMAT 2657db96d56Sopenharmony_ci 2667db96d56Sopenharmony_ci POSIX.1-2001 (pax) format. 2677db96d56Sopenharmony_ci 2687db96d56Sopenharmony_ci 2697db96d56Sopenharmony_ci.. data:: DEFAULT_FORMAT 2707db96d56Sopenharmony_ci 2717db96d56Sopenharmony_ci The default format for creating archives. This is currently :const:`PAX_FORMAT`. 2727db96d56Sopenharmony_ci 2737db96d56Sopenharmony_ci .. versionchanged:: 3.8 2747db96d56Sopenharmony_ci The default format for new archives was changed to 2757db96d56Sopenharmony_ci :const:`PAX_FORMAT` from :const:`GNU_FORMAT`. 2767db96d56Sopenharmony_ci 2777db96d56Sopenharmony_ci 2787db96d56Sopenharmony_ci.. seealso:: 2797db96d56Sopenharmony_ci 2807db96d56Sopenharmony_ci Module :mod:`zipfile` 2817db96d56Sopenharmony_ci Documentation of the :mod:`zipfile` standard module. 2827db96d56Sopenharmony_ci 2837db96d56Sopenharmony_ci :ref:`archiving-operations` 2847db96d56Sopenharmony_ci Documentation of the higher-level archiving facilities provided by the 2857db96d56Sopenharmony_ci standard :mod:`shutil` module. 2867db96d56Sopenharmony_ci 2877db96d56Sopenharmony_ci `GNU tar manual, Basic Tar Format <https://www.gnu.org/software/tar/manual/html_node/Standard.html>`_ 2887db96d56Sopenharmony_ci Documentation for tar archive files, including GNU tar extensions. 2897db96d56Sopenharmony_ci 2907db96d56Sopenharmony_ci 2917db96d56Sopenharmony_ci.. _tarfile-objects: 2927db96d56Sopenharmony_ci 2937db96d56Sopenharmony_ciTarFile Objects 2947db96d56Sopenharmony_ci--------------- 2957db96d56Sopenharmony_ci 2967db96d56Sopenharmony_ciThe :class:`TarFile` object provides an interface to a tar archive. A tar 2977db96d56Sopenharmony_ciarchive is a sequence of blocks. An archive member (a stored file) is made up of 2987db96d56Sopenharmony_cia header block followed by data blocks. It is possible to store a file in a tar 2997db96d56Sopenharmony_ciarchive several times. Each archive member is represented by a :class:`TarInfo` 3007db96d56Sopenharmony_ciobject, see :ref:`tarinfo-objects` for details. 3017db96d56Sopenharmony_ci 3027db96d56Sopenharmony_ciA :class:`TarFile` object can be used as a context manager in a :keyword:`with` 3037db96d56Sopenharmony_cistatement. It will automatically be closed when the block is completed. Please 3047db96d56Sopenharmony_cinote that in the event of an exception an archive opened for writing will not 3057db96d56Sopenharmony_cibe finalized; only the internally used file object will be closed. See the 3067db96d56Sopenharmony_ci:ref:`tar-examples` section for a use case. 3077db96d56Sopenharmony_ci 3087db96d56Sopenharmony_ci.. versionadded:: 3.2 3097db96d56Sopenharmony_ci Added support for the context management protocol. 3107db96d56Sopenharmony_ci 3117db96d56Sopenharmony_ci.. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors='surrogateescape', pax_headers=None, debug=0, errorlevel=1) 3127db96d56Sopenharmony_ci 3137db96d56Sopenharmony_ci All following arguments are optional and can be accessed as instance attributes 3147db96d56Sopenharmony_ci as well. 3157db96d56Sopenharmony_ci 3167db96d56Sopenharmony_ci *name* is the pathname of the archive. *name* may be a :term:`path-like object`. 3177db96d56Sopenharmony_ci It can be omitted if *fileobj* is given. 3187db96d56Sopenharmony_ci In this case, the file object's :attr:`name` attribute is used if it exists. 3197db96d56Sopenharmony_ci 3207db96d56Sopenharmony_ci *mode* is either ``'r'`` to read from an existing archive, ``'a'`` to append 3217db96d56Sopenharmony_ci data to an existing file, ``'w'`` to create a new file overwriting an existing 3227db96d56Sopenharmony_ci one, or ``'x'`` to create a new file only if it does not already exist. 3237db96d56Sopenharmony_ci 3247db96d56Sopenharmony_ci If *fileobj* is given, it is used for reading or writing data. If it can be 3257db96d56Sopenharmony_ci determined, *mode* is overridden by *fileobj*'s mode. *fileobj* will be used 3267db96d56Sopenharmony_ci from position 0. 3277db96d56Sopenharmony_ci 3287db96d56Sopenharmony_ci .. note:: 3297db96d56Sopenharmony_ci 3307db96d56Sopenharmony_ci *fileobj* is not closed, when :class:`TarFile` is closed. 3317db96d56Sopenharmony_ci 3327db96d56Sopenharmony_ci *format* controls the archive format for writing. It must be one of the constants 3337db96d56Sopenharmony_ci :const:`USTAR_FORMAT`, :const:`GNU_FORMAT` or :const:`PAX_FORMAT` that are 3347db96d56Sopenharmony_ci defined at module level. When reading, format will be automatically detected, even 3357db96d56Sopenharmony_ci if different formats are present in a single archive. 3367db96d56Sopenharmony_ci 3377db96d56Sopenharmony_ci The *tarinfo* argument can be used to replace the default :class:`TarInfo` class 3387db96d56Sopenharmony_ci with a different one. 3397db96d56Sopenharmony_ci 3407db96d56Sopenharmony_ci If *dereference* is :const:`False`, add symbolic and hard links to the archive. If it 3417db96d56Sopenharmony_ci is :const:`True`, add the content of the target files to the archive. This has no 3427db96d56Sopenharmony_ci effect on systems that do not support symbolic links. 3437db96d56Sopenharmony_ci 3447db96d56Sopenharmony_ci If *ignore_zeros* is :const:`False`, treat an empty block as the end of the archive. 3457db96d56Sopenharmony_ci If it is :const:`True`, skip empty (and invalid) blocks and try to get as many members 3467db96d56Sopenharmony_ci as possible. This is only useful for reading concatenated or damaged archives. 3477db96d56Sopenharmony_ci 3487db96d56Sopenharmony_ci *debug* can be set from ``0`` (no debug messages) up to ``3`` (all debug 3497db96d56Sopenharmony_ci messages). The messages are written to ``sys.stderr``. 3507db96d56Sopenharmony_ci 3517db96d56Sopenharmony_ci *errorlevel* controls how extraction errors are handled, 3527db96d56Sopenharmony_ci see :attr:`the corresponding attribute <~TarFile.errorlevel>`. 3537db96d56Sopenharmony_ci 3547db96d56Sopenharmony_ci The *encoding* and *errors* arguments define the character encoding to be 3557db96d56Sopenharmony_ci used for reading or writing the archive and how conversion errors are going 3567db96d56Sopenharmony_ci to be handled. The default settings will work for most users. 3577db96d56Sopenharmony_ci See section :ref:`tar-unicode` for in-depth information. 3587db96d56Sopenharmony_ci 3597db96d56Sopenharmony_ci The *pax_headers* argument is an optional dictionary of strings which 3607db96d56Sopenharmony_ci will be added as a pax global header if *format* is :const:`PAX_FORMAT`. 3617db96d56Sopenharmony_ci 3627db96d56Sopenharmony_ci .. versionchanged:: 3.2 3637db96d56Sopenharmony_ci Use ``'surrogateescape'`` as the default for the *errors* argument. 3647db96d56Sopenharmony_ci 3657db96d56Sopenharmony_ci .. versionchanged:: 3.5 3667db96d56Sopenharmony_ci The ``'x'`` (exclusive creation) mode was added. 3677db96d56Sopenharmony_ci 3687db96d56Sopenharmony_ci .. versionchanged:: 3.6 3697db96d56Sopenharmony_ci The *name* parameter accepts a :term:`path-like object`. 3707db96d56Sopenharmony_ci 3717db96d56Sopenharmony_ci 3727db96d56Sopenharmony_ci.. classmethod:: TarFile.open(...) 3737db96d56Sopenharmony_ci 3747db96d56Sopenharmony_ci Alternative constructor. The :func:`tarfile.open` function is actually a 3757db96d56Sopenharmony_ci shortcut to this classmethod. 3767db96d56Sopenharmony_ci 3777db96d56Sopenharmony_ci 3787db96d56Sopenharmony_ci.. method:: TarFile.getmember(name) 3797db96d56Sopenharmony_ci 3807db96d56Sopenharmony_ci Return a :class:`TarInfo` object for member *name*. If *name* can not be found 3817db96d56Sopenharmony_ci in the archive, :exc:`KeyError` is raised. 3827db96d56Sopenharmony_ci 3837db96d56Sopenharmony_ci .. note:: 3847db96d56Sopenharmony_ci 3857db96d56Sopenharmony_ci If a member occurs more than once in the archive, its last occurrence is assumed 3867db96d56Sopenharmony_ci to be the most up-to-date version. 3877db96d56Sopenharmony_ci 3887db96d56Sopenharmony_ci 3897db96d56Sopenharmony_ci.. method:: TarFile.getmembers() 3907db96d56Sopenharmony_ci 3917db96d56Sopenharmony_ci Return the members of the archive as a list of :class:`TarInfo` objects. The 3927db96d56Sopenharmony_ci list has the same order as the members in the archive. 3937db96d56Sopenharmony_ci 3947db96d56Sopenharmony_ci 3957db96d56Sopenharmony_ci.. method:: TarFile.getnames() 3967db96d56Sopenharmony_ci 3977db96d56Sopenharmony_ci Return the members as a list of their names. It has the same order as the list 3987db96d56Sopenharmony_ci returned by :meth:`getmembers`. 3997db96d56Sopenharmony_ci 4007db96d56Sopenharmony_ci 4017db96d56Sopenharmony_ci.. method:: TarFile.list(verbose=True, *, members=None) 4027db96d56Sopenharmony_ci 4037db96d56Sopenharmony_ci Print a table of contents to ``sys.stdout``. If *verbose* is :const:`False`, 4047db96d56Sopenharmony_ci only the names of the members are printed. If it is :const:`True`, output 4057db96d56Sopenharmony_ci similar to that of :program:`ls -l` is produced. If optional *members* is 4067db96d56Sopenharmony_ci given, it must be a subset of the list returned by :meth:`getmembers`. 4077db96d56Sopenharmony_ci 4087db96d56Sopenharmony_ci .. versionchanged:: 3.5 4097db96d56Sopenharmony_ci Added the *members* parameter. 4107db96d56Sopenharmony_ci 4117db96d56Sopenharmony_ci 4127db96d56Sopenharmony_ci.. method:: TarFile.next() 4137db96d56Sopenharmony_ci 4147db96d56Sopenharmony_ci Return the next member of the archive as a :class:`TarInfo` object, when 4157db96d56Sopenharmony_ci :class:`TarFile` is opened for reading. Return :const:`None` if there is no more 4167db96d56Sopenharmony_ci available. 4177db96d56Sopenharmony_ci 4187db96d56Sopenharmony_ci 4197db96d56Sopenharmony_ci.. method:: TarFile.extractall(path=".", members=None, *, numeric_owner=False, filter=None) 4207db96d56Sopenharmony_ci 4217db96d56Sopenharmony_ci Extract all members from the archive to the current working directory or 4227db96d56Sopenharmony_ci directory *path*. If optional *members* is given, it must be a subset of the 4237db96d56Sopenharmony_ci list returned by :meth:`getmembers`. Directory information like owner, 4247db96d56Sopenharmony_ci modification time and permissions are set after all members have been extracted. 4257db96d56Sopenharmony_ci This is done to work around two problems: A directory's modification time is 4267db96d56Sopenharmony_ci reset each time a file is created in it. And, if a directory's permissions do 4277db96d56Sopenharmony_ci not allow writing, extracting files to it will fail. 4287db96d56Sopenharmony_ci 4297db96d56Sopenharmony_ci If *numeric_owner* is :const:`True`, the uid and gid numbers from the tarfile 4307db96d56Sopenharmony_ci are used to set the owner/group for the extracted files. Otherwise, the named 4317db96d56Sopenharmony_ci values from the tarfile are used. 4327db96d56Sopenharmony_ci 4337db96d56Sopenharmony_ci The *filter* argument, which was added in Python 3.11.4, specifies how 4347db96d56Sopenharmony_ci ``members`` are modified or rejected before extraction. 4357db96d56Sopenharmony_ci See :ref:`tarfile-extraction-filter` for details. 4367db96d56Sopenharmony_ci It is recommended to set this explicitly depending on which *tar* features 4377db96d56Sopenharmony_ci you need to support. 4387db96d56Sopenharmony_ci 4397db96d56Sopenharmony_ci .. warning:: 4407db96d56Sopenharmony_ci 4417db96d56Sopenharmony_ci Never extract archives from untrusted sources without prior inspection. 4427db96d56Sopenharmony_ci It is possible that files are created outside of *path*, e.g. members 4437db96d56Sopenharmony_ci that have absolute filenames starting with ``"/"`` or filenames with two 4447db96d56Sopenharmony_ci dots ``".."``. 4457db96d56Sopenharmony_ci 4467db96d56Sopenharmony_ci Set ``filter='data'`` to prevent the most dangerous security issues, 4477db96d56Sopenharmony_ci and read the :ref:`tarfile-extraction-filter` section for details. 4487db96d56Sopenharmony_ci 4497db96d56Sopenharmony_ci .. versionchanged:: 3.5 4507db96d56Sopenharmony_ci Added the *numeric_owner* parameter. 4517db96d56Sopenharmony_ci 4527db96d56Sopenharmony_ci .. versionchanged:: 3.6 4537db96d56Sopenharmony_ci The *path* parameter accepts a :term:`path-like object`. 4547db96d56Sopenharmony_ci 4557db96d56Sopenharmony_ci .. versionchanged:: 3.11.4 4567db96d56Sopenharmony_ci Added the *filter* parameter. 4577db96d56Sopenharmony_ci 4587db96d56Sopenharmony_ci 4597db96d56Sopenharmony_ci.. method:: TarFile.extract(member, path="", set_attrs=True, *, numeric_owner=False, filter=None) 4607db96d56Sopenharmony_ci 4617db96d56Sopenharmony_ci Extract a member from the archive to the current working directory, using its 4627db96d56Sopenharmony_ci full name. Its file information is extracted as accurately as possible. *member* 4637db96d56Sopenharmony_ci may be a filename or a :class:`TarInfo` object. You can specify a different 4647db96d56Sopenharmony_ci directory using *path*. *path* may be a :term:`path-like object`. 4657db96d56Sopenharmony_ci File attributes (owner, mtime, mode) are set unless *set_attrs* is false. 4667db96d56Sopenharmony_ci 4677db96d56Sopenharmony_ci The *numeric_owner* and *filter* arguments are the same as 4687db96d56Sopenharmony_ci for :meth:`extractall`. 4697db96d56Sopenharmony_ci 4707db96d56Sopenharmony_ci .. note:: 4717db96d56Sopenharmony_ci 4727db96d56Sopenharmony_ci The :meth:`extract` method does not take care of several extraction issues. 4737db96d56Sopenharmony_ci In most cases you should consider using the :meth:`extractall` method. 4747db96d56Sopenharmony_ci 4757db96d56Sopenharmony_ci .. warning:: 4767db96d56Sopenharmony_ci 4777db96d56Sopenharmony_ci See the warning for :meth:`extractall`. 4787db96d56Sopenharmony_ci 4797db96d56Sopenharmony_ci Set ``filter='data'`` to prevent the most dangerous security issues, 4807db96d56Sopenharmony_ci and read the :ref:`tarfile-extraction-filter` section for details. 4817db96d56Sopenharmony_ci 4827db96d56Sopenharmony_ci .. versionchanged:: 3.2 4837db96d56Sopenharmony_ci Added the *set_attrs* parameter. 4847db96d56Sopenharmony_ci 4857db96d56Sopenharmony_ci .. versionchanged:: 3.5 4867db96d56Sopenharmony_ci Added the *numeric_owner* parameter. 4877db96d56Sopenharmony_ci 4887db96d56Sopenharmony_ci .. versionchanged:: 3.6 4897db96d56Sopenharmony_ci The *path* parameter accepts a :term:`path-like object`. 4907db96d56Sopenharmony_ci 4917db96d56Sopenharmony_ci .. versionchanged:: 3.11.4 4927db96d56Sopenharmony_ci Added the *filter* parameter. 4937db96d56Sopenharmony_ci 4947db96d56Sopenharmony_ci 4957db96d56Sopenharmony_ci.. method:: TarFile.extractfile(member) 4967db96d56Sopenharmony_ci 4977db96d56Sopenharmony_ci Extract a member from the archive as a file object. *member* may be 4987db96d56Sopenharmony_ci a filename or a :class:`TarInfo` object. If *member* is a regular file or 4997db96d56Sopenharmony_ci a link, an :class:`io.BufferedReader` object is returned. For all other 5007db96d56Sopenharmony_ci existing members, :const:`None` is returned. If *member* does not appear 5017db96d56Sopenharmony_ci in the archive, :exc:`KeyError` is raised. 5027db96d56Sopenharmony_ci 5037db96d56Sopenharmony_ci .. versionchanged:: 3.3 5047db96d56Sopenharmony_ci Return an :class:`io.BufferedReader` object. 5057db96d56Sopenharmony_ci 5067db96d56Sopenharmony_ci.. attribute:: TarFile.errorlevel 5077db96d56Sopenharmony_ci :type: int 5087db96d56Sopenharmony_ci 5097db96d56Sopenharmony_ci If *errorlevel* is ``0``, errors are ignored when using :meth:`TarFile.extract` 5107db96d56Sopenharmony_ci and :meth:`TarFile.extractall`. 5117db96d56Sopenharmony_ci Nevertheless, they appear as error messages in the debug output when 5127db96d56Sopenharmony_ci *debug* is greater than 0. 5137db96d56Sopenharmony_ci If ``1`` (the default), all *fatal* errors are raised as :exc:`OSError` or 5147db96d56Sopenharmony_ci :exc:`FilterError` exceptions. If ``2``, all *non-fatal* errors are raised 5157db96d56Sopenharmony_ci as :exc:`TarError` exceptions as well. 5167db96d56Sopenharmony_ci 5177db96d56Sopenharmony_ci Some exceptions, e.g. ones caused by wrong argument types or data 5187db96d56Sopenharmony_ci corruption, are always raised. 5197db96d56Sopenharmony_ci 5207db96d56Sopenharmony_ci Custom :ref:`extraction filters <tarfile-extraction-filter>` 5217db96d56Sopenharmony_ci should raise :exc:`FilterError` for *fatal* errors 5227db96d56Sopenharmony_ci and :exc:`ExtractError` for *non-fatal* ones. 5237db96d56Sopenharmony_ci 5247db96d56Sopenharmony_ci Note that when an exception is raised, the archive may be partially 5257db96d56Sopenharmony_ci extracted. It is the user’s responsibility to clean up. 5267db96d56Sopenharmony_ci 5277db96d56Sopenharmony_ci.. attribute:: TarFile.extraction_filter 5287db96d56Sopenharmony_ci 5297db96d56Sopenharmony_ci .. versionadded:: 3.11.4 5307db96d56Sopenharmony_ci 5317db96d56Sopenharmony_ci The :ref:`extraction filter <tarfile-extraction-filter>` used 5327db96d56Sopenharmony_ci as a default for the *filter* argument of :meth:`~TarFile.extract` 5337db96d56Sopenharmony_ci and :meth:`~TarFile.extractall`. 5347db96d56Sopenharmony_ci 5357db96d56Sopenharmony_ci The attribute may be ``None`` or a callable. 5367db96d56Sopenharmony_ci String names are not allowed for this attribute, unlike the *filter* 5377db96d56Sopenharmony_ci argument to :meth:`~TarFile.extract`. 5387db96d56Sopenharmony_ci 5397db96d56Sopenharmony_ci If ``extraction_filter`` is ``None`` (the default), 5407db96d56Sopenharmony_ci calling an extraction method without a *filter* argument will 5417db96d56Sopenharmony_ci use the :func:`fully_trusted <fully_trusted_filter>` filter for 5427db96d56Sopenharmony_ci compatibility with previous Python versions. 5437db96d56Sopenharmony_ci 5447db96d56Sopenharmony_ci In Python 3.12+, leaving ``extraction_filter=None`` will emit a 5457db96d56Sopenharmony_ci ``DeprecationWarning``. 5467db96d56Sopenharmony_ci 5477db96d56Sopenharmony_ci In Python 3.14+, leaving ``extraction_filter=None`` will cause 5487db96d56Sopenharmony_ci extraction methods to use the :func:`data <data_filter>` filter by default. 5497db96d56Sopenharmony_ci 5507db96d56Sopenharmony_ci The attribute may be set on instances or overridden in subclasses. 5517db96d56Sopenharmony_ci It also is possible to set it on the ``TarFile`` class itself to set a 5527db96d56Sopenharmony_ci global default, although, since it affects all uses of *tarfile*, 5537db96d56Sopenharmony_ci it is best practice to only do so in top-level applications or 5547db96d56Sopenharmony_ci :mod:`site configuration <site>`. 5557db96d56Sopenharmony_ci To set a global default this way, a filter function needs to be wrapped in 5567db96d56Sopenharmony_ci :func:`staticmethod()` to prevent injection of a ``self`` argument. 5577db96d56Sopenharmony_ci 5587db96d56Sopenharmony_ci.. method:: TarFile.add(name, arcname=None, recursive=True, *, filter=None) 5597db96d56Sopenharmony_ci 5607db96d56Sopenharmony_ci Add the file *name* to the archive. *name* may be any type of file 5617db96d56Sopenharmony_ci (directory, fifo, symbolic link, etc.). If given, *arcname* specifies an 5627db96d56Sopenharmony_ci alternative name for the file in the archive. Directories are added 5637db96d56Sopenharmony_ci recursively by default. This can be avoided by setting *recursive* to 5647db96d56Sopenharmony_ci :const:`False`. Recursion adds entries in sorted order. 5657db96d56Sopenharmony_ci If *filter* is given, it 5667db96d56Sopenharmony_ci should be a function that takes a :class:`TarInfo` object argument and 5677db96d56Sopenharmony_ci returns the changed :class:`TarInfo` object. If it instead returns 5687db96d56Sopenharmony_ci :const:`None` the :class:`TarInfo` object will be excluded from the 5697db96d56Sopenharmony_ci archive. See :ref:`tar-examples` for an example. 5707db96d56Sopenharmony_ci 5717db96d56Sopenharmony_ci .. versionchanged:: 3.2 5727db96d56Sopenharmony_ci Added the *filter* parameter. 5737db96d56Sopenharmony_ci 5747db96d56Sopenharmony_ci .. versionchanged:: 3.7 5757db96d56Sopenharmony_ci Recursion adds entries in sorted order. 5767db96d56Sopenharmony_ci 5777db96d56Sopenharmony_ci 5787db96d56Sopenharmony_ci.. method:: TarFile.addfile(tarinfo, fileobj=None) 5797db96d56Sopenharmony_ci 5807db96d56Sopenharmony_ci Add the :class:`TarInfo` object *tarinfo* to the archive. If *fileobj* is given, 5817db96d56Sopenharmony_ci it should be a :term:`binary file`, and 5827db96d56Sopenharmony_ci ``tarinfo.size`` bytes are read from it and added to the archive. You can 5837db96d56Sopenharmony_ci create :class:`TarInfo` objects directly, or by using :meth:`gettarinfo`. 5847db96d56Sopenharmony_ci 5857db96d56Sopenharmony_ci 5867db96d56Sopenharmony_ci.. method:: TarFile.gettarinfo(name=None, arcname=None, fileobj=None) 5877db96d56Sopenharmony_ci 5887db96d56Sopenharmony_ci Create a :class:`TarInfo` object from the result of :func:`os.stat` or 5897db96d56Sopenharmony_ci equivalent on an existing file. The file is either named by *name*, or 5907db96d56Sopenharmony_ci specified as a :term:`file object` *fileobj* with a file descriptor. 5917db96d56Sopenharmony_ci *name* may be a :term:`path-like object`. If 5927db96d56Sopenharmony_ci given, *arcname* specifies an alternative name for the file in the 5937db96d56Sopenharmony_ci archive, otherwise, the name is taken from *fileobj*’s 5947db96d56Sopenharmony_ci :attr:`~io.FileIO.name` attribute, or the *name* argument. The name 5957db96d56Sopenharmony_ci should be a text string. 5967db96d56Sopenharmony_ci 5977db96d56Sopenharmony_ci You can modify 5987db96d56Sopenharmony_ci some of the :class:`TarInfo`’s attributes before you add it using :meth:`addfile`. 5997db96d56Sopenharmony_ci If the file object is not an ordinary file object positioned at the 6007db96d56Sopenharmony_ci beginning of the file, attributes such as :attr:`~TarInfo.size` may need 6017db96d56Sopenharmony_ci modifying. This is the case for objects such as :class:`~gzip.GzipFile`. 6027db96d56Sopenharmony_ci The :attr:`~TarInfo.name` may also be modified, in which case *arcname* 6037db96d56Sopenharmony_ci could be a dummy string. 6047db96d56Sopenharmony_ci 6057db96d56Sopenharmony_ci .. versionchanged:: 3.6 6067db96d56Sopenharmony_ci The *name* parameter accepts a :term:`path-like object`. 6077db96d56Sopenharmony_ci 6087db96d56Sopenharmony_ci 6097db96d56Sopenharmony_ci.. method:: TarFile.close() 6107db96d56Sopenharmony_ci 6117db96d56Sopenharmony_ci Close the :class:`TarFile`. In write mode, two finishing zero blocks are 6127db96d56Sopenharmony_ci appended to the archive. 6137db96d56Sopenharmony_ci 6147db96d56Sopenharmony_ci 6157db96d56Sopenharmony_ci.. attribute:: TarFile.pax_headers 6167db96d56Sopenharmony_ci 6177db96d56Sopenharmony_ci A dictionary containing key-value pairs of pax global headers. 6187db96d56Sopenharmony_ci 6197db96d56Sopenharmony_ci 6207db96d56Sopenharmony_ci 6217db96d56Sopenharmony_ci.. _tarinfo-objects: 6227db96d56Sopenharmony_ci 6237db96d56Sopenharmony_ciTarInfo Objects 6247db96d56Sopenharmony_ci--------------- 6257db96d56Sopenharmony_ci 6267db96d56Sopenharmony_ciA :class:`TarInfo` object represents one member in a :class:`TarFile`. Aside 6277db96d56Sopenharmony_cifrom storing all required attributes of a file (like file type, size, time, 6287db96d56Sopenharmony_cipermissions, owner etc.), it provides some useful methods to determine its type. 6297db96d56Sopenharmony_ciIt does *not* contain the file's data itself. 6307db96d56Sopenharmony_ci 6317db96d56Sopenharmony_ci:class:`TarInfo` objects are returned by :class:`TarFile`'s methods 6327db96d56Sopenharmony_ci:meth:`~TarFile.getmember`, :meth:`~TarFile.getmembers` and 6337db96d56Sopenharmony_ci:meth:`~TarFile.gettarinfo`. 6347db96d56Sopenharmony_ci 6357db96d56Sopenharmony_ciModifying the objects returned by :meth:`~!TarFile.getmember` or 6367db96d56Sopenharmony_ci:meth:`~!TarFile.getmembers` will affect all subsequent 6377db96d56Sopenharmony_cioperations on the archive. 6387db96d56Sopenharmony_ciFor cases where this is unwanted, you can use :mod:`copy.copy() <copy>` or 6397db96d56Sopenharmony_cicall the :meth:`~TarInfo.replace` method to create a modified copy in one step. 6407db96d56Sopenharmony_ci 6417db96d56Sopenharmony_ciSeveral attributes can be set to ``None`` to indicate that a piece of metadata 6427db96d56Sopenharmony_ciis unused or unknown. 6437db96d56Sopenharmony_ciDifferent :class:`TarInfo` methods handle ``None`` differently: 6447db96d56Sopenharmony_ci 6457db96d56Sopenharmony_ci- The :meth:`~TarFile.extract` or :meth:`~TarFile.extractall` methods will 6467db96d56Sopenharmony_ci ignore the corresponding metadata, leaving it set to a default. 6477db96d56Sopenharmony_ci- :meth:`~TarFile.addfile` will fail. 6487db96d56Sopenharmony_ci- :meth:`~TarFile.list` will print a placeholder string. 6497db96d56Sopenharmony_ci 6507db96d56Sopenharmony_ci 6517db96d56Sopenharmony_ci.. versionchanged:: 3.11.4 6527db96d56Sopenharmony_ci Added :meth:`~TarInfo.replace` and handling of ``None``. 6537db96d56Sopenharmony_ci 6547db96d56Sopenharmony_ci 6557db96d56Sopenharmony_ci.. class:: TarInfo(name="") 6567db96d56Sopenharmony_ci 6577db96d56Sopenharmony_ci Create a :class:`TarInfo` object. 6587db96d56Sopenharmony_ci 6597db96d56Sopenharmony_ci 6607db96d56Sopenharmony_ci.. classmethod:: TarInfo.frombuf(buf, encoding, errors) 6617db96d56Sopenharmony_ci 6627db96d56Sopenharmony_ci Create and return a :class:`TarInfo` object from string buffer *buf*. 6637db96d56Sopenharmony_ci 6647db96d56Sopenharmony_ci Raises :exc:`HeaderError` if the buffer is invalid. 6657db96d56Sopenharmony_ci 6667db96d56Sopenharmony_ci 6677db96d56Sopenharmony_ci.. classmethod:: TarInfo.fromtarfile(tarfile) 6687db96d56Sopenharmony_ci 6697db96d56Sopenharmony_ci Read the next member from the :class:`TarFile` object *tarfile* and return it as 6707db96d56Sopenharmony_ci a :class:`TarInfo` object. 6717db96d56Sopenharmony_ci 6727db96d56Sopenharmony_ci 6737db96d56Sopenharmony_ci.. method:: TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='surrogateescape') 6747db96d56Sopenharmony_ci 6757db96d56Sopenharmony_ci Create a string buffer from a :class:`TarInfo` object. For information on the 6767db96d56Sopenharmony_ci arguments see the constructor of the :class:`TarFile` class. 6777db96d56Sopenharmony_ci 6787db96d56Sopenharmony_ci .. versionchanged:: 3.2 6797db96d56Sopenharmony_ci Use ``'surrogateescape'`` as the default for the *errors* argument. 6807db96d56Sopenharmony_ci 6817db96d56Sopenharmony_ci 6827db96d56Sopenharmony_ciA ``TarInfo`` object has the following public data attributes: 6837db96d56Sopenharmony_ci 6847db96d56Sopenharmony_ci 6857db96d56Sopenharmony_ci.. attribute:: TarInfo.name 6867db96d56Sopenharmony_ci :type: str 6877db96d56Sopenharmony_ci 6887db96d56Sopenharmony_ci Name of the archive member. 6897db96d56Sopenharmony_ci 6907db96d56Sopenharmony_ci 6917db96d56Sopenharmony_ci.. attribute:: TarInfo.size 6927db96d56Sopenharmony_ci :type: int 6937db96d56Sopenharmony_ci 6947db96d56Sopenharmony_ci Size in bytes. 6957db96d56Sopenharmony_ci 6967db96d56Sopenharmony_ci 6977db96d56Sopenharmony_ci.. attribute:: TarInfo.mtime 6987db96d56Sopenharmony_ci :type: int | float 6997db96d56Sopenharmony_ci 7007db96d56Sopenharmony_ci Time of last modification in seconds since the :ref:`epoch <epoch>`, 7017db96d56Sopenharmony_ci as in :attr:`os.stat_result.st_mtime`. 7027db96d56Sopenharmony_ci 7037db96d56Sopenharmony_ci .. versionchanged:: 3.11.4 7047db96d56Sopenharmony_ci 7057db96d56Sopenharmony_ci Can be set to ``None`` for :meth:`~TarFile.extract` and 7067db96d56Sopenharmony_ci :meth:`~TarFile.extractall`, causing extraction to skip applying this 7077db96d56Sopenharmony_ci attribute. 7087db96d56Sopenharmony_ci 7097db96d56Sopenharmony_ci.. attribute:: TarInfo.mode 7107db96d56Sopenharmony_ci :type: int 7117db96d56Sopenharmony_ci 7127db96d56Sopenharmony_ci Permission bits, as for :func:`os.chmod`. 7137db96d56Sopenharmony_ci 7147db96d56Sopenharmony_ci .. versionchanged:: 3.11.4 7157db96d56Sopenharmony_ci 7167db96d56Sopenharmony_ci Can be set to ``None`` for :meth:`~TarFile.extract` and 7177db96d56Sopenharmony_ci :meth:`~TarFile.extractall`, causing extraction to skip applying this 7187db96d56Sopenharmony_ci attribute. 7197db96d56Sopenharmony_ci 7207db96d56Sopenharmony_ci.. attribute:: TarInfo.type 7217db96d56Sopenharmony_ci 7227db96d56Sopenharmony_ci File type. *type* is usually one of these constants: :const:`REGTYPE`, 7237db96d56Sopenharmony_ci :const:`AREGTYPE`, :const:`LNKTYPE`, :const:`SYMTYPE`, :const:`DIRTYPE`, 7247db96d56Sopenharmony_ci :const:`FIFOTYPE`, :const:`CONTTYPE`, :const:`CHRTYPE`, :const:`BLKTYPE`, 7257db96d56Sopenharmony_ci :const:`GNUTYPE_SPARSE`. To determine the type of a :class:`TarInfo` object 7267db96d56Sopenharmony_ci more conveniently, use the ``is*()`` methods below. 7277db96d56Sopenharmony_ci 7287db96d56Sopenharmony_ci 7297db96d56Sopenharmony_ci.. attribute:: TarInfo.linkname 7307db96d56Sopenharmony_ci :type: str 7317db96d56Sopenharmony_ci 7327db96d56Sopenharmony_ci Name of the target file name, which is only present in :class:`TarInfo` objects 7337db96d56Sopenharmony_ci of type :const:`LNKTYPE` and :const:`SYMTYPE`. 7347db96d56Sopenharmony_ci 7357db96d56Sopenharmony_ci 7367db96d56Sopenharmony_ci.. attribute:: TarInfo.uid 7377db96d56Sopenharmony_ci :type: int 7387db96d56Sopenharmony_ci 7397db96d56Sopenharmony_ci User ID of the user who originally stored this member. 7407db96d56Sopenharmony_ci 7417db96d56Sopenharmony_ci .. versionchanged:: 3.11.4 7427db96d56Sopenharmony_ci 7437db96d56Sopenharmony_ci Can be set to ``None`` for :meth:`~TarFile.extract` and 7447db96d56Sopenharmony_ci :meth:`~TarFile.extractall`, causing extraction to skip applying this 7457db96d56Sopenharmony_ci attribute. 7467db96d56Sopenharmony_ci 7477db96d56Sopenharmony_ci.. attribute:: TarInfo.gid 7487db96d56Sopenharmony_ci :type: int 7497db96d56Sopenharmony_ci 7507db96d56Sopenharmony_ci Group ID of the user who originally stored this member. 7517db96d56Sopenharmony_ci 7527db96d56Sopenharmony_ci .. versionchanged:: 3.11.4 7537db96d56Sopenharmony_ci 7547db96d56Sopenharmony_ci Can be set to ``None`` for :meth:`~TarFile.extract` and 7557db96d56Sopenharmony_ci :meth:`~TarFile.extractall`, causing extraction to skip applying this 7567db96d56Sopenharmony_ci attribute. 7577db96d56Sopenharmony_ci 7587db96d56Sopenharmony_ci.. attribute:: TarInfo.uname 7597db96d56Sopenharmony_ci :type: str 7607db96d56Sopenharmony_ci 7617db96d56Sopenharmony_ci User name. 7627db96d56Sopenharmony_ci 7637db96d56Sopenharmony_ci .. versionchanged:: 3.11.4 7647db96d56Sopenharmony_ci 7657db96d56Sopenharmony_ci Can be set to ``None`` for :meth:`~TarFile.extract` and 7667db96d56Sopenharmony_ci :meth:`~TarFile.extractall`, causing extraction to skip applying this 7677db96d56Sopenharmony_ci attribute. 7687db96d56Sopenharmony_ci 7697db96d56Sopenharmony_ci.. attribute:: TarInfo.gname 7707db96d56Sopenharmony_ci :type: str 7717db96d56Sopenharmony_ci 7727db96d56Sopenharmony_ci Group name. 7737db96d56Sopenharmony_ci 7747db96d56Sopenharmony_ci .. versionchanged:: 3.11.4 7757db96d56Sopenharmony_ci 7767db96d56Sopenharmony_ci Can be set to ``None`` for :meth:`~TarFile.extract` and 7777db96d56Sopenharmony_ci :meth:`~TarFile.extractall`, causing extraction to skip applying this 7787db96d56Sopenharmony_ci attribute. 7797db96d56Sopenharmony_ci 7807db96d56Sopenharmony_ci.. attribute:: TarInfo.pax_headers 7817db96d56Sopenharmony_ci :type: dict 7827db96d56Sopenharmony_ci 7837db96d56Sopenharmony_ci A dictionary containing key-value pairs of an associated pax extended header. 7847db96d56Sopenharmony_ci 7857db96d56Sopenharmony_ci.. method:: TarInfo.replace(name=..., mtime=..., mode=..., linkname=..., 7867db96d56Sopenharmony_ci uid=..., gid=..., uname=..., gname=..., 7877db96d56Sopenharmony_ci deep=True) 7887db96d56Sopenharmony_ci 7897db96d56Sopenharmony_ci .. versionadded:: 3.11.4 7907db96d56Sopenharmony_ci 7917db96d56Sopenharmony_ci Return a *new* copy of the :class:`!TarInfo` object with the given attributes 7927db96d56Sopenharmony_ci changed. For example, to return a ``TarInfo`` with the group name set to 7937db96d56Sopenharmony_ci ``'staff'``, use:: 7947db96d56Sopenharmony_ci 7957db96d56Sopenharmony_ci new_tarinfo = old_tarinfo.replace(gname='staff') 7967db96d56Sopenharmony_ci 7977db96d56Sopenharmony_ci By default, a deep copy is made. 7987db96d56Sopenharmony_ci If *deep* is false, the copy is shallow, i.e. ``pax_headers`` 7997db96d56Sopenharmony_ci and any custom attributes are shared with the original ``TarInfo`` object. 8007db96d56Sopenharmony_ci 8017db96d56Sopenharmony_ciA :class:`TarInfo` object also provides some convenient query methods: 8027db96d56Sopenharmony_ci 8037db96d56Sopenharmony_ci 8047db96d56Sopenharmony_ci.. method:: TarInfo.isfile() 8057db96d56Sopenharmony_ci 8067db96d56Sopenharmony_ci Return :const:`True` if the :class:`Tarinfo` object is a regular file. 8077db96d56Sopenharmony_ci 8087db96d56Sopenharmony_ci 8097db96d56Sopenharmony_ci.. method:: TarInfo.isreg() 8107db96d56Sopenharmony_ci 8117db96d56Sopenharmony_ci Same as :meth:`isfile`. 8127db96d56Sopenharmony_ci 8137db96d56Sopenharmony_ci 8147db96d56Sopenharmony_ci.. method:: TarInfo.isdir() 8157db96d56Sopenharmony_ci 8167db96d56Sopenharmony_ci Return :const:`True` if it is a directory. 8177db96d56Sopenharmony_ci 8187db96d56Sopenharmony_ci 8197db96d56Sopenharmony_ci.. method:: TarInfo.issym() 8207db96d56Sopenharmony_ci 8217db96d56Sopenharmony_ci Return :const:`True` if it is a symbolic link. 8227db96d56Sopenharmony_ci 8237db96d56Sopenharmony_ci 8247db96d56Sopenharmony_ci.. method:: TarInfo.islnk() 8257db96d56Sopenharmony_ci 8267db96d56Sopenharmony_ci Return :const:`True` if it is a hard link. 8277db96d56Sopenharmony_ci 8287db96d56Sopenharmony_ci 8297db96d56Sopenharmony_ci.. method:: TarInfo.ischr() 8307db96d56Sopenharmony_ci 8317db96d56Sopenharmony_ci Return :const:`True` if it is a character device. 8327db96d56Sopenharmony_ci 8337db96d56Sopenharmony_ci 8347db96d56Sopenharmony_ci.. method:: TarInfo.isblk() 8357db96d56Sopenharmony_ci 8367db96d56Sopenharmony_ci Return :const:`True` if it is a block device. 8377db96d56Sopenharmony_ci 8387db96d56Sopenharmony_ci 8397db96d56Sopenharmony_ci.. method:: TarInfo.isfifo() 8407db96d56Sopenharmony_ci 8417db96d56Sopenharmony_ci Return :const:`True` if it is a FIFO. 8427db96d56Sopenharmony_ci 8437db96d56Sopenharmony_ci 8447db96d56Sopenharmony_ci.. method:: TarInfo.isdev() 8457db96d56Sopenharmony_ci 8467db96d56Sopenharmony_ci Return :const:`True` if it is one of character device, block device or FIFO. 8477db96d56Sopenharmony_ci 8487db96d56Sopenharmony_ci 8497db96d56Sopenharmony_ci.. _tarfile-extraction-filter: 8507db96d56Sopenharmony_ci 8517db96d56Sopenharmony_ciExtraction filters 8527db96d56Sopenharmony_ci------------------ 8537db96d56Sopenharmony_ci 8547db96d56Sopenharmony_ci.. versionadded:: 3.11.4 8557db96d56Sopenharmony_ci 8567db96d56Sopenharmony_ciThe *tar* format is designed to capture all details of a UNIX-like filesystem, 8577db96d56Sopenharmony_ciwhich makes it very powerful. 8587db96d56Sopenharmony_ciUnfortunately, the features make it easy to create tar files that have 8597db96d56Sopenharmony_ciunintended -- and possibly malicious -- effects when extracted. 8607db96d56Sopenharmony_ciFor example, extracting a tar file can overwrite arbitrary files in various 8617db96d56Sopenharmony_ciways (e.g. by using absolute paths, ``..`` path components, or symlinks that 8627db96d56Sopenharmony_ciaffect later members). 8637db96d56Sopenharmony_ci 8647db96d56Sopenharmony_ciIn most cases, the full functionality is not needed. 8657db96d56Sopenharmony_ciTherefore, *tarfile* supports extraction filters: a mechanism to limit 8667db96d56Sopenharmony_cifunctionality, and thus mitigate some of the security issues. 8677db96d56Sopenharmony_ci 8687db96d56Sopenharmony_ci.. seealso:: 8697db96d56Sopenharmony_ci 8707db96d56Sopenharmony_ci :pep:`706` 8717db96d56Sopenharmony_ci Contains further motivation and rationale behind the design. 8727db96d56Sopenharmony_ci 8737db96d56Sopenharmony_ciThe *filter* argument to :meth:`TarFile.extract` or :meth:`~TarFile.extractall` 8747db96d56Sopenharmony_cican be: 8757db96d56Sopenharmony_ci 8767db96d56Sopenharmony_ci* the string ``'fully_trusted'``: Honor all metadata as specified in the 8777db96d56Sopenharmony_ci archive. 8787db96d56Sopenharmony_ci Should be used if the user trusts the archive completely, or implements 8797db96d56Sopenharmony_ci their own complex verification. 8807db96d56Sopenharmony_ci 8817db96d56Sopenharmony_ci* the string ``'tar'``: Honor most *tar*-specific features (i.e. features of 8827db96d56Sopenharmony_ci UNIX-like filesystems), but block features that are very likely to be 8837db96d56Sopenharmony_ci surprising or malicious. See :func:`tar_filter` for details. 8847db96d56Sopenharmony_ci 8857db96d56Sopenharmony_ci* the string ``'data'``: Ignore or block most features specific to UNIX-like 8867db96d56Sopenharmony_ci filesystems. Intended for extracting cross-platform data archives. 8877db96d56Sopenharmony_ci See :func:`data_filter` for details. 8887db96d56Sopenharmony_ci 8897db96d56Sopenharmony_ci* ``None`` (default): Use :attr:`TarFile.extraction_filter`. 8907db96d56Sopenharmony_ci 8917db96d56Sopenharmony_ci If that is also ``None`` (the default), the ``'fully_trusted'`` 8927db96d56Sopenharmony_ci filter will be used (for compatibility with earlier versions of Python). 8937db96d56Sopenharmony_ci 8947db96d56Sopenharmony_ci In Python 3.12, the default will emit a ``DeprecationWarning``. 8957db96d56Sopenharmony_ci 8967db96d56Sopenharmony_ci In Python 3.14, the ``'data'`` filter will become the default instead. 8977db96d56Sopenharmony_ci It's possible to switch earlier; see :attr:`TarFile.extraction_filter`. 8987db96d56Sopenharmony_ci 8997db96d56Sopenharmony_ci* A callable which will be called for each extracted member with a 9007db96d56Sopenharmony_ci :ref:`TarInfo <tarinfo-objects>` describing the member and the destination 9017db96d56Sopenharmony_ci path to where the archive is extracted (i.e. the same path is used for all 9027db96d56Sopenharmony_ci members):: 9037db96d56Sopenharmony_ci 9047db96d56Sopenharmony_ci filter(/, member: TarInfo, path: str) -> TarInfo | None 9057db96d56Sopenharmony_ci 9067db96d56Sopenharmony_ci The callable is called just before each member is extracted, so it can 9077db96d56Sopenharmony_ci take the current state of the disk into account. 9087db96d56Sopenharmony_ci It can: 9097db96d56Sopenharmony_ci 9107db96d56Sopenharmony_ci - return a :class:`TarInfo` object which will be used instead of the metadata 9117db96d56Sopenharmony_ci in the archive, or 9127db96d56Sopenharmony_ci - return ``None``, in which case the member will be skipped, or 9137db96d56Sopenharmony_ci - raise an exception to abort the operation or skip the member, 9147db96d56Sopenharmony_ci depending on :attr:`~TarFile.errorlevel`. 9157db96d56Sopenharmony_ci Note that when extraction is aborted, :meth:`~TarFile.extractall` may leave 9167db96d56Sopenharmony_ci the archive partially extracted. It does not attempt to clean up. 9177db96d56Sopenharmony_ci 9187db96d56Sopenharmony_ciDefault named filters 9197db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~ 9207db96d56Sopenharmony_ci 9217db96d56Sopenharmony_ciThe pre-defined, named filters are available as functions, so they can be 9227db96d56Sopenharmony_cireused in custom filters: 9237db96d56Sopenharmony_ci 9247db96d56Sopenharmony_ci.. function:: fully_trusted_filter(/, member, path) 9257db96d56Sopenharmony_ci 9267db96d56Sopenharmony_ci Return *member* unchanged. 9277db96d56Sopenharmony_ci 9287db96d56Sopenharmony_ci This implements the ``'fully_trusted'`` filter. 9297db96d56Sopenharmony_ci 9307db96d56Sopenharmony_ci.. function:: tar_filter(/, member, path) 9317db96d56Sopenharmony_ci 9327db96d56Sopenharmony_ci Implements the ``'tar'`` filter. 9337db96d56Sopenharmony_ci 9347db96d56Sopenharmony_ci - Strip leading slashes (``/`` and :attr:`os.sep`) from filenames. 9357db96d56Sopenharmony_ci - :ref:`Refuse <tarfile-extraction-refuse>` to extract files with absolute 9367db96d56Sopenharmony_ci paths (in case the name is absolute 9377db96d56Sopenharmony_ci even after stripping slashes, e.g. ``C:/foo`` on Windows). 9387db96d56Sopenharmony_ci This raises :class:`~tarfile.AbsolutePathError`. 9397db96d56Sopenharmony_ci - :ref:`Refuse <tarfile-extraction-refuse>` to extract files whose absolute 9407db96d56Sopenharmony_ci path (after following symlinks) would end up outside the destination. 9417db96d56Sopenharmony_ci This raises :class:`~tarfile.OutsideDestinationError`. 9427db96d56Sopenharmony_ci - Clear high mode bits (setuid, setgid, sticky) and group/other write bits 9437db96d56Sopenharmony_ci (:attr:`~stat.S_IWGRP`|:attr:`~stat.S_IWOTH`). 9447db96d56Sopenharmony_ci 9457db96d56Sopenharmony_ci Return the modified ``TarInfo`` member. 9467db96d56Sopenharmony_ci 9477db96d56Sopenharmony_ci.. function:: data_filter(/, member, path) 9487db96d56Sopenharmony_ci 9497db96d56Sopenharmony_ci Implements the ``'data'`` filter. 9507db96d56Sopenharmony_ci In addition to what ``tar_filter`` does: 9517db96d56Sopenharmony_ci 9527db96d56Sopenharmony_ci - :ref:`Refuse <tarfile-extraction-refuse>` to extract links (hard or soft) 9537db96d56Sopenharmony_ci that link to absolute paths, or ones that link outside the destination. 9547db96d56Sopenharmony_ci 9557db96d56Sopenharmony_ci This raises :class:`~tarfile.AbsoluteLinkError` or 9567db96d56Sopenharmony_ci :class:`~tarfile.LinkOutsideDestinationError`. 9577db96d56Sopenharmony_ci 9587db96d56Sopenharmony_ci Note that such files are refused even on platforms that do not support 9597db96d56Sopenharmony_ci symbolic links. 9607db96d56Sopenharmony_ci 9617db96d56Sopenharmony_ci - :ref:`Refuse <tarfile-extraction-refuse>` to extract device files 9627db96d56Sopenharmony_ci (including pipes). 9637db96d56Sopenharmony_ci This raises :class:`~tarfile.SpecialFileError`. 9647db96d56Sopenharmony_ci 9657db96d56Sopenharmony_ci - For regular files, including hard links: 9667db96d56Sopenharmony_ci 9677db96d56Sopenharmony_ci - Set the owner read and write permissions 9687db96d56Sopenharmony_ci (:attr:`~stat.S_IRUSR`|:attr:`~stat.S_IWUSR`). 9697db96d56Sopenharmony_ci - Remove the group & other executable permission 9707db96d56Sopenharmony_ci (:attr:`~stat.S_IXGRP`|:attr:`~stat.S_IXOTH`) 9717db96d56Sopenharmony_ci if the owner doesn’t have it (:attr:`~stat.S_IXUSR`). 9727db96d56Sopenharmony_ci 9737db96d56Sopenharmony_ci - For other files (directories), set ``mode`` to ``None``, so 9747db96d56Sopenharmony_ci that extraction methods skip applying permission bits. 9757db96d56Sopenharmony_ci - Set user and group info (``uid``, ``gid``, ``uname``, ``gname``) 9767db96d56Sopenharmony_ci to ``None``, so that extraction methods skip setting it. 9777db96d56Sopenharmony_ci 9787db96d56Sopenharmony_ci Return the modified ``TarInfo`` member. 9797db96d56Sopenharmony_ci 9807db96d56Sopenharmony_ci 9817db96d56Sopenharmony_ci.. _tarfile-extraction-refuse: 9827db96d56Sopenharmony_ci 9837db96d56Sopenharmony_ciFilter errors 9847db96d56Sopenharmony_ci~~~~~~~~~~~~~ 9857db96d56Sopenharmony_ci 9867db96d56Sopenharmony_ciWhen a filter refuses to extract a file, it will raise an appropriate exception, 9877db96d56Sopenharmony_cia subclass of :class:`~tarfile.FilterError`. 9887db96d56Sopenharmony_ciThis will abort the extraction if :attr:`TarFile.errorlevel` is 1 or more. 9897db96d56Sopenharmony_ciWith ``errorlevel=0`` the error will be logged and the member will be skipped, 9907db96d56Sopenharmony_cibut extraction will continue. 9917db96d56Sopenharmony_ci 9927db96d56Sopenharmony_ci 9937db96d56Sopenharmony_ciHints for further verification 9947db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 9957db96d56Sopenharmony_ci 9967db96d56Sopenharmony_ciEven with ``filter='data'``, *tarfile* is not suited for extracting untrusted 9977db96d56Sopenharmony_cifiles without prior inspection. 9987db96d56Sopenharmony_ciAmong other issues, the pre-defined filters do not prevent denial-of-service 9997db96d56Sopenharmony_ciattacks. Users should do additional checks. 10007db96d56Sopenharmony_ci 10017db96d56Sopenharmony_ciHere is an incomplete list of things to consider: 10027db96d56Sopenharmony_ci 10037db96d56Sopenharmony_ci* Extract to a :func:`new temporary directory <tempfile.mkdtemp>` 10047db96d56Sopenharmony_ci to prevent e.g. exploiting pre-existing links, and to make it easier to 10057db96d56Sopenharmony_ci clean up after a failed extraction. 10067db96d56Sopenharmony_ci* When working with untrusted data, use external (e.g. OS-level) limits on 10077db96d56Sopenharmony_ci disk, memory and CPU usage. 10087db96d56Sopenharmony_ci* Check filenames against an allow-list of characters 10097db96d56Sopenharmony_ci (to filter out control characters, confusables, foreign path separators, 10107db96d56Sopenharmony_ci etc.). 10117db96d56Sopenharmony_ci* Check that filenames have expected extensions (discouraging files that 10127db96d56Sopenharmony_ci execute when you “click on them”, or extension-less files like Windows special device names). 10137db96d56Sopenharmony_ci* Limit the number of extracted files, total size of extracted data, 10147db96d56Sopenharmony_ci filename length (including symlink length), and size of individual files. 10157db96d56Sopenharmony_ci* Check for files that would be shadowed on case-insensitive filesystems. 10167db96d56Sopenharmony_ci 10177db96d56Sopenharmony_ciAlso note that: 10187db96d56Sopenharmony_ci 10197db96d56Sopenharmony_ci* Tar files may contain multiple versions of the same file. 10207db96d56Sopenharmony_ci Later ones are expected to overwrite any earlier ones. 10217db96d56Sopenharmony_ci This feature is crucial to allow updating tape archives, but can be abused 10227db96d56Sopenharmony_ci maliciously. 10237db96d56Sopenharmony_ci* *tarfile* does not protect against issues with “live” data, 10247db96d56Sopenharmony_ci e.g. an attacker tinkering with the destination (or source) directory while 10257db96d56Sopenharmony_ci extraction (or archiving) is in progress. 10267db96d56Sopenharmony_ci 10277db96d56Sopenharmony_ci 10287db96d56Sopenharmony_ciSupporting older Python versions 10297db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 10307db96d56Sopenharmony_ci 10317db96d56Sopenharmony_ciExtraction filters were added to Python 3.12, and are backported to older 10327db96d56Sopenharmony_civersions as security updates. 10337db96d56Sopenharmony_ciTo check whether the feature is available, use e.g. 10347db96d56Sopenharmony_ci``hasattr(tarfile, 'data_filter')`` rather than checking the Python version. 10357db96d56Sopenharmony_ci 10367db96d56Sopenharmony_ciThe following examples show how to support Python versions with and without 10377db96d56Sopenharmony_cithe feature. 10387db96d56Sopenharmony_ciNote that setting ``extraction_filter`` will affect any subsequent operations. 10397db96d56Sopenharmony_ci 10407db96d56Sopenharmony_ci* Fully trusted archive:: 10417db96d56Sopenharmony_ci 10427db96d56Sopenharmony_ci my_tarfile.extraction_filter = (lambda member, path: member) 10437db96d56Sopenharmony_ci my_tarfile.extractall() 10447db96d56Sopenharmony_ci 10457db96d56Sopenharmony_ci* Use the ``'data'`` filter if available, but revert to Python 3.11 behavior 10467db96d56Sopenharmony_ci (``'fully_trusted'``) if this feature is not available:: 10477db96d56Sopenharmony_ci 10487db96d56Sopenharmony_ci my_tarfile.extraction_filter = getattr(tarfile, 'data_filter', 10497db96d56Sopenharmony_ci (lambda member, path: member)) 10507db96d56Sopenharmony_ci my_tarfile.extractall() 10517db96d56Sopenharmony_ci 10527db96d56Sopenharmony_ci* Use the ``'data'`` filter; *fail* if it is not available:: 10537db96d56Sopenharmony_ci 10547db96d56Sopenharmony_ci my_tarfile.extractall(filter=tarfile.data_filter) 10557db96d56Sopenharmony_ci 10567db96d56Sopenharmony_ci or:: 10577db96d56Sopenharmony_ci 10587db96d56Sopenharmony_ci my_tarfile.extraction_filter = tarfile.data_filter 10597db96d56Sopenharmony_ci my_tarfile.extractall() 10607db96d56Sopenharmony_ci 10617db96d56Sopenharmony_ci* Use the ``'data'`` filter; *warn* if it is not available:: 10627db96d56Sopenharmony_ci 10637db96d56Sopenharmony_ci if hasattr(tarfile, 'data_filter'): 10647db96d56Sopenharmony_ci my_tarfile.extractall(filter='data') 10657db96d56Sopenharmony_ci else: 10667db96d56Sopenharmony_ci # remove this when no longer needed 10677db96d56Sopenharmony_ci warn_the_user('Extracting may be unsafe; consider updating Python') 10687db96d56Sopenharmony_ci my_tarfile.extractall() 10697db96d56Sopenharmony_ci 10707db96d56Sopenharmony_ci 10717db96d56Sopenharmony_ciStateful extraction filter example 10727db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 10737db96d56Sopenharmony_ci 10747db96d56Sopenharmony_ciWhile *tarfile*'s extraction methods take a simple *filter* callable, 10757db96d56Sopenharmony_cicustom filters may be more complex objects with an internal state. 10767db96d56Sopenharmony_ciIt may be useful to write these as context managers, to be used like this:: 10777db96d56Sopenharmony_ci 10787db96d56Sopenharmony_ci with StatefulFilter() as filter_func: 10797db96d56Sopenharmony_ci tar.extractall(path, filter=filter_func) 10807db96d56Sopenharmony_ci 10817db96d56Sopenharmony_ciSuch a filter can be written as, for example:: 10827db96d56Sopenharmony_ci 10837db96d56Sopenharmony_ci class StatefulFilter: 10847db96d56Sopenharmony_ci def __init__(self): 10857db96d56Sopenharmony_ci self.file_count = 0 10867db96d56Sopenharmony_ci 10877db96d56Sopenharmony_ci def __enter__(self): 10887db96d56Sopenharmony_ci return self 10897db96d56Sopenharmony_ci 10907db96d56Sopenharmony_ci def __call__(self, member, path): 10917db96d56Sopenharmony_ci self.file_count += 1 10927db96d56Sopenharmony_ci return member 10937db96d56Sopenharmony_ci 10947db96d56Sopenharmony_ci def __exit__(self, *exc_info): 10957db96d56Sopenharmony_ci print(f'{self.file_count} files extracted') 10967db96d56Sopenharmony_ci 10977db96d56Sopenharmony_ci 10987db96d56Sopenharmony_ci.. _tarfile-commandline: 10997db96d56Sopenharmony_ci.. program:: tarfile 11007db96d56Sopenharmony_ci 11017db96d56Sopenharmony_ci 11027db96d56Sopenharmony_ciCommand-Line Interface 11037db96d56Sopenharmony_ci---------------------- 11047db96d56Sopenharmony_ci 11057db96d56Sopenharmony_ci.. versionadded:: 3.4 11067db96d56Sopenharmony_ci 11077db96d56Sopenharmony_ciThe :mod:`tarfile` module provides a simple command-line interface to interact 11087db96d56Sopenharmony_ciwith tar archives. 11097db96d56Sopenharmony_ci 11107db96d56Sopenharmony_ciIf you want to create a new tar archive, specify its name after the :option:`-c` 11117db96d56Sopenharmony_cioption and then list the filename(s) that should be included: 11127db96d56Sopenharmony_ci 11137db96d56Sopenharmony_ci.. code-block:: shell-session 11147db96d56Sopenharmony_ci 11157db96d56Sopenharmony_ci $ python -m tarfile -c monty.tar spam.txt eggs.txt 11167db96d56Sopenharmony_ci 11177db96d56Sopenharmony_ciPassing a directory is also acceptable: 11187db96d56Sopenharmony_ci 11197db96d56Sopenharmony_ci.. code-block:: shell-session 11207db96d56Sopenharmony_ci 11217db96d56Sopenharmony_ci $ python -m tarfile -c monty.tar life-of-brian_1979/ 11227db96d56Sopenharmony_ci 11237db96d56Sopenharmony_ciIf you want to extract a tar archive into the current directory, use 11247db96d56Sopenharmony_cithe :option:`-e` option: 11257db96d56Sopenharmony_ci 11267db96d56Sopenharmony_ci.. code-block:: shell-session 11277db96d56Sopenharmony_ci 11287db96d56Sopenharmony_ci $ python -m tarfile -e monty.tar 11297db96d56Sopenharmony_ci 11307db96d56Sopenharmony_ciYou can also extract a tar archive into a different directory by passing the 11317db96d56Sopenharmony_cidirectory's name: 11327db96d56Sopenharmony_ci 11337db96d56Sopenharmony_ci.. code-block:: shell-session 11347db96d56Sopenharmony_ci 11357db96d56Sopenharmony_ci $ python -m tarfile -e monty.tar other-dir/ 11367db96d56Sopenharmony_ci 11377db96d56Sopenharmony_ciFor a list of the files in a tar archive, use the :option:`-l` option: 11387db96d56Sopenharmony_ci 11397db96d56Sopenharmony_ci.. code-block:: shell-session 11407db96d56Sopenharmony_ci 11417db96d56Sopenharmony_ci $ python -m tarfile -l monty.tar 11427db96d56Sopenharmony_ci 11437db96d56Sopenharmony_ci 11447db96d56Sopenharmony_ciCommand-line options 11457db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~ 11467db96d56Sopenharmony_ci 11477db96d56Sopenharmony_ci.. cmdoption:: -l <tarfile> 11487db96d56Sopenharmony_ci --list <tarfile> 11497db96d56Sopenharmony_ci 11507db96d56Sopenharmony_ci List files in a tarfile. 11517db96d56Sopenharmony_ci 11527db96d56Sopenharmony_ci.. cmdoption:: -c <tarfile> <source1> ... <sourceN> 11537db96d56Sopenharmony_ci --create <tarfile> <source1> ... <sourceN> 11547db96d56Sopenharmony_ci 11557db96d56Sopenharmony_ci Create tarfile from source files. 11567db96d56Sopenharmony_ci 11577db96d56Sopenharmony_ci.. cmdoption:: -e <tarfile> [<output_dir>] 11587db96d56Sopenharmony_ci --extract <tarfile> [<output_dir>] 11597db96d56Sopenharmony_ci 11607db96d56Sopenharmony_ci Extract tarfile into the current directory if *output_dir* is not specified. 11617db96d56Sopenharmony_ci 11627db96d56Sopenharmony_ci.. cmdoption:: -t <tarfile> 11637db96d56Sopenharmony_ci --test <tarfile> 11647db96d56Sopenharmony_ci 11657db96d56Sopenharmony_ci Test whether the tarfile is valid or not. 11667db96d56Sopenharmony_ci 11677db96d56Sopenharmony_ci.. cmdoption:: -v, --verbose 11687db96d56Sopenharmony_ci 11697db96d56Sopenharmony_ci Verbose output. 11707db96d56Sopenharmony_ci 11717db96d56Sopenharmony_ci.. cmdoption:: --filter <filtername> 11727db96d56Sopenharmony_ci 11737db96d56Sopenharmony_ci Specifies the *filter* for ``--extract``. 11747db96d56Sopenharmony_ci See :ref:`tarfile-extraction-filter` for details. 11757db96d56Sopenharmony_ci Only string names are accepted (that is, ``fully_trusted``, ``tar``, 11767db96d56Sopenharmony_ci and ``data``). 11777db96d56Sopenharmony_ci 11787db96d56Sopenharmony_ci .. versionadded:: 3.11.4 11797db96d56Sopenharmony_ci 11807db96d56Sopenharmony_ci.. _tar-examples: 11817db96d56Sopenharmony_ci 11827db96d56Sopenharmony_ciExamples 11837db96d56Sopenharmony_ci-------- 11847db96d56Sopenharmony_ci 11857db96d56Sopenharmony_ciHow to extract an entire tar archive to the current working directory:: 11867db96d56Sopenharmony_ci 11877db96d56Sopenharmony_ci import tarfile 11887db96d56Sopenharmony_ci tar = tarfile.open("sample.tar.gz") 11897db96d56Sopenharmony_ci tar.extractall() 11907db96d56Sopenharmony_ci tar.close() 11917db96d56Sopenharmony_ci 11927db96d56Sopenharmony_ciHow to extract a subset of a tar archive with :meth:`TarFile.extractall` using 11937db96d56Sopenharmony_cia generator function instead of a list:: 11947db96d56Sopenharmony_ci 11957db96d56Sopenharmony_ci import os 11967db96d56Sopenharmony_ci import tarfile 11977db96d56Sopenharmony_ci 11987db96d56Sopenharmony_ci def py_files(members): 11997db96d56Sopenharmony_ci for tarinfo in members: 12007db96d56Sopenharmony_ci if os.path.splitext(tarinfo.name)[1] == ".py": 12017db96d56Sopenharmony_ci yield tarinfo 12027db96d56Sopenharmony_ci 12037db96d56Sopenharmony_ci tar = tarfile.open("sample.tar.gz") 12047db96d56Sopenharmony_ci tar.extractall(members=py_files(tar)) 12057db96d56Sopenharmony_ci tar.close() 12067db96d56Sopenharmony_ci 12077db96d56Sopenharmony_ciHow to create an uncompressed tar archive from a list of filenames:: 12087db96d56Sopenharmony_ci 12097db96d56Sopenharmony_ci import tarfile 12107db96d56Sopenharmony_ci tar = tarfile.open("sample.tar", "w") 12117db96d56Sopenharmony_ci for name in ["foo", "bar", "quux"]: 12127db96d56Sopenharmony_ci tar.add(name) 12137db96d56Sopenharmony_ci tar.close() 12147db96d56Sopenharmony_ci 12157db96d56Sopenharmony_ciThe same example using the :keyword:`with` statement:: 12167db96d56Sopenharmony_ci 12177db96d56Sopenharmony_ci import tarfile 12187db96d56Sopenharmony_ci with tarfile.open("sample.tar", "w") as tar: 12197db96d56Sopenharmony_ci for name in ["foo", "bar", "quux"]: 12207db96d56Sopenharmony_ci tar.add(name) 12217db96d56Sopenharmony_ci 12227db96d56Sopenharmony_ciHow to read a gzip compressed tar archive and display some member information:: 12237db96d56Sopenharmony_ci 12247db96d56Sopenharmony_ci import tarfile 12257db96d56Sopenharmony_ci tar = tarfile.open("sample.tar.gz", "r:gz") 12267db96d56Sopenharmony_ci for tarinfo in tar: 12277db96d56Sopenharmony_ci print(tarinfo.name, "is", tarinfo.size, "bytes in size and is ", end="") 12287db96d56Sopenharmony_ci if tarinfo.isreg(): 12297db96d56Sopenharmony_ci print("a regular file.") 12307db96d56Sopenharmony_ci elif tarinfo.isdir(): 12317db96d56Sopenharmony_ci print("a directory.") 12327db96d56Sopenharmony_ci else: 12337db96d56Sopenharmony_ci print("something else.") 12347db96d56Sopenharmony_ci tar.close() 12357db96d56Sopenharmony_ci 12367db96d56Sopenharmony_ciHow to create an archive and reset the user information using the *filter* 12377db96d56Sopenharmony_ciparameter in :meth:`TarFile.add`:: 12387db96d56Sopenharmony_ci 12397db96d56Sopenharmony_ci import tarfile 12407db96d56Sopenharmony_ci def reset(tarinfo): 12417db96d56Sopenharmony_ci tarinfo.uid = tarinfo.gid = 0 12427db96d56Sopenharmony_ci tarinfo.uname = tarinfo.gname = "root" 12437db96d56Sopenharmony_ci return tarinfo 12447db96d56Sopenharmony_ci tar = tarfile.open("sample.tar.gz", "w:gz") 12457db96d56Sopenharmony_ci tar.add("foo", filter=reset) 12467db96d56Sopenharmony_ci tar.close() 12477db96d56Sopenharmony_ci 12487db96d56Sopenharmony_ci 12497db96d56Sopenharmony_ci.. _tar-formats: 12507db96d56Sopenharmony_ci 12517db96d56Sopenharmony_ciSupported tar formats 12527db96d56Sopenharmony_ci--------------------- 12537db96d56Sopenharmony_ci 12547db96d56Sopenharmony_ciThere are three tar formats that can be created with the :mod:`tarfile` module: 12557db96d56Sopenharmony_ci 12567db96d56Sopenharmony_ci* The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames 12577db96d56Sopenharmony_ci up to a length of at best 256 characters and linknames up to 100 characters. 12587db96d56Sopenharmony_ci The maximum file size is 8 GiB. This is an old and limited but widely 12597db96d56Sopenharmony_ci supported format. 12607db96d56Sopenharmony_ci 12617db96d56Sopenharmony_ci* The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and 12627db96d56Sopenharmony_ci linknames, files bigger than 8 GiB and sparse files. It is the de facto 12637db96d56Sopenharmony_ci standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar 12647db96d56Sopenharmony_ci extensions for long names, sparse file support is read-only. 12657db96d56Sopenharmony_ci 12667db96d56Sopenharmony_ci* The POSIX.1-2001 pax format (:const:`PAX_FORMAT`). It is the most flexible 12677db96d56Sopenharmony_ci format with virtually no limits. It supports long filenames and linknames, large 12687db96d56Sopenharmony_ci files and stores pathnames in a portable way. Modern tar implementations, 12697db96d56Sopenharmony_ci including GNU tar, bsdtar/libarchive and star, fully support extended *pax* 12707db96d56Sopenharmony_ci features; some old or unmaintained libraries may not, but should treat 12717db96d56Sopenharmony_ci *pax* archives as if they were in the universally supported *ustar* format. 12727db96d56Sopenharmony_ci It is the current default format for new archives. 12737db96d56Sopenharmony_ci 12747db96d56Sopenharmony_ci It extends the existing *ustar* format with extra headers for information 12757db96d56Sopenharmony_ci that cannot be stored otherwise. There are two flavours of pax headers: 12767db96d56Sopenharmony_ci Extended headers only affect the subsequent file header, global 12777db96d56Sopenharmony_ci headers are valid for the complete archive and affect all following files. 12787db96d56Sopenharmony_ci All the data in a pax header is encoded in *UTF-8* for portability reasons. 12797db96d56Sopenharmony_ci 12807db96d56Sopenharmony_ciThere are some more variants of the tar format which can be read, but not 12817db96d56Sopenharmony_cicreated: 12827db96d56Sopenharmony_ci 12837db96d56Sopenharmony_ci* The ancient V7 format. This is the first tar format from Unix Seventh Edition, 12847db96d56Sopenharmony_ci storing only regular files and directories. Names must not be longer than 100 12857db96d56Sopenharmony_ci characters, there is no user/group name information. Some archives have 12867db96d56Sopenharmony_ci miscalculated header checksums in case of fields with non-ASCII characters. 12877db96d56Sopenharmony_ci 12887db96d56Sopenharmony_ci* The SunOS tar extended format. This format is a variant of the POSIX.1-2001 12897db96d56Sopenharmony_ci pax format, but is not compatible. 12907db96d56Sopenharmony_ci 12917db96d56Sopenharmony_ci.. _tar-unicode: 12927db96d56Sopenharmony_ci 12937db96d56Sopenharmony_ciUnicode issues 12947db96d56Sopenharmony_ci-------------- 12957db96d56Sopenharmony_ci 12967db96d56Sopenharmony_ciThe tar format was originally conceived to make backups on tape drives with the 12977db96d56Sopenharmony_cimain focus on preserving file system information. Nowadays tar archives are 12987db96d56Sopenharmony_cicommonly used for file distribution and exchanging archives over networks. One 12997db96d56Sopenharmony_ciproblem of the original format (which is the basis of all other formats) is 13007db96d56Sopenharmony_cithat there is no concept of supporting different character encodings. For 13017db96d56Sopenharmony_ciexample, an ordinary tar archive created on a *UTF-8* system cannot be read 13027db96d56Sopenharmony_cicorrectly on a *Latin-1* system if it contains non-*ASCII* characters. Textual 13037db96d56Sopenharmony_cimetadata (like filenames, linknames, user/group names) will appear damaged. 13047db96d56Sopenharmony_ciUnfortunately, there is no way to autodetect the encoding of an archive. The 13057db96d56Sopenharmony_cipax format was designed to solve this problem. It stores non-ASCII metadata 13067db96d56Sopenharmony_ciusing the universal character encoding *UTF-8*. 13077db96d56Sopenharmony_ci 13087db96d56Sopenharmony_ciThe details of character conversion in :mod:`tarfile` are controlled by the 13097db96d56Sopenharmony_ci*encoding* and *errors* keyword arguments of the :class:`TarFile` class. 13107db96d56Sopenharmony_ci 13117db96d56Sopenharmony_ci*encoding* defines the character encoding to use for the metadata in the 13127db96d56Sopenharmony_ciarchive. The default value is :func:`sys.getfilesystemencoding` or ``'ascii'`` 13137db96d56Sopenharmony_cias a fallback. Depending on whether the archive is read or written, the 13147db96d56Sopenharmony_cimetadata must be either decoded or encoded. If *encoding* is not set 13157db96d56Sopenharmony_ciappropriately, this conversion may fail. 13167db96d56Sopenharmony_ci 13177db96d56Sopenharmony_ciThe *errors* argument defines how characters are treated that cannot be 13187db96d56Sopenharmony_ciconverted. Possible values are listed in section :ref:`error-handlers`. 13197db96d56Sopenharmony_ciThe default scheme is ``'surrogateescape'`` which Python also uses for its 13207db96d56Sopenharmony_cifile system calls, see :ref:`os-filenames`. 13217db96d56Sopenharmony_ci 13227db96d56Sopenharmony_ciFor :const:`PAX_FORMAT` archives (the default), *encoding* is generally not needed 13237db96d56Sopenharmony_cibecause all the metadata is stored using *UTF-8*. *encoding* is only used in 13247db96d56Sopenharmony_cithe rare cases when binary pax headers are decoded or when strings with 13257db96d56Sopenharmony_cisurrogate characters are stored. 1326