xref: /third_party/python/Doc/library/tarfile.rst (revision 7db96d56)
17db96d56Sopenharmony_ci:mod:`tarfile` --- Read and write tar archive files
27db96d56Sopenharmony_ci===================================================
37db96d56Sopenharmony_ci
47db96d56Sopenharmony_ci.. module:: tarfile
57db96d56Sopenharmony_ci   :synopsis: Read and write tar-format archive files.
67db96d56Sopenharmony_ci
77db96d56Sopenharmony_ci.. moduleauthor:: Lars Gustäbel <lars@gustaebel.de>
87db96d56Sopenharmony_ci.. sectionauthor:: Lars Gustäbel <lars@gustaebel.de>
97db96d56Sopenharmony_ci
107db96d56Sopenharmony_ci**Source code:** :source:`Lib/tarfile.py`
117db96d56Sopenharmony_ci
127db96d56Sopenharmony_ci--------------
137db96d56Sopenharmony_ci
147db96d56Sopenharmony_ciThe :mod:`tarfile` module makes it possible to read and write tar
157db96d56Sopenharmony_ciarchives, including those using gzip, bz2 and lzma compression.
167db96d56Sopenharmony_ciUse the :mod:`zipfile` module to read or write :file:`.zip` files, or the
177db96d56Sopenharmony_cihigher-level functions in :ref:`shutil <archiving-operations>`.
187db96d56Sopenharmony_ci
197db96d56Sopenharmony_ciSome facts and figures:
207db96d56Sopenharmony_ci
217db96d56Sopenharmony_ci* reads and writes :mod:`gzip`, :mod:`bz2` and :mod:`lzma` compressed archives
227db96d56Sopenharmony_ci  if the respective modules are available.
237db96d56Sopenharmony_ci
247db96d56Sopenharmony_ci* read/write support for the POSIX.1-1988 (ustar) format.
257db96d56Sopenharmony_ci
267db96d56Sopenharmony_ci* read/write support for the GNU tar format including *longname* and *longlink*
277db96d56Sopenharmony_ci  extensions, read-only support for all variants of the *sparse* extension
287db96d56Sopenharmony_ci  including restoration of sparse files.
297db96d56Sopenharmony_ci
307db96d56Sopenharmony_ci* read/write support for the POSIX.1-2001 (pax) format.
317db96d56Sopenharmony_ci
327db96d56Sopenharmony_ci* handles directories, regular files, hardlinks, symbolic links, fifos,
337db96d56Sopenharmony_ci  character devices and block devices and is able to acquire and restore file
347db96d56Sopenharmony_ci  information like timestamp, access permissions and owner.
357db96d56Sopenharmony_ci
367db96d56Sopenharmony_ci.. versionchanged:: 3.3
377db96d56Sopenharmony_ci   Added support for :mod:`lzma` compression.
387db96d56Sopenharmony_ci
397db96d56Sopenharmony_ci
407db96d56Sopenharmony_ci.. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, **kwargs)
417db96d56Sopenharmony_ci
427db96d56Sopenharmony_ci   Return a :class:`TarFile` object for the pathname *name*. For detailed
437db96d56Sopenharmony_ci   information on :class:`TarFile` objects and the keyword arguments that are
447db96d56Sopenharmony_ci   allowed, see :ref:`tarfile-objects`.
457db96d56Sopenharmony_ci
467db96d56Sopenharmony_ci   *mode* has to be a string of the form ``'filemode[:compression]'``, it defaults
477db96d56Sopenharmony_ci   to ``'r'``. Here is a full list of mode combinations:
487db96d56Sopenharmony_ci
497db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
507db96d56Sopenharmony_ci   | mode             | action                                      |
517db96d56Sopenharmony_ci   +==================+=============================================+
527db96d56Sopenharmony_ci   | ``'r' or 'r:*'`` | Open for reading with transparent           |
537db96d56Sopenharmony_ci   |                  | compression (recommended).                  |
547db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
557db96d56Sopenharmony_ci   | ``'r:'``         | Open for reading exclusively without        |
567db96d56Sopenharmony_ci   |                  | compression.                                |
577db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
587db96d56Sopenharmony_ci   | ``'r:gz'``       | Open for reading with gzip compression.     |
597db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
607db96d56Sopenharmony_ci   | ``'r:bz2'``      | Open for reading with bzip2 compression.    |
617db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
627db96d56Sopenharmony_ci   | ``'r:xz'``       | Open for reading with lzma compression.     |
637db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
647db96d56Sopenharmony_ci   | ``'x'`` or       | Create a tarfile exclusively without        |
657db96d56Sopenharmony_ci   | ``'x:'``         | compression.                                |
667db96d56Sopenharmony_ci   |                  | Raise a :exc:`FileExistsError` exception    |
677db96d56Sopenharmony_ci   |                  | if it already exists.                       |
687db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
697db96d56Sopenharmony_ci   | ``'x:gz'``       | Create a tarfile with gzip compression.     |
707db96d56Sopenharmony_ci   |                  | Raise a :exc:`FileExistsError` exception    |
717db96d56Sopenharmony_ci   |                  | if it already exists.                       |
727db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
737db96d56Sopenharmony_ci   | ``'x:bz2'``      | Create a tarfile with bzip2 compression.    |
747db96d56Sopenharmony_ci   |                  | Raise a :exc:`FileExistsError` exception    |
757db96d56Sopenharmony_ci   |                  | if it already exists.                       |
767db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
777db96d56Sopenharmony_ci   | ``'x:xz'``       | Create a tarfile with lzma compression.     |
787db96d56Sopenharmony_ci   |                  | Raise a :exc:`FileExistsError` exception    |
797db96d56Sopenharmony_ci   |                  | if it already exists.                       |
807db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
817db96d56Sopenharmony_ci   | ``'a' or 'a:'``  | Open for appending with no compression. The |
827db96d56Sopenharmony_ci   |                  | file is created if it does not exist.       |
837db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
847db96d56Sopenharmony_ci   | ``'w' or 'w:'``  | Open for uncompressed writing.              |
857db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
867db96d56Sopenharmony_ci   | ``'w:gz'``       | Open for gzip compressed writing.           |
877db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
887db96d56Sopenharmony_ci   | ``'w:bz2'``      | Open for bzip2 compressed writing.          |
897db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
907db96d56Sopenharmony_ci   | ``'w:xz'``       | Open for lzma compressed writing.           |
917db96d56Sopenharmony_ci   +------------------+---------------------------------------------+
927db96d56Sopenharmony_ci
937db96d56Sopenharmony_ci   Note that ``'a:gz'``, ``'a:bz2'`` or ``'a:xz'`` is not possible. If *mode*
947db96d56Sopenharmony_ci   is not suitable to open a certain (compressed) file for reading,
957db96d56Sopenharmony_ci   :exc:`ReadError` is raised. Use *mode* ``'r'`` to avoid this.  If a
967db96d56Sopenharmony_ci   compression method is not supported, :exc:`CompressionError` is raised.
977db96d56Sopenharmony_ci
987db96d56Sopenharmony_ci   If *fileobj* is specified, it is used as an alternative to a :term:`file object`
997db96d56Sopenharmony_ci   opened in binary mode for *name*. It is supposed to be at position 0.
1007db96d56Sopenharmony_ci
1017db96d56Sopenharmony_ci   For modes ``'w:gz'``, ``'r:gz'``, ``'w:bz2'``, ``'r:bz2'``, ``'x:gz'``,
1027db96d56Sopenharmony_ci   ``'x:bz2'``, :func:`tarfile.open` accepts the keyword argument
1037db96d56Sopenharmony_ci   *compresslevel* (default ``9``) to specify the compression level of the file.
1047db96d56Sopenharmony_ci
1057db96d56Sopenharmony_ci   For modes ``'w:xz'`` and ``'x:xz'``, :func:`tarfile.open` accepts the
1067db96d56Sopenharmony_ci   keyword argument *preset* to specify the compression level of the file.
1077db96d56Sopenharmony_ci
1087db96d56Sopenharmony_ci   For special purposes, there is a second format for *mode*:
1097db96d56Sopenharmony_ci   ``'filemode|[compression]'``.  :func:`tarfile.open` will return a :class:`TarFile`
1107db96d56Sopenharmony_ci   object that processes its data as a stream of blocks.  No random seeking will
1117db96d56Sopenharmony_ci   be done on the file. If given, *fileobj* may be any object that has a
1127db96d56Sopenharmony_ci   :meth:`read` or :meth:`write` method (depending on the *mode*). *bufsize*
1137db96d56Sopenharmony_ci   specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant
1147db96d56Sopenharmony_ci   in combination with e.g. ``sys.stdin``, a socket :term:`file object` or a tape
1157db96d56Sopenharmony_ci   device. However, such a :class:`TarFile` object is limited in that it does
1167db96d56Sopenharmony_ci   not allow random access, see :ref:`tar-examples`.  The currently
1177db96d56Sopenharmony_ci   possible modes:
1187db96d56Sopenharmony_ci
1197db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1207db96d56Sopenharmony_ci   | Mode        | Action                                     |
1217db96d56Sopenharmony_ci   +=============+============================================+
1227db96d56Sopenharmony_ci   | ``'r|*'``   | Open a *stream* of tar blocks for reading  |
1237db96d56Sopenharmony_ci   |             | with transparent compression.              |
1247db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1257db96d56Sopenharmony_ci   | ``'r|'``    | Open a *stream* of uncompressed tar blocks |
1267db96d56Sopenharmony_ci   |             | for reading.                               |
1277db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1287db96d56Sopenharmony_ci   | ``'r|gz'``  | Open a gzip compressed *stream* for        |
1297db96d56Sopenharmony_ci   |             | reading.                                   |
1307db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1317db96d56Sopenharmony_ci   | ``'r|bz2'`` | Open a bzip2 compressed *stream* for       |
1327db96d56Sopenharmony_ci   |             | reading.                                   |
1337db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1347db96d56Sopenharmony_ci   | ``'r|xz'``  | Open an lzma compressed *stream* for       |
1357db96d56Sopenharmony_ci   |             | reading.                                   |
1367db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1377db96d56Sopenharmony_ci   | ``'w|'``    | Open an uncompressed *stream* for writing. |
1387db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1397db96d56Sopenharmony_ci   | ``'w|gz'``  | Open a gzip compressed *stream* for        |
1407db96d56Sopenharmony_ci   |             | writing.                                   |
1417db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1427db96d56Sopenharmony_ci   | ``'w|bz2'`` | Open a bzip2 compressed *stream* for       |
1437db96d56Sopenharmony_ci   |             | writing.                                   |
1447db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1457db96d56Sopenharmony_ci   | ``'w|xz'``  | Open an lzma compressed *stream* for       |
1467db96d56Sopenharmony_ci   |             | writing.                                   |
1477db96d56Sopenharmony_ci   +-------------+--------------------------------------------+
1487db96d56Sopenharmony_ci
1497db96d56Sopenharmony_ci   .. versionchanged:: 3.5
1507db96d56Sopenharmony_ci      The ``'x'`` (exclusive creation) mode was added.
1517db96d56Sopenharmony_ci
1527db96d56Sopenharmony_ci   .. versionchanged:: 3.6
1537db96d56Sopenharmony_ci      The *name* parameter accepts a :term:`path-like object`.
1547db96d56Sopenharmony_ci
1557db96d56Sopenharmony_ci
1567db96d56Sopenharmony_ci.. class:: TarFile
1577db96d56Sopenharmony_ci   :noindex:
1587db96d56Sopenharmony_ci
1597db96d56Sopenharmony_ci   Class for reading and writing tar archives. Do not use this class directly:
1607db96d56Sopenharmony_ci   use :func:`tarfile.open` instead. See :ref:`tarfile-objects`.
1617db96d56Sopenharmony_ci
1627db96d56Sopenharmony_ci
1637db96d56Sopenharmony_ci.. function:: is_tarfile(name)
1647db96d56Sopenharmony_ci
1657db96d56Sopenharmony_ci   Return :const:`True` if *name* is a tar archive file, that the :mod:`tarfile`
1667db96d56Sopenharmony_ci   module can read. *name* may be a :class:`str`, file, or file-like object.
1677db96d56Sopenharmony_ci
1687db96d56Sopenharmony_ci   .. versionchanged:: 3.9
1697db96d56Sopenharmony_ci      Support for file and file-like objects.
1707db96d56Sopenharmony_ci
1717db96d56Sopenharmony_ci
1727db96d56Sopenharmony_ciThe :mod:`tarfile` module defines the following exceptions:
1737db96d56Sopenharmony_ci
1747db96d56Sopenharmony_ci
1757db96d56Sopenharmony_ci.. exception:: TarError
1767db96d56Sopenharmony_ci
1777db96d56Sopenharmony_ci   Base class for all :mod:`tarfile` exceptions.
1787db96d56Sopenharmony_ci
1797db96d56Sopenharmony_ci
1807db96d56Sopenharmony_ci.. exception:: ReadError
1817db96d56Sopenharmony_ci
1827db96d56Sopenharmony_ci   Is raised when a tar archive is opened, that either cannot be handled by the
1837db96d56Sopenharmony_ci   :mod:`tarfile` module or is somehow invalid.
1847db96d56Sopenharmony_ci
1857db96d56Sopenharmony_ci
1867db96d56Sopenharmony_ci.. exception:: CompressionError
1877db96d56Sopenharmony_ci
1887db96d56Sopenharmony_ci   Is raised when a compression method is not supported or when the data cannot be
1897db96d56Sopenharmony_ci   decoded properly.
1907db96d56Sopenharmony_ci
1917db96d56Sopenharmony_ci
1927db96d56Sopenharmony_ci.. exception:: StreamError
1937db96d56Sopenharmony_ci
1947db96d56Sopenharmony_ci   Is raised for the limitations that are typical for stream-like :class:`TarFile`
1957db96d56Sopenharmony_ci   objects.
1967db96d56Sopenharmony_ci
1977db96d56Sopenharmony_ci
1987db96d56Sopenharmony_ci.. exception:: ExtractError
1997db96d56Sopenharmony_ci
2007db96d56Sopenharmony_ci   Is raised for *non-fatal* errors when using :meth:`TarFile.extract`, but only if
2017db96d56Sopenharmony_ci   :attr:`TarFile.errorlevel`\ ``== 2``.
2027db96d56Sopenharmony_ci
2037db96d56Sopenharmony_ci
2047db96d56Sopenharmony_ci.. exception:: HeaderError
2057db96d56Sopenharmony_ci
2067db96d56Sopenharmony_ci   Is raised by :meth:`TarInfo.frombuf` if the buffer it gets is invalid.
2077db96d56Sopenharmony_ci
2087db96d56Sopenharmony_ci
2097db96d56Sopenharmony_ci.. exception:: FilterError
2107db96d56Sopenharmony_ci
2117db96d56Sopenharmony_ci   Base class for members :ref:`refused <tarfile-extraction-refuse>` by
2127db96d56Sopenharmony_ci   filters.
2137db96d56Sopenharmony_ci
2147db96d56Sopenharmony_ci   .. attribute:: tarinfo
2157db96d56Sopenharmony_ci
2167db96d56Sopenharmony_ci      Information about the member that the filter refused to extract,
2177db96d56Sopenharmony_ci      as :ref:`TarInfo <tarinfo-objects>`.
2187db96d56Sopenharmony_ci
2197db96d56Sopenharmony_ci.. exception:: AbsolutePathError
2207db96d56Sopenharmony_ci
2217db96d56Sopenharmony_ci   Raised to refuse extracting a member with an absolute path.
2227db96d56Sopenharmony_ci
2237db96d56Sopenharmony_ci.. exception:: OutsideDestinationError
2247db96d56Sopenharmony_ci
2257db96d56Sopenharmony_ci   Raised to refuse extracting a member outside the destination directory.
2267db96d56Sopenharmony_ci
2277db96d56Sopenharmony_ci.. exception:: SpecialFileError
2287db96d56Sopenharmony_ci
2297db96d56Sopenharmony_ci   Raised to refuse extracting a special file (e.g. a device or pipe).
2307db96d56Sopenharmony_ci
2317db96d56Sopenharmony_ci.. exception:: AbsoluteLinkError
2327db96d56Sopenharmony_ci
2337db96d56Sopenharmony_ci   Raised to refuse extracting a symbolic link with an absolute path.
2347db96d56Sopenharmony_ci
2357db96d56Sopenharmony_ci.. exception:: LinkOutsideDestinationError
2367db96d56Sopenharmony_ci
2377db96d56Sopenharmony_ci   Raised to refuse extracting a symbolic link pointing outside the destination
2387db96d56Sopenharmony_ci   directory.
2397db96d56Sopenharmony_ci
2407db96d56Sopenharmony_ci
2417db96d56Sopenharmony_ciThe following constants are available at the module level:
2427db96d56Sopenharmony_ci
2437db96d56Sopenharmony_ci.. data:: ENCODING
2447db96d56Sopenharmony_ci
2457db96d56Sopenharmony_ci   The default character encoding: ``'utf-8'`` on Windows, the value returned by
2467db96d56Sopenharmony_ci   :func:`sys.getfilesystemencoding` otherwise.
2477db96d56Sopenharmony_ci
2487db96d56Sopenharmony_ci
2497db96d56Sopenharmony_ciEach of the following constants defines a tar archive format that the
2507db96d56Sopenharmony_ci:mod:`tarfile` module is able to create. See section :ref:`tar-formats` for
2517db96d56Sopenharmony_cidetails.
2527db96d56Sopenharmony_ci
2537db96d56Sopenharmony_ci
2547db96d56Sopenharmony_ci.. data:: USTAR_FORMAT
2557db96d56Sopenharmony_ci
2567db96d56Sopenharmony_ci   POSIX.1-1988 (ustar) format.
2577db96d56Sopenharmony_ci
2587db96d56Sopenharmony_ci
2597db96d56Sopenharmony_ci.. data:: GNU_FORMAT
2607db96d56Sopenharmony_ci
2617db96d56Sopenharmony_ci   GNU tar format.
2627db96d56Sopenharmony_ci
2637db96d56Sopenharmony_ci
2647db96d56Sopenharmony_ci.. data:: PAX_FORMAT
2657db96d56Sopenharmony_ci
2667db96d56Sopenharmony_ci   POSIX.1-2001 (pax) format.
2677db96d56Sopenharmony_ci
2687db96d56Sopenharmony_ci
2697db96d56Sopenharmony_ci.. data:: DEFAULT_FORMAT
2707db96d56Sopenharmony_ci
2717db96d56Sopenharmony_ci   The default format for creating archives. This is currently :const:`PAX_FORMAT`.
2727db96d56Sopenharmony_ci
2737db96d56Sopenharmony_ci   .. versionchanged:: 3.8
2747db96d56Sopenharmony_ci      The default format for new archives was changed to
2757db96d56Sopenharmony_ci      :const:`PAX_FORMAT` from :const:`GNU_FORMAT`.
2767db96d56Sopenharmony_ci
2777db96d56Sopenharmony_ci
2787db96d56Sopenharmony_ci.. seealso::
2797db96d56Sopenharmony_ci
2807db96d56Sopenharmony_ci   Module :mod:`zipfile`
2817db96d56Sopenharmony_ci      Documentation of the :mod:`zipfile` standard module.
2827db96d56Sopenharmony_ci
2837db96d56Sopenharmony_ci   :ref:`archiving-operations`
2847db96d56Sopenharmony_ci      Documentation of the higher-level archiving facilities provided by the
2857db96d56Sopenharmony_ci      standard :mod:`shutil` module.
2867db96d56Sopenharmony_ci
2877db96d56Sopenharmony_ci   `GNU tar manual, Basic Tar Format <https://www.gnu.org/software/tar/manual/html_node/Standard.html>`_
2887db96d56Sopenharmony_ci      Documentation for tar archive files, including GNU tar extensions.
2897db96d56Sopenharmony_ci
2907db96d56Sopenharmony_ci
2917db96d56Sopenharmony_ci.. _tarfile-objects:
2927db96d56Sopenharmony_ci
2937db96d56Sopenharmony_ciTarFile Objects
2947db96d56Sopenharmony_ci---------------
2957db96d56Sopenharmony_ci
2967db96d56Sopenharmony_ciThe :class:`TarFile` object provides an interface to a tar archive. A tar
2977db96d56Sopenharmony_ciarchive is a sequence of blocks. An archive member (a stored file) is made up of
2987db96d56Sopenharmony_cia header block followed by data blocks. It is possible to store a file in a tar
2997db96d56Sopenharmony_ciarchive several times. Each archive member is represented by a :class:`TarInfo`
3007db96d56Sopenharmony_ciobject, see :ref:`tarinfo-objects` for details.
3017db96d56Sopenharmony_ci
3027db96d56Sopenharmony_ciA :class:`TarFile` object can be used as a context manager in a :keyword:`with`
3037db96d56Sopenharmony_cistatement. It will automatically be closed when the block is completed. Please
3047db96d56Sopenharmony_cinote that in the event of an exception an archive opened for writing will not
3057db96d56Sopenharmony_cibe finalized; only the internally used file object will be closed. See the
3067db96d56Sopenharmony_ci:ref:`tar-examples` section for a use case.
3077db96d56Sopenharmony_ci
3087db96d56Sopenharmony_ci.. versionadded:: 3.2
3097db96d56Sopenharmony_ci   Added support for the context management protocol.
3107db96d56Sopenharmony_ci
3117db96d56Sopenharmony_ci.. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors='surrogateescape', pax_headers=None, debug=0, errorlevel=1)
3127db96d56Sopenharmony_ci
3137db96d56Sopenharmony_ci   All following arguments are optional and can be accessed as instance attributes
3147db96d56Sopenharmony_ci   as well.
3157db96d56Sopenharmony_ci
3167db96d56Sopenharmony_ci   *name* is the pathname of the archive. *name* may be a :term:`path-like object`.
3177db96d56Sopenharmony_ci   It can be omitted if *fileobj* is given.
3187db96d56Sopenharmony_ci   In this case, the file object's :attr:`name` attribute is used if it exists.
3197db96d56Sopenharmony_ci
3207db96d56Sopenharmony_ci   *mode* is either ``'r'`` to read from an existing archive, ``'a'`` to append
3217db96d56Sopenharmony_ci   data to an existing file, ``'w'`` to create a new file overwriting an existing
3227db96d56Sopenharmony_ci   one, or ``'x'`` to create a new file only if it does not already exist.
3237db96d56Sopenharmony_ci
3247db96d56Sopenharmony_ci   If *fileobj* is given, it is used for reading or writing data. If it can be
3257db96d56Sopenharmony_ci   determined, *mode* is overridden by *fileobj*'s mode. *fileobj* will be used
3267db96d56Sopenharmony_ci   from position 0.
3277db96d56Sopenharmony_ci
3287db96d56Sopenharmony_ci   .. note::
3297db96d56Sopenharmony_ci
3307db96d56Sopenharmony_ci      *fileobj* is not closed, when :class:`TarFile` is closed.
3317db96d56Sopenharmony_ci
3327db96d56Sopenharmony_ci   *format* controls the archive format for writing. It must be one of the constants
3337db96d56Sopenharmony_ci   :const:`USTAR_FORMAT`, :const:`GNU_FORMAT` or :const:`PAX_FORMAT` that are
3347db96d56Sopenharmony_ci   defined at module level. When reading, format will be automatically detected, even
3357db96d56Sopenharmony_ci   if different formats are present in a single archive.
3367db96d56Sopenharmony_ci
3377db96d56Sopenharmony_ci   The *tarinfo* argument can be used to replace the default :class:`TarInfo` class
3387db96d56Sopenharmony_ci   with a different one.
3397db96d56Sopenharmony_ci
3407db96d56Sopenharmony_ci   If *dereference* is :const:`False`, add symbolic and hard links to the archive. If it
3417db96d56Sopenharmony_ci   is :const:`True`, add the content of the target files to the archive. This has no
3427db96d56Sopenharmony_ci   effect on systems that do not support symbolic links.
3437db96d56Sopenharmony_ci
3447db96d56Sopenharmony_ci   If *ignore_zeros* is :const:`False`, treat an empty block as the end of the archive.
3457db96d56Sopenharmony_ci   If it is :const:`True`, skip empty (and invalid) blocks and try to get as many members
3467db96d56Sopenharmony_ci   as possible. This is only useful for reading concatenated or damaged archives.
3477db96d56Sopenharmony_ci
3487db96d56Sopenharmony_ci   *debug* can be set from ``0`` (no debug messages) up to ``3`` (all debug
3497db96d56Sopenharmony_ci   messages). The messages are written to ``sys.stderr``.
3507db96d56Sopenharmony_ci
3517db96d56Sopenharmony_ci   *errorlevel* controls how extraction errors are handled,
3527db96d56Sopenharmony_ci   see :attr:`the corresponding attribute <~TarFile.errorlevel>`.
3537db96d56Sopenharmony_ci
3547db96d56Sopenharmony_ci   The *encoding* and *errors* arguments define the character encoding to be
3557db96d56Sopenharmony_ci   used for reading or writing the archive and how conversion errors are going
3567db96d56Sopenharmony_ci   to be handled. The default settings will work for most users.
3577db96d56Sopenharmony_ci   See section :ref:`tar-unicode` for in-depth information.
3587db96d56Sopenharmony_ci
3597db96d56Sopenharmony_ci   The *pax_headers* argument is an optional dictionary of strings which
3607db96d56Sopenharmony_ci   will be added as a pax global header if *format* is :const:`PAX_FORMAT`.
3617db96d56Sopenharmony_ci
3627db96d56Sopenharmony_ci   .. versionchanged:: 3.2
3637db96d56Sopenharmony_ci      Use ``'surrogateescape'`` as the default for the *errors* argument.
3647db96d56Sopenharmony_ci
3657db96d56Sopenharmony_ci   .. versionchanged:: 3.5
3667db96d56Sopenharmony_ci      The ``'x'`` (exclusive creation) mode was added.
3677db96d56Sopenharmony_ci
3687db96d56Sopenharmony_ci   .. versionchanged:: 3.6
3697db96d56Sopenharmony_ci      The *name* parameter accepts a :term:`path-like object`.
3707db96d56Sopenharmony_ci
3717db96d56Sopenharmony_ci
3727db96d56Sopenharmony_ci.. classmethod:: TarFile.open(...)
3737db96d56Sopenharmony_ci
3747db96d56Sopenharmony_ci   Alternative constructor. The :func:`tarfile.open` function is actually a
3757db96d56Sopenharmony_ci   shortcut to this classmethod.
3767db96d56Sopenharmony_ci
3777db96d56Sopenharmony_ci
3787db96d56Sopenharmony_ci.. method:: TarFile.getmember(name)
3797db96d56Sopenharmony_ci
3807db96d56Sopenharmony_ci   Return a :class:`TarInfo` object for member *name*. If *name* can not be found
3817db96d56Sopenharmony_ci   in the archive, :exc:`KeyError` is raised.
3827db96d56Sopenharmony_ci
3837db96d56Sopenharmony_ci   .. note::
3847db96d56Sopenharmony_ci
3857db96d56Sopenharmony_ci      If a member occurs more than once in the archive, its last occurrence is assumed
3867db96d56Sopenharmony_ci      to be the most up-to-date version.
3877db96d56Sopenharmony_ci
3887db96d56Sopenharmony_ci
3897db96d56Sopenharmony_ci.. method:: TarFile.getmembers()
3907db96d56Sopenharmony_ci
3917db96d56Sopenharmony_ci   Return the members of the archive as a list of :class:`TarInfo` objects. The
3927db96d56Sopenharmony_ci   list has the same order as the members in the archive.
3937db96d56Sopenharmony_ci
3947db96d56Sopenharmony_ci
3957db96d56Sopenharmony_ci.. method:: TarFile.getnames()
3967db96d56Sopenharmony_ci
3977db96d56Sopenharmony_ci   Return the members as a list of their names. It has the same order as the list
3987db96d56Sopenharmony_ci   returned by :meth:`getmembers`.
3997db96d56Sopenharmony_ci
4007db96d56Sopenharmony_ci
4017db96d56Sopenharmony_ci.. method:: TarFile.list(verbose=True, *, members=None)
4027db96d56Sopenharmony_ci
4037db96d56Sopenharmony_ci   Print a table of contents to ``sys.stdout``. If *verbose* is :const:`False`,
4047db96d56Sopenharmony_ci   only the names of the members are printed. If it is :const:`True`, output
4057db96d56Sopenharmony_ci   similar to that of :program:`ls -l` is produced. If optional *members* is
4067db96d56Sopenharmony_ci   given, it must be a subset of the list returned by :meth:`getmembers`.
4077db96d56Sopenharmony_ci
4087db96d56Sopenharmony_ci   .. versionchanged:: 3.5
4097db96d56Sopenharmony_ci      Added the *members* parameter.
4107db96d56Sopenharmony_ci
4117db96d56Sopenharmony_ci
4127db96d56Sopenharmony_ci.. method:: TarFile.next()
4137db96d56Sopenharmony_ci
4147db96d56Sopenharmony_ci   Return the next member of the archive as a :class:`TarInfo` object, when
4157db96d56Sopenharmony_ci   :class:`TarFile` is opened for reading. Return :const:`None` if there is no more
4167db96d56Sopenharmony_ci   available.
4177db96d56Sopenharmony_ci
4187db96d56Sopenharmony_ci
4197db96d56Sopenharmony_ci.. method:: TarFile.extractall(path=".", members=None, *, numeric_owner=False, filter=None)
4207db96d56Sopenharmony_ci
4217db96d56Sopenharmony_ci   Extract all members from the archive to the current working directory or
4227db96d56Sopenharmony_ci   directory *path*. If optional *members* is given, it must be a subset of the
4237db96d56Sopenharmony_ci   list returned by :meth:`getmembers`. Directory information like owner,
4247db96d56Sopenharmony_ci   modification time and permissions are set after all members have been extracted.
4257db96d56Sopenharmony_ci   This is done to work around two problems: A directory's modification time is
4267db96d56Sopenharmony_ci   reset each time a file is created in it. And, if a directory's permissions do
4277db96d56Sopenharmony_ci   not allow writing, extracting files to it will fail.
4287db96d56Sopenharmony_ci
4297db96d56Sopenharmony_ci   If *numeric_owner* is :const:`True`, the uid and gid numbers from the tarfile
4307db96d56Sopenharmony_ci   are used to set the owner/group for the extracted files. Otherwise, the named
4317db96d56Sopenharmony_ci   values from the tarfile are used.
4327db96d56Sopenharmony_ci
4337db96d56Sopenharmony_ci   The *filter* argument, which was added in Python 3.11.4, specifies how
4347db96d56Sopenharmony_ci   ``members`` are modified or rejected before extraction.
4357db96d56Sopenharmony_ci   See :ref:`tarfile-extraction-filter` for details.
4367db96d56Sopenharmony_ci   It is recommended to set this explicitly depending on which *tar* features
4377db96d56Sopenharmony_ci   you need to support.
4387db96d56Sopenharmony_ci
4397db96d56Sopenharmony_ci   .. warning::
4407db96d56Sopenharmony_ci
4417db96d56Sopenharmony_ci      Never extract archives from untrusted sources without prior inspection.
4427db96d56Sopenharmony_ci      It is possible that files are created outside of *path*, e.g. members
4437db96d56Sopenharmony_ci      that have absolute filenames starting with ``"/"`` or filenames with two
4447db96d56Sopenharmony_ci      dots ``".."``.
4457db96d56Sopenharmony_ci
4467db96d56Sopenharmony_ci      Set ``filter='data'`` to prevent the most dangerous security issues,
4477db96d56Sopenharmony_ci      and read the :ref:`tarfile-extraction-filter` section for details.
4487db96d56Sopenharmony_ci
4497db96d56Sopenharmony_ci   .. versionchanged:: 3.5
4507db96d56Sopenharmony_ci      Added the *numeric_owner* parameter.
4517db96d56Sopenharmony_ci
4527db96d56Sopenharmony_ci   .. versionchanged:: 3.6
4537db96d56Sopenharmony_ci      The *path* parameter accepts a :term:`path-like object`.
4547db96d56Sopenharmony_ci
4557db96d56Sopenharmony_ci   .. versionchanged:: 3.11.4
4567db96d56Sopenharmony_ci      Added the *filter* parameter.
4577db96d56Sopenharmony_ci
4587db96d56Sopenharmony_ci
4597db96d56Sopenharmony_ci.. method:: TarFile.extract(member, path="", set_attrs=True, *, numeric_owner=False, filter=None)
4607db96d56Sopenharmony_ci
4617db96d56Sopenharmony_ci   Extract a member from the archive to the current working directory, using its
4627db96d56Sopenharmony_ci   full name. Its file information is extracted as accurately as possible. *member*
4637db96d56Sopenharmony_ci   may be a filename or a :class:`TarInfo` object. You can specify a different
4647db96d56Sopenharmony_ci   directory using *path*. *path* may be a :term:`path-like object`.
4657db96d56Sopenharmony_ci   File attributes (owner, mtime, mode) are set unless *set_attrs* is false.
4667db96d56Sopenharmony_ci
4677db96d56Sopenharmony_ci   The *numeric_owner* and *filter* arguments are the same as
4687db96d56Sopenharmony_ci   for :meth:`extractall`.
4697db96d56Sopenharmony_ci
4707db96d56Sopenharmony_ci   .. note::
4717db96d56Sopenharmony_ci
4727db96d56Sopenharmony_ci      The :meth:`extract` method does not take care of several extraction issues.
4737db96d56Sopenharmony_ci      In most cases you should consider using the :meth:`extractall` method.
4747db96d56Sopenharmony_ci
4757db96d56Sopenharmony_ci   .. warning::
4767db96d56Sopenharmony_ci
4777db96d56Sopenharmony_ci      See the warning for :meth:`extractall`.
4787db96d56Sopenharmony_ci
4797db96d56Sopenharmony_ci      Set ``filter='data'`` to prevent the most dangerous security issues,
4807db96d56Sopenharmony_ci      and read the :ref:`tarfile-extraction-filter` section for details.
4817db96d56Sopenharmony_ci
4827db96d56Sopenharmony_ci   .. versionchanged:: 3.2
4837db96d56Sopenharmony_ci      Added the *set_attrs* parameter.
4847db96d56Sopenharmony_ci
4857db96d56Sopenharmony_ci   .. versionchanged:: 3.5
4867db96d56Sopenharmony_ci      Added the *numeric_owner* parameter.
4877db96d56Sopenharmony_ci
4887db96d56Sopenharmony_ci   .. versionchanged:: 3.6
4897db96d56Sopenharmony_ci      The *path* parameter accepts a :term:`path-like object`.
4907db96d56Sopenharmony_ci
4917db96d56Sopenharmony_ci   .. versionchanged:: 3.11.4
4927db96d56Sopenharmony_ci      Added the *filter* parameter.
4937db96d56Sopenharmony_ci
4947db96d56Sopenharmony_ci
4957db96d56Sopenharmony_ci.. method:: TarFile.extractfile(member)
4967db96d56Sopenharmony_ci
4977db96d56Sopenharmony_ci   Extract a member from the archive as a file object. *member* may be
4987db96d56Sopenharmony_ci   a filename or a :class:`TarInfo` object. If *member* is a regular file or
4997db96d56Sopenharmony_ci   a link, an :class:`io.BufferedReader` object is returned. For all other
5007db96d56Sopenharmony_ci   existing members, :const:`None` is returned. If *member* does not appear
5017db96d56Sopenharmony_ci   in the archive, :exc:`KeyError` is raised.
5027db96d56Sopenharmony_ci
5037db96d56Sopenharmony_ci   .. versionchanged:: 3.3
5047db96d56Sopenharmony_ci      Return an :class:`io.BufferedReader` object.
5057db96d56Sopenharmony_ci
5067db96d56Sopenharmony_ci.. attribute:: TarFile.errorlevel
5077db96d56Sopenharmony_ci   :type: int
5087db96d56Sopenharmony_ci
5097db96d56Sopenharmony_ci   If *errorlevel* is ``0``, errors are ignored when using :meth:`TarFile.extract`
5107db96d56Sopenharmony_ci   and :meth:`TarFile.extractall`.
5117db96d56Sopenharmony_ci   Nevertheless, they appear as error messages in the debug output when
5127db96d56Sopenharmony_ci   *debug* is greater than 0.
5137db96d56Sopenharmony_ci   If ``1`` (the default), all *fatal* errors are raised as :exc:`OSError` or
5147db96d56Sopenharmony_ci   :exc:`FilterError` exceptions. If ``2``, all *non-fatal* errors are raised
5157db96d56Sopenharmony_ci   as :exc:`TarError` exceptions as well.
5167db96d56Sopenharmony_ci
5177db96d56Sopenharmony_ci   Some exceptions, e.g. ones caused by wrong argument types or data
5187db96d56Sopenharmony_ci   corruption, are always raised.
5197db96d56Sopenharmony_ci
5207db96d56Sopenharmony_ci   Custom :ref:`extraction filters <tarfile-extraction-filter>`
5217db96d56Sopenharmony_ci   should raise :exc:`FilterError` for *fatal* errors
5227db96d56Sopenharmony_ci   and :exc:`ExtractError` for *non-fatal* ones.
5237db96d56Sopenharmony_ci
5247db96d56Sopenharmony_ci   Note that when an exception is raised, the archive may be partially
5257db96d56Sopenharmony_ci   extracted. It is the user’s responsibility to clean up.
5267db96d56Sopenharmony_ci
5277db96d56Sopenharmony_ci.. attribute:: TarFile.extraction_filter
5287db96d56Sopenharmony_ci
5297db96d56Sopenharmony_ci   .. versionadded:: 3.11.4
5307db96d56Sopenharmony_ci
5317db96d56Sopenharmony_ci   The :ref:`extraction filter <tarfile-extraction-filter>` used
5327db96d56Sopenharmony_ci   as a default for the *filter* argument of :meth:`~TarFile.extract`
5337db96d56Sopenharmony_ci   and :meth:`~TarFile.extractall`.
5347db96d56Sopenharmony_ci
5357db96d56Sopenharmony_ci   The attribute may be ``None`` or a callable.
5367db96d56Sopenharmony_ci   String names are not allowed for this attribute, unlike the *filter*
5377db96d56Sopenharmony_ci   argument to :meth:`~TarFile.extract`.
5387db96d56Sopenharmony_ci
5397db96d56Sopenharmony_ci   If ``extraction_filter`` is ``None`` (the default),
5407db96d56Sopenharmony_ci   calling an extraction method without a *filter* argument will
5417db96d56Sopenharmony_ci   use the :func:`fully_trusted <fully_trusted_filter>` filter for
5427db96d56Sopenharmony_ci   compatibility with previous Python versions.
5437db96d56Sopenharmony_ci
5447db96d56Sopenharmony_ci   In Python 3.12+, leaving ``extraction_filter=None`` will emit a
5457db96d56Sopenharmony_ci   ``DeprecationWarning``.
5467db96d56Sopenharmony_ci
5477db96d56Sopenharmony_ci   In Python 3.14+, leaving ``extraction_filter=None`` will cause
5487db96d56Sopenharmony_ci   extraction methods to use the :func:`data <data_filter>` filter by default.
5497db96d56Sopenharmony_ci
5507db96d56Sopenharmony_ci   The attribute may be set on instances or overridden in subclasses.
5517db96d56Sopenharmony_ci   It also is possible to set it on the ``TarFile`` class itself to set a
5527db96d56Sopenharmony_ci   global default, although, since it affects all uses of *tarfile*,
5537db96d56Sopenharmony_ci   it is best practice to only do so in top-level applications or
5547db96d56Sopenharmony_ci   :mod:`site configuration <site>`.
5557db96d56Sopenharmony_ci   To set a global default this way, a filter function needs to be wrapped in
5567db96d56Sopenharmony_ci   :func:`staticmethod()` to prevent injection of a ``self`` argument.
5577db96d56Sopenharmony_ci
5587db96d56Sopenharmony_ci.. method:: TarFile.add(name, arcname=None, recursive=True, *, filter=None)
5597db96d56Sopenharmony_ci
5607db96d56Sopenharmony_ci   Add the file *name* to the archive. *name* may be any type of file
5617db96d56Sopenharmony_ci   (directory, fifo, symbolic link, etc.). If given, *arcname* specifies an
5627db96d56Sopenharmony_ci   alternative name for the file in the archive. Directories are added
5637db96d56Sopenharmony_ci   recursively by default. This can be avoided by setting *recursive* to
5647db96d56Sopenharmony_ci   :const:`False`. Recursion adds entries in sorted order.
5657db96d56Sopenharmony_ci   If *filter* is given, it
5667db96d56Sopenharmony_ci   should be a function that takes a :class:`TarInfo` object argument and
5677db96d56Sopenharmony_ci   returns the changed :class:`TarInfo` object. If it instead returns
5687db96d56Sopenharmony_ci   :const:`None` the :class:`TarInfo` object will be excluded from the
5697db96d56Sopenharmony_ci   archive. See :ref:`tar-examples` for an example.
5707db96d56Sopenharmony_ci
5717db96d56Sopenharmony_ci   .. versionchanged:: 3.2
5727db96d56Sopenharmony_ci      Added the *filter* parameter.
5737db96d56Sopenharmony_ci
5747db96d56Sopenharmony_ci   .. versionchanged:: 3.7
5757db96d56Sopenharmony_ci      Recursion adds entries in sorted order.
5767db96d56Sopenharmony_ci
5777db96d56Sopenharmony_ci
5787db96d56Sopenharmony_ci.. method:: TarFile.addfile(tarinfo, fileobj=None)
5797db96d56Sopenharmony_ci
5807db96d56Sopenharmony_ci   Add the :class:`TarInfo` object *tarinfo* to the archive. If *fileobj* is given,
5817db96d56Sopenharmony_ci   it should be a :term:`binary file`, and
5827db96d56Sopenharmony_ci   ``tarinfo.size`` bytes are read from it and added to the archive.  You can
5837db96d56Sopenharmony_ci   create :class:`TarInfo` objects directly, or by using :meth:`gettarinfo`.
5847db96d56Sopenharmony_ci
5857db96d56Sopenharmony_ci
5867db96d56Sopenharmony_ci.. method:: TarFile.gettarinfo(name=None, arcname=None, fileobj=None)
5877db96d56Sopenharmony_ci
5887db96d56Sopenharmony_ci   Create a :class:`TarInfo` object from the result of :func:`os.stat` or
5897db96d56Sopenharmony_ci   equivalent on an existing file.  The file is either named by *name*, or
5907db96d56Sopenharmony_ci   specified as a :term:`file object` *fileobj* with a file descriptor.
5917db96d56Sopenharmony_ci   *name* may be a :term:`path-like object`.  If
5927db96d56Sopenharmony_ci   given, *arcname* specifies an alternative name for the file in the
5937db96d56Sopenharmony_ci   archive, otherwise, the name is taken from *fileobj*’s
5947db96d56Sopenharmony_ci   :attr:`~io.FileIO.name` attribute, or the *name* argument.  The name
5957db96d56Sopenharmony_ci   should be a text string.
5967db96d56Sopenharmony_ci
5977db96d56Sopenharmony_ci   You can modify
5987db96d56Sopenharmony_ci   some of the :class:`TarInfo`’s attributes before you add it using :meth:`addfile`.
5997db96d56Sopenharmony_ci   If the file object is not an ordinary file object positioned at the
6007db96d56Sopenharmony_ci   beginning of the file, attributes such as :attr:`~TarInfo.size` may need
6017db96d56Sopenharmony_ci   modifying.  This is the case for objects such as :class:`~gzip.GzipFile`.
6027db96d56Sopenharmony_ci   The :attr:`~TarInfo.name` may also be modified, in which case *arcname*
6037db96d56Sopenharmony_ci   could be a dummy string.
6047db96d56Sopenharmony_ci
6057db96d56Sopenharmony_ci   .. versionchanged:: 3.6
6067db96d56Sopenharmony_ci      The *name* parameter accepts a :term:`path-like object`.
6077db96d56Sopenharmony_ci
6087db96d56Sopenharmony_ci
6097db96d56Sopenharmony_ci.. method:: TarFile.close()
6107db96d56Sopenharmony_ci
6117db96d56Sopenharmony_ci   Close the :class:`TarFile`. In write mode, two finishing zero blocks are
6127db96d56Sopenharmony_ci   appended to the archive.
6137db96d56Sopenharmony_ci
6147db96d56Sopenharmony_ci
6157db96d56Sopenharmony_ci.. attribute:: TarFile.pax_headers
6167db96d56Sopenharmony_ci
6177db96d56Sopenharmony_ci   A dictionary containing key-value pairs of pax global headers.
6187db96d56Sopenharmony_ci
6197db96d56Sopenharmony_ci
6207db96d56Sopenharmony_ci
6217db96d56Sopenharmony_ci.. _tarinfo-objects:
6227db96d56Sopenharmony_ci
6237db96d56Sopenharmony_ciTarInfo Objects
6247db96d56Sopenharmony_ci---------------
6257db96d56Sopenharmony_ci
6267db96d56Sopenharmony_ciA :class:`TarInfo` object represents one member in a :class:`TarFile`. Aside
6277db96d56Sopenharmony_cifrom storing all required attributes of a file (like file type, size, time,
6287db96d56Sopenharmony_cipermissions, owner etc.), it provides some useful methods to determine its type.
6297db96d56Sopenharmony_ciIt does *not* contain the file's data itself.
6307db96d56Sopenharmony_ci
6317db96d56Sopenharmony_ci:class:`TarInfo` objects are returned by :class:`TarFile`'s methods
6327db96d56Sopenharmony_ci:meth:`~TarFile.getmember`, :meth:`~TarFile.getmembers` and
6337db96d56Sopenharmony_ci:meth:`~TarFile.gettarinfo`.
6347db96d56Sopenharmony_ci
6357db96d56Sopenharmony_ciModifying the objects returned by :meth:`~!TarFile.getmember` or
6367db96d56Sopenharmony_ci:meth:`~!TarFile.getmembers` will affect all subsequent
6377db96d56Sopenharmony_cioperations on the archive.
6387db96d56Sopenharmony_ciFor cases where this is unwanted, you can use :mod:`copy.copy() <copy>` or
6397db96d56Sopenharmony_cicall the :meth:`~TarInfo.replace` method to create a modified copy in one step.
6407db96d56Sopenharmony_ci
6417db96d56Sopenharmony_ciSeveral attributes can be set to ``None`` to indicate that a piece of metadata
6427db96d56Sopenharmony_ciis unused or unknown.
6437db96d56Sopenharmony_ciDifferent :class:`TarInfo` methods handle ``None`` differently:
6447db96d56Sopenharmony_ci
6457db96d56Sopenharmony_ci- The :meth:`~TarFile.extract` or :meth:`~TarFile.extractall` methods will
6467db96d56Sopenharmony_ci  ignore the corresponding metadata, leaving it set to a default.
6477db96d56Sopenharmony_ci- :meth:`~TarFile.addfile` will fail.
6487db96d56Sopenharmony_ci- :meth:`~TarFile.list` will print a placeholder string.
6497db96d56Sopenharmony_ci
6507db96d56Sopenharmony_ci
6517db96d56Sopenharmony_ci.. versionchanged:: 3.11.4
6527db96d56Sopenharmony_ci   Added :meth:`~TarInfo.replace` and handling of ``None``.
6537db96d56Sopenharmony_ci
6547db96d56Sopenharmony_ci
6557db96d56Sopenharmony_ci.. class:: TarInfo(name="")
6567db96d56Sopenharmony_ci
6577db96d56Sopenharmony_ci   Create a :class:`TarInfo` object.
6587db96d56Sopenharmony_ci
6597db96d56Sopenharmony_ci
6607db96d56Sopenharmony_ci.. classmethod:: TarInfo.frombuf(buf, encoding, errors)
6617db96d56Sopenharmony_ci
6627db96d56Sopenharmony_ci   Create and return a :class:`TarInfo` object from string buffer *buf*.
6637db96d56Sopenharmony_ci
6647db96d56Sopenharmony_ci   Raises :exc:`HeaderError` if the buffer is invalid.
6657db96d56Sopenharmony_ci
6667db96d56Sopenharmony_ci
6677db96d56Sopenharmony_ci.. classmethod:: TarInfo.fromtarfile(tarfile)
6687db96d56Sopenharmony_ci
6697db96d56Sopenharmony_ci   Read the next member from the :class:`TarFile` object *tarfile* and return it as
6707db96d56Sopenharmony_ci   a :class:`TarInfo` object.
6717db96d56Sopenharmony_ci
6727db96d56Sopenharmony_ci
6737db96d56Sopenharmony_ci.. method:: TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='surrogateescape')
6747db96d56Sopenharmony_ci
6757db96d56Sopenharmony_ci   Create a string buffer from a :class:`TarInfo` object. For information on the
6767db96d56Sopenharmony_ci   arguments see the constructor of the :class:`TarFile` class.
6777db96d56Sopenharmony_ci
6787db96d56Sopenharmony_ci   .. versionchanged:: 3.2
6797db96d56Sopenharmony_ci      Use ``'surrogateescape'`` as the default for the *errors* argument.
6807db96d56Sopenharmony_ci
6817db96d56Sopenharmony_ci
6827db96d56Sopenharmony_ciA ``TarInfo`` object has the following public data attributes:
6837db96d56Sopenharmony_ci
6847db96d56Sopenharmony_ci
6857db96d56Sopenharmony_ci.. attribute:: TarInfo.name
6867db96d56Sopenharmony_ci   :type: str
6877db96d56Sopenharmony_ci
6887db96d56Sopenharmony_ci   Name of the archive member.
6897db96d56Sopenharmony_ci
6907db96d56Sopenharmony_ci
6917db96d56Sopenharmony_ci.. attribute:: TarInfo.size
6927db96d56Sopenharmony_ci   :type: int
6937db96d56Sopenharmony_ci
6947db96d56Sopenharmony_ci   Size in bytes.
6957db96d56Sopenharmony_ci
6967db96d56Sopenharmony_ci
6977db96d56Sopenharmony_ci.. attribute:: TarInfo.mtime
6987db96d56Sopenharmony_ci   :type: int | float
6997db96d56Sopenharmony_ci
7007db96d56Sopenharmony_ci   Time of last modification in seconds since the :ref:`epoch <epoch>`,
7017db96d56Sopenharmony_ci   as in :attr:`os.stat_result.st_mtime`.
7027db96d56Sopenharmony_ci
7037db96d56Sopenharmony_ci   .. versionchanged:: 3.11.4
7047db96d56Sopenharmony_ci
7057db96d56Sopenharmony_ci      Can be set to ``None`` for :meth:`~TarFile.extract` and
7067db96d56Sopenharmony_ci      :meth:`~TarFile.extractall`, causing extraction to skip applying this
7077db96d56Sopenharmony_ci      attribute.
7087db96d56Sopenharmony_ci
7097db96d56Sopenharmony_ci.. attribute:: TarInfo.mode
7107db96d56Sopenharmony_ci   :type: int
7117db96d56Sopenharmony_ci
7127db96d56Sopenharmony_ci   Permission bits, as for :func:`os.chmod`.
7137db96d56Sopenharmony_ci
7147db96d56Sopenharmony_ci   .. versionchanged:: 3.11.4
7157db96d56Sopenharmony_ci
7167db96d56Sopenharmony_ci      Can be set to ``None`` for :meth:`~TarFile.extract` and
7177db96d56Sopenharmony_ci      :meth:`~TarFile.extractall`, causing extraction to skip applying this
7187db96d56Sopenharmony_ci      attribute.
7197db96d56Sopenharmony_ci
7207db96d56Sopenharmony_ci.. attribute:: TarInfo.type
7217db96d56Sopenharmony_ci
7227db96d56Sopenharmony_ci   File type.  *type* is usually one of these constants: :const:`REGTYPE`,
7237db96d56Sopenharmony_ci   :const:`AREGTYPE`, :const:`LNKTYPE`, :const:`SYMTYPE`, :const:`DIRTYPE`,
7247db96d56Sopenharmony_ci   :const:`FIFOTYPE`, :const:`CONTTYPE`, :const:`CHRTYPE`, :const:`BLKTYPE`,
7257db96d56Sopenharmony_ci   :const:`GNUTYPE_SPARSE`.  To determine the type of a :class:`TarInfo` object
7267db96d56Sopenharmony_ci   more conveniently, use the ``is*()`` methods below.
7277db96d56Sopenharmony_ci
7287db96d56Sopenharmony_ci
7297db96d56Sopenharmony_ci.. attribute:: TarInfo.linkname
7307db96d56Sopenharmony_ci   :type: str
7317db96d56Sopenharmony_ci
7327db96d56Sopenharmony_ci   Name of the target file name, which is only present in :class:`TarInfo` objects
7337db96d56Sopenharmony_ci   of type :const:`LNKTYPE` and :const:`SYMTYPE`.
7347db96d56Sopenharmony_ci
7357db96d56Sopenharmony_ci
7367db96d56Sopenharmony_ci.. attribute:: TarInfo.uid
7377db96d56Sopenharmony_ci   :type: int
7387db96d56Sopenharmony_ci
7397db96d56Sopenharmony_ci   User ID of the user who originally stored this member.
7407db96d56Sopenharmony_ci
7417db96d56Sopenharmony_ci   .. versionchanged:: 3.11.4
7427db96d56Sopenharmony_ci
7437db96d56Sopenharmony_ci      Can be set to ``None`` for :meth:`~TarFile.extract` and
7447db96d56Sopenharmony_ci      :meth:`~TarFile.extractall`, causing extraction to skip applying this
7457db96d56Sopenharmony_ci      attribute.
7467db96d56Sopenharmony_ci
7477db96d56Sopenharmony_ci.. attribute:: TarInfo.gid
7487db96d56Sopenharmony_ci   :type: int
7497db96d56Sopenharmony_ci
7507db96d56Sopenharmony_ci   Group ID of the user who originally stored this member.
7517db96d56Sopenharmony_ci
7527db96d56Sopenharmony_ci   .. versionchanged:: 3.11.4
7537db96d56Sopenharmony_ci
7547db96d56Sopenharmony_ci      Can be set to ``None`` for :meth:`~TarFile.extract` and
7557db96d56Sopenharmony_ci      :meth:`~TarFile.extractall`, causing extraction to skip applying this
7567db96d56Sopenharmony_ci      attribute.
7577db96d56Sopenharmony_ci
7587db96d56Sopenharmony_ci.. attribute:: TarInfo.uname
7597db96d56Sopenharmony_ci   :type: str
7607db96d56Sopenharmony_ci
7617db96d56Sopenharmony_ci   User name.
7627db96d56Sopenharmony_ci
7637db96d56Sopenharmony_ci   .. versionchanged:: 3.11.4
7647db96d56Sopenharmony_ci
7657db96d56Sopenharmony_ci      Can be set to ``None`` for :meth:`~TarFile.extract` and
7667db96d56Sopenharmony_ci      :meth:`~TarFile.extractall`, causing extraction to skip applying this
7677db96d56Sopenharmony_ci      attribute.
7687db96d56Sopenharmony_ci
7697db96d56Sopenharmony_ci.. attribute:: TarInfo.gname
7707db96d56Sopenharmony_ci   :type: str
7717db96d56Sopenharmony_ci
7727db96d56Sopenharmony_ci   Group name.
7737db96d56Sopenharmony_ci
7747db96d56Sopenharmony_ci   .. versionchanged:: 3.11.4
7757db96d56Sopenharmony_ci
7767db96d56Sopenharmony_ci      Can be set to ``None`` for :meth:`~TarFile.extract` and
7777db96d56Sopenharmony_ci      :meth:`~TarFile.extractall`, causing extraction to skip applying this
7787db96d56Sopenharmony_ci      attribute.
7797db96d56Sopenharmony_ci
7807db96d56Sopenharmony_ci.. attribute:: TarInfo.pax_headers
7817db96d56Sopenharmony_ci   :type: dict
7827db96d56Sopenharmony_ci
7837db96d56Sopenharmony_ci   A dictionary containing key-value pairs of an associated pax extended header.
7847db96d56Sopenharmony_ci
7857db96d56Sopenharmony_ci.. method:: TarInfo.replace(name=..., mtime=..., mode=..., linkname=...,
7867db96d56Sopenharmony_ci                            uid=..., gid=..., uname=..., gname=...,
7877db96d56Sopenharmony_ci                            deep=True)
7887db96d56Sopenharmony_ci
7897db96d56Sopenharmony_ci   .. versionadded:: 3.11.4
7907db96d56Sopenharmony_ci
7917db96d56Sopenharmony_ci   Return a *new* copy of the :class:`!TarInfo` object with the given attributes
7927db96d56Sopenharmony_ci   changed. For example, to return a ``TarInfo`` with the group name set to
7937db96d56Sopenharmony_ci   ``'staff'``, use::
7947db96d56Sopenharmony_ci
7957db96d56Sopenharmony_ci       new_tarinfo = old_tarinfo.replace(gname='staff')
7967db96d56Sopenharmony_ci
7977db96d56Sopenharmony_ci   By default, a deep copy is made.
7987db96d56Sopenharmony_ci   If *deep* is false, the copy is shallow, i.e. ``pax_headers``
7997db96d56Sopenharmony_ci   and any custom attributes are shared with the original ``TarInfo`` object.
8007db96d56Sopenharmony_ci
8017db96d56Sopenharmony_ciA :class:`TarInfo` object also provides some convenient query methods:
8027db96d56Sopenharmony_ci
8037db96d56Sopenharmony_ci
8047db96d56Sopenharmony_ci.. method:: TarInfo.isfile()
8057db96d56Sopenharmony_ci
8067db96d56Sopenharmony_ci   Return :const:`True` if the :class:`Tarinfo` object is a regular file.
8077db96d56Sopenharmony_ci
8087db96d56Sopenharmony_ci
8097db96d56Sopenharmony_ci.. method:: TarInfo.isreg()
8107db96d56Sopenharmony_ci
8117db96d56Sopenharmony_ci   Same as :meth:`isfile`.
8127db96d56Sopenharmony_ci
8137db96d56Sopenharmony_ci
8147db96d56Sopenharmony_ci.. method:: TarInfo.isdir()
8157db96d56Sopenharmony_ci
8167db96d56Sopenharmony_ci   Return :const:`True` if it is a directory.
8177db96d56Sopenharmony_ci
8187db96d56Sopenharmony_ci
8197db96d56Sopenharmony_ci.. method:: TarInfo.issym()
8207db96d56Sopenharmony_ci
8217db96d56Sopenharmony_ci   Return :const:`True` if it is a symbolic link.
8227db96d56Sopenharmony_ci
8237db96d56Sopenharmony_ci
8247db96d56Sopenharmony_ci.. method:: TarInfo.islnk()
8257db96d56Sopenharmony_ci
8267db96d56Sopenharmony_ci   Return :const:`True` if it is a hard link.
8277db96d56Sopenharmony_ci
8287db96d56Sopenharmony_ci
8297db96d56Sopenharmony_ci.. method:: TarInfo.ischr()
8307db96d56Sopenharmony_ci
8317db96d56Sopenharmony_ci   Return :const:`True` if it is a character device.
8327db96d56Sopenharmony_ci
8337db96d56Sopenharmony_ci
8347db96d56Sopenharmony_ci.. method:: TarInfo.isblk()
8357db96d56Sopenharmony_ci
8367db96d56Sopenharmony_ci   Return :const:`True` if it is a block device.
8377db96d56Sopenharmony_ci
8387db96d56Sopenharmony_ci
8397db96d56Sopenharmony_ci.. method:: TarInfo.isfifo()
8407db96d56Sopenharmony_ci
8417db96d56Sopenharmony_ci   Return :const:`True` if it is a FIFO.
8427db96d56Sopenharmony_ci
8437db96d56Sopenharmony_ci
8447db96d56Sopenharmony_ci.. method:: TarInfo.isdev()
8457db96d56Sopenharmony_ci
8467db96d56Sopenharmony_ci   Return :const:`True` if it is one of character device, block device or FIFO.
8477db96d56Sopenharmony_ci
8487db96d56Sopenharmony_ci
8497db96d56Sopenharmony_ci.. _tarfile-extraction-filter:
8507db96d56Sopenharmony_ci
8517db96d56Sopenharmony_ciExtraction filters
8527db96d56Sopenharmony_ci------------------
8537db96d56Sopenharmony_ci
8547db96d56Sopenharmony_ci.. versionadded:: 3.11.4
8557db96d56Sopenharmony_ci
8567db96d56Sopenharmony_ciThe *tar* format is designed to capture all details of a UNIX-like filesystem,
8577db96d56Sopenharmony_ciwhich makes it very powerful.
8587db96d56Sopenharmony_ciUnfortunately, the features make it easy to create tar files that have
8597db96d56Sopenharmony_ciunintended -- and possibly malicious -- effects when extracted.
8607db96d56Sopenharmony_ciFor example, extracting a tar file can overwrite arbitrary files in various
8617db96d56Sopenharmony_ciways (e.g.  by using absolute paths, ``..`` path components, or symlinks that
8627db96d56Sopenharmony_ciaffect later members).
8637db96d56Sopenharmony_ci
8647db96d56Sopenharmony_ciIn most cases, the full functionality is not needed.
8657db96d56Sopenharmony_ciTherefore, *tarfile* supports extraction filters: a mechanism to limit
8667db96d56Sopenharmony_cifunctionality, and thus mitigate some of the security issues.
8677db96d56Sopenharmony_ci
8687db96d56Sopenharmony_ci.. seealso::
8697db96d56Sopenharmony_ci
8707db96d56Sopenharmony_ci   :pep:`706`
8717db96d56Sopenharmony_ci      Contains further motivation and rationale behind the design.
8727db96d56Sopenharmony_ci
8737db96d56Sopenharmony_ciThe *filter* argument to :meth:`TarFile.extract` or :meth:`~TarFile.extractall`
8747db96d56Sopenharmony_cican be:
8757db96d56Sopenharmony_ci
8767db96d56Sopenharmony_ci* the string ``'fully_trusted'``: Honor all metadata as specified in the
8777db96d56Sopenharmony_ci  archive.
8787db96d56Sopenharmony_ci  Should be used if the user trusts the archive completely, or implements
8797db96d56Sopenharmony_ci  their own complex verification.
8807db96d56Sopenharmony_ci
8817db96d56Sopenharmony_ci* the string ``'tar'``: Honor most *tar*-specific features (i.e. features of
8827db96d56Sopenharmony_ci  UNIX-like filesystems), but block features that are very likely to be
8837db96d56Sopenharmony_ci  surprising or malicious. See :func:`tar_filter` for details.
8847db96d56Sopenharmony_ci
8857db96d56Sopenharmony_ci* the string ``'data'``: Ignore or block most features specific to UNIX-like
8867db96d56Sopenharmony_ci  filesystems. Intended for extracting cross-platform data archives.
8877db96d56Sopenharmony_ci  See :func:`data_filter` for details.
8887db96d56Sopenharmony_ci
8897db96d56Sopenharmony_ci* ``None`` (default): Use :attr:`TarFile.extraction_filter`.
8907db96d56Sopenharmony_ci
8917db96d56Sopenharmony_ci  If that is also ``None`` (the default), the ``'fully_trusted'``
8927db96d56Sopenharmony_ci  filter will be used (for compatibility with earlier versions of Python).
8937db96d56Sopenharmony_ci
8947db96d56Sopenharmony_ci  In Python 3.12, the default will emit a ``DeprecationWarning``.
8957db96d56Sopenharmony_ci
8967db96d56Sopenharmony_ci  In Python 3.14, the ``'data'`` filter will become the default instead.
8977db96d56Sopenharmony_ci  It's possible to switch earlier; see :attr:`TarFile.extraction_filter`.
8987db96d56Sopenharmony_ci
8997db96d56Sopenharmony_ci* A callable which will be called for each extracted member with a
9007db96d56Sopenharmony_ci  :ref:`TarInfo <tarinfo-objects>` describing the member and the destination
9017db96d56Sopenharmony_ci  path to where the archive is extracted (i.e. the same path is used for all
9027db96d56Sopenharmony_ci  members)::
9037db96d56Sopenharmony_ci
9047db96d56Sopenharmony_ci      filter(/, member: TarInfo, path: str) -> TarInfo | None
9057db96d56Sopenharmony_ci
9067db96d56Sopenharmony_ci  The callable is called just before each member is extracted, so it can
9077db96d56Sopenharmony_ci  take the current state of the disk into account.
9087db96d56Sopenharmony_ci  It can:
9097db96d56Sopenharmony_ci
9107db96d56Sopenharmony_ci  - return a :class:`TarInfo` object which will be used instead of the metadata
9117db96d56Sopenharmony_ci    in the archive, or
9127db96d56Sopenharmony_ci  - return ``None``, in which case the member will be skipped, or
9137db96d56Sopenharmony_ci  - raise an exception to abort the operation or skip the member,
9147db96d56Sopenharmony_ci    depending on :attr:`~TarFile.errorlevel`.
9157db96d56Sopenharmony_ci    Note that when extraction is aborted, :meth:`~TarFile.extractall` may leave
9167db96d56Sopenharmony_ci    the archive partially extracted. It does not attempt to clean up.
9177db96d56Sopenharmony_ci
9187db96d56Sopenharmony_ciDefault named filters
9197db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~
9207db96d56Sopenharmony_ci
9217db96d56Sopenharmony_ciThe pre-defined, named filters are available as functions, so they can be
9227db96d56Sopenharmony_cireused in custom filters:
9237db96d56Sopenharmony_ci
9247db96d56Sopenharmony_ci.. function:: fully_trusted_filter(/, member, path)
9257db96d56Sopenharmony_ci
9267db96d56Sopenharmony_ci   Return *member* unchanged.
9277db96d56Sopenharmony_ci
9287db96d56Sopenharmony_ci   This implements the ``'fully_trusted'`` filter.
9297db96d56Sopenharmony_ci
9307db96d56Sopenharmony_ci.. function:: tar_filter(/, member, path)
9317db96d56Sopenharmony_ci
9327db96d56Sopenharmony_ci  Implements the ``'tar'`` filter.
9337db96d56Sopenharmony_ci
9347db96d56Sopenharmony_ci  - Strip leading slashes (``/`` and :attr:`os.sep`) from filenames.
9357db96d56Sopenharmony_ci  - :ref:`Refuse <tarfile-extraction-refuse>` to extract files with absolute
9367db96d56Sopenharmony_ci    paths (in case the name is absolute
9377db96d56Sopenharmony_ci    even after stripping slashes, e.g. ``C:/foo`` on Windows).
9387db96d56Sopenharmony_ci    This raises :class:`~tarfile.AbsolutePathError`.
9397db96d56Sopenharmony_ci  - :ref:`Refuse <tarfile-extraction-refuse>` to extract files whose absolute
9407db96d56Sopenharmony_ci    path (after following symlinks) would end up outside the destination.
9417db96d56Sopenharmony_ci    This raises :class:`~tarfile.OutsideDestinationError`.
9427db96d56Sopenharmony_ci  - Clear high mode bits (setuid, setgid, sticky) and group/other write bits
9437db96d56Sopenharmony_ci    (:attr:`~stat.S_IWGRP`|:attr:`~stat.S_IWOTH`).
9447db96d56Sopenharmony_ci
9457db96d56Sopenharmony_ci  Return the modified ``TarInfo`` member.
9467db96d56Sopenharmony_ci
9477db96d56Sopenharmony_ci.. function:: data_filter(/, member, path)
9487db96d56Sopenharmony_ci
9497db96d56Sopenharmony_ci  Implements the ``'data'`` filter.
9507db96d56Sopenharmony_ci  In addition to what ``tar_filter`` does:
9517db96d56Sopenharmony_ci
9527db96d56Sopenharmony_ci  - :ref:`Refuse <tarfile-extraction-refuse>` to extract links (hard or soft)
9537db96d56Sopenharmony_ci    that link to absolute paths, or ones that link outside the destination.
9547db96d56Sopenharmony_ci
9557db96d56Sopenharmony_ci    This raises :class:`~tarfile.AbsoluteLinkError` or
9567db96d56Sopenharmony_ci    :class:`~tarfile.LinkOutsideDestinationError`.
9577db96d56Sopenharmony_ci
9587db96d56Sopenharmony_ci    Note that such files are refused even on platforms that do not support
9597db96d56Sopenharmony_ci    symbolic links.
9607db96d56Sopenharmony_ci
9617db96d56Sopenharmony_ci  - :ref:`Refuse <tarfile-extraction-refuse>` to extract device files
9627db96d56Sopenharmony_ci    (including pipes).
9637db96d56Sopenharmony_ci    This raises :class:`~tarfile.SpecialFileError`.
9647db96d56Sopenharmony_ci
9657db96d56Sopenharmony_ci  - For regular files, including hard links:
9667db96d56Sopenharmony_ci
9677db96d56Sopenharmony_ci    - Set the owner read and write permissions
9687db96d56Sopenharmony_ci      (:attr:`~stat.S_IRUSR`|:attr:`~stat.S_IWUSR`).
9697db96d56Sopenharmony_ci    - Remove the group & other executable permission
9707db96d56Sopenharmony_ci      (:attr:`~stat.S_IXGRP`|:attr:`~stat.S_IXOTH`)
9717db96d56Sopenharmony_ci      if the owner doesn’t have it (:attr:`~stat.S_IXUSR`).
9727db96d56Sopenharmony_ci
9737db96d56Sopenharmony_ci  - For other files (directories), set ``mode`` to ``None``, so
9747db96d56Sopenharmony_ci    that extraction methods skip applying permission bits.
9757db96d56Sopenharmony_ci  - Set user and group info (``uid``, ``gid``, ``uname``, ``gname``)
9767db96d56Sopenharmony_ci    to ``None``, so that extraction methods skip setting it.
9777db96d56Sopenharmony_ci
9787db96d56Sopenharmony_ci  Return the modified ``TarInfo`` member.
9797db96d56Sopenharmony_ci
9807db96d56Sopenharmony_ci
9817db96d56Sopenharmony_ci.. _tarfile-extraction-refuse:
9827db96d56Sopenharmony_ci
9837db96d56Sopenharmony_ciFilter errors
9847db96d56Sopenharmony_ci~~~~~~~~~~~~~
9857db96d56Sopenharmony_ci
9867db96d56Sopenharmony_ciWhen a filter refuses to extract a file, it will raise an appropriate exception,
9877db96d56Sopenharmony_cia subclass of :class:`~tarfile.FilterError`.
9887db96d56Sopenharmony_ciThis will abort the extraction if :attr:`TarFile.errorlevel` is 1 or more.
9897db96d56Sopenharmony_ciWith ``errorlevel=0`` the error will be logged and the member will be skipped,
9907db96d56Sopenharmony_cibut extraction will continue.
9917db96d56Sopenharmony_ci
9927db96d56Sopenharmony_ci
9937db96d56Sopenharmony_ciHints for further verification
9947db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
9957db96d56Sopenharmony_ci
9967db96d56Sopenharmony_ciEven with ``filter='data'``, *tarfile* is not suited for extracting untrusted
9977db96d56Sopenharmony_cifiles without prior inspection.
9987db96d56Sopenharmony_ciAmong other issues, the pre-defined filters do not prevent denial-of-service
9997db96d56Sopenharmony_ciattacks. Users should do additional checks.
10007db96d56Sopenharmony_ci
10017db96d56Sopenharmony_ciHere is an incomplete list of things to consider:
10027db96d56Sopenharmony_ci
10037db96d56Sopenharmony_ci* Extract to a :func:`new temporary directory <tempfile.mkdtemp>`
10047db96d56Sopenharmony_ci  to prevent e.g. exploiting pre-existing links, and to make it easier to
10057db96d56Sopenharmony_ci  clean up after a failed extraction.
10067db96d56Sopenharmony_ci* When working with untrusted data, use external (e.g. OS-level) limits on
10077db96d56Sopenharmony_ci  disk, memory and CPU usage.
10087db96d56Sopenharmony_ci* Check filenames against an allow-list of characters
10097db96d56Sopenharmony_ci  (to filter out control characters, confusables, foreign path separators,
10107db96d56Sopenharmony_ci  etc.).
10117db96d56Sopenharmony_ci* Check that filenames have expected extensions (discouraging files that
10127db96d56Sopenharmony_ci  execute when you “click on them”, or extension-less files like Windows special device names).
10137db96d56Sopenharmony_ci* Limit the number of extracted files, total size of extracted data,
10147db96d56Sopenharmony_ci  filename length (including symlink length), and size of individual files.
10157db96d56Sopenharmony_ci* Check for files that would be shadowed on case-insensitive filesystems.
10167db96d56Sopenharmony_ci
10177db96d56Sopenharmony_ciAlso note that:
10187db96d56Sopenharmony_ci
10197db96d56Sopenharmony_ci* Tar files may contain multiple versions of the same file.
10207db96d56Sopenharmony_ci  Later ones are expected to overwrite any earlier ones.
10217db96d56Sopenharmony_ci  This feature is crucial to allow updating tape archives, but can be abused
10227db96d56Sopenharmony_ci  maliciously.
10237db96d56Sopenharmony_ci* *tarfile* does not protect against issues with “live” data,
10247db96d56Sopenharmony_ci  e.g. an attacker tinkering with the destination (or source) directory while
10257db96d56Sopenharmony_ci  extraction (or archiving) is in progress.
10267db96d56Sopenharmony_ci
10277db96d56Sopenharmony_ci
10287db96d56Sopenharmony_ciSupporting older Python versions
10297db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10307db96d56Sopenharmony_ci
10317db96d56Sopenharmony_ciExtraction filters were added to Python 3.12, and are backported to older
10327db96d56Sopenharmony_civersions as security updates.
10337db96d56Sopenharmony_ciTo check whether the feature is available, use e.g.
10347db96d56Sopenharmony_ci``hasattr(tarfile, 'data_filter')`` rather than checking the Python version.
10357db96d56Sopenharmony_ci
10367db96d56Sopenharmony_ciThe following examples show how to support Python versions with and without
10377db96d56Sopenharmony_cithe feature.
10387db96d56Sopenharmony_ciNote that setting ``extraction_filter`` will affect any subsequent operations.
10397db96d56Sopenharmony_ci
10407db96d56Sopenharmony_ci* Fully trusted archive::
10417db96d56Sopenharmony_ci
10427db96d56Sopenharmony_ci    my_tarfile.extraction_filter = (lambda member, path: member)
10437db96d56Sopenharmony_ci    my_tarfile.extractall()
10447db96d56Sopenharmony_ci
10457db96d56Sopenharmony_ci* Use the ``'data'`` filter if available, but revert to Python 3.11 behavior
10467db96d56Sopenharmony_ci  (``'fully_trusted'``) if this feature is not available::
10477db96d56Sopenharmony_ci
10487db96d56Sopenharmony_ci    my_tarfile.extraction_filter = getattr(tarfile, 'data_filter',
10497db96d56Sopenharmony_ci                                           (lambda member, path: member))
10507db96d56Sopenharmony_ci    my_tarfile.extractall()
10517db96d56Sopenharmony_ci
10527db96d56Sopenharmony_ci* Use the ``'data'`` filter; *fail* if it is not available::
10537db96d56Sopenharmony_ci
10547db96d56Sopenharmony_ci    my_tarfile.extractall(filter=tarfile.data_filter)
10557db96d56Sopenharmony_ci
10567db96d56Sopenharmony_ci  or::
10577db96d56Sopenharmony_ci
10587db96d56Sopenharmony_ci    my_tarfile.extraction_filter = tarfile.data_filter
10597db96d56Sopenharmony_ci    my_tarfile.extractall()
10607db96d56Sopenharmony_ci
10617db96d56Sopenharmony_ci* Use the ``'data'`` filter; *warn* if it is not available::
10627db96d56Sopenharmony_ci
10637db96d56Sopenharmony_ci   if hasattr(tarfile, 'data_filter'):
10647db96d56Sopenharmony_ci       my_tarfile.extractall(filter='data')
10657db96d56Sopenharmony_ci   else:
10667db96d56Sopenharmony_ci       # remove this when no longer needed
10677db96d56Sopenharmony_ci       warn_the_user('Extracting may be unsafe; consider updating Python')
10687db96d56Sopenharmony_ci       my_tarfile.extractall()
10697db96d56Sopenharmony_ci
10707db96d56Sopenharmony_ci
10717db96d56Sopenharmony_ciStateful extraction filter example
10727db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10737db96d56Sopenharmony_ci
10747db96d56Sopenharmony_ciWhile *tarfile*'s extraction methods take a simple *filter* callable,
10757db96d56Sopenharmony_cicustom filters may be more complex objects with an internal state.
10767db96d56Sopenharmony_ciIt may be useful to write these as context managers, to be used like this::
10777db96d56Sopenharmony_ci
10787db96d56Sopenharmony_ci    with StatefulFilter() as filter_func:
10797db96d56Sopenharmony_ci        tar.extractall(path, filter=filter_func)
10807db96d56Sopenharmony_ci
10817db96d56Sopenharmony_ciSuch a filter can be written as, for example::
10827db96d56Sopenharmony_ci
10837db96d56Sopenharmony_ci    class StatefulFilter:
10847db96d56Sopenharmony_ci        def __init__(self):
10857db96d56Sopenharmony_ci            self.file_count = 0
10867db96d56Sopenharmony_ci
10877db96d56Sopenharmony_ci        def __enter__(self):
10887db96d56Sopenharmony_ci            return self
10897db96d56Sopenharmony_ci
10907db96d56Sopenharmony_ci        def __call__(self, member, path):
10917db96d56Sopenharmony_ci            self.file_count += 1
10927db96d56Sopenharmony_ci            return member
10937db96d56Sopenharmony_ci
10947db96d56Sopenharmony_ci        def __exit__(self, *exc_info):
10957db96d56Sopenharmony_ci            print(f'{self.file_count} files extracted')
10967db96d56Sopenharmony_ci
10977db96d56Sopenharmony_ci
10987db96d56Sopenharmony_ci.. _tarfile-commandline:
10997db96d56Sopenharmony_ci.. program:: tarfile
11007db96d56Sopenharmony_ci
11017db96d56Sopenharmony_ci
11027db96d56Sopenharmony_ciCommand-Line Interface
11037db96d56Sopenharmony_ci----------------------
11047db96d56Sopenharmony_ci
11057db96d56Sopenharmony_ci.. versionadded:: 3.4
11067db96d56Sopenharmony_ci
11077db96d56Sopenharmony_ciThe :mod:`tarfile` module provides a simple command-line interface to interact
11087db96d56Sopenharmony_ciwith tar archives.
11097db96d56Sopenharmony_ci
11107db96d56Sopenharmony_ciIf you want to create a new tar archive, specify its name after the :option:`-c`
11117db96d56Sopenharmony_cioption and then list the filename(s) that should be included:
11127db96d56Sopenharmony_ci
11137db96d56Sopenharmony_ci.. code-block:: shell-session
11147db96d56Sopenharmony_ci
11157db96d56Sopenharmony_ci    $ python -m tarfile -c monty.tar  spam.txt eggs.txt
11167db96d56Sopenharmony_ci
11177db96d56Sopenharmony_ciPassing a directory is also acceptable:
11187db96d56Sopenharmony_ci
11197db96d56Sopenharmony_ci.. code-block:: shell-session
11207db96d56Sopenharmony_ci
11217db96d56Sopenharmony_ci    $ python -m tarfile -c monty.tar life-of-brian_1979/
11227db96d56Sopenharmony_ci
11237db96d56Sopenharmony_ciIf you want to extract a tar archive into the current directory, use
11247db96d56Sopenharmony_cithe :option:`-e` option:
11257db96d56Sopenharmony_ci
11267db96d56Sopenharmony_ci.. code-block:: shell-session
11277db96d56Sopenharmony_ci
11287db96d56Sopenharmony_ci    $ python -m tarfile -e monty.tar
11297db96d56Sopenharmony_ci
11307db96d56Sopenharmony_ciYou can also extract a tar archive into a different directory by passing the
11317db96d56Sopenharmony_cidirectory's name:
11327db96d56Sopenharmony_ci
11337db96d56Sopenharmony_ci.. code-block:: shell-session
11347db96d56Sopenharmony_ci
11357db96d56Sopenharmony_ci    $ python -m tarfile -e monty.tar  other-dir/
11367db96d56Sopenharmony_ci
11377db96d56Sopenharmony_ciFor a list of the files in a tar archive, use the :option:`-l` option:
11387db96d56Sopenharmony_ci
11397db96d56Sopenharmony_ci.. code-block:: shell-session
11407db96d56Sopenharmony_ci
11417db96d56Sopenharmony_ci    $ python -m tarfile -l monty.tar
11427db96d56Sopenharmony_ci
11437db96d56Sopenharmony_ci
11447db96d56Sopenharmony_ciCommand-line options
11457db96d56Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~
11467db96d56Sopenharmony_ci
11477db96d56Sopenharmony_ci.. cmdoption:: -l <tarfile>
11487db96d56Sopenharmony_ci               --list <tarfile>
11497db96d56Sopenharmony_ci
11507db96d56Sopenharmony_ci   List files in a tarfile.
11517db96d56Sopenharmony_ci
11527db96d56Sopenharmony_ci.. cmdoption:: -c <tarfile> <source1> ... <sourceN>
11537db96d56Sopenharmony_ci               --create <tarfile> <source1> ... <sourceN>
11547db96d56Sopenharmony_ci
11557db96d56Sopenharmony_ci   Create tarfile from source files.
11567db96d56Sopenharmony_ci
11577db96d56Sopenharmony_ci.. cmdoption:: -e <tarfile> [<output_dir>]
11587db96d56Sopenharmony_ci               --extract <tarfile> [<output_dir>]
11597db96d56Sopenharmony_ci
11607db96d56Sopenharmony_ci   Extract tarfile into the current directory if *output_dir* is not specified.
11617db96d56Sopenharmony_ci
11627db96d56Sopenharmony_ci.. cmdoption:: -t <tarfile>
11637db96d56Sopenharmony_ci               --test <tarfile>
11647db96d56Sopenharmony_ci
11657db96d56Sopenharmony_ci   Test whether the tarfile is valid or not.
11667db96d56Sopenharmony_ci
11677db96d56Sopenharmony_ci.. cmdoption:: -v, --verbose
11687db96d56Sopenharmony_ci
11697db96d56Sopenharmony_ci   Verbose output.
11707db96d56Sopenharmony_ci
11717db96d56Sopenharmony_ci.. cmdoption:: --filter <filtername>
11727db96d56Sopenharmony_ci
11737db96d56Sopenharmony_ci   Specifies the *filter* for ``--extract``.
11747db96d56Sopenharmony_ci   See :ref:`tarfile-extraction-filter` for details.
11757db96d56Sopenharmony_ci   Only string names are accepted (that is, ``fully_trusted``, ``tar``,
11767db96d56Sopenharmony_ci   and ``data``).
11777db96d56Sopenharmony_ci
11787db96d56Sopenharmony_ci   .. versionadded:: 3.11.4
11797db96d56Sopenharmony_ci
11807db96d56Sopenharmony_ci.. _tar-examples:
11817db96d56Sopenharmony_ci
11827db96d56Sopenharmony_ciExamples
11837db96d56Sopenharmony_ci--------
11847db96d56Sopenharmony_ci
11857db96d56Sopenharmony_ciHow to extract an entire tar archive to the current working directory::
11867db96d56Sopenharmony_ci
11877db96d56Sopenharmony_ci   import tarfile
11887db96d56Sopenharmony_ci   tar = tarfile.open("sample.tar.gz")
11897db96d56Sopenharmony_ci   tar.extractall()
11907db96d56Sopenharmony_ci   tar.close()
11917db96d56Sopenharmony_ci
11927db96d56Sopenharmony_ciHow to extract a subset of a tar archive with :meth:`TarFile.extractall` using
11937db96d56Sopenharmony_cia generator function instead of a list::
11947db96d56Sopenharmony_ci
11957db96d56Sopenharmony_ci   import os
11967db96d56Sopenharmony_ci   import tarfile
11977db96d56Sopenharmony_ci
11987db96d56Sopenharmony_ci   def py_files(members):
11997db96d56Sopenharmony_ci       for tarinfo in members:
12007db96d56Sopenharmony_ci           if os.path.splitext(tarinfo.name)[1] == ".py":
12017db96d56Sopenharmony_ci               yield tarinfo
12027db96d56Sopenharmony_ci
12037db96d56Sopenharmony_ci   tar = tarfile.open("sample.tar.gz")
12047db96d56Sopenharmony_ci   tar.extractall(members=py_files(tar))
12057db96d56Sopenharmony_ci   tar.close()
12067db96d56Sopenharmony_ci
12077db96d56Sopenharmony_ciHow to create an uncompressed tar archive from a list of filenames::
12087db96d56Sopenharmony_ci
12097db96d56Sopenharmony_ci   import tarfile
12107db96d56Sopenharmony_ci   tar = tarfile.open("sample.tar", "w")
12117db96d56Sopenharmony_ci   for name in ["foo", "bar", "quux"]:
12127db96d56Sopenharmony_ci       tar.add(name)
12137db96d56Sopenharmony_ci   tar.close()
12147db96d56Sopenharmony_ci
12157db96d56Sopenharmony_ciThe same example using the :keyword:`with` statement::
12167db96d56Sopenharmony_ci
12177db96d56Sopenharmony_ci    import tarfile
12187db96d56Sopenharmony_ci    with tarfile.open("sample.tar", "w") as tar:
12197db96d56Sopenharmony_ci        for name in ["foo", "bar", "quux"]:
12207db96d56Sopenharmony_ci            tar.add(name)
12217db96d56Sopenharmony_ci
12227db96d56Sopenharmony_ciHow to read a gzip compressed tar archive and display some member information::
12237db96d56Sopenharmony_ci
12247db96d56Sopenharmony_ci   import tarfile
12257db96d56Sopenharmony_ci   tar = tarfile.open("sample.tar.gz", "r:gz")
12267db96d56Sopenharmony_ci   for tarinfo in tar:
12277db96d56Sopenharmony_ci       print(tarinfo.name, "is", tarinfo.size, "bytes in size and is ", end="")
12287db96d56Sopenharmony_ci       if tarinfo.isreg():
12297db96d56Sopenharmony_ci           print("a regular file.")
12307db96d56Sopenharmony_ci       elif tarinfo.isdir():
12317db96d56Sopenharmony_ci           print("a directory.")
12327db96d56Sopenharmony_ci       else:
12337db96d56Sopenharmony_ci           print("something else.")
12347db96d56Sopenharmony_ci   tar.close()
12357db96d56Sopenharmony_ci
12367db96d56Sopenharmony_ciHow to create an archive and reset the user information using the *filter*
12377db96d56Sopenharmony_ciparameter in :meth:`TarFile.add`::
12387db96d56Sopenharmony_ci
12397db96d56Sopenharmony_ci    import tarfile
12407db96d56Sopenharmony_ci    def reset(tarinfo):
12417db96d56Sopenharmony_ci        tarinfo.uid = tarinfo.gid = 0
12427db96d56Sopenharmony_ci        tarinfo.uname = tarinfo.gname = "root"
12437db96d56Sopenharmony_ci        return tarinfo
12447db96d56Sopenharmony_ci    tar = tarfile.open("sample.tar.gz", "w:gz")
12457db96d56Sopenharmony_ci    tar.add("foo", filter=reset)
12467db96d56Sopenharmony_ci    tar.close()
12477db96d56Sopenharmony_ci
12487db96d56Sopenharmony_ci
12497db96d56Sopenharmony_ci.. _tar-formats:
12507db96d56Sopenharmony_ci
12517db96d56Sopenharmony_ciSupported tar formats
12527db96d56Sopenharmony_ci---------------------
12537db96d56Sopenharmony_ci
12547db96d56Sopenharmony_ciThere are three tar formats that can be created with the :mod:`tarfile` module:
12557db96d56Sopenharmony_ci
12567db96d56Sopenharmony_ci* The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames
12577db96d56Sopenharmony_ci  up to a length of at best 256 characters and linknames up to 100 characters.
12587db96d56Sopenharmony_ci  The maximum file size is 8 GiB. This is an old and limited but widely
12597db96d56Sopenharmony_ci  supported format.
12607db96d56Sopenharmony_ci
12617db96d56Sopenharmony_ci* The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and
12627db96d56Sopenharmony_ci  linknames, files bigger than 8 GiB and sparse files. It is the de facto
12637db96d56Sopenharmony_ci  standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar
12647db96d56Sopenharmony_ci  extensions for long names, sparse file support is read-only.
12657db96d56Sopenharmony_ci
12667db96d56Sopenharmony_ci* The POSIX.1-2001 pax format (:const:`PAX_FORMAT`). It is the most flexible
12677db96d56Sopenharmony_ci  format with virtually no limits. It supports long filenames and linknames, large
12687db96d56Sopenharmony_ci  files and stores pathnames in a portable way. Modern tar implementations,
12697db96d56Sopenharmony_ci  including GNU tar, bsdtar/libarchive and star, fully support extended *pax*
12707db96d56Sopenharmony_ci  features; some old or unmaintained libraries may not, but should treat
12717db96d56Sopenharmony_ci  *pax* archives as if they were in the universally supported *ustar* format.
12727db96d56Sopenharmony_ci  It is the current default format for new archives.
12737db96d56Sopenharmony_ci
12747db96d56Sopenharmony_ci  It extends the existing *ustar* format with extra headers for information
12757db96d56Sopenharmony_ci  that cannot be stored otherwise. There are two flavours of pax headers:
12767db96d56Sopenharmony_ci  Extended headers only affect the subsequent file header, global
12777db96d56Sopenharmony_ci  headers are valid for the complete archive and affect all following files.
12787db96d56Sopenharmony_ci  All the data in a pax header is encoded in *UTF-8* for portability reasons.
12797db96d56Sopenharmony_ci
12807db96d56Sopenharmony_ciThere are some more variants of the tar format which can be read, but not
12817db96d56Sopenharmony_cicreated:
12827db96d56Sopenharmony_ci
12837db96d56Sopenharmony_ci* The ancient V7 format. This is the first tar format from Unix Seventh Edition,
12847db96d56Sopenharmony_ci  storing only regular files and directories. Names must not be longer than 100
12857db96d56Sopenharmony_ci  characters, there is no user/group name information. Some archives have
12867db96d56Sopenharmony_ci  miscalculated header checksums in case of fields with non-ASCII characters.
12877db96d56Sopenharmony_ci
12887db96d56Sopenharmony_ci* The SunOS tar extended format. This format is a variant of the POSIX.1-2001
12897db96d56Sopenharmony_ci  pax format, but is not compatible.
12907db96d56Sopenharmony_ci
12917db96d56Sopenharmony_ci.. _tar-unicode:
12927db96d56Sopenharmony_ci
12937db96d56Sopenharmony_ciUnicode issues
12947db96d56Sopenharmony_ci--------------
12957db96d56Sopenharmony_ci
12967db96d56Sopenharmony_ciThe tar format was originally conceived to make backups on tape drives with the
12977db96d56Sopenharmony_cimain focus on preserving file system information. Nowadays tar archives are
12987db96d56Sopenharmony_cicommonly used for file distribution and exchanging archives over networks. One
12997db96d56Sopenharmony_ciproblem of the original format (which is the basis of all other formats) is
13007db96d56Sopenharmony_cithat there is no concept of supporting different character encodings. For
13017db96d56Sopenharmony_ciexample, an ordinary tar archive created on a *UTF-8* system cannot be read
13027db96d56Sopenharmony_cicorrectly on a *Latin-1* system if it contains non-*ASCII* characters. Textual
13037db96d56Sopenharmony_cimetadata (like filenames, linknames, user/group names) will appear damaged.
13047db96d56Sopenharmony_ciUnfortunately, there is no way to autodetect the encoding of an archive. The
13057db96d56Sopenharmony_cipax format was designed to solve this problem. It stores non-ASCII metadata
13067db96d56Sopenharmony_ciusing the universal character encoding *UTF-8*.
13077db96d56Sopenharmony_ci
13087db96d56Sopenharmony_ciThe details of character conversion in :mod:`tarfile` are controlled by the
13097db96d56Sopenharmony_ci*encoding* and *errors* keyword arguments of the :class:`TarFile` class.
13107db96d56Sopenharmony_ci
13117db96d56Sopenharmony_ci*encoding* defines the character encoding to use for the metadata in the
13127db96d56Sopenharmony_ciarchive. The default value is :func:`sys.getfilesystemencoding` or ``'ascii'``
13137db96d56Sopenharmony_cias a fallback. Depending on whether the archive is read or written, the
13147db96d56Sopenharmony_cimetadata must be either decoded or encoded. If *encoding* is not set
13157db96d56Sopenharmony_ciappropriately, this conversion may fail.
13167db96d56Sopenharmony_ci
13177db96d56Sopenharmony_ciThe *errors* argument defines how characters are treated that cannot be
13187db96d56Sopenharmony_ciconverted. Possible values are listed in section :ref:`error-handlers`.
13197db96d56Sopenharmony_ciThe default scheme is ``'surrogateescape'`` which Python also uses for its
13207db96d56Sopenharmony_cifile system calls, see :ref:`os-filenames`.
13217db96d56Sopenharmony_ci
13227db96d56Sopenharmony_ciFor :const:`PAX_FORMAT` archives (the default), *encoding* is generally not needed
13237db96d56Sopenharmony_cibecause all the metadata is stored using *UTF-8*. *encoding* is only used in
13247db96d56Sopenharmony_cithe rare cases when binary pax headers are decoded or when strings with
13257db96d56Sopenharmony_cisurrogate characters are stored.
1326