xref: /third_party/python/Doc/library/xml.sax.rst (revision 7db96d56)
17db96d56Sopenharmony_ci:mod:`xml.sax` --- Support for SAX2 parsers
27db96d56Sopenharmony_ci===========================================
37db96d56Sopenharmony_ci
47db96d56Sopenharmony_ci.. module:: xml.sax
57db96d56Sopenharmony_ci   :synopsis: Package containing SAX2 base classes and convenience functions.
67db96d56Sopenharmony_ci
77db96d56Sopenharmony_ci.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
87db96d56Sopenharmony_ci.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
97db96d56Sopenharmony_ci.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
107db96d56Sopenharmony_ci
117db96d56Sopenharmony_ci**Source code:** :source:`Lib/xml/sax/__init__.py`
127db96d56Sopenharmony_ci
137db96d56Sopenharmony_ci--------------
147db96d56Sopenharmony_ci
157db96d56Sopenharmony_ciThe :mod:`xml.sax` package provides a number of modules which implement the
167db96d56Sopenharmony_ciSimple API for XML (SAX) interface for Python.  The package itself provides the
177db96d56Sopenharmony_ciSAX exceptions and the convenience functions which will be most used by users of
187db96d56Sopenharmony_cithe SAX API.
197db96d56Sopenharmony_ci
207db96d56Sopenharmony_ci
217db96d56Sopenharmony_ci.. warning::
227db96d56Sopenharmony_ci
237db96d56Sopenharmony_ci   The :mod:`xml.sax` module is not secure against maliciously
247db96d56Sopenharmony_ci   constructed data.  If you need to parse untrusted or unauthenticated data see
257db96d56Sopenharmony_ci   :ref:`xml-vulnerabilities`.
267db96d56Sopenharmony_ci
277db96d56Sopenharmony_ci.. versionchanged:: 3.7.1
287db96d56Sopenharmony_ci
297db96d56Sopenharmony_ci   The SAX parser no longer processes general external entities by default
307db96d56Sopenharmony_ci   to increase security. Before, the parser created network connections
317db96d56Sopenharmony_ci   to fetch remote files or loaded local files from the file
327db96d56Sopenharmony_ci   system for DTD and entities. The feature can be enabled again with method
337db96d56Sopenharmony_ci   :meth:`~xml.sax.xmlreader.XMLReader.setFeature` on the parser object
347db96d56Sopenharmony_ci   and argument :data:`~xml.sax.handler.feature_external_ges`.
357db96d56Sopenharmony_ci
367db96d56Sopenharmony_ciThe convenience functions are:
377db96d56Sopenharmony_ci
387db96d56Sopenharmony_ci
397db96d56Sopenharmony_ci.. function:: make_parser(parser_list=[])
407db96d56Sopenharmony_ci
417db96d56Sopenharmony_ci   Create and return a SAX :class:`~xml.sax.xmlreader.XMLReader` object.  The
427db96d56Sopenharmony_ci   first parser found will
437db96d56Sopenharmony_ci   be used.  If *parser_list* is provided, it must be an iterable of strings which
447db96d56Sopenharmony_ci   name modules that have a function named :func:`create_parser`.  Modules listed
457db96d56Sopenharmony_ci   in *parser_list* will be used before modules in the default list of parsers.
467db96d56Sopenharmony_ci
477db96d56Sopenharmony_ci   .. versionchanged:: 3.8
487db96d56Sopenharmony_ci      The *parser_list* argument can be any iterable, not just a list.
497db96d56Sopenharmony_ci
507db96d56Sopenharmony_ci
517db96d56Sopenharmony_ci.. function:: parse(filename_or_stream, handler, error_handler=handler.ErrorHandler())
527db96d56Sopenharmony_ci
537db96d56Sopenharmony_ci   Create a SAX parser and use it to parse a document.  The document, passed in as
547db96d56Sopenharmony_ci   *filename_or_stream*, can be a filename or a file object.  The *handler*
557db96d56Sopenharmony_ci   parameter needs to be a SAX :class:`~handler.ContentHandler` instance.  If
567db96d56Sopenharmony_ci   *error_handler* is given, it must be a SAX :class:`~handler.ErrorHandler`
577db96d56Sopenharmony_ci   instance; if
587db96d56Sopenharmony_ci   omitted,  :exc:`SAXParseException` will be raised on all errors.  There is no
597db96d56Sopenharmony_ci   return value; all work must be done by the *handler* passed in.
607db96d56Sopenharmony_ci
617db96d56Sopenharmony_ci
627db96d56Sopenharmony_ci.. function:: parseString(string, handler, error_handler=handler.ErrorHandler())
637db96d56Sopenharmony_ci
647db96d56Sopenharmony_ci   Similar to :func:`parse`, but parses from a buffer *string* received as a
657db96d56Sopenharmony_ci   parameter.  *string* must be a :class:`str` instance or a
667db96d56Sopenharmony_ci   :term:`bytes-like object`.
677db96d56Sopenharmony_ci
687db96d56Sopenharmony_ci   .. versionchanged:: 3.5
697db96d56Sopenharmony_ci      Added support of :class:`str` instances.
707db96d56Sopenharmony_ci
717db96d56Sopenharmony_ciA typical SAX application uses three kinds of objects: readers, handlers and
727db96d56Sopenharmony_ciinput sources.  "Reader" in this context is another term for parser, i.e. some
737db96d56Sopenharmony_cipiece of code that reads the bytes or characters from the input source, and
747db96d56Sopenharmony_ciproduces a sequence of events. The events then get distributed to the handler
757db96d56Sopenharmony_ciobjects, i.e. the reader invokes a method on the handler.  A SAX application
767db96d56Sopenharmony_cimust therefore obtain a reader object, create or open the input sources, create
777db96d56Sopenharmony_cithe handlers, and connect these objects all together.  As the final step of
787db96d56Sopenharmony_cipreparation, the reader is called to parse the input. During parsing, methods on
797db96d56Sopenharmony_cithe handler objects are called based on structural and syntactic events from the
807db96d56Sopenharmony_ciinput data.
817db96d56Sopenharmony_ci
827db96d56Sopenharmony_ciFor these objects, only the interfaces are relevant; they are normally not
837db96d56Sopenharmony_ciinstantiated by the application itself.  Since Python does not have an explicit
847db96d56Sopenharmony_cinotion of interface, they are formally introduced as classes, but applications
857db96d56Sopenharmony_cimay use implementations which do not inherit from the provided classes.  The
867db96d56Sopenharmony_ci:class:`~xml.sax.xmlreader.InputSource`, :class:`~xml.sax.xmlreader.Locator`,
877db96d56Sopenharmony_ci:class:`~xml.sax.xmlreader.Attributes`, :class:`~xml.sax.xmlreader.AttributesNS`,
887db96d56Sopenharmony_ciand :class:`~xml.sax.xmlreader.XMLReader` interfaces are defined in the
897db96d56Sopenharmony_cimodule :mod:`xml.sax.xmlreader`.  The handler interfaces are defined in
907db96d56Sopenharmony_ci:mod:`xml.sax.handler`.  For convenience,
917db96d56Sopenharmony_ci:class:`~xml.sax.xmlreader.InputSource` (which is often
927db96d56Sopenharmony_ciinstantiated directly) and the handler classes are also available from
937db96d56Sopenharmony_ci:mod:`xml.sax`.  These interfaces are described below.
947db96d56Sopenharmony_ci
957db96d56Sopenharmony_ciIn addition to these classes, :mod:`xml.sax` provides the following exception
967db96d56Sopenharmony_ciclasses.
977db96d56Sopenharmony_ci
987db96d56Sopenharmony_ci
997db96d56Sopenharmony_ci.. exception:: SAXException(msg, exception=None)
1007db96d56Sopenharmony_ci
1017db96d56Sopenharmony_ci   Encapsulate an XML error or warning.  This class can contain basic error or
1027db96d56Sopenharmony_ci   warning information from either the XML parser or the application: it can be
1037db96d56Sopenharmony_ci   subclassed to provide additional functionality or to add localization.  Note
1047db96d56Sopenharmony_ci   that although the handlers defined in the
1057db96d56Sopenharmony_ci   :class:`~xml.sax.handler.ErrorHandler` interface
1067db96d56Sopenharmony_ci   receive instances of this exception, it is not required to actually raise the
1077db96d56Sopenharmony_ci   exception --- it is also useful as a container for information.
1087db96d56Sopenharmony_ci
1097db96d56Sopenharmony_ci   When instantiated, *msg* should be a human-readable description of the error.
1107db96d56Sopenharmony_ci   The optional *exception* parameter, if given, should be ``None`` or an exception
1117db96d56Sopenharmony_ci   that was caught by the parsing code and is being passed along as information.
1127db96d56Sopenharmony_ci
1137db96d56Sopenharmony_ci   This is the base class for the other SAX exception classes.
1147db96d56Sopenharmony_ci
1157db96d56Sopenharmony_ci
1167db96d56Sopenharmony_ci.. exception:: SAXParseException(msg, exception, locator)
1177db96d56Sopenharmony_ci
1187db96d56Sopenharmony_ci   Subclass of :exc:`SAXException` raised on parse errors. Instances of this
1197db96d56Sopenharmony_ci   class are passed to the methods of the SAX
1207db96d56Sopenharmony_ci   :class:`~xml.sax.handler.ErrorHandler` interface to provide information
1217db96d56Sopenharmony_ci   about the parse error.  This class supports the SAX
1227db96d56Sopenharmony_ci   :class:`~xml.sax.xmlreader.Locator` interface as well as the
1237db96d56Sopenharmony_ci   :class:`SAXException` interface.
1247db96d56Sopenharmony_ci
1257db96d56Sopenharmony_ci
1267db96d56Sopenharmony_ci.. exception:: SAXNotRecognizedException(msg, exception=None)
1277db96d56Sopenharmony_ci
1287db96d56Sopenharmony_ci   Subclass of :exc:`SAXException` raised when a SAX
1297db96d56Sopenharmony_ci   :class:`~xml.sax.xmlreader.XMLReader` is
1307db96d56Sopenharmony_ci   confronted with an unrecognized feature or property.  SAX applications and
1317db96d56Sopenharmony_ci   extensions may use this class for similar purposes.
1327db96d56Sopenharmony_ci
1337db96d56Sopenharmony_ci
1347db96d56Sopenharmony_ci.. exception:: SAXNotSupportedException(msg, exception=None)
1357db96d56Sopenharmony_ci
1367db96d56Sopenharmony_ci   Subclass of :exc:`SAXException` raised when a SAX
1377db96d56Sopenharmony_ci   :class:`~xml.sax.xmlreader.XMLReader` is asked to
1387db96d56Sopenharmony_ci   enable a feature that is not supported, or to set a property to a value that the
1397db96d56Sopenharmony_ci   implementation does not support.  SAX applications and extensions may use this
1407db96d56Sopenharmony_ci   class for similar purposes.
1417db96d56Sopenharmony_ci
1427db96d56Sopenharmony_ci
1437db96d56Sopenharmony_ci.. seealso::
1447db96d56Sopenharmony_ci
1457db96d56Sopenharmony_ci   `SAX: The Simple API for XML <http://www.saxproject.org/>`_
1467db96d56Sopenharmony_ci      This site is the focal point for the definition of the SAX API.  It provides a
1477db96d56Sopenharmony_ci      Java implementation and online documentation.  Links to implementations and
1487db96d56Sopenharmony_ci      historical information are also available.
1497db96d56Sopenharmony_ci
1507db96d56Sopenharmony_ci   Module :mod:`xml.sax.handler`
1517db96d56Sopenharmony_ci      Definitions of the interfaces for application-provided objects.
1527db96d56Sopenharmony_ci
1537db96d56Sopenharmony_ci   Module :mod:`xml.sax.saxutils`
1547db96d56Sopenharmony_ci      Convenience functions for use in SAX applications.
1557db96d56Sopenharmony_ci
1567db96d56Sopenharmony_ci   Module :mod:`xml.sax.xmlreader`
1577db96d56Sopenharmony_ci      Definitions of the interfaces for parser-provided objects.
1587db96d56Sopenharmony_ci
1597db96d56Sopenharmony_ci
1607db96d56Sopenharmony_ci.. _sax-exception-objects:
1617db96d56Sopenharmony_ci
1627db96d56Sopenharmony_ciSAXException Objects
1637db96d56Sopenharmony_ci--------------------
1647db96d56Sopenharmony_ci
1657db96d56Sopenharmony_ciThe :class:`SAXException` exception class supports the following methods:
1667db96d56Sopenharmony_ci
1677db96d56Sopenharmony_ci
1687db96d56Sopenharmony_ci.. method:: SAXException.getMessage()
1697db96d56Sopenharmony_ci
1707db96d56Sopenharmony_ci   Return a human-readable message describing the error condition.
1717db96d56Sopenharmony_ci
1727db96d56Sopenharmony_ci
1737db96d56Sopenharmony_ci.. method:: SAXException.getException()
1747db96d56Sopenharmony_ci
1757db96d56Sopenharmony_ci   Return an encapsulated exception object, or ``None``.
1767db96d56Sopenharmony_ci
177