xref: /third_party/python/Doc/whatsnew/3.1.rst (revision 7db96d56)
1****************************
2  What's New In Python 3.1
3****************************
4
5:Author: Raymond Hettinger
6
7.. $Id$
8   Rules for maintenance:
9
10   * Anyone can add text to this document.  Do not spend very much time
11   on the wording of your changes, because your text will probably
12   get rewritten to some degree.
13
14   * The maintainer will go through Misc/NEWS periodically and add
15   changes; it's therefore more important to add your changes to
16   Misc/NEWS than to this file.
17
18   * This is not a complete list of every single change; completeness
19   is the purpose of Misc/NEWS.  Some changes I consider too small
20   or esoteric to include.  If such a change is added to the text,
21   I'll just remove it.  (This is another reason you shouldn't spend
22   too much time on writing your addition.)
23
24   * If you want to draw your new text to the attention of the
25   maintainer, add 'XXX' to the beginning of the paragraph or
26   section.
27
28   * It's OK to just add a fragmentary note about a change.  For
29   example: "XXX Describe the transmogrify() function added to the
30   socket module."  The maintainer will research the change and
31   write the necessary text.
32
33   * You can comment out your additions if you like, but it's not
34   necessary (especially when a final release is some months away).
35
36   * Credit the author of a patch or bugfix.   Just the name is
37   sufficient; the e-mail address isn't necessary.
38
39   * It's helpful to add the bug/patch number as a comment:
40
41   % Patch 12345
42   XXX Describe the transmogrify() function added to the socket
43   module.
44   (Contributed by P.Y. Developer.)
45
46   This saves the maintainer the effort of going through the SVN log
47   when researching a change.
48
49This article explains the new features in Python 3.1, compared to 3.0.
50Python 3.1 was released on June 27, 2009.
51
52
53PEP 372: Ordered Dictionaries
54=============================
55
56Regular Python dictionaries iterate over key/value pairs in arbitrary order.
57Over the years, a number of authors have written alternative implementations
58that remember the order that the keys were originally inserted.  Based on
59the experiences from those implementations, a new
60:class:`collections.OrderedDict` class has been introduced.
61
62The OrderedDict API is substantially the same as regular dictionaries
63but will iterate over keys and values in a guaranteed order depending on
64when a key was first inserted.  If a new entry overwrites an existing entry,
65the original insertion position is left unchanged.  Deleting an entry and
66reinserting it will move it to the end.
67
68The standard library now supports use of ordered dictionaries in several
69modules.  The :mod:`configparser` module uses them by default.  This lets
70configuration files be read, modified, and then written back in their original
71order.  The *_asdict()* method for :func:`collections.namedtuple` now
72returns an ordered dictionary with the values appearing in the same order as
73the underlying tuple indices.  The :mod:`json` module is being built-out with
74an *object_pairs_hook* to allow OrderedDicts to be built by the decoder.
75Support was also added for third-party tools like `PyYAML <https://pyyaml.org/>`_.
76
77.. seealso::
78
79   :pep:`372` - Ordered Dictionaries
80      PEP written by Armin Ronacher and Raymond Hettinger.  Implementation
81      written by Raymond Hettinger.
82
83
84PEP 378: Format Specifier for Thousands Separator
85=================================================
86
87The built-in :func:`format` function and the :meth:`str.format` method use
88a mini-language that now includes a simple, non-locale aware way to format
89a number with a thousands separator.  That provides a way to humanize a
90program's output, improving its professional appearance and readability::
91
92    >>> format(1234567, ',d')
93    '1,234,567'
94    >>> format(1234567.89, ',.2f')
95    '1,234,567.89'
96    >>> format(12345.6 + 8901234.12j, ',f')
97    '12,345.600000+8,901,234.120000j'
98    >>> format(Decimal('1234567.89'), ',f')
99    '1,234,567.89'
100
101The supported types are :class:`int`, :class:`float`, :class:`complex`
102and :class:`decimal.Decimal`.
103
104Discussions are underway about how to specify alternative separators
105like dots, spaces, apostrophes, or underscores.  Locale-aware applications
106should use the existing *n* format specifier which already has some support
107for thousands separators.
108
109.. seealso::
110
111   :pep:`378` - Format Specifier for Thousands Separator
112      PEP written by Raymond Hettinger and implemented by Eric Smith and
113      Mark Dickinson.
114
115
116Other Language Changes
117======================
118
119Some smaller changes made to the core Python language are:
120
121* Directories and zip archives containing a :file:`__main__.py`
122  file can now be executed directly by passing their name to the
123  interpreter. The directory/zipfile is automatically inserted as the
124  first entry in sys.path.  (Suggestion and initial patch by Andy Chu;
125  revised patch by Phillip J. Eby and Nick Coghlan; :issue:`1739468`.)
126
127* The :func:`int` type gained a ``bit_length`` method that returns the
128  number of bits necessary to represent its argument in binary::
129
130      >>> n = 37
131      >>> bin(37)
132      '0b100101'
133      >>> n.bit_length()
134      6
135      >>> n = 2**123-1
136      >>> n.bit_length()
137      123
138      >>> (n+1).bit_length()
139      124
140
141  (Contributed by Fredrik Johansson, Victor Stinner, Raymond Hettinger,
142  and Mark Dickinson; :issue:`3439`.)
143
144* The fields in :func:`format` strings can now be automatically
145  numbered::
146
147    >>> 'Sir {} of {}'.format('Gallahad', 'Camelot')
148    'Sir Gallahad of Camelot'
149
150  Formerly, the string would have required numbered fields such as:
151  ``'Sir {0} of {1}'``.
152
153  (Contributed by Eric Smith; :issue:`5237`.)
154
155* The :func:`string.maketrans` function is deprecated and is replaced by new
156  static methods, :meth:`bytes.maketrans` and :meth:`bytearray.maketrans`.
157  This change solves the confusion around which types were supported by the
158  :mod:`string` module. Now, :class:`str`, :class:`bytes`, and
159  :class:`bytearray` each have their own **maketrans** and **translate**
160  methods with intermediate translation tables of the appropriate type.
161
162  (Contributed by Georg Brandl; :issue:`5675`.)
163
164* The syntax of the :keyword:`with` statement now allows multiple context
165  managers in a single statement::
166
167    >>> with open('mylog.txt') as infile, open('a.out', 'w') as outfile:
168    ...     for line in infile:
169    ...         if '<critical>' in line:
170    ...             outfile.write(line)
171
172  With the new syntax, the :func:`contextlib.nested` function is no longer
173  needed and is now deprecated.
174
175  (Contributed by Georg Brandl and Mattias Brändström;
176  `appspot issue 53094 <https://codereview.appspot.com/53094>`_.)
177
178* ``round(x, n)`` now returns an integer if *x* is an integer.
179  Previously it returned a float::
180
181    >>> round(1123, -2)
182    1100
183
184  (Contributed by Mark Dickinson; :issue:`4707`.)
185
186* Python now uses David Gay's algorithm for finding the shortest floating
187  point representation that doesn't change its value.  This should help
188  mitigate some of the confusion surrounding binary floating point
189  numbers.
190
191  The significance is easily seen with a number like ``1.1`` which does not
192  have an exact equivalent in binary floating point.  Since there is no exact
193  equivalent, an expression like ``float('1.1')`` evaluates to the nearest
194  representable value which is ``0x1.199999999999ap+0`` in hex or
195  ``1.100000000000000088817841970012523233890533447265625`` in decimal. That
196  nearest value was and still is used in subsequent floating point
197  calculations.
198
199  What is new is how the number gets displayed.  Formerly, Python used a
200  simple approach.  The value of ``repr(1.1)`` was computed as ``format(1.1,
201  '.17g')`` which evaluated to ``'1.1000000000000001'``. The advantage of
202  using 17 digits was that it relied on IEEE-754 guarantees to assure that
203  ``eval(repr(1.1))`` would round-trip exactly to its original value.  The
204  disadvantage is that many people found the output to be confusing (mistaking
205  intrinsic limitations of binary floating point representation as being a
206  problem with Python itself).
207
208  The new algorithm for ``repr(1.1)`` is smarter and returns ``'1.1'``.
209  Effectively, it searches all equivalent string representations (ones that
210  get stored with the same underlying float value) and returns the shortest
211  representation.
212
213  The new algorithm tends to emit cleaner representations when possible, but
214  it does not change the underlying values.  So, it is still the case that
215  ``1.1 + 2.2 != 3.3`` even though the representations may suggest otherwise.
216
217  The new algorithm depends on certain features in the underlying floating
218  point implementation.  If the required features are not found, the old
219  algorithm will continue to be used.  Also, the text pickle protocols
220  assure cross-platform portability by using the old algorithm.
221
222  (Contributed by Eric Smith and Mark Dickinson; :issue:`1580`)
223
224New, Improved, and Deprecated Modules
225=====================================
226
227* Added a :class:`collections.Counter` class to support convenient
228  counting of unique items in a sequence or iterable::
229
230      >>> Counter(['red', 'blue', 'red', 'green', 'blue', 'blue'])
231      Counter({'blue': 3, 'red': 2, 'green': 1})
232
233  (Contributed by Raymond Hettinger; :issue:`1696199`.)
234
235* Added a new module, :mod:`tkinter.ttk` for access to the Tk themed widget set.
236  The basic idea of ttk is to separate, to the extent possible, the code
237  implementing a widget's behavior from the code implementing its appearance.
238
239  (Contributed by Guilherme Polo; :issue:`2983`.)
240
241* The :class:`gzip.GzipFile` and :class:`bz2.BZ2File` classes now support
242  the context management protocol::
243
244        >>> # Automatically close file after writing
245        >>> with gzip.GzipFile(filename, "wb") as f:
246        ...     f.write(b"xxx")
247
248  (Contributed by Antoine Pitrou.)
249
250* The :mod:`decimal` module now supports methods for creating a
251  decimal object from a binary :class:`float`.  The conversion is
252  exact but can sometimes be surprising::
253
254      >>> Decimal.from_float(1.1)
255      Decimal('1.100000000000000088817841970012523233890533447265625')
256
257  The long decimal result shows the actual binary fraction being
258  stored for *1.1*.  The fraction has many digits because *1.1* cannot
259  be exactly represented in binary.
260
261  (Contributed by Raymond Hettinger and Mark Dickinson.)
262
263* The :mod:`itertools` module grew two new functions.  The
264  :func:`itertools.combinations_with_replacement` function is one of
265  four for generating combinatorics including permutations and Cartesian
266  products.  The :func:`itertools.compress` function mimics its namesake
267  from APL.  Also, the existing :func:`itertools.count` function now has
268  an optional *step* argument and can accept any type of counting
269  sequence including :class:`fractions.Fraction` and
270  :class:`decimal.Decimal`::
271
272    >>> [p+q for p,q in combinations_with_replacement('LOVE', 2)]
273    ['LL', 'LO', 'LV', 'LE', 'OO', 'OV', 'OE', 'VV', 'VE', 'EE']
274
275    >>> list(compress(data=range(10), selectors=[0,0,1,1,0,1,0,1,0,0]))
276    [2, 3, 5, 7]
277
278    >>> c = count(start=Fraction(1,2), step=Fraction(1,6))
279    >>> [next(c), next(c), next(c), next(c)]
280    [Fraction(1, 2), Fraction(2, 3), Fraction(5, 6), Fraction(1, 1)]
281
282  (Contributed by Raymond Hettinger.)
283
284* :func:`collections.namedtuple` now supports a keyword argument
285  *rename* which lets invalid fieldnames be automatically converted to
286  positional names in the form _0, _1, etc.  This is useful when
287  the field names are being created by an external source such as a
288  CSV header, SQL field list, or user input::
289
290    >>> query = input()
291    SELECT region, dept, count(*) FROM main GROUPBY region, dept
292
293    >>> cursor.execute(query)
294    >>> query_fields = [desc[0] for desc in cursor.description]
295    >>> UserQuery = namedtuple('UserQuery', query_fields, rename=True)
296    >>> pprint.pprint([UserQuery(*row) for row in cursor])
297    [UserQuery(region='South', dept='Shipping', _2=185),
298     UserQuery(region='North', dept='Accounting', _2=37),
299     UserQuery(region='West', dept='Sales', _2=419)]
300
301  (Contributed by Raymond Hettinger; :issue:`1818`.)
302
303* The :func:`re.sub`, :func:`re.subn` and :func:`re.split` functions now
304  accept a flags parameter.
305
306  (Contributed by Gregory Smith.)
307
308* The :mod:`logging` module now implements a simple :class:`logging.NullHandler`
309  class for applications that are not using logging but are calling
310  library code that does.  Setting-up a null handler will suppress
311  spurious warnings such as "No handlers could be found for logger foo"::
312
313    >>> h = logging.NullHandler()
314    >>> logging.getLogger("foo").addHandler(h)
315
316  (Contributed by Vinay Sajip; :issue:`4384`).
317
318* The :mod:`runpy` module which supports the ``-m`` command line switch
319  now supports the execution of packages by looking for and executing
320  a ``__main__`` submodule when a package name is supplied.
321
322  (Contributed by Andi Vajda; :issue:`4195`.)
323
324* The :mod:`pdb` module can now access and display source code loaded via
325  :mod:`zipimport` (or any other conformant :pep:`302` loader).
326
327  (Contributed by Alexander Belopolsky; :issue:`4201`.)
328
329*  :class:`functools.partial` objects can now be pickled.
330
331  (Suggested by Antoine Pitrou and Jesse Noller.  Implemented by
332  Jack Diederich; :issue:`5228`.)
333
334* Add :mod:`pydoc` help topics for symbols so that ``help('@')``
335  works as expected in the interactive environment.
336
337  (Contributed by David Laban; :issue:`4739`.)
338
339* The :mod:`unittest` module now supports skipping individual tests or classes
340  of tests. And it supports marking a test as an expected failure, a test that
341  is known to be broken, but shouldn't be counted as a failure on a
342  TestResult::
343
344    class TestGizmo(unittest.TestCase):
345
346        @unittest.skipUnless(sys.platform.startswith("win"), "requires Windows")
347        def test_gizmo_on_windows(self):
348            ...
349
350        @unittest.expectedFailure
351        def test_gimzo_without_required_library(self):
352            ...
353
354  Also, tests for exceptions have been builtout to work with context managers
355  using the :keyword:`with` statement::
356
357      def test_division_by_zero(self):
358          with self.assertRaises(ZeroDivisionError):
359              x / 0
360
361  In addition, several new assertion methods were added including
362  :func:`assertSetEqual`, :func:`assertDictEqual`,
363  :func:`assertDictContainsSubset`, :func:`assertListEqual`,
364  :func:`assertTupleEqual`, :func:`assertSequenceEqual`,
365  :func:`assertRaisesRegexp`, :func:`assertIsNone`,
366  and :func:`assertIsNotNone`.
367
368  (Contributed by Benjamin Peterson and Antoine Pitrou.)
369
370* The :mod:`io` module has three new constants for the :meth:`seek`
371  method :data:`SEEK_SET`, :data:`SEEK_CUR`, and :data:`SEEK_END`.
372
373* The :attr:`sys.version_info` tuple is now a named tuple::
374
375    >>> sys.version_info
376    sys.version_info(major=3, minor=1, micro=0, releaselevel='alpha', serial=2)
377
378  (Contributed by Ross Light; :issue:`4285`.)
379
380* The :mod:`nntplib` and :mod:`imaplib` modules now support IPv6.
381
382  (Contributed by Derek Morr; :issue:`1655` and :issue:`1664`.)
383
384* The :mod:`pickle` module has been adapted for better interoperability with
385  Python 2.x when used with protocol 2 or lower.  The reorganization of the
386  standard library changed the formal reference for many objects.  For
387  example, ``__builtin__.set`` in Python 2 is called ``builtins.set`` in Python
388  3. This change confounded efforts to share data between different versions of
389  Python.  But now when protocol 2 or lower is selected, the pickler will
390  automatically use the old Python 2 names for both loading and dumping. This
391  remapping is turned-on by default but can be disabled with the *fix_imports*
392  option::
393
394    >>> s = {1, 2, 3}
395    >>> pickle.dumps(s, protocol=0)
396    b'c__builtin__\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.'
397    >>> pickle.dumps(s, protocol=0, fix_imports=False)
398    b'cbuiltins\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.'
399
400  An unfortunate but unavoidable side-effect of this change is that protocol 2
401  pickles produced by Python 3.1 won't be readable with Python 3.0. The latest
402  pickle protocol, protocol 3, should be used when migrating data between
403  Python 3.x implementations, as it doesn't attempt to remain compatible with
404  Python 2.x.
405
406  (Contributed by Alexandre Vassalotti and Antoine Pitrou, :issue:`6137`.)
407
408* A new module, :mod:`importlib` was added.  It provides a complete, portable,
409  pure Python reference implementation of the :keyword:`import` statement and its
410  counterpart, the :func:`__import__` function.  It represents a substantial
411  step forward in documenting and defining the actions that take place during
412  imports.
413
414  (Contributed by Brett Cannon.)
415
416Optimizations
417=============
418
419Major performance enhancements have been added:
420
421* The new I/O library (as defined in :pep:`3116`) was mostly written in
422  Python and quickly proved to be a problematic bottleneck in Python 3.0.
423  In Python 3.1, the I/O library has been entirely rewritten in C and is
424  2 to 20 times faster depending on the task at hand. The pure Python
425  version is still available for experimentation purposes through
426  the ``_pyio`` module.
427
428  (Contributed by Amaury Forgeot d'Arc and Antoine Pitrou.)
429
430* Added a heuristic so that tuples and dicts containing only untrackable objects
431  are not tracked by the garbage collector. This can reduce the size of
432  collections and therefore the garbage collection overhead on long-running
433  programs, depending on their particular use of datatypes.
434
435  (Contributed by Antoine Pitrou, :issue:`4688`.)
436
437* Enabling a configure option named ``--with-computed-gotos``
438  on compilers that support it (notably: gcc, SunPro, icc), the bytecode
439  evaluation loop is compiled with a new dispatch mechanism which gives
440  speedups of up to 20%, depending on the system, the compiler, and
441  the benchmark.
442
443  (Contributed by Antoine Pitrou along with a number of other participants,
444  :issue:`4753`).
445
446* The decoding of UTF-8, UTF-16 and LATIN-1 is now two to four times
447  faster.
448
449  (Contributed by Antoine Pitrou and Amaury Forgeot d'Arc, :issue:`4868`.)
450
451* The :mod:`json` module now has a C extension to substantially improve
452  its performance.  In addition, the API was modified so that json works
453  only with :class:`str`, not with :class:`bytes`.  That change makes the
454  module closely match the `JSON specification <https://json.org/>`_
455  which is defined in terms of Unicode.
456
457  (Contributed by Bob Ippolito and converted to Py3.1 by Antoine Pitrou
458  and Benjamin Peterson; :issue:`4136`.)
459
460* Unpickling now interns the attribute names of pickled objects.  This saves
461  memory and allows pickles to be smaller.
462
463  (Contributed by Jake McGuire and Antoine Pitrou; :issue:`5084`.)
464
465IDLE
466====
467
468* IDLE's format menu now provides an option to strip trailing whitespace
469  from a source file.
470
471  (Contributed by Roger D. Serwy; :issue:`5150`.)
472
473Build and C API Changes
474=======================
475
476Changes to Python's build process and to the C API include:
477
478* Integers are now stored internally either in base ``2**15`` or in base
479  ``2**30``, the base being determined at build time.  Previously, they
480  were always stored in base ``2**15``.  Using base ``2**30`` gives
481  significant performance improvements on 64-bit machines, but
482  benchmark results on 32-bit machines have been mixed.  Therefore,
483  the default is to use base ``2**30`` on 64-bit machines and base ``2**15``
484  on 32-bit machines; on Unix, there's a new configure option
485  ``--enable-big-digits`` that can be used to override this default.
486
487  Apart from the performance improvements this change should be invisible to
488  end users, with one exception: for testing and debugging purposes there's a
489  new :attr:`sys.int_info` that provides information about the
490  internal format, giving the number of bits per digit and the size in bytes
491  of the C type used to store each digit::
492
493     >>> import sys
494     >>> sys.int_info
495     sys.int_info(bits_per_digit=30, sizeof_digit=4)
496
497  (Contributed by Mark Dickinson; :issue:`4258`.)
498
499* The :c:func:`PyLong_AsUnsignedLongLong()` function now handles a negative
500  *pylong* by raising :exc:`OverflowError` instead of :exc:`TypeError`.
501
502  (Contributed by Mark Dickinson and Lisandro Dalcrin; :issue:`5175`.)
503
504* Deprecated :c:func:`PyNumber_Int`.  Use :c:func:`PyNumber_Long` instead.
505
506  (Contributed by Mark Dickinson; :issue:`4910`.)
507
508* Added a new :c:func:`PyOS_string_to_double` function to replace the
509  deprecated functions :c:func:`PyOS_ascii_strtod` and :c:func:`PyOS_ascii_atof`.
510
511  (Contributed by Mark Dickinson; :issue:`5914`.)
512
513* Added :c:type:`PyCapsule` as a replacement for the :c:type:`PyCObject` API.
514  The principal difference is that the new type has a well defined interface
515  for passing typing safety information and a less complicated signature
516  for calling a destructor.  The old type had a problematic API and is now
517  deprecated.
518
519  (Contributed by Larry Hastings; :issue:`5630`.)
520
521Porting to Python 3.1
522=====================
523
524This section lists previously described changes and other bugfixes
525that may require changes to your code:
526
527* The new floating point string representations can break existing doctests.
528  For example::
529
530    def e():
531        '''Compute the base of natural logarithms.
532
533        >>> e()
534        2.7182818284590451
535
536        '''
537        return sum(1/math.factorial(x) for x in reversed(range(30)))
538
539    doctest.testmod()
540
541    **********************************************************************
542    Failed example:
543        e()
544    Expected:
545        2.7182818284590451
546    Got:
547        2.718281828459045
548    **********************************************************************
549
550* The automatic name remapping in the pickle module for protocol 2 or lower can
551  make Python 3.1 pickles unreadable in Python 3.0.  One solution is to use
552  protocol 3.  Another solution is to set the *fix_imports* option to ``False``.
553  See the discussion above for more details.
554