1**************************** 2 What's New In Python 3.1 3**************************** 4 5:Author: Raymond Hettinger 6 7.. $Id$ 8 Rules for maintenance: 9 10 * Anyone can add text to this document. Do not spend very much time 11 on the wording of your changes, because your text will probably 12 get rewritten to some degree. 13 14 * The maintainer will go through Misc/NEWS periodically and add 15 changes; it's therefore more important to add your changes to 16 Misc/NEWS than to this file. 17 18 * This is not a complete list of every single change; completeness 19 is the purpose of Misc/NEWS. Some changes I consider too small 20 or esoteric to include. If such a change is added to the text, 21 I'll just remove it. (This is another reason you shouldn't spend 22 too much time on writing your addition.) 23 24 * If you want to draw your new text to the attention of the 25 maintainer, add 'XXX' to the beginning of the paragraph or 26 section. 27 28 * It's OK to just add a fragmentary note about a change. For 29 example: "XXX Describe the transmogrify() function added to the 30 socket module." The maintainer will research the change and 31 write the necessary text. 32 33 * You can comment out your additions if you like, but it's not 34 necessary (especially when a final release is some months away). 35 36 * Credit the author of a patch or bugfix. Just the name is 37 sufficient; the e-mail address isn't necessary. 38 39 * It's helpful to add the bug/patch number as a comment: 40 41 % Patch 12345 42 XXX Describe the transmogrify() function added to the socket 43 module. 44 (Contributed by P.Y. Developer.) 45 46 This saves the maintainer the effort of going through the SVN log 47 when researching a change. 48 49This article explains the new features in Python 3.1, compared to 3.0. 50Python 3.1 was released on June 27, 2009. 51 52 53PEP 372: Ordered Dictionaries 54============================= 55 56Regular Python dictionaries iterate over key/value pairs in arbitrary order. 57Over the years, a number of authors have written alternative implementations 58that remember the order that the keys were originally inserted. Based on 59the experiences from those implementations, a new 60:class:`collections.OrderedDict` class has been introduced. 61 62The OrderedDict API is substantially the same as regular dictionaries 63but will iterate over keys and values in a guaranteed order depending on 64when a key was first inserted. If a new entry overwrites an existing entry, 65the original insertion position is left unchanged. Deleting an entry and 66reinserting it will move it to the end. 67 68The standard library now supports use of ordered dictionaries in several 69modules. The :mod:`configparser` module uses them by default. This lets 70configuration files be read, modified, and then written back in their original 71order. The *_asdict()* method for :func:`collections.namedtuple` now 72returns an ordered dictionary with the values appearing in the same order as 73the underlying tuple indices. The :mod:`json` module is being built-out with 74an *object_pairs_hook* to allow OrderedDicts to be built by the decoder. 75Support was also added for third-party tools like `PyYAML <https://pyyaml.org/>`_. 76 77.. seealso:: 78 79 :pep:`372` - Ordered Dictionaries 80 PEP written by Armin Ronacher and Raymond Hettinger. Implementation 81 written by Raymond Hettinger. 82 83 84PEP 378: Format Specifier for Thousands Separator 85================================================= 86 87The built-in :func:`format` function and the :meth:`str.format` method use 88a mini-language that now includes a simple, non-locale aware way to format 89a number with a thousands separator. That provides a way to humanize a 90program's output, improving its professional appearance and readability:: 91 92 >>> format(1234567, ',d') 93 '1,234,567' 94 >>> format(1234567.89, ',.2f') 95 '1,234,567.89' 96 >>> format(12345.6 + 8901234.12j, ',f') 97 '12,345.600000+8,901,234.120000j' 98 >>> format(Decimal('1234567.89'), ',f') 99 '1,234,567.89' 100 101The supported types are :class:`int`, :class:`float`, :class:`complex` 102and :class:`decimal.Decimal`. 103 104Discussions are underway about how to specify alternative separators 105like dots, spaces, apostrophes, or underscores. Locale-aware applications 106should use the existing *n* format specifier which already has some support 107for thousands separators. 108 109.. seealso:: 110 111 :pep:`378` - Format Specifier for Thousands Separator 112 PEP written by Raymond Hettinger and implemented by Eric Smith and 113 Mark Dickinson. 114 115 116Other Language Changes 117====================== 118 119Some smaller changes made to the core Python language are: 120 121* Directories and zip archives containing a :file:`__main__.py` 122 file can now be executed directly by passing their name to the 123 interpreter. The directory/zipfile is automatically inserted as the 124 first entry in sys.path. (Suggestion and initial patch by Andy Chu; 125 revised patch by Phillip J. Eby and Nick Coghlan; :issue:`1739468`.) 126 127* The :func:`int` type gained a ``bit_length`` method that returns the 128 number of bits necessary to represent its argument in binary:: 129 130 >>> n = 37 131 >>> bin(37) 132 '0b100101' 133 >>> n.bit_length() 134 6 135 >>> n = 2**123-1 136 >>> n.bit_length() 137 123 138 >>> (n+1).bit_length() 139 124 140 141 (Contributed by Fredrik Johansson, Victor Stinner, Raymond Hettinger, 142 and Mark Dickinson; :issue:`3439`.) 143 144* The fields in :func:`format` strings can now be automatically 145 numbered:: 146 147 >>> 'Sir {} of {}'.format('Gallahad', 'Camelot') 148 'Sir Gallahad of Camelot' 149 150 Formerly, the string would have required numbered fields such as: 151 ``'Sir {0} of {1}'``. 152 153 (Contributed by Eric Smith; :issue:`5237`.) 154 155* The :func:`string.maketrans` function is deprecated and is replaced by new 156 static methods, :meth:`bytes.maketrans` and :meth:`bytearray.maketrans`. 157 This change solves the confusion around which types were supported by the 158 :mod:`string` module. Now, :class:`str`, :class:`bytes`, and 159 :class:`bytearray` each have their own **maketrans** and **translate** 160 methods with intermediate translation tables of the appropriate type. 161 162 (Contributed by Georg Brandl; :issue:`5675`.) 163 164* The syntax of the :keyword:`with` statement now allows multiple context 165 managers in a single statement:: 166 167 >>> with open('mylog.txt') as infile, open('a.out', 'w') as outfile: 168 ... for line in infile: 169 ... if '<critical>' in line: 170 ... outfile.write(line) 171 172 With the new syntax, the :func:`contextlib.nested` function is no longer 173 needed and is now deprecated. 174 175 (Contributed by Georg Brandl and Mattias Brändström; 176 `appspot issue 53094 <https://codereview.appspot.com/53094>`_.) 177 178* ``round(x, n)`` now returns an integer if *x* is an integer. 179 Previously it returned a float:: 180 181 >>> round(1123, -2) 182 1100 183 184 (Contributed by Mark Dickinson; :issue:`4707`.) 185 186* Python now uses David Gay's algorithm for finding the shortest floating 187 point representation that doesn't change its value. This should help 188 mitigate some of the confusion surrounding binary floating point 189 numbers. 190 191 The significance is easily seen with a number like ``1.1`` which does not 192 have an exact equivalent in binary floating point. Since there is no exact 193 equivalent, an expression like ``float('1.1')`` evaluates to the nearest 194 representable value which is ``0x1.199999999999ap+0`` in hex or 195 ``1.100000000000000088817841970012523233890533447265625`` in decimal. That 196 nearest value was and still is used in subsequent floating point 197 calculations. 198 199 What is new is how the number gets displayed. Formerly, Python used a 200 simple approach. The value of ``repr(1.1)`` was computed as ``format(1.1, 201 '.17g')`` which evaluated to ``'1.1000000000000001'``. The advantage of 202 using 17 digits was that it relied on IEEE-754 guarantees to assure that 203 ``eval(repr(1.1))`` would round-trip exactly to its original value. The 204 disadvantage is that many people found the output to be confusing (mistaking 205 intrinsic limitations of binary floating point representation as being a 206 problem with Python itself). 207 208 The new algorithm for ``repr(1.1)`` is smarter and returns ``'1.1'``. 209 Effectively, it searches all equivalent string representations (ones that 210 get stored with the same underlying float value) and returns the shortest 211 representation. 212 213 The new algorithm tends to emit cleaner representations when possible, but 214 it does not change the underlying values. So, it is still the case that 215 ``1.1 + 2.2 != 3.3`` even though the representations may suggest otherwise. 216 217 The new algorithm depends on certain features in the underlying floating 218 point implementation. If the required features are not found, the old 219 algorithm will continue to be used. Also, the text pickle protocols 220 assure cross-platform portability by using the old algorithm. 221 222 (Contributed by Eric Smith and Mark Dickinson; :issue:`1580`) 223 224New, Improved, and Deprecated Modules 225===================================== 226 227* Added a :class:`collections.Counter` class to support convenient 228 counting of unique items in a sequence or iterable:: 229 230 >>> Counter(['red', 'blue', 'red', 'green', 'blue', 'blue']) 231 Counter({'blue': 3, 'red': 2, 'green': 1}) 232 233 (Contributed by Raymond Hettinger; :issue:`1696199`.) 234 235* Added a new module, :mod:`tkinter.ttk` for access to the Tk themed widget set. 236 The basic idea of ttk is to separate, to the extent possible, the code 237 implementing a widget's behavior from the code implementing its appearance. 238 239 (Contributed by Guilherme Polo; :issue:`2983`.) 240 241* The :class:`gzip.GzipFile` and :class:`bz2.BZ2File` classes now support 242 the context management protocol:: 243 244 >>> # Automatically close file after writing 245 >>> with gzip.GzipFile(filename, "wb") as f: 246 ... f.write(b"xxx") 247 248 (Contributed by Antoine Pitrou.) 249 250* The :mod:`decimal` module now supports methods for creating a 251 decimal object from a binary :class:`float`. The conversion is 252 exact but can sometimes be surprising:: 253 254 >>> Decimal.from_float(1.1) 255 Decimal('1.100000000000000088817841970012523233890533447265625') 256 257 The long decimal result shows the actual binary fraction being 258 stored for *1.1*. The fraction has many digits because *1.1* cannot 259 be exactly represented in binary. 260 261 (Contributed by Raymond Hettinger and Mark Dickinson.) 262 263* The :mod:`itertools` module grew two new functions. The 264 :func:`itertools.combinations_with_replacement` function is one of 265 four for generating combinatorics including permutations and Cartesian 266 products. The :func:`itertools.compress` function mimics its namesake 267 from APL. Also, the existing :func:`itertools.count` function now has 268 an optional *step* argument and can accept any type of counting 269 sequence including :class:`fractions.Fraction` and 270 :class:`decimal.Decimal`:: 271 272 >>> [p+q for p,q in combinations_with_replacement('LOVE', 2)] 273 ['LL', 'LO', 'LV', 'LE', 'OO', 'OV', 'OE', 'VV', 'VE', 'EE'] 274 275 >>> list(compress(data=range(10), selectors=[0,0,1,1,0,1,0,1,0,0])) 276 [2, 3, 5, 7] 277 278 >>> c = count(start=Fraction(1,2), step=Fraction(1,6)) 279 >>> [next(c), next(c), next(c), next(c)] 280 [Fraction(1, 2), Fraction(2, 3), Fraction(5, 6), Fraction(1, 1)] 281 282 (Contributed by Raymond Hettinger.) 283 284* :func:`collections.namedtuple` now supports a keyword argument 285 *rename* which lets invalid fieldnames be automatically converted to 286 positional names in the form _0, _1, etc. This is useful when 287 the field names are being created by an external source such as a 288 CSV header, SQL field list, or user input:: 289 290 >>> query = input() 291 SELECT region, dept, count(*) FROM main GROUPBY region, dept 292 293 >>> cursor.execute(query) 294 >>> query_fields = [desc[0] for desc in cursor.description] 295 >>> UserQuery = namedtuple('UserQuery', query_fields, rename=True) 296 >>> pprint.pprint([UserQuery(*row) for row in cursor]) 297 [UserQuery(region='South', dept='Shipping', _2=185), 298 UserQuery(region='North', dept='Accounting', _2=37), 299 UserQuery(region='West', dept='Sales', _2=419)] 300 301 (Contributed by Raymond Hettinger; :issue:`1818`.) 302 303* The :func:`re.sub`, :func:`re.subn` and :func:`re.split` functions now 304 accept a flags parameter. 305 306 (Contributed by Gregory Smith.) 307 308* The :mod:`logging` module now implements a simple :class:`logging.NullHandler` 309 class for applications that are not using logging but are calling 310 library code that does. Setting-up a null handler will suppress 311 spurious warnings such as "No handlers could be found for logger foo":: 312 313 >>> h = logging.NullHandler() 314 >>> logging.getLogger("foo").addHandler(h) 315 316 (Contributed by Vinay Sajip; :issue:`4384`). 317 318* The :mod:`runpy` module which supports the ``-m`` command line switch 319 now supports the execution of packages by looking for and executing 320 a ``__main__`` submodule when a package name is supplied. 321 322 (Contributed by Andi Vajda; :issue:`4195`.) 323 324* The :mod:`pdb` module can now access and display source code loaded via 325 :mod:`zipimport` (or any other conformant :pep:`302` loader). 326 327 (Contributed by Alexander Belopolsky; :issue:`4201`.) 328 329* :class:`functools.partial` objects can now be pickled. 330 331 (Suggested by Antoine Pitrou and Jesse Noller. Implemented by 332 Jack Diederich; :issue:`5228`.) 333 334* Add :mod:`pydoc` help topics for symbols so that ``help('@')`` 335 works as expected in the interactive environment. 336 337 (Contributed by David Laban; :issue:`4739`.) 338 339* The :mod:`unittest` module now supports skipping individual tests or classes 340 of tests. And it supports marking a test as an expected failure, a test that 341 is known to be broken, but shouldn't be counted as a failure on a 342 TestResult:: 343 344 class TestGizmo(unittest.TestCase): 345 346 @unittest.skipUnless(sys.platform.startswith("win"), "requires Windows") 347 def test_gizmo_on_windows(self): 348 ... 349 350 @unittest.expectedFailure 351 def test_gimzo_without_required_library(self): 352 ... 353 354 Also, tests for exceptions have been builtout to work with context managers 355 using the :keyword:`with` statement:: 356 357 def test_division_by_zero(self): 358 with self.assertRaises(ZeroDivisionError): 359 x / 0 360 361 In addition, several new assertion methods were added including 362 :func:`assertSetEqual`, :func:`assertDictEqual`, 363 :func:`assertDictContainsSubset`, :func:`assertListEqual`, 364 :func:`assertTupleEqual`, :func:`assertSequenceEqual`, 365 :func:`assertRaisesRegexp`, :func:`assertIsNone`, 366 and :func:`assertIsNotNone`. 367 368 (Contributed by Benjamin Peterson and Antoine Pitrou.) 369 370* The :mod:`io` module has three new constants for the :meth:`seek` 371 method :data:`SEEK_SET`, :data:`SEEK_CUR`, and :data:`SEEK_END`. 372 373* The :attr:`sys.version_info` tuple is now a named tuple:: 374 375 >>> sys.version_info 376 sys.version_info(major=3, minor=1, micro=0, releaselevel='alpha', serial=2) 377 378 (Contributed by Ross Light; :issue:`4285`.) 379 380* The :mod:`nntplib` and :mod:`imaplib` modules now support IPv6. 381 382 (Contributed by Derek Morr; :issue:`1655` and :issue:`1664`.) 383 384* The :mod:`pickle` module has been adapted for better interoperability with 385 Python 2.x when used with protocol 2 or lower. The reorganization of the 386 standard library changed the formal reference for many objects. For 387 example, ``__builtin__.set`` in Python 2 is called ``builtins.set`` in Python 388 3. This change confounded efforts to share data between different versions of 389 Python. But now when protocol 2 or lower is selected, the pickler will 390 automatically use the old Python 2 names for both loading and dumping. This 391 remapping is turned-on by default but can be disabled with the *fix_imports* 392 option:: 393 394 >>> s = {1, 2, 3} 395 >>> pickle.dumps(s, protocol=0) 396 b'c__builtin__\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.' 397 >>> pickle.dumps(s, protocol=0, fix_imports=False) 398 b'cbuiltins\nset\np0\n((lp1\nL1L\naL2L\naL3L\natp2\nRp3\n.' 399 400 An unfortunate but unavoidable side-effect of this change is that protocol 2 401 pickles produced by Python 3.1 won't be readable with Python 3.0. The latest 402 pickle protocol, protocol 3, should be used when migrating data between 403 Python 3.x implementations, as it doesn't attempt to remain compatible with 404 Python 2.x. 405 406 (Contributed by Alexandre Vassalotti and Antoine Pitrou, :issue:`6137`.) 407 408* A new module, :mod:`importlib` was added. It provides a complete, portable, 409 pure Python reference implementation of the :keyword:`import` statement and its 410 counterpart, the :func:`__import__` function. It represents a substantial 411 step forward in documenting and defining the actions that take place during 412 imports. 413 414 (Contributed by Brett Cannon.) 415 416Optimizations 417============= 418 419Major performance enhancements have been added: 420 421* The new I/O library (as defined in :pep:`3116`) was mostly written in 422 Python and quickly proved to be a problematic bottleneck in Python 3.0. 423 In Python 3.1, the I/O library has been entirely rewritten in C and is 424 2 to 20 times faster depending on the task at hand. The pure Python 425 version is still available for experimentation purposes through 426 the ``_pyio`` module. 427 428 (Contributed by Amaury Forgeot d'Arc and Antoine Pitrou.) 429 430* Added a heuristic so that tuples and dicts containing only untrackable objects 431 are not tracked by the garbage collector. This can reduce the size of 432 collections and therefore the garbage collection overhead on long-running 433 programs, depending on their particular use of datatypes. 434 435 (Contributed by Antoine Pitrou, :issue:`4688`.) 436 437* Enabling a configure option named ``--with-computed-gotos`` 438 on compilers that support it (notably: gcc, SunPro, icc), the bytecode 439 evaluation loop is compiled with a new dispatch mechanism which gives 440 speedups of up to 20%, depending on the system, the compiler, and 441 the benchmark. 442 443 (Contributed by Antoine Pitrou along with a number of other participants, 444 :issue:`4753`). 445 446* The decoding of UTF-8, UTF-16 and LATIN-1 is now two to four times 447 faster. 448 449 (Contributed by Antoine Pitrou and Amaury Forgeot d'Arc, :issue:`4868`.) 450 451* The :mod:`json` module now has a C extension to substantially improve 452 its performance. In addition, the API was modified so that json works 453 only with :class:`str`, not with :class:`bytes`. That change makes the 454 module closely match the `JSON specification <https://json.org/>`_ 455 which is defined in terms of Unicode. 456 457 (Contributed by Bob Ippolito and converted to Py3.1 by Antoine Pitrou 458 and Benjamin Peterson; :issue:`4136`.) 459 460* Unpickling now interns the attribute names of pickled objects. This saves 461 memory and allows pickles to be smaller. 462 463 (Contributed by Jake McGuire and Antoine Pitrou; :issue:`5084`.) 464 465IDLE 466==== 467 468* IDLE's format menu now provides an option to strip trailing whitespace 469 from a source file. 470 471 (Contributed by Roger D. Serwy; :issue:`5150`.) 472 473Build and C API Changes 474======================= 475 476Changes to Python's build process and to the C API include: 477 478* Integers are now stored internally either in base ``2**15`` or in base 479 ``2**30``, the base being determined at build time. Previously, they 480 were always stored in base ``2**15``. Using base ``2**30`` gives 481 significant performance improvements on 64-bit machines, but 482 benchmark results on 32-bit machines have been mixed. Therefore, 483 the default is to use base ``2**30`` on 64-bit machines and base ``2**15`` 484 on 32-bit machines; on Unix, there's a new configure option 485 ``--enable-big-digits`` that can be used to override this default. 486 487 Apart from the performance improvements this change should be invisible to 488 end users, with one exception: for testing and debugging purposes there's a 489 new :attr:`sys.int_info` that provides information about the 490 internal format, giving the number of bits per digit and the size in bytes 491 of the C type used to store each digit:: 492 493 >>> import sys 494 >>> sys.int_info 495 sys.int_info(bits_per_digit=30, sizeof_digit=4) 496 497 (Contributed by Mark Dickinson; :issue:`4258`.) 498 499* The :c:func:`PyLong_AsUnsignedLongLong()` function now handles a negative 500 *pylong* by raising :exc:`OverflowError` instead of :exc:`TypeError`. 501 502 (Contributed by Mark Dickinson and Lisandro Dalcrin; :issue:`5175`.) 503 504* Deprecated :c:func:`PyNumber_Int`. Use :c:func:`PyNumber_Long` instead. 505 506 (Contributed by Mark Dickinson; :issue:`4910`.) 507 508* Added a new :c:func:`PyOS_string_to_double` function to replace the 509 deprecated functions :c:func:`PyOS_ascii_strtod` and :c:func:`PyOS_ascii_atof`. 510 511 (Contributed by Mark Dickinson; :issue:`5914`.) 512 513* Added :c:type:`PyCapsule` as a replacement for the :c:type:`PyCObject` API. 514 The principal difference is that the new type has a well defined interface 515 for passing typing safety information and a less complicated signature 516 for calling a destructor. The old type had a problematic API and is now 517 deprecated. 518 519 (Contributed by Larry Hastings; :issue:`5630`.) 520 521Porting to Python 3.1 522===================== 523 524This section lists previously described changes and other bugfixes 525that may require changes to your code: 526 527* The new floating point string representations can break existing doctests. 528 For example:: 529 530 def e(): 531 '''Compute the base of natural logarithms. 532 533 >>> e() 534 2.7182818284590451 535 536 ''' 537 return sum(1/math.factorial(x) for x in reversed(range(30))) 538 539 doctest.testmod() 540 541 ********************************************************************** 542 Failed example: 543 e() 544 Expected: 545 2.7182818284590451 546 Got: 547 2.718281828459045 548 ********************************************************************** 549 550* The automatic name remapping in the pickle module for protocol 2 or lower can 551 make Python 3.1 pickles unreadable in Python 3.0. One solution is to use 552 protocol 3. Another solution is to set the *fix_imports* option to ``False``. 553 See the discussion above for more details. 554