Skip to content

Latest commit

 

History

History
931 lines (705 loc) · 38.9 KB

3.0.rst

File metadata and controls

931 lines (705 loc) · 38.9 KB

What's New In Python 3.0

Author: Guido van Rossum

This article explains the new features in Python 3.0, compared to 2.6. Python 3.0, also known as "Python 3000" or "Py3K", is the first ever intentionally backwards incompatible Python release. Python 3.0 was released on December 3, 2008. There are more changes than in a typical release, and more that are important for all Python users. Nevertheless, after digesting the changes, you'll find that Python really hasn't changed all that much -- by and large, we're mostly fixing well-known annoyances and warts, and removing a lot of old cruft.

This article doesn't attempt to provide a complete specification of all new features, but instead tries to give a convenient overview. For full details, you should refer to the documentation for Python 3.0, and/or the many PEPs referenced in the text. If you want to understand the complete implementation and design rationale for a particular feature, PEPs usually have more details than the regular documentation; but note that PEPs usually are not kept up-to-date once a feature has been fully implemented.

Due to time constraints this document is not as complete as it should have been. As always for a new release, the Misc/NEWS file in the source distribution contains a wealth of detailed information about every small thing that was changed.

Common Stumbling Blocks

This section lists those few changes that are most likely to trip you up if you're used to Python 2.5.

Print Is A Function

The print statement has been replaced with a :func:`print` function, with keyword arguments to replace most of the special syntax of the old print statement (PEP 3105). Examples:

Old: print "The answer is", 2*2
New: print("The answer is", 2*2)

Old: print x,           # Trailing comma suppresses newline
New: print(x, end=" ")  # Appends a space instead of a newline

Old: print              # Prints a newline
New: print()            # You must call the function!

Old: print >>sys.stderr, "fatal error"
New: print("fatal error", file=sys.stderr)

Old: print (x, y)       # prints repr((x, y))
New: print((x, y))      # Not the same as print(x, y)!

You can also customize the separator between items, e.g.:

print("There are <", 2**32, "> possibilities!", sep="")

which produces:

There are <4294967296> possibilities!

Note:

  • The :func:`print` function doesn't support the "softspace" feature of the old print statement. For example, in Python 2.x, print "A\n", "B" would write "A\nB\n"; but in Python 3.0, print("A\n", "B") writes "A\n B\n".
  • Initially, you'll be finding yourself typing the old print x a lot in interactive mode. Time to retrain your fingers to type print(x) instead!
  • When using the 2to3 source-to-source conversion tool, all print statements are automatically converted to :func:`print` function calls, so this is mostly a non-issue for larger projects.

Views And Iterators Instead Of Lists

Some well-known APIs no longer return lists:

Ordering Comparisons

Python 3.0 has simplified the rules for ordering comparisons:

  • The ordering comparison operators (<, <=, >=, >) raise a TypeError exception when the operands don't have a meaningful natural ordering. Thus, expressions like 1 < '', 0 > None or len <= len are no longer valid, and e.g. None < None raises :exc:`TypeError` instead of returning False. A corollary is that sorting a heterogeneous list no longer makes sense -- all the elements must be comparable to each other. Note that this does not apply to the == and != operators: objects of different incomparable types always compare unequal to each other.
  • :meth:`sorted` and :meth:`list.sort` no longer accept the cmp argument providing a comparison function. Use the key argument instead. N.B. the key and reverse arguments are now "keyword-only".
  • The :func:`!cmp` function should be treated as gone, and the :meth:`!__cmp__` special method is no longer supported. Use :meth:`~object.__lt__` for sorting, :meth:`~object.__eq__` with :meth:`~object.__hash__`, and other rich comparisons as needed. (If you really need the :func:`!cmp` functionality, you could use the expression (a > b) - (a < b) as the equivalent for cmp(a, b).)

Integers

  • PEP 237: Essentially, :class:`!long` renamed to :class:`int`. That is, there is only one built-in integral type, named :class:`int`; but it behaves mostly like the old :class:`!long` type.
  • PEP 238: An expression like 1/2 returns a float. Use 1//2 to get the truncating behavior. (The latter syntax has existed for years, at least since Python 2.2.)
  • The :data:`!sys.maxint` constant was removed, since there is no longer a limit to the value of integers. However, :data:`sys.maxsize` can be used as an integer larger than any practical list or string index. It conforms to the implementation's "natural" integer size and is typically the same as :data:`!sys.maxint` in previous releases on the same platform (assuming the same build options).
  • The :func:`repr` of a long integer doesn't include the trailing L anymore, so code that unconditionally strips that character will chop off the last digit instead. (Use :func:`str` instead.)
  • Octal literals are no longer of the form 0720; use 0o720 instead.

Text Vs. Data Instead Of Unicode Vs. 8-bit

Everything you thought you knew about binary data and Unicode has changed.

  • Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. All text is Unicode; however encoded Unicode is represented as binary data. The type used to hold text is :class:`str`, the type used to hold data is :class:`bytes`. The biggest difference with the 2.x situation is that any attempt to mix text and data in Python 3.0 raises :exc:`TypeError`, whereas if you were to mix Unicode and 8-bit strings in Python 2.x, it would work if the 8-bit string happened to contain only 7-bit (ASCII) bytes, but you would get :exc:`UnicodeDecodeError` if it contained non-ASCII values. This value-specific behavior has caused numerous sad faces over the years.
  • As a consequence of this change in philosophy, pretty much all code that uses Unicode, encodings or binary data most likely has to change. The change is for the better, as in the 2.x world there were numerous bugs having to do with mixing encoded and unencoded text. To be prepared in Python 2.x, start using :class:`!unicode` for all unencoded text, and :class:`str` for binary or encoded data only. Then the 2to3 tool will do most of the work for you.
  • You can no longer use u"..." literals for Unicode text. However, you must use b"..." literals for binary data.
  • As the :class:`str` and :class:`bytes` types cannot be mixed, you must always explicitly convert between them. Use :meth:`str.encode` to go from :class:`str` to :class:`bytes`, and :meth:`bytes.decode` to go from :class:`bytes` to :class:`str`. You can also use bytes(s, encoding=...) and str(b, encoding=...), respectively.
  • Like :class:`str`, the :class:`bytes` type is immutable. There is a separate mutable type to hold buffered binary data, :class:`bytearray`. Nearly all APIs that accept :class:`bytes` also accept :class:`bytearray`. The mutable API is based on :class:`collections.MutableSequence <collections.abc.MutableSequence>`.
  • All backslashes in raw string literals are interpreted literally. This means that '\U' and '\u' escapes in raw strings are not treated specially. For example, r'\u20ac' is a string of 6 characters in Python 3.0, whereas in 2.6, ur'\u20ac' was the single "euro" character. (Of course, this change only affects raw string literals; the euro character is '\u20ac' in Python 3.0.)
  • The built-in :class:`!basestring` abstract type was removed. Use :class:`str` instead. The :class:`str` and :class:`bytes` types don't have functionality enough in common to warrant a shared base class. The 2to3 tool (see below) replaces every occurrence of :class:`!basestring` with :class:`str`.
  • Files opened as text files (still the default mode for :func:`open`) always use an encoding to map between strings (in memory) and bytes (on disk). Binary files (opened with a b in the mode argument) always use bytes in memory. This means that if a file is opened using an incorrect mode or encoding, I/O will likely fail loudly, instead of silently producing incorrect data. It also means that even Unix users will have to specify the correct mode (text or binary) when opening a file. There is a platform-dependent default encoding, which on Unixy platforms can be set with the LANG environment variable (and sometimes also with some other platform-specific locale-related environment variables). In many cases, but not all, the system default is UTF-8; you should never count on this default. Any application reading or writing more than pure ASCII text should probably have a way to override the encoding. There is no longer any need for using the encoding-aware streams in the :mod:`codecs` module.
  • The initial values of :data:`sys.stdin`, :data:`sys.stdout` and :data:`sys.stderr` are now unicode-only text files (i.e., they are instances of :class:`io.TextIOBase`). To read and write bytes data with these streams, you need to use their :data:`io.TextIOBase.buffer` attribute.
  • Filenames are passed to and returned from APIs as (Unicode) strings. This can present platform-specific problems because on some platforms filenames are arbitrary byte strings. (On the other hand, on Windows filenames are natively stored as Unicode.) As a work-around, most APIs (e.g. :func:`open` and many functions in the :mod:`os` module) that take filenames accept :class:`bytes` objects as well as strings, and a few APIs have a way to ask for a :class:`bytes` return value. Thus, :func:`os.listdir` returns a list of :class:`bytes` instances if the argument is a :class:`bytes` instance, and :func:`os.getcwdb` returns the current working directory as a :class:`bytes` instance. Note that when :func:`os.listdir` returns a list of strings, filenames that cannot be decoded properly are omitted rather than raising :exc:`UnicodeError`.
  • Some system APIs like :data:`os.environ` and :data:`sys.argv` can also present problems when the bytes made available by the system is not interpretable using the default encoding. Setting the LANG variable and rerunning the program is probably the best approach.
  • PEP 3138: The :func:`repr` of a string no longer escapes non-ASCII characters. It still escapes control characters and code points with non-printable status in the Unicode standard, however.
  • PEP 3120: The default source encoding is now UTF-8.
  • PEP 3131: Non-ASCII letters are now allowed in identifiers. (However, the standard library remains ASCII-only with the exception of contributor names in comments.)
  • The :mod:`!StringIO` and :mod:`!cStringIO` modules are gone. Instead, import the :mod:`io` module and use :class:`io.StringIO` or :class:`io.BytesIO` for text and data respectively.
  • See also the :ref:`unicode-howto`, which was updated for Python 3.0.

Overview Of Syntax Changes

This section gives a brief overview of every syntactic change in Python 3.0.

New Syntax

  • PEP 3107: Function argument and return value annotations. This provides a standardized way of annotating a function's parameters and return value. There are no semantics attached to such annotations except that they can be introspected at runtime using the :attr:`~object.__annotations__` attribute. The intent is to encourage experimentation through metaclasses, decorators or frameworks.

  • PEP 3102: Keyword-only arguments. Named parameters occurring after *args in the parameter list must be specified using keyword syntax in the call. You can also use a bare * in the parameter list to indicate that you don't accept a variable-length argument list, but you do have keyword-only arguments.

  • Keyword arguments are allowed after the list of base classes in a class definition. This is used by the new convention for specifying a metaclass (see next section), but can be used for other purposes as well, as long as the metaclass supports it.

  • PEP 3104: :keyword:`nonlocal` statement. Using nonlocal x you can now assign directly to a variable in an outer (but non-global) scope. :keyword:`!nonlocal` is a new reserved word.

  • PEP 3132: Extended Iterable Unpacking. You can now write things like a, b, *rest = some_sequence. And even *rest, a = stuff. The rest object is always a (possibly empty) list; the right-hand side may be any iterable. Example:

    (a, *rest, b) = range(5)
    

    This sets a to 0, b to 4, and rest to [1, 2, 3].

  • Dictionary comprehensions: {k: v for k, v in stuff} means the same thing as dict(stuff) but is more flexible. (This is PEP 274 vindicated. :-)

  • Set literals, e.g. {1, 2}. Note that {} is an empty dictionary; use set() for an empty set. Set comprehensions are also supported; e.g., {x for x in stuff} means the same thing as set(stuff) but is more flexible.

  • New octal literals, e.g. 0o720 (already in 2.6). The old octal literals (0720) are gone.

  • New binary literals, e.g. 0b1010 (already in 2.6), and there is a new corresponding built-in function, :func:`bin`.

  • Bytes literals are introduced with a leading b or B, and there is a new corresponding built-in function, :func:`bytes`.

Changed Syntax

Removed Syntax

  • PEP 3113: Tuple parameter unpacking removed. You can no longer write def foo(a, (b, c)): .... Use def foo(a, b_c): b, c = b_c instead.
  • Removed backticks (use :func:`repr` instead).
  • Removed <> (use != instead).
  • Removed keyword: :func:`exec` is no longer a keyword; it remains as a function. (Fortunately the function syntax was also accepted in 2.x.) Also note that :func:`exec` no longer takes a stream argument; instead of exec(f) you can use exec(f.read()).
  • Integer literals no longer support a trailing l or L.
  • String literals no longer support a leading u or U.
  • The :keyword:`from` module :keyword:`import` * syntax is only allowed at the module level, no longer inside functions.
  • The only acceptable syntax for relative imports is :samp:`from .[{module}] import {name}`. All :keyword:`import` forms not starting with . are interpreted as absolute imports. (PEP 328)
  • Classic classes are gone.

Changes Already Present In Python 2.6

Since many users presumably make the jump straight from Python 2.5 to Python 3.0, this section reminds the reader of new features that were originally designed for Python 3.0 but that were back-ported to Python 2.6. The corresponding sections in :ref:`whats-new-in-2.6` should be consulted for longer descriptions.

Library Changes

Due to time constraints, this document does not exhaustively cover the very extensive changes to the standard library. PEP 3108 is the reference for the major changes to the library. Here's a capsule review:

Some other changes to standard library modules, not covered by PEP 3108:

PEP 3101: A New Approach To String Formatting

  • A new system for built-in string formatting operations replaces the % string formatting operator. (However, the % operator is still supported; it will be deprecated in Python 3.1 and removed from the language at some later time.) Read PEP 3101 for the full scoop.

Changes To Exceptions

The APIs for raising and catching exception have been cleaned up and new powerful features added:

Miscellaneous Other Changes

Operators And Special Methods

Builtins

Build and C API Changes

Due to time constraints, here is a very incomplete list of changes to the C API.

Performance

The net result of the 3.0 generalizations is that Python 3.0 runs the pystone benchmark around 10% slower than Python 2.5. Most likely the biggest cause is the removal of special-casing for small integers. There's room for improvement, but it will happen after 3.0 is released!

Porting To Python 3.0

For porting existing Python 2.5 or 2.6 source code to Python 3.0, the best strategy is the following:

  1. (Prerequisite:) Start with excellent test coverage.
  2. Port to Python 2.6. This should be no more work than the average port from Python 2.x to Python 2.(x+1). Make sure all your tests pass.
  3. (Still using 2.6:) Turn on the :option:`!-3` command line switch. This enables warnings about features that will be removed (or change) in 3.0. Run your test suite again, and fix code that you get warnings about until there are no warnings left, and all your tests still pass.
  4. Run the 2to3 source-to-source translator over your source code tree. Run the result of the translation under Python 3.0. Manually fix up any remaining issues, fixing problems until all tests pass again.

It is not recommended to try to write source code that runs unchanged under both Python 2.6 and 3.0; you'd have to use a very contorted coding style, e.g. avoiding print statements, metaclasses, and much more. If you are maintaining a library that needs to support both Python 2.6 and Python 3.0, the best approach is to modify step 3 above by editing the 2.6 version of the source code and running the 2to3 translator again, rather than editing the 3.0 version of the source code.

For porting C extensions to Python 3.0, please see :ref:`cporting-howto`.