Skip to content

Commit 791c4d5

Browse files
committed
Add document with tips for writing portable 2-3 code
1 parent f4adec7 commit 791c4d5

File tree

2 files changed

+120
-0
lines changed

2 files changed

+120
-0
lines changed

doc/devel/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
:maxdepth: 2
1414

1515
coding_guide.rst
16+
portable_code.rst
1617
license.rst
1718
gitwash/index.rst
1819
testing.rst

doc/devel/portable_code.rst

+119
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
Writing code for Python 2 and 3
2+
-------------------------------
3+
4+
As of matplotlib 1.4, the `six <http://pythonhosted.org/six/>`_
5+
library is used to support Python 2 and 3 from a single code base.
6+
The `2to3` tool is no longer used.
7+
8+
This document describes some of the issues with that approach and some
9+
recommended solutions. It is not a complete guide to Python 2 and 3
10+
compatibility.
11+
12+
Welcome to the ``__future__``
13+
-----------------------------
14+
15+
The top of every `.py` file should include the following::
16+
17+
from __future__ import absolute_import, division, print_function, unicode_literals
18+
19+
This will make the Python 2 interpreter behave as close to Python 3 as
20+
possible.
21+
22+
All matplotlib files should also import `six`, whether they are using
23+
it or not, just to make moving code between modules easier, as `six`
24+
gets used *a lot*::
25+
26+
import six
27+
28+
Finding places to use six
29+
-------------------------
30+
31+
The only way to make sure code works on both Python 2 and 3 is to make sure it
32+
is covered by unit tests.
33+
34+
However, the `2to3` commandline tool can also be used to locate places
35+
that require special handling with `six`.
36+
37+
(The `modernize <https://pypi.python.org/pypi/modernize>`_ tool may
38+
also be handy, though I've never used it personally).
39+
40+
The `six <http://pythonhosted.org/six/>`_ documentation serves as a
41+
good reference for the sorts of things that need to be updated.
42+
43+
The dreaded ``\u`` escapes
44+
--------------------------
45+
46+
When `from __future__ import unicode_literals` is used, all string
47+
literals (not preceded with a `b`) will become unicode literals.
48+
49+
Normally, one would use "raw" string literals to encode strings that
50+
contain a lot of slashes that we don't want Python to interpret as
51+
special characters. A common example in matplotlib is when it deals
52+
with TeX and has to represent things like ``r"\usepackage{foo}"``.
53+
Unfortunately, on Python 2there is no way to represent `\u` in a raw
54+
unicode string literal, since it will always be interpreted as the
55+
start of a unicode character escape, such as `\u20af`. The only
56+
solution is to use a regular (non-raw) string literal and repeat all
57+
slashes, e.g. ``"\\usepackage{foo}"``.
58+
59+
The following shows the problem on Python 2::
60+
61+
>>> ur'\u'
62+
File "<stdin>", line 1
63+
SyntaxError: (unicode error) 'rawunicodeescape' codec can't decode bytes in
64+
position 0-1: truncated \uXXXX
65+
>>> ur'\\u'
66+
u'\\\\u'
67+
>>> u'\u'
68+
File "<stdin>", line 1
69+
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in
70+
position 0-1: truncated \uXXXX escape
71+
>>> u'\\u'
72+
u'\\u'
73+
74+
This bug has been fixed in Python 3, however, we can't take advantage
75+
of that and still support Python 2::
76+
77+
>>> r'\u'
78+
'\\u'
79+
>>> r'\\u'
80+
'\\\\u'
81+
>>> '\u'
82+
File "<stdin>", line 1
83+
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in
84+
position 0-1: truncated \uXXXX escape
85+
>>> '\\u'
86+
'\\u'
87+
88+
Iteration
89+
---------
90+
91+
The behavior of the methods for iterating over the items, values and
92+
keys of a dictionary has changed in Python 3. Additionally, other
93+
built-in functions such as `zip`, `range` and `map` have changed to
94+
return iterators rather than temporary lists.
95+
96+
In many cases, the performance implications of iterating vs. creating
97+
a temporary list won't matter, so it's tempting to use the form that
98+
is simplest to read. However, that results in code that behaves
99+
differently on Python 2 and 3, leading to subtle bugs that may not be
100+
detected by the regression tests. Therefore, unless the loop in
101+
question is provably simple and doesn't call into other code, the
102+
`six` versions that ensure the same behavior on both Python 2 and 3
103+
should be used. The following table shows the mapping of equivalent
104+
semantics between Python 2, 3 and six for `dict.items()`:
105+
106+
============================== ============================== ==============================
107+
Python 2 Python 3 six
108+
============================== ============================== ==============================
109+
``d.items()`` ``list(d.items())`` ``list(six.iteritems(d))``
110+
``d.iteritems()`` ``d.items()`` ``six.iteritems(d)``
111+
============================== ============================== ==============================
112+
113+
Numpy-specific things
114+
---------------------
115+
116+
When specifying dtypes, all strings must be byte strings on Python 2
117+
and unicode strings on Python 3. The best way to handle this is to
118+
force cast them using `str()`. The same is true of structure
119+
specifiers in the `struct` built-in module.

0 commit comments

Comments
 (0)