1
1
.. _text-bytes :
2
+ .. _bytes_mode :
2
3
3
4
Bytes/text management
4
5
=====================
5
6
6
- Python 3 introduces a hard distinction between *text * (``str ``) – sequences of
7
- characters (formally, *Unicode codepoints *) – and ``bytes `` – sequences of
8
- 8-bit values used to encode *any * kind of data for storage or transmission.
9
-
10
- Python 2 has the same distinction between ``str `` (bytes) and
11
- ``unicode `` (text).
12
- However, values can be implicitly converted between these types as needed,
13
- e.g. when comparing or writing to disk or the network.
14
- The implicit encoding and decoding can be a source of subtle bugs when not
15
- designed and tested adequately.
16
-
17
- In python-ldap 2.x (for Python 2), bytes were used for all fields,
18
- including those guaranteed to be text.
19
-
20
- From version 3.0, python-ldap uses text where appropriate.
21
- On Python 2, the :ref: `bytes mode <bytes_mode >` setting influences how text is
22
- handled.
23
-
24
-
25
- What's text, and what's bytes
26
- -----------------------------
27
-
28
7
The LDAP protocol states that some fields (distinguished names, relative
29
8
distinguished names, attribute names, queries) be encoded in UTF-8.
30
- In python-ldap, these are represented as text (``str `` on Python 3,
31
- ``unicode `` on Python 2).
9
+ In python-ldap, these are represented as text (``str `` on Python 3).
32
10
33
11
Attribute *values *, on the other hand, **MAY **
34
12
contain any type of data, including text.
@@ -38,102 +16,26 @@ Thus, attribute values are *always* treated as ``bytes``.
38
16
Encoding/decoding to other formats – text, images, etc. – is left to the caller.
39
17
40
18
41
- .. _bytes_mode :
42
-
43
- The bytes mode
44
- --------------
45
-
46
- In Python 3, text values are represented as ``str ``, the Unicode text type.
47
-
48
- In Python 2, the behavior of python-ldap 3.0 is influenced by a ``bytes_mode ``
49
- argument to :func: `ldap.initialize `:
50
-
51
- ``bytes_mode=True `` (backwards compatible):
52
- Text values are represented as bytes (``str ``) encoded using UTF-8.
53
-
54
- ``bytes_mode=False `` (future compatible):
55
- Text values are represented as ``unicode ``.
56
-
57
- If not given explicitly, python-ldap will default to ``bytes_mode=True ``,
58
- but if a ``unicode `` value is supplied to it, it will warn and use that value.
59
-
60
- Backwards-compatible behavior is not scheduled for removal until Python 2
61
- itself reaches end of life.
62
-
63
-
64
- Errors, warnings, and automatic encoding
65
- ----------------------------------------
66
-
67
- While the type of values *returned * from python-ldap is always given by
68
- ``bytes_mode ``, for Python 2 the behavior for “wrong-type” values *passed in *
69
- can be controlled by the ``bytes_strictness `` argument to
70
- :func: `ldap.initialize `:
19
+ Historical note
20
+ ---------------
71
21
72
- ``bytes_strictness='error' `` (default if ``bytes_mode `` is specified):
73
- A ``TypeError `` is raised.
74
-
75
- ``bytes_strictness='warn' `` (default when ``bytes_mode `` is not given explicitly):
76
- A warning is raised, and the value is encoded/decoded
77
- using the UTF-8 encoding.
78
-
79
- The warnings are of type :class: `~ldap.LDAPBytesWarning `, which
80
- is a subclass of :class: `BytesWarning ` designed to be easily
81
- :ref: `filtered out <filter-bytes-warning >` if needed.
82
-
83
- ``bytes_strictness='silent' ``:
84
- The value is automatically encoded/decoded using the UTF-8 encoding.
85
-
86
- On Python 3, ``bytes_strictness `` is ignored and a ``TypeError `` is always
87
- raised.
88
-
89
- When setting ``bytes_strictness ``, an explicit value for ``bytes_mode `` needs
90
- to be given as well.
91
-
92
-
93
- Porting recommendations
94
- -----------------------
95
-
96
- Since end of life of Python 2 is coming in a few years, projects are strongly
97
- urged to make their code compatible with Python 3. General instructions for
98
- this are provided :ref: `in Python documentation <pyporting-howto >` and in the
99
- `Conservative porting guide `_.
100
-
101
- .. _Conservative porting guide : https://portingguide.readthedocs.io/en/latest/
102
-
103
-
104
- When porting from python-ldap 2.x, users are advised to update their code
105
- to set ``bytes_mode=False ``, and fix any resulting failures.
106
-
107
- The typical usage is as follows.
108
- Note that only the result's *values * are of the ``bytes `` type:
109
-
110
- .. code-block :: pycon
111
-
112
- >>> import ldap
113
- >>> con = ldap.initialize('ldap://localhost:389', bytes_mode=False)
114
- >>> con.simple_bind_s(u'login', u'secret_password')
115
- >>> results = con.search_s(u'ou=people,dc=example,dc=org', ldap.SCOPE_SUBTREE, u"(cn=Raphaël)")
116
- >>> results
117
- [
118
- ("cn=Raphaël,ou=people,dc=example,dc=org", {
119
- 'cn': [b'Rapha\xc3\xabl'],
120
- 'sn': [b'Barrois'],
121
- }),
122
- ]
123
-
124
-
125
- .. _filter-bytes-warning :
126
-
127
- Filtering warnings
128
- ------------------
22
+ Python 3 introduced a hard distinction between *text * (``str ``) – sequences of
23
+ characters (formally, *Unicode codepoints *) – and ``bytes `` – sequences of
24
+ 8-bit values used to encode *any * kind of data for storage or transmission.
129
25
130
- The bytes mode warnings can be filtered out and ignored with a
131
- simple filter.
26
+ Python 2 had the same distinction between ``str `` (bytes) and
27
+ ``unicode `` (text).
28
+ However, values could be implicitly converted between these types as needed,
29
+ e.g. when comparing or writing to disk or the network.
30
+ The implicit encoding and decoding can be a source of subtle bugs when not
31
+ designed and tested adequately.
132
32
133
- .. code-block :: python
33
+ In python-ldap 2.x (for Python 2), bytes were used for all fields,
34
+ including those guaranteed to be text.
134
35
135
- import warnings
136
- import ldap
36
+ From version 3.0 to 3.3, python-ldap uses text where appropriate.
37
+ On Python 2, special ``bytes_mode `` and ``bytes_strictness `` settings
38
+ influenced how text was handled.
137
39
138
- if hasattr (ldap, ' LDAPBytesWarning ' ):
139
- warnings.simplefilter( ' ignore ' , ldap.LDAPBytesWarning)
40
+ From version 3.3 on, only Python 3 is supported. The “bytes mode” settings
41
+ are deprecated and do nothing.
0 commit comments