Skip to content

Commit 37bb289

Browse files
authored
bpo-40943: PY_SSIZE_T_CLEAN required for '#' formats (GH-20784)
The PY_SSIZE_T_CLEAN macro must now be defined to use PyArg_ParseTuple() and Py_BuildValue() "#" formats: "es#", "et#", "s#", "u#", "y#", "z#", "U#" and "Z#". See the PEP 353. Update _testcapi.test_buildvalue_issue38913().
1 parent 01ece63 commit 37bb289

File tree

6 files changed

+87
-102
lines changed

6 files changed

+87
-102
lines changed

Doc/c-api/arg.rst

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -55,13 +55,11 @@ which disallows mutable objects such as :class:`bytearray`.
5555

5656
.. note::
5757

58-
For all ``#`` variants of formats (``s#``, ``y#``, etc.), the type of
59-
the length argument (int or :c:type:`Py_ssize_t`) is controlled by
60-
defining the macro :c:macro:`PY_SSIZE_T_CLEAN` before including
61-
:file:`Python.h`. If the macro was defined, length is a
62-
:c:type:`Py_ssize_t` rather than an :c:type:`int`. This behavior will change
63-
in a future Python version to only support :c:type:`Py_ssize_t` and
64-
drop :c:type:`int` support. It is best to always define :c:macro:`PY_SSIZE_T_CLEAN`.
58+
For all ``#`` variants of formats (``s#``, ``y#``, etc.), the macro
59+
:c:macro:`PY_SSIZE_T_CLEAN` must be defined before including
60+
:file:`Python.h`. On Python 3.9 and older, the type of the length argument
61+
is :c:type:`Py_ssize_t` if the :c:macro:`PY_SSIZE_T_CLEAN` macro is defined,
62+
or int otherwise.
6563

6664

6765
``s`` (:class:`str`) [const char \*]
@@ -90,7 +88,7 @@ which disallows mutable objects such as :class:`bytearray`.
9088
In this case the resulting C string may contain embedded NUL bytes.
9189
Unicode objects are converted to C strings using ``'utf-8'`` encoding.
9290

93-
``s#`` (:class:`str`, read-only :term:`bytes-like object`) [const char \*, int or :c:type:`Py_ssize_t`]
91+
``s#`` (:class:`str`, read-only :term:`bytes-like object`) [const char \*, :c:type:`Py_ssize_t`]
9492
Like ``s*``, except that it doesn't accept mutable objects.
9593
The result is stored into two C variables,
9694
the first one a pointer to a C string, the second one its length.
@@ -105,7 +103,7 @@ which disallows mutable objects such as :class:`bytearray`.
105103
Like ``s*``, but the Python object may also be ``None``, in which case the
106104
``buf`` member of the :c:type:`Py_buffer` structure is set to ``NULL``.
107105

108-
``z#`` (:class:`str`, read-only :term:`bytes-like object` or ``None``) [const char \*, int or :c:type:`Py_ssize_t`]
106+
``z#`` (:class:`str`, read-only :term:`bytes-like object` or ``None``) [const char \*, :c:type:`Py_ssize_t`]
109107
Like ``s#``, but the Python object may also be ``None``, in which case the C
110108
pointer is set to ``NULL``.
111109

@@ -124,7 +122,7 @@ which disallows mutable objects such as :class:`bytearray`.
124122
bytes-like objects. **This is the recommended way to accept
125123
binary data.**
126124

127-
``y#`` (read-only :term:`bytes-like object`) [const char \*, int or :c:type:`Py_ssize_t`]
125+
``y#`` (read-only :term:`bytes-like object`) [const char \*, :c:type:`Py_ssize_t`]
128126
This variant on ``s#`` doesn't accept Unicode objects, only bytes-like
129127
objects.
130128

@@ -155,7 +153,7 @@ which disallows mutable objects such as :class:`bytearray`.
155153
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
156154
:c:func:`PyUnicode_AsWideCharString`.
157155

158-
``u#`` (:class:`str`) [const Py_UNICODE \*, int or :c:type:`Py_ssize_t`]
156+
``u#`` (:class:`str`) [const Py_UNICODE \*, :c:type:`Py_ssize_t`]
159157
This variant on ``u`` stores into two C variables, the first one a pointer to a
160158
Unicode data buffer, the second one its length. This variant allows
161159
null code points.
@@ -172,7 +170,7 @@ which disallows mutable objects such as :class:`bytearray`.
172170
Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
173171
:c:func:`PyUnicode_AsWideCharString`.
174172

175-
``Z#`` (:class:`str` or ``None``) [const Py_UNICODE \*, int or :c:type:`Py_ssize_t`]
173+
``Z#`` (:class:`str` or ``None``) [const Py_UNICODE \*, :c:type:`Py_ssize_t`]
176174
Like ``u#``, but the Python object may also be ``None``, in which case the
177175
:c:type:`Py_UNICODE` pointer is set to ``NULL``.
178176

@@ -213,7 +211,7 @@ which disallows mutable objects such as :class:`bytearray`.
213211
recoding them. Instead, the implementation assumes that the byte string object uses
214212
the encoding passed in as parameter.
215213

216-
``es#`` (:class:`str`) [const char \*encoding, char \*\*buffer, int or :c:type:`Py_ssize_t` \*buffer_length]
214+
``es#`` (:class:`str`) [const char \*encoding, char \*\*buffer, :c:type:`Py_ssize_t` \*buffer_length]
217215
This variant on ``s#`` is used for encoding Unicode into a character buffer.
218216
Unlike the ``es`` format, this variant allows input data which contains NUL
219217
characters.
@@ -244,7 +242,7 @@ which disallows mutable objects such as :class:`bytearray`.
244242
In both cases, *\*buffer_length* is set to the length of the encoded data
245243
without the trailing NUL byte.
246244

247-
``et#`` (:class:`str`, :class:`bytes` or :class:`bytearray`) [const char \*encoding, char \*\*buffer, int or :c:type:`Py_ssize_t` \*buffer_length]
245+
``et#`` (:class:`str`, :class:`bytes` or :class:`bytearray`) [const char \*encoding, char \*\*buffer, :c:type:`Py_ssize_t` \*buffer_length]
248246
Same as ``es#`` except that byte string objects are passed through without recoding
249247
them. Instead, the implementation assumes that the byte string object uses the
250248
encoding passed in as parameter.
@@ -549,7 +547,7 @@ Building values
549547
Convert a null-terminated C string to a Python :class:`str` object using ``'utf-8'``
550548
encoding. If the C string pointer is ``NULL``, ``None`` is used.
551549
552-
``s#`` (:class:`str` or ``None``) [const char \*, int or :c:type:`Py_ssize_t`]
550+
``s#`` (:class:`str` or ``None``) [const char \*, :c:type:`Py_ssize_t`]
553551
Convert a C string and its length to a Python :class:`str` object using ``'utf-8'``
554552
encoding. If the C string pointer is ``NULL``, the length is ignored and
555553
``None`` is returned.
@@ -558,30 +556,30 @@ Building values
558556
This converts a C string to a Python :class:`bytes` object. If the C
559557
string pointer is ``NULL``, ``None`` is returned.
560558
561-
``y#`` (:class:`bytes`) [const char \*, int or :c:type:`Py_ssize_t`]
559+
``y#`` (:class:`bytes`) [const char \*, :c:type:`Py_ssize_t`]
562560
This converts a C string and its lengths to a Python object. If the C
563561
string pointer is ``NULL``, ``None`` is returned.
564562
565563
``z`` (:class:`str` or ``None``) [const char \*]
566564
Same as ``s``.
567565
568-
``z#`` (:class:`str` or ``None``) [const char \*, int or :c:type:`Py_ssize_t`]
566+
``z#`` (:class:`str` or ``None``) [const char \*, :c:type:`Py_ssize_t`]
569567
Same as ``s#``.
570568
571569
``u`` (:class:`str`) [const wchar_t \*]
572570
Convert a null-terminated :c:type:`wchar_t` buffer of Unicode (UTF-16 or UCS-4)
573571
data to a Python Unicode object. If the Unicode buffer pointer is ``NULL``,
574572
``None`` is returned.
575573
576-
``u#`` (:class:`str`) [const wchar_t \*, int or :c:type:`Py_ssize_t`]
574+
``u#`` (:class:`str`) [const wchar_t \*, :c:type:`Py_ssize_t`]
577575
Convert a Unicode (UTF-16 or UCS-4) data buffer and its length to a Python
578576
Unicode object. If the Unicode buffer pointer is ``NULL``, the length is ignored
579577
and ``None`` is returned.
580578
581579
``U`` (:class:`str` or ``None``) [const char \*]
582580
Same as ``s``.
583581
584-
``U#`` (:class:`str` or ``None``) [const char \*, int or :c:type:`Py_ssize_t`]
582+
``U#`` (:class:`str` or ``None``) [const char \*, :c:type:`Py_ssize_t`]
585583
Same as ``s#``.
586584
587585
``i`` (:class:`int`) [int]

Doc/whatsnew/3.10.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,13 @@ New Features
155155
Porting to Python 3.10
156156
----------------------
157157

158+
* The ``PY_SSIZE_T_CLEAN`` macro must now be defined to use
159+
:c:func:`PyArg_ParseTuple` and :c:func:`Py_BuildValue` formats which use
160+
``#``: ``es#``, ``et#``, ``s#``, ``u#``, ``y#``, ``z#``, ``U#`` and ``Z#``.
161+
See :ref:`Parsing arguments and building values
162+
<arg-parsing>` and the :pep:`353`.
163+
(Contributed by Victor Stinner in :issue:`40943`.)
164+
158165
* Since :c:func:`Py_TYPE()` is changed to the inline static function,
159166
``Py_TYPE(obj) = new_type`` must be replaced with ``Py_SET_TYPE(obj, new_type)``:
160167
see :c:func:`Py_SET_TYPE()` (available since Python 3.9). For backward
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
The ``PY_SSIZE_T_CLEAN`` macro must now be defined to use
2+
:c:func:`PyArg_ParseTuple` and :c:func:`Py_BuildValue` formats which use ``#``:
3+
``es#``, ``et#``, ``s#``, ``u#``, ``y#``, ``z#``, ``U#`` and ``Z#``.
4+
See :ref:`Parsing arguments and building values <arg-parsing>` and the
5+
:pep:`353`.

Modules/_testcapimodule.c

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6868,29 +6868,36 @@ test_buildvalue_issue38913(PyObject *self, PyObject *Py_UNUSED(ignored))
68686868
PyObject *res;
68696869
const char str[] = "string";
68706870
const Py_UNICODE unicode[] = L"unicode";
6871-
PyErr_SetNone(PyExc_ZeroDivisionError);
6871+
assert(!PyErr_Occurred());
68726872

68736873
res = Py_BuildValue("(s#O)", str, 1, Py_None);
68746874
assert(res == NULL);
6875-
if (!PyErr_ExceptionMatches(PyExc_ZeroDivisionError)) {
6875+
if (!PyErr_ExceptionMatches(PyExc_SystemError)) {
68766876
return NULL;
68776877
}
6878+
PyErr_Clear();
6879+
68786880
res = Py_BuildValue("(z#O)", str, 1, Py_None);
68796881
assert(res == NULL);
6880-
if (!PyErr_ExceptionMatches(PyExc_ZeroDivisionError)) {
6882+
if (!PyErr_ExceptionMatches(PyExc_SystemError)) {
68816883
return NULL;
68826884
}
6885+
PyErr_Clear();
6886+
68836887
res = Py_BuildValue("(y#O)", str, 1, Py_None);
68846888
assert(res == NULL);
6885-
if (!PyErr_ExceptionMatches(PyExc_ZeroDivisionError)) {
6889+
if (!PyErr_ExceptionMatches(PyExc_SystemError)) {
68866890
return NULL;
68876891
}
6892+
PyErr_Clear();
6893+
68886894
res = Py_BuildValue("(u#O)", unicode, 1, Py_None);
68896895
assert(res == NULL);
6890-
if (!PyErr_ExceptionMatches(PyExc_ZeroDivisionError)) {
6896+
if (!PyErr_ExceptionMatches(PyExc_SystemError)) {
68916897
return NULL;
68926898
}
6893-
68946899
PyErr_Clear();
6900+
6901+
68956902
Py_RETURN_NONE;
68966903
}

Python/getargs.c

Lines changed: 27 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -656,27 +656,12 @@ convertsimple(PyObject *arg, const char **p_format, va_list *p_va, int flags,
656656
char *msgbuf, size_t bufsize, freelist_t *freelist)
657657
{
658658
/* For # codes */
659-
#define FETCH_SIZE int *q=NULL;Py_ssize_t *q2=NULL;\
660-
if (flags & FLAG_SIZE_T) q2=va_arg(*p_va, Py_ssize_t*); \
661-
else { \
662-
if (PyErr_WarnEx(PyExc_DeprecationWarning, \
663-
"PY_SSIZE_T_CLEAN will be required for '#' formats", 1)) { \
664-
return NULL; \
665-
} \
666-
q=va_arg(*p_va, int*); \
667-
}
668-
#define STORE_SIZE(s) \
669-
if (flags & FLAG_SIZE_T) \
670-
*q2=s; \
671-
else { \
672-
if (INT_MAX < s) { \
673-
PyErr_SetString(PyExc_OverflowError, \
674-
"size does not fit in an int"); \
675-
return converterr("", arg, msgbuf, bufsize); \
676-
} \
677-
*q = (int)s; \
678-
}
679-
#define BUFFER_LEN ((flags & FLAG_SIZE_T) ? *q2:*q)
659+
#define REQUIRE_PY_SSIZE_T_CLEAN \
660+
if (!(flags & FLAG_SIZE_T)) { \
661+
PyErr_SetString(PyExc_SystemError, \
662+
"PY_SSIZE_T_CLEAN macro must be defined for '#' formats"); \
663+
return NULL; \
664+
}
680665
#define RETURN_ERR_OCCURRED return msgbuf
681666

682667
const char *format = *p_format;
@@ -931,8 +916,9 @@ convertsimple(PyObject *arg, const char **p_format, va_list *p_va, int flags,
931916
if (count < 0)
932917
return converterr(buf, arg, msgbuf, bufsize);
933918
if (*format == '#') {
934-
FETCH_SIZE;
935-
STORE_SIZE(count);
919+
REQUIRE_PY_SSIZE_T_CLEAN;
920+
Py_ssize_t *psize = va_arg(*p_va, Py_ssize_t*);
921+
*psize = count;
936922
format++;
937923
} else {
938924
if (strlen(*p) != (size_t)count) {
@@ -974,11 +960,12 @@ convertsimple(PyObject *arg, const char **p_format, va_list *p_va, int flags,
974960
} else if (*format == '#') { /* a string or read-only bytes-like object */
975961
/* "s#" or "z#" */
976962
const void **p = (const void **)va_arg(*p_va, const char **);
977-
FETCH_SIZE;
963+
REQUIRE_PY_SSIZE_T_CLEAN;
964+
Py_ssize_t *psize = va_arg(*p_va, Py_ssize_t*);
978965

979966
if (c == 'z' && arg == Py_None) {
980967
*p = NULL;
981-
STORE_SIZE(0);
968+
*psize = 0;
982969
}
983970
else if (PyUnicode_Check(arg)) {
984971
Py_ssize_t len;
@@ -987,15 +974,15 @@ convertsimple(PyObject *arg, const char **p_format, va_list *p_va, int flags,
987974
return converterr(CONV_UNICODE,
988975
arg, msgbuf, bufsize);
989976
*p = sarg;
990-
STORE_SIZE(len);
977+
*psize = len;
991978
}
992979
else { /* read-only bytes-like object */
993980
/* XXX Really? */
994981
const char *buf;
995982
Py_ssize_t count = convertbuffer(arg, p, &buf);
996983
if (count < 0)
997984
return converterr(buf, arg, msgbuf, bufsize);
998-
STORE_SIZE(count);
985+
*psize = count;
999986
}
1000987
format++;
1001988
} else {
@@ -1034,18 +1021,19 @@ _Py_COMP_DIAG_IGNORE_DEPR_DECLS
10341021

10351022
if (*format == '#') {
10361023
/* "u#" or "Z#" */
1037-
FETCH_SIZE;
1024+
REQUIRE_PY_SSIZE_T_CLEAN;
1025+
Py_ssize_t *psize = va_arg(*p_va, Py_ssize_t*);
10381026

10391027
if (c == 'Z' && arg == Py_None) {
10401028
*p = NULL;
1041-
STORE_SIZE(0);
1029+
*psize = 0;
10421030
}
10431031
else if (PyUnicode_Check(arg)) {
10441032
Py_ssize_t len;
10451033
*p = PyUnicode_AsUnicodeAndSize(arg, &len);
10461034
if (*p == NULL)
10471035
RETURN_ERR_OCCURRED;
1048-
STORE_SIZE(len);
1036+
*psize = len;
10491037
}
10501038
else
10511039
return converterr(c == 'Z' ? "str or None" : "str",
@@ -1160,22 +1148,11 @@ _Py_COMP_DIAG_POP
11601148
trailing 0-byte
11611149
11621150
*/
1163-
int *q = NULL; Py_ssize_t *q2 = NULL;
1164-
if (flags & FLAG_SIZE_T) {
1165-
q2 = va_arg(*p_va, Py_ssize_t*);
1166-
}
1167-
else {
1168-
if (PyErr_WarnEx(PyExc_DeprecationWarning,
1169-
"PY_SSIZE_T_CLEAN will be required for '#' formats", 1))
1170-
{
1171-
Py_DECREF(s);
1172-
return NULL;
1173-
}
1174-
q = va_arg(*p_va, int*);
1175-
}
1151+
REQUIRE_PY_SSIZE_T_CLEAN;
1152+
Py_ssize_t *psize = va_arg(*p_va, Py_ssize_t*);
11761153

11771154
format++;
1178-
if (q == NULL && q2 == NULL) {
1155+
if (psize == NULL) {
11791156
Py_DECREF(s);
11801157
return converterr(
11811158
"(buffer_len is NULL)",
@@ -1195,30 +1172,20 @@ _Py_COMP_DIAG_POP
11951172
arg, msgbuf, bufsize);
11961173
}
11971174
} else {
1198-
if (size + 1 > BUFFER_LEN) {
1175+
if (size + 1 > *psize) {
11991176
Py_DECREF(s);
12001177
PyErr_Format(PyExc_ValueError,
12011178
"encoded string too long "
12021179
"(%zd, maximum length %zd)",
1203-
(Py_ssize_t)size, (Py_ssize_t)(BUFFER_LEN-1));
1180+
(Py_ssize_t)size, (Py_ssize_t)(*psize - 1));
12041181
RETURN_ERR_OCCURRED;
12051182
}
12061183
}
12071184
memcpy(*buffer, ptr, size+1);
12081185

1209-
if (flags & FLAG_SIZE_T) {
1210-
*q2 = size;
1211-
}
1212-
else {
1213-
if (INT_MAX < size) {
1214-
Py_DECREF(s);
1215-
PyErr_SetString(PyExc_OverflowError,
1216-
"size does not fit in an int");
1217-
return converterr("", arg, msgbuf, bufsize);
1218-
}
1219-
*q = (int)size;
1220-
}
1221-
} else {
1186+
*psize = size;
1187+
}
1188+
else {
12221189
/* Using a 0-terminated buffer:
12231190
12241191
- the encoded string has to be 0-terminated
@@ -1356,9 +1323,7 @@ _Py_COMP_DIAG_POP
13561323
*p_format = format;
13571324
return NULL;
13581325

1359-
#undef FETCH_SIZE
1360-
#undef STORE_SIZE
1361-
#undef BUFFER_LEN
1326+
#undef REQUIRE_PY_SSIZE_T_CLEAN
13621327
#undef RETURN_ERR_OCCURRED
13631328
}
13641329

0 commit comments

Comments
 (0)