Skip to content

Generator expression behavior changed in 3.13.4 - it does not throw exception anymore #135171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Roman513 opened this issue Jun 5, 2025 · 23 comments
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@Roman513
Copy link

Roman513 commented Jun 5, 2025

Bug report

Bug description:

In 3.13.3 and before

>>> bool(_ for item in False)
Traceback (most recent call last):
  File "<python-input-14>", line 1, in <module>
    bool(_ for item in False)
                       ^^^^^
TypeError: 'bool' object is not iterable
>>> (_ for item in False)
Traceback (most recent call last):
  File "<python-input-15>", line 1, in <module>
    (_ for item in False)
                   ^^^^^
TypeError: 'bool' object is not iterable

Now in 3.13.4

>>> bool(_ for item in False)
True
>>> (_ for item in False)
<generator object <genexpr> at 0xffffa9c02200>

This is very fundamental change. Is it expected?

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

@Roman513 Roman513 added the type-bug An unexpected behavior, bug, or error label Jun 5, 2025
@Roman513
Copy link
Author

Roman513 commented Jun 5, 2025

Most probably related to fix #132384 of #127682

@serhiy-storchaka serhiy-storchaka added 3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes labels Jun 5, 2025
@serhiy-storchaka
Copy link
Member

We should restore an old behavior which was intentional: GET_ITER should be called before passing argument to generator expression, not after.

@Yhg1s
Copy link
Member

Yhg1s commented Jun 5, 2025

Sounds like the change should be reverted. We're doing an expedited 3.13.5 for other reasons, and the rollback can go in there.

@serhiy-storchaka
Copy link
Member

We need two rollbacks -- one for the change that removed GET_ITER before calling the generator expression, and other for the change that added GET_ITER in a generator expression (that may not be trivial).

@markshannon
Copy link
Member

What is the bug here?

bool(_ for item in False) should be True. All generators are truthy.

If the intent was to write any(_ for item in False) or all(_ for item in False) then the expected TypeError is raised.

Consider the equivalent generator function:

def gen():
     for item in False:
        yield _

creating the generator does not raise an exception, but iterating over it does.

>>> bool(gen())
True

Iterating over an iterable is a two step process.:

  1. Convert the iterable into an iterator (using the GET_ITER instruction)
  2. Repeatedly get the next item from the iterator (using the FOR_ITER instruction)

For for loops and generators, the FOR_ITER and GET_ITER are together and everything works fine.

Historically, for reasons I am unaware of, in generator expressions the conversion from iterable to iterator occurred when the generator was created, not when used. This left a gap between conversion and the (unchecked) iteration which could mean that iteration could crash.
#125038.

We attempted to fix that by putting a GET_ITER in the generator expression as well, but that breaks iterators without __iter__ methods, or where the __iter__ has side effects. #127682
It also introduced a crash, but in SWIG generated code: rpm-software-management/libdnf#1682

So we must do one of the following:

  1. Allow a crash. I'm strongly opposed to this option
  2. Break working iterators that do not implement __iter__, which have always worked.
  3. Allow the creation of some generator expressions that was previously not allowed, but will still fail as expected when iterated over.

Option 3 is what we now have and seems to me to be easily the best option. It breaks no working code, and produces the same exceptions for faulty code in a way that is largely indistinguishable from the behavior in 3.12.

That generator expressions and generator functions behaved differently could also be considered a bug.

See also #127682 (comment)

@serhiy-storchaka
Copy link
Member

It should be a TypeError, because False is not iterable. It's been that way since the beginning, and I believe it was intentional, and that the existing code depends on it. It helps to detect bugs. Even if we want to change the behavior, it should be preceded by a general discussion, and we can only do that in a new Python version, not in a bugfix release.

We just have to return the old behavior. The crash you mentioned is only possible after low level hacking. The behavior change is visible to common users.

@Roman513
Copy link
Author

Roman513 commented Jun 5, 2025

@markshannon I do not insist that it must be fixed, but I thought it could be considered for a major release. I encountered this issue in legacy code, and there’s no question it should be refactored to avoid this. The exception is used there to exit recursion while parsing nested dict-like structures, if you need a real-world scenario.
Casting to bool is not important, a more generic example would be something like x = (_ for _ in None). If the examples in the bug description seem confusing, I can replace them.

@markshannon
Copy link
Member

@Roman513 the problem is not that your examples are hard to use, but they appear contrived. What was the original issue you found (in actual code)?

@markshannon
Copy link
Member

@serhiy-storchaka What about the other issues I listed, do they do not deserve to be fixed? Why this issue and not those?

@markshannon
Copy link
Member

One other thing. what about generator expressions with multiple iterables e.g. ((i,j) for i in range(3) for j in range(n))?

Python 3.12 and earlier were already inconsistent here:

Python 3.12
>>> ((i,j) for i in range(3) for j in False)
<generator object <genexpr> at 0x7447eaba43c0>
>>> ((i,j) for i in False for j in range(n))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'bool' object is not iterable

@markshannon
Copy link
Member

Overall, I believe the current behavior is the best. It is the robust, efficient and is consistent with generator functions. So I definitely want to keep it for 3.15+.

Regarding 3.13 and 3.14 there is one more thing we could attempt:

  • Move GET_ITER back to the creation of the generator expression for 3.13 and 3.14
  • Add a tp_iternext function to object so that FOR_ITER never crashes, just throws the expected exception.

The downside of this is that any C extensions doing Py_TYPE(op)->tp_iternext != NULL to determine if an object is an iterator will be broken. Hyrum's law states that someone will be doing this 🙁

@Yhg1s @hugovk It's your call.

https://xkcd.com/1172/

@serhiy-storchaka
Copy link
Member

Historically, for reasons I am unaware of, in generator expressions the conversion from iterable to iterator occurred when the generator was created, not when used.

For history, see PEP 289, especially https://peps.python.org/pep-0289/#early-binding-versus-late-binding .

The reasoning does not directly mention the question where iter() should be called, but I think that it is applied to it too. We want to get a TypeError earlier, before passing a generator expression to consumer. This is mentioned in #39778 (comment):

In order to get an error instantly from an expression like
"g=(x for x in None)",

It is interesting that the crash in FOR_ITER was known from beginning: #39778 (comment) . I believe it can be fixed by adding an additional check in FOR_ITER. It seems that the same crash can be reproduced with the current code if use debugger (or just the trace function).

What about the other issues I listed, do they do not deserve to be fixed?

Aren't other issues caused by attempt to fix the crash in FOR_ITER? If we restore the original behavior, they will gone.

@Roman513
Copy link
Author

Roman513 commented Jun 6, 2025

@markshannon OK, if it looks contrived, here’s the original issue, just with names changed. As I said, this is pretty lame code, but it worked for years, and I didn’t expect it to stop with a minor update.

def safe_obj(obj):
    try:
        if isinstance(obj, str):
            return obj.replace("$", "")  # Was another string sanitizing, does not matter for demonstration
        elif hasattr(obj, "items"):
            return type(obj)(safe_obj(item) for item in obj.items())
        else:
            return type(obj)(safe_obj(item) for item in obj)
    except TypeError:
        return obj

>>> safe_obj({"param": True})
{'param': True}
>>> safe_obj({"param": False})
{'param': True}

I guess other users could find other unexpected issues - it's only been 1.5 days since the 3.13.4 images were pushed to DockerHub.

@markshannon
Copy link
Member

@serhiy that's about binding, not calling iter(). We aren't changing the binding time here. I don't think anything about the timing of the call to iter() is implied at all.

Aren't other issues caused by attempt to fix the crash in FOR_ITER?

One of them is the crash in FOR_ITER.
If we fix the crash though, then the strange inconsistencies remain, but the other issues go away.

@markshannon
Copy link
Member

It seems that the same crash can be reproduced with the current code if use debugger (or just the trace function).

@serhiy-storchaka how? I don't think it can.

@Yhg1s
Copy link
Member

Yhg1s commented Jun 6, 2025

Overall, I believe the current behavior is the best. It is the robust, efficient and is consistent with generator functions. So I definitely want to keep it for 3.15+.

It's debatable for 3.14/3.15. I think it's a really bad idea to do this without the usual deprecation, because it's clear that this behaviour is intentional and people are relying on it, but if the SC wants to do this, they can provide an exception to the backward compatibility policy. Doing this with a regular deprecation period is more palatable, but I still think it's a pointless change for no reason, and further erodes the trust in Python as a stable language.

It's absolutely unacceptable for 3.13. This is not the kind of change we can make in patch releases. It breaks real-world, functioning code. It has to be rolled back.

@markshannon
Copy link
Member

@Yhg1s

It's absolutely unacceptable for 3.13. This is not the kind of change we can make in patch releases. It breaks real-world, functioning code. It has to be rolled back.

What isn't the kind of change we make? Bug fixes?
Can you please put the pitchfork away for a bit, and let's work out the best approach.

Exactly which PR do you want to revert?

If we revert #132384 that re-introduces #127682.
If we then revert #125178 to fix #127682, it then re-introduces #125038 which is a crash, so we don't want to do that.

We need to fix #125038 some other way first, and then do the reverts.
I think @serhiy-storchaka's suggestion of adding a tp_iternext == NULL check to FOR_ITER is probably the simplest to implement and least likely to introduce any new bugs.
@Yhg1s would that work for you?

For 3.15, we can discuss it later. If we want to keep the redundant check, we can do so by adding an additional instruction when creating the generator to check that the iterable is actually an iterable.

Unfortunately we can't add the additional instruction for 3.14 as it would involve changing the bytecodes, so we'll need to do the same change and reverts for 3.14 as for 3.13

@serhiy-storchaka
Copy link
Member

If we want to keep the redundant check, we can do so by adding an additional instruction when creating the generator to check that the iterable is actually an iterable.

This may be not enough. We should call __iter__, not just check that it was defined, because it can fail, and we want to get that error before passing the generator object to consumer which can have other error. The reason is the same as in PEP 289.

@maxnoe
Copy link
Contributor

maxnoe commented Jun 6, 2025

Just to add another data point for effects on real-world code: this broke numba compilation of list comprehensions:
numba/numba#10101

@Yhg1s
Copy link
Member

Yhg1s commented Jun 6, 2025

@Yhg1s
What isn't the kind of change we make? Bug fixes?

If the bugfix breaks intentional behaviour relied on by existing code, yes.

Exactly which PR do you want to revert?

If we revert #132384 that re-introduces #127682. If we then revert #125178 to fix #127682, it then re-introduces #125038 which is a crash, so we don't want to do that.

Not fixing an existing big is preferable to introducing breaking behaviour.

Unfortunately we can't add the additional instruction for 3.14 as it would involve changing the bytecodes, so we'll need to do the same change and reverts for 3.14 as for 3.13

3.14 is up to @hugovk, but changing the bytecode is still allowed in the beta phase

@markshannon
Copy link
Member

If the bugfix breaks intentional behaviour relied on by existing code, yes.

And how are we to know that it is intentional behavior if it isn't documented?

https://docs.python.org/3/reference/expressions.html#generator-expressions says nothing about checking that the first iterable supports __iter__ eagerly, merely that it is evaluated eagerly.

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Jun 7, 2025

It was in "equivalent code" examples in PEP 289. There was an explicit test for such behavior. Look also at the history. Raymond asked the same question, but was convinced by Guido.

I think that the old behavior should be restored in main before starting discussion. If we decide to change it, it may require a transitional period.

@justvanrossum
Copy link

Another example of real breakage: fonttools/fonttools#3854

@picnixz picnixz added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Jun 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

7 participants