-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Document that cache() and lru_cache() do not have a "call once" guarantee #103475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
From source code, it seems only
|
The term threadsafe means different things to different people. Here, it is correctly used in the narrow sense to indicate that the underlying data structure updates take place with a lock (or GIL) held. That makes it safe to use in a multithreaded environment without fear that the data structure will become incoherent. The stands in marked contrast to structures like the pure Python version of OrderedDict or random.gauss() where the invariants get broken during multi-threaded updates. We're mostly consistent about this in the docs where we say that random.random, dict.setdefault, and deque.append are threadsafe. I'll add a clarifying note to the |
@functools.cache
is NOT thread safe…"call once" guarantee.
…arantee (pythonGH-103669) (cherry picked from commit e5eaac6) Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com>
I'm curious about this statement. I understand that locks, in general, do not compose. But the current implementation uses a reentrant lock for the caching bits, and I can't come up with a reason we couldn't do an efficient double-checked lock with that reentrant lock even in the case that the user code (perversely) chooses to recurse. Assuming I'm wrong about the above... would the CPython team be open to a PR that would include a conditional in the |
You can open a PR. But it is hard to handle things correctly. Note you cannot simply use one lock to guard the whole function. This from functools import lru_cache
from concurrent.futures import ThreadPoolExecutor
def heavy_things(n: int):
return f(n - 1) + f(n - 2)
@lru_cache
def f(n):
if n > 1:
with ThreadPoolExecutor() as p:
result = p.map(heavy_things, [n])
return sum(result)
return 1
print(f(10)) It's a demo that some calculations might happen in another thread. Do not cause a deadlock in this case. My recommendation would be to customize with your own implementation. |
yes, this makes sense. for what it's worth, we do have our own implementation. it's just unfortunate that we have to have it. An updated suggestion for a possible API that would be backward-compatible and put as much power as possible in the hands of the user: Allow I feel confident this would not break existing tests, and it should not be terribly difficult to write a threaded test (although what would constitute sufficient 'negative proof' of race conditions I am not sure.) |
actually, i'd revise my proposal. The parameter would be Then, if this was non-nil,
This allows the user to opt into granular locking per unique call. there are various ways to implement this; our implementation uses |
I don't think so. This would be a feature creep for the cache to handle a niche use case. IIRC other cache implementations I looked at did not have this feature. Also note that the C version doesn't even have its own lock, so there is nothing to attach this to. And for the Python version, I'm wary of ever leaving an open lock across an arbitrary user function call — that is a recipe for hard to find bugs. Perhaps make your own package for implementing call-once behavior and post it to the Python packaging index. There, it can get a thorough shake-down and we can see if there is any uptake by the user community. FWIW, |
I don't think so now. We are moving the other way. See #101890 |
Documentation
The documentation for
@functools.cache
states thatbut it is not threadsafe before the value is cached.
In the following example the wrapped function will return two different instances of list.
This is an issue, for example, when you use
@functools.cache
for implementation of singletons (we can leave debate about singletons not being a good practice out of this).The documentation should not claim the cache to be threadsafe or there should be an explicit warning about this situation.
Linked PRs
The text was updated successfully, but these errors were encountered: