Skip to content

fix(ReconnectLDAPObject): reconnect race condition #507

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 8, 2023

Conversation

pmhahn
Copy link
Contributor

@pmhahn pmhahn commented Jan 13, 2023

There is a race condition in ReconnectLDAPObject._apply_method_s() when
multiple threads use a shared ReconnectLDAPObject instance:

Traceback (most recent call last):
...
File "ldap/ldapobject.py", line 1259, in search_ext_s
return self._apply_method_s(SimpleLDAPObject.search_ext_s,*args,**kwargs)
File "ldap/ldapobject.py", line 1201, in _apply_method_s
self.reconnect(self._uri,retry_max=self._retry_max,retry_delay=self._retry_delay)
File "ldap/ldapobject.py", line 1181, in reconnect
SimpleLDAPObject.unbind_s(self)
File "ldap/ldapobject.py", line 905, in unbind_s
return self.unbind_ext_s(None,None)
File "ldap/ldapobject.py", line 889, in unbind_ext_s
msgid = self.unbind_ext(serverctrls,clientctrls)
File "ldap/ldapobject.py", line 881, in unbind_ext
res = self._ldap_call(self._l.unbind_ext,RequestControlTuples(serverctrls),RequestControlTuples(clientctrls))
File "ldap/ldapobject.py", line 352, in getattr
self.class.name,repr(name)
AttributeError: ReconnectLDAPObject has no attribute '_l'

While the reconnect() is already protected by self._reconnect_lock,
there is a small windows between the unbind_s() and reconnect()
call, where CPython can switch threads: While the 1st threads is already
waiting inside reconnect() for its next try, the 2nd threads will call
unbind_s() outside that lock which will race with the 1st thread also
calling unbind_s(): Whoever comes first will do the del self._l
first, so the later threads will raise the above AttributeError.

Move calling unbind_s() inside the locked region so that self._l is
handled atomically.

Add a new parameter force to either forcefully close any previous
connection or keep re-use the previous connection if it still is
supposed to work.

PS: This does not fix #253 but addresses https://forge.univention.org/bugzilla/show_bug.cgi?id=43274

There is a race condition in ReconnectLDAPObject._apply_method_s() when
multiple threads use a shared ReconnectLDAPObject instance:

> Traceback (most recent call last):
...
>   File "ldap/ldapobject.py", line 1259, in search_ext_s
>     return self._apply_method_s(SimpleLDAPObject.search_ext_s,*args,**kwargs)
>   File "ldap/ldapobject.py", line 1201, in _apply_method_s
>     self.reconnect(self._uri,retry_max=self._retry_max,retry_delay=self._retry_delay)
>   File "ldap/ldapobject.py", line 1181, in reconnect
>     SimpleLDAPObject.unbind_s(self)
>   File "ldap/ldapobject.py", line 905, in unbind_s
>     return self.unbind_ext_s(None,None)
>   File "ldap/ldapobject.py", line 889, in unbind_ext_s
>     msgid = self.unbind_ext(serverctrls,clientctrls)
>   File "ldap/ldapobject.py", line 881, in unbind_ext
>     res = self._ldap_call(self._l.unbind_ext,RequestControlTuples(serverctrls),RequestControlTuples(clientctrls))
>   File "ldap/ldapobject.py", line 352, in __getattr__
>     self.__class__.__name__,repr(name)
> AttributeError: ReconnectLDAPObject has no attribute '_l'

While the reconnect() is already protected by `self._reconnect_lock`,
there is a small windows between the `unbind_s()` and `reconnect()`
call, where CPython can switch threads: While the 1st threads is already
waiting inside `reconnect()` for its next try, the 2nd threads will call
`unbind_s()` outside that lock which will race with the 1st thread also
calling `unbind_s()`: Whoever comes first will do the `del self._l`
first, so the later threads will raise the above AttributeError.

Move calling `unbind_s()` inside the locked region so that `self._l` is
handled atomically.

Add a new parameter `force` to either forcefully close any previous
connection or keep re-use the previous connection if it still is
supposed to work.
@pmhahn
Copy link
Contributor Author

pmhahn commented Jan 26, 2023

There seems to be a race-condition in running the tests in pypy3: I can "reproduce" the bug locally on my Debian-11-Bullseye machine, but mostly one test failed; which one is random. Sometimes I get a clean run, but most fail.
I know that pypy3 has a different behavior on when objects are freed as this is unspecified and different Python implementations differ in that detail.

@droideck
Copy link
Contributor

droideck commented Jan 26, 2023

Yeah, it's a known issue: #460. So your change is good as it is.

Regarding your PR, it looks good to me!
I'll leave a bit of a time window for @mistotebe (or anybody else) to review, and we can merge it early next week.
Thank you for your contribution!

@droideck droideck merged commit 105925f into python-ldap:main Feb 8, 2023
@pmhahn pmhahn deleted the unbind branch February 8, 2023 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SimpleLDAPObject has no attribute '_l'
2 participants