Skip to content

Conversation

thrau
Copy link
Member

@thrau thrau commented Mar 25, 2023

This PR fixes a race condition in the service plugin loading, where multiple concurrent calls to service_manager.require(...) could lead to Service plugins being instantiated more than once, and subsequently on_after_init being called multiple times, whose implementation often mutates global state.

We haven't seen this manifest yet, other an exception in the logs when running tests/integration/test_terraform.py::TestTerraform::test_bucket_exists:

2023-03-24T17:22:07.962 ERROR --- [   asgi_gw_5] localstack.utils.functions : error calling function on_after_init
Traceback (most recent call last):
  File "/opt/code/localstack/localstack/utils/functions.py", line 41, in call_safe
    return func(*args, **kwargs)
  File "/opt/code/localstack/localstack/services/apigateway/provider.py", line 97, in on_after_init
    apply_patches()
  File "/opt/code/localstack/localstack/services/apigateway/patches.py", line 270, in apply_patches
    def get_rest_api(self, function_id):
  File "/opt/code/localstack/localstack/utils/patch.py", line 170, in wrapper
    fn.patch = Patch.function(target, fn, pass_target=pass_target)
  File "/opt/code/localstack/localstack/utils/patch.py", line 96, in function
    return Patch(obj, name, new)
  File "/opt/code/localstack/localstack/utils/patch.py", line 56, in __init__
    self.old = getattr(self.obj, name)
AttributeError: module 'localstack.services.apigateway.patches' has no attribute 'get_rest_api'

which was an obscure indicator that on_after_init was called twice.

The PR introduces a locking mechanism that makes sure that, while the service is being loaded and instantiated, other calls to require of the same service block. The SynchronizedDefaultDict(RLock) pattern is helpful here - and we make sure we delete the locks after using them (we only need them once).

@thrau thrau requested a review from whummer March 25, 2023 12:30
@coveralls
Copy link

coveralls commented Mar 25, 2023

Coverage Status

Coverage: 81.74% (-0.01%) from 81.753% when pulling b155b49 on fix-plugin-loading-concurrency into 00d3a59 on master.

@thrau thrau requested review from bentsku and removed request for whummer March 25, 2023 14:13
Copy link
Member

@whummer whummer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch @thrau 🚀 . Nice approach with the SynchronizedDefaultDict, and great to see it covered by a test. Only one question / suggestion around introducing explicit guards around self._services..

return None

with self._mutex:
if plugin.service.name() not in self._services:
Copy link
Member

@whummer whummer Mar 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if there can be a situation where this condition is not true, i.e., where an entry for this service name would already exist in self._services (since we're now locking on the service name in this critical section, and _load_service_plugin(..) in line 565 above is side-effect free).

The only case where this could happen is if plugin.service.name() returns a string that is different from name - can this be the case?

If that's something we're not expecting to happen, maybe we could proactively raise an error, to avoid such cases from going unnoticed:

            with self._mutex:
                service_name = plugin.service.name()
                if service_name in self._services:
                    raise ServiceStateException(
                        "Service {service_name} unexpectedly already in list of loaded services")
                    )
                super().add_service(plugin.service)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, can't happen anymore. fixed!

@thrau thrau force-pushed the fix-plugin-loading-concurrency branch from 091dcf8 to b155b49 Compare March 25, 2023 14:44
@thrau thrau merged commit cd84c77 into master Mar 25, 2023
@thrau thrau deleted the fix-plugin-loading-concurrency branch March 25, 2023 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants