Skip to content

fix proxied requests forwarding to Moto #11653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 9, 2024
Merged

Conversation

bentsku
Copy link
Contributor

@bentsku bentsku commented Oct 7, 2024

Motivation

As reported by #11652, we have an issue forwarding requests to moto when those have a full URI in the HTTP request, as explained in #8962.
(Example HTTP request: POST http://sns.eu-central-1.amazonaws.com/ HTTP/1.1 instead of the usual POST / HTTP/1.1)

This PR uses the helper created with the above PR to get the request path to send towards the moto dispatcher, so that we only get the path part of the request. It only changes the logic to get the actual endpoint to call, as we were already using that helper to forward the request to moto. The logic change is very minimal.

Looking at it, I believe this is an issue in how Werkzeug handles those requests, putting the full RAW_URI in the path and not doing sanitization beforehand, as we've seen with https://www.ietf.org/rfc/rfc2616.txt that web servers should handle those.
Also, the question comes as to what changed in 3.7.2, as this used to work with versions before, tested by the user down to 3.4.

Changes

  • update the logic to properly sanitize the path used to get the proper endpoint for moto, to be sure it is only a path and not a full URI
  • add a regression test for it (I also confirmed the user sample now works with the fix)

fixes #11652

@bentsku bentsku added aws:sns Amazon Simple Notification Service area: asf semver: patch Non-breaking changes which can be included in patch releases labels Oct 7, 2024
@bentsku bentsku self-assigned this Oct 7, 2024
@bentsku bentsku requested review from alexrashed and thrau October 7, 2024 22:36
@bentsku bentsku marked this pull request as draft October 8, 2024 00:22
Copy link

github-actions bot commented Oct 8, 2024

LocalStack Community integration with Pro

    2 files  ±0      2 suites  ±0   1h 38m 50s ⏱️ - 4m 19s
3 488 tests +1  3 075 ✅ +1  413 💤 ±0  0 ❌ ±0 
3 490 runs  +1  3 075 ✅ +1  415 💤 ±0  0 ❌ ±0 

Results for commit 21ed1a8. ± Comparison against base commit 87104a6.

♻️ This comment has been updated with latest results.

@bentsku bentsku marked this pull request as ready for review October 8, 2024 08:59
Copy link
Member

@alexrashed alexrashed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Great catch! This really is an interesting case! Thanks a lot for jumping on this and digging this deep!
I think we could move forward with this fix, but I think the problem here could actually be caused by rolo (since rolo creates the request).
I think it would be worth it to try to:

  • See if this issue can really be reproduced with werkzeug and raise an issue in their repo is this really is the case. Their own code suggest that you are right and the path variable really shouldn't contain anything before the root path slash.
  • If it is not reproducible in plain werkzeug, I think this would actually be a subtle issue in rolo, and we should fix it there. /cc @thrau

What do you think?

# this is where we skip the HTTP roundtrip between the moto server and the boto client
dispatch = get_dispatcher(service.service_name, request.path)
dispatch = get_dispatcher(service.service_name, raw_path.split("?")[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would be great to add a comment line here explaining what's happening here and why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, agreed. The thing is, the "best" would be to call get_raw_path only to get the path to match the request, but it is called internally in get_full_raw_path and we'd be doing twice. I'll add a comment!

@bentsku
Copy link
Contributor Author

bentsku commented Oct 9, 2024

@alexrashed sure! I'll try recreating against Werkzeug, I believe that during my first investigation I had already found that Werkzeug had the issue itself.

In Rolo, it creates the request by directly instantiating werkzeug.wrappers.request.Request on the passed environ, so I don't think rolo is directly to blame here. (In rolo.gateway.wsgi.WsgiGateway.__call__).
We could fix the issue in rolo, but for now I think we're just having the same behavior than Werkzeug by default.

I'll set up a reproducer for Werkzeug, but I think we should go forward with this fix for now! I'll add the comment

edit: working on the reproducer, I can't reproduce with either base Werkzeug or even base hello world rolo, printing request.path gives the right value when using werkzeug.run_simple. Will continue looking 🔍 might be related to WSGI and the web server in itself, most probably in our twisted bridge or twisted itself.

@bentsku bentsku force-pushed the fix-moto-call-with-proxy branch from a9482e0 to 21ed1a8 Compare October 9, 2024 11:59
@bentsku
Copy link
Contributor Author

bentsku commented Oct 9, 2024

@alexrashed the issue seems to be in Twisted, when getting the WSGI environ when using rolo, this is the PATH_INFO: 'PATH_INFO': '/ttp://sns.eu-central-1.amazonaws.com/', which is why it's wrong down the line. I'll try investigating and fixing it, but I guess in the meantime, we should try to fix the issue itself to unblock users?

Small reproducer:

if __name__ == '__main__':
    from twisted.web import server, resource
    from twisted.internet import reactor

    class Simple(resource.Resource):
        isLeaf = True
        def render_POST(self, request):
            print(f"{request.path=}")
            return "<html>Hello, world!</html>"

    site = server.Site(Simple())
    reactor.listenTCP(4566, site)

Using the reproducer given in https://github.com/Blind-Striker/localstack-sns-issue-sandbox, this prints:
request.path=b'http://sns.eu-central-1.amazonaws.com/', and I can see the environ is wrong.

Copy link
Member

@alexrashed alexrashed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! We can definitely move forward with this fix to unblock users, and then carefully look at the issue in rolo and twisted! Thanks a lot for digging into this and addressing the comment! 🏅 💯

@bentsku bentsku merged commit 3ff3e2a into master Oct 9, 2024
34 checks passed
@bentsku bentsku deleted the fix-moto-call-with-proxy branch October 9, 2024 14:27
@bentsku
Copy link
Contributor Author

bentsku commented Oct 11, 2024

@alexrashed here's the PR in Rolo: localstack/rolo#23
I could track this down to a bug in Twisted, as I wrote the test for rolo I know have a clear reproducer, so I'll open a bug report there when I have the time. We can then revert the change in moto.py once we update the dependencies to use rolo, as we have a regression test for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: asf aws:sns Amazon Simple Notification Service semver: patch Non-breaking changes which can be included in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: LocalStack.NET all SNS operations fails on LocalStack 3.7.2 and 3.8.0
2 participants