-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
add parsing and routing to APIGW next gen #11051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
LocalStack Community integration with Pro 2 files ± 0 2 suites ±0 1h 36m 24s ⏱️ + 1m 27s Results for commit 82845b6. ± Comparison against base commit c35ce3b. This pull request removes 1 and adds 25 tests. Note that renamed tests count towards both.
This pull request skips 1 test.
♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
The amount of work you put into this really shows!
I have left one question regarding the body in the InvocationRequest
def get_resources_from_deployment(deployment: RestApiDeployment) -> ListOfResource: | ||
""" | ||
This returns the `Resources` from a deployment | ||
This allows to decouple the underlying split of resources between Moto and LocalStack, and always return the right | ||
format. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 Nice!
for resource_path, resource in path_map.items(): | ||
api_path_regex = re.sub(r"{[^+]+\+}", r"[^\?#]+", resource_path) | ||
api_path_regex = re.sub(r"{[^}]+}", r"[^/]+", api_path_regex) | ||
if re.match(r"^%s$" % api_path_regex, path): | ||
matches.append((resource_path, resource)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad to see the follow up that aims at getting rid of all this custom routing! 🤘
""" | ||
This can be represented by the following routes: | ||
- / | ||
- GET /test | ||
- PUT /test/{param} | ||
- DELETE /{proxy+} | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and clean! The added comment helps a lot
"""Multi value query string parameters of the request""" | ||
multi_value_headers: Optional[dict[str, list[str]]] | ||
"""Multi value headers of the request""" | ||
body: Optional[bytes] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be useful to have raw_body
and decoded_body
, or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand, how do you mean?
Currently, we use restore_payload
for the body, because the request parsing of Werkzeug might consume the data to create a parsed form, and then we would not be able to forward/read the raw body.
It splits in 3 parts: data
, form
and files
. My belief is that APIGW does not manipulate different parts of the body directly? But I'm not sure.
It's also possible I'm not familiar with what is possible to do with APIGW 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I think the only thing we can do on the body is through the request templates. I haven't looked fully into the integration, so maybe this is the right way to go. I was just concerned about decoding it in with LS parser to restore it, to decode it again to apply the templates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see.
I think the issue here is what restore_payload
does. We don't really "decode", we "re-encode" it to be like what we actually received from the client.
For example, the usual way is to call request.data
to get the bytes
content of the body.
But let's imagine your request is a Content-Type: application/x-www-form-urlencoded
.
You get an HTTP body looking like this: field1=value1&field2=value2
When accessing request.data
, you'll get... b""
, an empty bytes
object because Werkzeug will have parsed the request and consumed the stream, and set request.form
to be equal to an ImmutableMultiDict
looking like {"field1": "value1", "field2": "value2"}
.
One of the way here is to directly consume the request.stream
once to get the raw data out. But anything earlier in the handler chain called request.data
, the stream will have be consumed and you have no way to get it back (yet! with the new twisted
web server, we actually have a way now 🥳 which means we can have a short circuit restore_payload
), which is why we re-encode the body 😕
So in itself I'd say we have the proper raw_body
now, the one the request template should act upon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it makes sense then since only one handler should need to perform operations on the body, that step can ensure the body returns to bytes once it has completed and and shouldn't need to keep a dict representation of it anywhere. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really a great push forward for the Next Gen API GW, providing a solid foundation re-using our internal frameworks 😍
Especially in combination with #11057 (where basically all the parsing / routing is then replaced to directly done by Werkzeug! 🥳
If needed, I can already merge the 2 PRs in one if that makes it easier 😄
Having two separate PRs helped to really review the different iterations (where each are quite big). Now, after the individual reviews, merging one into the other should work perfectly fine. :)
Motivation
This PR introduces the parsing of the request via a handler to populate a context with what I think is necessary right now.
It also introduces the routing handler, which will take the incoming request and match it against the deployment, to get the right
Resource
andMethod
to attach it to the context. Those will be used by all the next handlers, it will allow us to know what to do against the incoming request (method request parameters, integration request, integration itself).Currently, we re-use all the current code to do the routing, as the routing logic is quite complex. This can be replaced in the future by a router like the
ServiceRequestRouter
, and especiallyop_router.py
, as it would help us to extract the parameters, and it seems the logic and even the syntax is pretty similar.I've tackled the migration from the legacy routing to the Werkzeug routing like
op_router.py
in a follow up PR, see #11057.If needed, I can already merge the 2 PRs in one if that makes it easier 😄
Changes
TODO
What's left to do: