-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
fix apigw model resolving when importing and when validating #8446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
34cc7ff
to
2e80c12
Compare
b3531a0
to
1cd0837
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments that I hope we can discuss them
@@ -108,12 +119,14 @@ def get_apigateway_store(account_id: str = None, region: str = None) -> ApiGatew | |||
|
|||
|
|||
class Resolver: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the time, Resolver
looked like a ok name, not anymore, we need a better name for this openapi spec reference resolver :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I could rename it to OpenAPISpecificationResolver
or OASResolver
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also refactor at some point, like Waldemar said here: #8371 (review)
It would be nice to be able to separate the import/export code.
We've also added a bunch a new supported features to the importer, we should try to add them to the exporter too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resolver should be independent of import or export. I don't think it matters for the export anyways.
@@ -138,7 +151,20 @@ def _resolve_refpath(self, refpath: str) -> dict: | |||
if refpath in self._refpaths and not self.allow_recursive: | |||
raise Exception("recursion detected with allow_recursive=False") | |||
|
|||
if refpath in self._cache: | |||
# We don't resolve the Model definition, we will return a absolute reference to the model like AWS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be a bug, sure, but the cache should avoid all circular loops. Resolving all refs up front, cleans up a lot of the lookups after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. However, AWS does not resolve the Model at all as shown by the snapshot, they replace it with an absolute reference to the Model. This is also how you set up reference to other Model
with the API outside of importing. I'd say Model
is a specific case here.
Also, not resolving Model makes sense, because we don't use the schema of the Model
but only set its name to the requestModels
and responseModels
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the JSON Schema spec is talking about recursion and circular references, and they say that it's disallowed.
Also, I don't think the cache will avoid circular loop in case of circular references in the model itself once it's resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it's disallowed because if done in a certain way, you get a loop. The cache works as a mechanism to avoid chasing the tail if the reference is already resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache works, it properly reference the object, but the JSON dumps will disallow circular references in itself 😞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow. I guess the goal of our resolver is to allow us to map this OAS to an API Gateway API, not necessarily properly resolve the schema? We don't need a valid OAS schema after resolving? Technically, I think it's still valid, it follows the retrieval URI spec: https://json-schema.org/understanding-json-schema/structuring.html#retrieval-uri
If AWS maps the Model this way in their requestModels
and requestResponses
, we can take this shortcut to allow us to map it in an easier way to what we need to create the API?
Sorry, I might be missing something here, genuine question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AWS does not resolve the Model at all as shown by the snapshot
"house": {
"$ref": "/restapis/rest-id:1/models/House"
}
Having a reference or the inline method is valid. So it doesn't matter what AWS does, in fact, what does it mean? Did you export and that's how it came out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm talking to how it maps a schema in components/schema
to a Model, and how it manages those references inside the Model, see https://stackoverflow.com/questions/37823112/how-do-i-reference-one-model-from-another-model-using-aws-api-gateway
This also about how it maps a relative $ref in schema
, basically, I've imported an API and then fetched the models from AWS to see how they look like. The 3 following JSON in the PR description describes the flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about this, we should try to import an API with no Model defined in definitions/schema
, but instead directly set the schema in the method response. I wonder what comes out of this, because AWS would lack the schema name. I think all of this discussion can actually be solved by this test? Maybe AWS creates a schema with a random name, I'll give it a try later today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To quickly come back on this, if you don't specify a $ref in responses/<statusCode>/schema
field, AWS will automatically create the Model for you.
Basically done the following (simplified):
{
"paths": "/pet",
"post": {
"responses": {
"200": {
"description": "Successful operation",
"schema": {
"type": "object",
"properties": {
"pet": {
"$ref": "#/definitions/Pet"
},
"message": {
"type": "string"
}
}
},
"headers": {
"Access-Control-Allow-Origin": {
"type": "string",
"description": "URI that may access the resource"
}
}
}
}
}
}
And AWS created the follow Model (normally json encoded Schema field):
{
"name": "MODEL03987f",
"contentType": "application/json",
"schema": {
"type": "object",
"properties": {
"pet": {
"$ref": "https://apigateway.amazonaws.com/restapis/<api-id>/models/Pet"
},
"message": {
"type": "string"
}
}
}
}
Seems like we're still missing that feature, I'll note it and add it later on!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we discussed offline, my comments are not blocking - approved 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did only a cursory review, but looks like @calvernaz already reviewed in detail. 👍 Fantastic work @bentsku churning through these parity enhancements, let's get it out to our users 🚀
Changes made to avoid circular references in
Model
containing references to itself, which can happen easily.Example: A Person contains a
children
property, which is a collection ofPerson
. After resolving the models, it will fail to be serialised to JSON because it contains circular references. Or a container which contains an object with a reference to its container.AWS solves this issue by storing the template with its absolute reference[1][2]. It might be stored fully resolved server-side, or might be resolved when actually validating the template.
Still WIP, so I'll see how I do this, this is only the importer phase now, next step is to properly read those schema when validating.This PR also implements
RequestValidator
when importing (see here), and support forrequestBody
property inMethod
for OpenAPI 3.0 (see here)Example:
Once resolved before, it would have lead to
House
having a reference toPerson
andPerson
having a reference toHouse
. These 2 models could be part of a bigger model, thus being in the same dictionary and make it non-serializable.From this example, we want our Model to be able to be used to validate incoming requests. First, it will be resolved like AWS, with absolute paths. We will replace the relative
$ref
part with an absolute link to it,$ref: "http://apigateway.{localstackhost}/restapis/{apiId}/models/{ModelName}
. This allows us to safely JSON dump them, avoiding circular references.Now, the real issue comes when using those Model. We need to be able to resolve them.
The key is to use some tricks from JSON Schema, namely recursive[3] and
$defs
[4].This will allow use to get our Model schema, and all
$ref
inside will be resolved and put into a$defs
field.We will recursively resolve all models and put them into
$defs
, even transitive ones. If we spot a$ref
to our Model, to avoid infinite looping, we will replace its$ref
with#
, signifying to JSON schema that it references its own model.That's basically the gist of it. The PR validates that, changes a bit our schema Resolver, and adds a new
ModelResolver
before validating a request, and that resolved model would be cached in our store (also with the benefit of notjson.loads
every time).What it gives:
Starting with those 2 schemas:
We then resolve them and store them as
Model
with absolute paths (from the snapshot):And once we want to use them to validate our request, this is the
Person
schema, usingHouse
for one of its field:It's a valid schema, as we can attest here
Notes