Skip to content

Conversation

bentsku
Copy link
Contributor

@bentsku bentsku commented Jun 6, 2023

Changes made to avoid circular references in Model containing references to itself, which can happen easily.

Example: A Person contains a children property, which is a collection of Person. After resolving the models, it will fail to be serialised to JSON because it contains circular references. Or a container which contains an object with a reference to its container.

AWS solves this issue by storing the template with its absolute reference[1][2]. It might be stored fully resolved server-side, or might be resolved when actually validating the template. Still WIP, so I'll see how I do this, this is only the importer phase now, next step is to properly read those schema when validating.

This PR also implements RequestValidator when importing (see here), and support for requestBody property in Method for OpenAPI 3.0 (see here)

Example:

{
  "Person": {
    "type": "object",
    "description": "Random person schema.",
    "properties": {
      "randomProperty": {
        "type": "string",
        "description": "Random property",
        "format": "byte"
      },
      "container": {
        "$ref": "#/components/schemas/House"
      }
    }
  },
  "House": {
    "type": "object",
    "description": "Where Person can live",
    "properties": {
      "randomProperty": {
        "type": "string",
        "description": "Random property",
        "format": "byte"
      },
      "contains": {
        "type": "array",
        "items": {
          "$ref": "#/components/schemas/Person"
        },
        "minItems": 1,
        "description": "The information of who is living in the house"
      }
    }
  }
}

Once resolved before, it would have lead to House having a reference to Person and Person having a reference to House. These 2 models could be part of a bigger model, thus being in the same dictionary and make it non-serializable.

From this example, we want our Model to be able to be used to validate incoming requests. First, it will be resolved like AWS, with absolute paths. We will replace the relative $ref part with an absolute link to it, $ref: "http://apigateway.{localstackhost}/restapis/{apiId}/models/{ModelName}. This allows us to safely JSON dump them, avoiding circular references.

Now, the real issue comes when using those Model. We need to be able to resolve them.
The key is to use some tricks from JSON Schema, namely recursive[3] and $defs [4].
This will allow use to get our Model schema, and all $ref inside will be resolved and put into a $defs field.
We will recursively resolve all models and put them into $defs, even transitive ones. If we spot a $ref to our Model, to avoid infinite looping, we will replace its $ref with #, signifying to JSON schema that it references its own model.

That's basically the gist of it. The PR validates that, changes a bit our schema Resolver, and adds a new ModelResolver before validating a request, and that resolved model would be cached in our store (also with the benefit of not json.loads every time).

What it gives:

Starting with those 2 schemas:

"schemas": {
  "Person": {
    "type": "object",
    "description": "Random person schema.",
    "properties": {
      "name": {
        "type": "string",
        "description": "Random property"
      },
      "b": {
        "type": "number"
      },
      "house": {
        "$ref": "#/components/schemas/House"
      }
    },
    "required": ["name", "b"]
  },
  "House": {
    "type": "object",
    "description": "Where a Person can live",
    "properties": {
      "randomProperty": {
        "type": "string",
        "description": "Random property",
        "format": "byte"
      },
      "contains": {
        "type": "array",
        "items": {
          "$ref": "#/components/schemas/Person"
        },
        "minItems": 1,
        "description": "The information of who is living in the house"
      }
    }
  }
}

We then resolve them and store them as Model with absolute paths (from the snapshot):

[
  {
    "contentType": "application/json",
    "description": "Where a Person can live",
    "id": "<model-id:2>",
    "name": "House",
    "schema": {
      "type": "object",
      "properties": {
        "randomProperty": {
          "type": "string",
          "description": "Random property",
          "format": "byte"
        },
        "contains": {
          "minItems": 1,
          "type": "array",
          "description": "The information of who is living in the house",
          "items": {
            "$ref": "<model-base-url>/restapis/<rest-id:1>/models/Person"
          }
        }
      },
      "description": "Where a Person can live"
    }
  },
  {
    "contentType": "application/json",
    "description": "Random person schema.",
    "id": "<model-id:3>",
    "name": "Person",
    "schema": {
      "required": [
        "b",
        "name"
      ],
      "type": "object",
      "properties": {
        "name": {
          "type": "string",
          "description": "Random property"
        },
        "b": {
          "type": "number"
        },
        "house": {
          "$ref": "<model-base-url>/restapis/<rest-id:1>/models/House"
        }
      },
      "description": "Random person schema."
    }
  }
]

And once we want to use them to validate our request, this is the Person schema, using House for one of its field:
It's a valid schema, as we can attest here

{
  "type": "object",
  "description": "Random person schema.",
  "properties": {
    "name": {
      "type": "string",
      "description": "Random property"
    },
    "b": {
      "type": "number"
    },
    "house": {
      "$ref": "#/$defs/House"
    }
  },
  "required": [
    "name",
    "b"
  ],
  "$defs": {
    "House": {
      "type": "object",
      "description": "Where a Person can live",
      "properties": {
        "randomProperty": {
          "type": "string",
          "description": "Random property",
          "format": "byte"
        },
        "contains": {
          "type": "array",
          "items": {
            "$ref": "#"
          },
          "minItems": 1,
          "description": "The information of who is living in the house"
        }
      }
    }
  }
}

Notes

@bentsku bentsku added the semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases label Jun 6, 2023
@bentsku bentsku changed the base branch from master to fix-apigw-apikey-authorizer-import June 6, 2023 02:18
@bentsku bentsku self-assigned this Jun 6, 2023
@bentsku bentsku added the aws:apigateway Amazon API Gateway label Jun 6, 2023
@github-actions
Copy link

github-actions bot commented Jun 6, 2023

LocalStack Community integration with Pro

       2 files         2 suites   1h 24m 34s ⏱️
2 092 tests 1 809 ✔️ 283 💤 0
2 093 runs  1 809 ✔️ 284 💤 0

Results for commit 878b684.

♻️ This comment has been updated with latest results.

@bentsku bentsku force-pushed the fix-apigw-apikey-authorizer-import branch from 34cc7ff to 2e80c12 Compare June 6, 2023 18:05
Base automatically changed from fix-apigw-apikey-authorizer-import to master June 7, 2023 01:28
@bentsku bentsku force-pushed the fix-apigw-model-resolving branch from b3531a0 to 1cd0837 Compare June 7, 2023 01:29
@bentsku bentsku requested a review from whummer June 7, 2023 01:57
@bentsku bentsku marked this pull request as ready for review June 7, 2023 02:21
@bentsku bentsku requested a review from calvernaz as a code owner June 7, 2023 02:21
Copy link
Contributor

@calvernaz calvernaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few comments that I hope we can discuss them

@@ -108,12 +119,14 @@ def get_apigateway_store(account_id: str = None, region: str = None) -> ApiGatew


class Resolver:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time, Resolver looked like a ok name, not anymore, we need a better name for this openapi spec reference resolver :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I could rename it to OpenAPISpecificationResolver or OASResolver!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll also refactor at some point, like Waldemar said here: #8371 (review)

It would be nice to be able to separate the import/export code.

We've also added a bunch a new supported features to the importer, we should try to add them to the exporter too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resolver should be independent of import or export. I don't think it matters for the export anyways.

@@ -138,7 +151,20 @@ def _resolve_refpath(self, refpath: str) -> dict:
if refpath in self._refpaths and not self.allow_recursive:
raise Exception("recursion detected with allow_recursive=False")

if refpath in self._cache:
# We don't resolve the Model definition, we will return a absolute reference to the model like AWS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can be a bug, sure, but the cache should avoid all circular loops. Resolving all refs up front, cleans up a lot of the lookups after.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. However, AWS does not resolve the Model at all as shown by the snapshot, they replace it with an absolute reference to the Model. This is also how you set up reference to other Model with the API outside of importing. I'd say Model is a specific case here.

Also, not resolving Model makes sense, because we don't use the schema of the Model but only set its name to the requestModels and responseModels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the JSON Schema spec is talking about recursion and circular references, and they say that it's disallowed.
Also, I don't think the cache will avoid circular loop in case of circular references in the model itself once it's resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it's disallowed because if done in a certain way, you get a loop. The cache works as a mechanism to avoid chasing the tail if the reference is already resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache works, it properly reference the object, but the JSON dumps will disallow circular references in itself 😞

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. I guess the goal of our resolver is to allow us to map this OAS to an API Gateway API, not necessarily properly resolve the schema? We don't need a valid OAS schema after resolving? Technically, I think it's still valid, it follows the retrieval URI spec: https://json-schema.org/understanding-json-schema/structuring.html#retrieval-uri

If AWS maps the Model this way in their requestModels and requestResponses, we can take this shortcut to allow us to map it in an easier way to what we need to create the API?
Sorry, I might be missing something here, genuine question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AWS does not resolve the Model at all as shown by the snapshot
"house": {
"$ref": "/restapis/rest-id:1/models/House"
}

Having a reference or the inline method is valid. So it doesn't matter what AWS does, in fact, what does it mean? Did you export and that's how it came out?

Copy link
Contributor Author

@bentsku bentsku Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm talking to how it maps a schema in components/schema to a Model, and how it manages those references inside the Model, see https://stackoverflow.com/questions/37823112/how-do-i-reference-one-model-from-another-model-using-aws-api-gateway
This also about how it maps a relative $ref in schema, basically, I've imported an API and then fetched the models from AWS to see how they look like. The 3 following JSON in the PR description describes the flow.

Copy link
Contributor Author

@bentsku bentsku Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about this, we should try to import an API with no Model defined in definitions/schema, but instead directly set the schema in the method response. I wonder what comes out of this, because AWS would lack the schema name. I think all of this discussion can actually be solved by this test? Maybe AWS creates a schema with a random name, I'll give it a try later today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To quickly come back on this, if you don't specify a $ref in responses/<statusCode>/schema field, AWS will automatically create the Model for you.
Basically done the following (simplified):

{
  "paths": "/pet",
  "post": {
    "responses": {
      "200": {
        "description": "Successful operation",
        "schema": {
          "type": "object",
          "properties": {
            "pet": {
              "$ref": "#/definitions/Pet"
            },
            "message": {
              "type": "string"
            }
          }
        },
        "headers": {
          "Access-Control-Allow-Origin": {
            "type": "string",
            "description": "URI that may access the resource"
          }
        }
      }
    }
  }
}

And AWS created the follow Model (normally json encoded Schema field):

{
  "name": "MODEL03987f",
  "contentType": "application/json",
  "schema": {
    "type": "object",
    "properties": {
      "pet": {
        "$ref": "https://apigateway.amazonaws.com/restapis/<api-id>/models/Pet"
      },
      "message": {
        "type": "string"
      }
    }
  }
}

Seems like we're still missing that feature, I'll note it and add it later on!

@bentsku bentsku changed the title wip: fix apigw model resolving when importing and when validating fix apigw model resolving when importing and when validating Jun 7, 2023
Copy link
Contributor

@calvernaz calvernaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we discussed offline, my comments are not blocking - approved 🚀

@coveralls
Copy link

coveralls commented Jun 7, 2023

Coverage Status

coverage: 82.807% (+0.04%) from 82.767% when pulling 878b684 on fix-apigw-model-resolving into 40903fb on master.

Copy link
Member

@whummer whummer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did only a cursory review, but looks like @calvernaz already reviewed in detail. 👍 Fantastic work @bentsku churning through these parity enhancements, let's get it out to our users 🚀

@bentsku bentsku merged commit 31f4e83 into master Jun 7, 2023
@bentsku bentsku deleted the fix-apigw-model-resolving branch June 7, 2023 16:04
@dfangl dfangl added this to the 2.2 milestone Jul 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws:apigateway Amazon API Gateway semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants