Skip to content

feat: Convert rego queries into SQL clauses #4225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Oct 4, 2022
Merged

Conversation

Emyrk
Copy link
Member

@Emyrk Emyrk commented Sep 27, 2022

What this does

Currently Filter works by querying all resources of a given type, then calling Authorize() on each element in the list. This would not work with pagination.

This feature takes the queries resulting from partial execution, and returns a SQL clause that can be added into a WHERE statement of a SQL query. The results are equivalent to all authorized objects in the original Filter call.

This PR only implements this for listing workspaces. This technique can be copied easily to other listing (templates, users, etc). I just started with the one with the largest lists and the one we will likely paginate first.

Example

Partial execution returns these queries. If either is true, the user in question can read the object.

+---------+--------------------------------------------------------------------+
| Query 1 | input.object.org_owner != ""                                       |
|         | input.object.org_owner in {"19a3d781-9e53-4608-948e-ac084470cd99"} |
|         | input.object.owner != ""                                           |
|         | "me" = input.object.owner                                          |
+---------+--------------------------------------------------------------------+
| Query 2 | input.object.org_owner = ""                                        |
|         | input.object.owner != ""                                           |
|         | "me" = input.object.owner                                          |
+---------+--------------------------------------------------------------------+

This code converts these rego queries into the following SQL:

SELECT * FROM workspaces WHERE
-- SQL generated below this line
(
    (
        input.object.org_owner != '' AND 
        input.object.org_owner = ANY(ARRAY ['a207e4ff-2cf4-44be-aadd-8f866452e3b1']) AND 
        input.object.owner != '' AND 'me' = input.object.owner
    ) 
        OR
    (
        input.object.org_owner = '' AND 
        input.object.owner != '' AND 
        'me' = input.object.owner
    )
)

Long Rego Query Example

+---------+--------------------------------------------------------------------+
| Query 1 | input.object.org_owner != ""                                       |
|         | input.object.org_owner in {"19a3d781-9e53-4608-948e-ac084470cd99"} |
|         | input.object.owner != ""                                           |
|         | "me" = input.object.owner                                          |
+---------+--------------------------------------------------------------------+
| Query 2 | input.object.org_owner = ""                                        |
|         | input.object.owner != ""                                           |
|         | "me" = input.object.owner                                          |
+---------+--------------------------------------------------------------------+
| Query 3 | input.object.org_owner != ""                                       |
|         | input.object.org_owner in {"19a3d781-9e53-4608-948e-ac084470cd99"} |
|         | input.object.org_owner != ""                                       |
|         | "read" in input.object.acl_group_list.allUsers                     |
+---------+--------------------------------------------------------------------+
| Query 4 | input.object.org_owner != ""                                       |
|         | input.object.org_owner in {"19a3d781-9e53-4608-948e-ac084470cd99"} |
|         | input.object.org_owner != ""                                       |
|         | "*" in input.object.acl_group_list.allUsers                        |
+---------+--------------------------------------------------------------------+
| Query 5 | input.object.org_owner = ""                                        |
|         | input.object.org_owner != ""                                       |
|         | "read" in input.object.acl_group_list.allUsers                     |
+---------+--------------------------------------------------------------------+
| Query 6 | input.object.org_owner = ""                                        |
|         | input.object.org_owner != ""                                       |
|         | "*" in input.object.acl_group_list.allUsers                        |
+---------+--------------------------------------------------------------------+
| Query 7 | "read" in input.object.acl_user_list.me                            |
+---------+--------------------------------------------------------------------+
| Query 8 | "*" in input.object.acl_user_list.me                               |
**+---------+--------------------------------------------------------------------+**

Future Work

Implement this technique for other list calls.

)

// AuthorizedGetWorkspaces returns all workspaces that the user is authorized to access.
func (q *sqlQuerier) AuthorizedGetWorkspaces(ctx context.Context, arg GetWorkspacesParams, authorizedFilter rbac.AuthorizeFilter) ([]Workspace, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could add hooks into SQLc to do this. Kinda sucks to have to copy the original function and call it like this.

@Emyrk Emyrk marked this pull request as ready for review September 29, 2022 00:37
@Emyrk Emyrk requested a review from kylecarbs September 29, 2022 13:24
Copy link
Member

@johnstcn johnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but I'll defer to other folks who are working on v2.

This solves it without proper types in our AST.
Might bite the bullet and implement some better types
Copy link
Member

@kylecarbs kylecarbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are surprisingly simple to follow, or at least much simpler than I would have expected interfacing with Rego to be.

My primary concern is the query separation between generated and non-generated. I'd love if we could have a single query (maybe even for just getting a single?) that used the same authorization flow, but maybe that's just unreasonable.

eg.

  • GetWorkspaceByID
  • GetWorkspaces
  • GetWorkspaceByOwnerAndName

Could all be replaced by this?

// AuthorizedGetWorkspaces returns all workspaces that the user is authorized to access.
// This code is copied from `GetWorkspaces` and adds the authorized filter WHERE
// clause.
func (q *sqlQuerier) AuthorizedGetWorkspaces(ctx context.Context, arg GetWorkspacesParams, authorizedFilter rbac.AuthorizeFilter) ([]Workspace, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might change the wording of this. Prefixing with Authorized makes me think that I'm authorized to get workspaces, not that the workspaces returned I'm authorized to access.

Thoughts on GetAuthorizedWorkspaces instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for us to get rid of GetWorkspaces in its entirety? There doesn't seem much of a point in getting workspaces that the user doesn't have access to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kylecarbs that is a good question. For system level things there might be a need? But we can always add back "GetWorkspaces" later.

I was using SQLc's generated code to make this function. To get rid of GetWorkspaces would be to remove all those SQLc calls. I guess that is ok? It is nice to use it to generate most of this code, it just can't support the dynamic parts of the query.

@Emyrk
Copy link
Member Author

Emyrk commented Sep 30, 2022

AuthorizedGetWorkspaces

This change is only made for "listing" objects, for singular objects running the rego on the returned object is easier. So if we implemented this for the singular too, I would not convert to SQL, I would just call the sqlc code and run Authorize() on that returned object.

I wish there was a way to integrate this into SQLc, as it is a great tool, it just doesn't support this use case.


My goal for this PR was to implement this functionality to make this possible. We next implement it for the 2/3 list queries that are "slow" with the Filter. It is really only needed in 2/3 places to support that dashboard "List all that I can view" view.

@Emyrk Emyrk merged commit cd4ab97 into main Oct 4, 2022
@Emyrk Emyrk deleted the stevenmasley/rego_to_sql branch October 4, 2022 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants