-
Notifications
You must be signed in to change notification settings - Fork 886
chore: add template_with_user
view to include user contextual data
#8568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sold on this. Today we're adding TemplateWithOwner
, tomorrow do we add TemplateWithOwnerAndDependentWorkspaces
? Where do we draw the line?
Given that the goal here is to support removing the ability for regular users to read all other user data, maybe we can allow all users to read a subset of other users' data, but disallow enumeration (e.g. allow GetPublicUserInfo but not ListAllUsers).
I intentionally only included the owner because this information is arguably always relevant and always ok to join. This PR replaces all instances of
So the line would be just that. What is the base standard information to include in the "primitive" resource. If it comes to be we should include say the "Active template version name", then I think it could be argued to join that in too in the future. Naming gets annoying though... |
I did try using The downside to type Template interface {
Template() database.Template
} Then if I join more tables, it just returns interfaces with more fields, but it's still a type TemplateWithOwner interface {
Template
Owner() VisibleUser
} The next type TemplateWithDependencies interface {
Template
TemplateWithOwner
ActiveVersion() database.TemplateVersion
} This would allow us to still write general functions for things like I would use sqlx to make these models and method signatures. |
Would it be reasonable to patch Another idea, would it be reasonable to move the "private" user data into a separate table? This way we can assume querying the regular users table is always OK (sans listing all users). With the private data in a separate table, it requires explicit intent to include it. This may be a bad suggestion as my understanding of what we're trying to do/achieve here is a bit limited. |
This would solve the new struct per query problem, but it would still leave the problem of copy pasting the SQL table being joined. The view prevents having to copy/paste this new "Templates" table on each query. Views cannot use the SELECT
templates.*, users.username AS created_by_username, users.avatar_url AS created_by_avatar_url
FROM
templates
LEFT JOIN
(%s) AS users
ON
templates.created_by = users.id
This was my idea with the |
@johnstcn @mafredri One way to get around a lot of the type stuff is just to make this the new All our code will just use this new view. There still is the issue of updating the view on new columns, and that is just a new chore we will have. I can even add a unit test to compare the fields on @johnstcn I know this doesn't solve the slippery slope of where to stop (TemplateWithDependencies etc) There is the issue of solving the greater |
@coadler also pointed out a concern for rebuilding the views. I think I can come up with a unit test that will error nicely. You would have to run CI to see said error though 😢 EDIT: Maybe in the |
@Emyrk can we just maintain our own slight fork of sqlc? We depend on it so heavily that we shouldn't be limited by the most common use-cases imo. |
I like the idea that @mafredri has with naming embeds... |
We would probably only need to fork the "plugin": https://github.com/sqlc-dev/sqlc-go https://docs.sqlc.dev/en/latest/guides/plugins.html There is 2 things we would need to support to get this "clean".
I would be in favor of this fork idea. For this specific PR though I would prefer just to use the I think forking sqlc would be a very interesting idea though that opens up possibilities. |
@matifali @johnstcn @kylecarbs For this PR and the next types If we want to change this in the future to support more joins or whatever, we can address it then. I do not want to fork SLQc, or use SQLx, or write custom queries, or ..etc.. right now. JOIN support can come, and do not think this PR would prevent that. I say we do this simple thing now, and when we hit another 2 or 3 cases, we find a better way to generalize it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks OK; I realise that there is no easy way around the issues Steven desribed. Deferring to @kylecarbs or @ammario for final approval however.
template, err = tx.GetTemplateByID(ctx, template.ID) | ||
if err != nil { | ||
return xerrors.Errorf("get updated template by ID: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(non-blocking): Needing to re-fetch the template to populate the audit log is an unfortunate side-effect. It would be nice if we could move this into a trigger on the audit_logs table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it is unfortunate. I could have made the audit log table take TemplateTable
, but then we would have 2 types floating around.
We do have 1 less read in the fetch case now as we do not need to fetch the user, so it's a trade for 1 call on the read, to 1 call on the update.
If we are willing to add postgres RULE
s we can actually "insert into a view". But that would complicate the view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a trade for 1 call on the read, to 1 call on the update.
I think this is a fair trade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could move this into a trigger on the audit_logs table.
Unfortunately we support multiple paths for audit logs to be exported, so it can't just be as simple as a trigger :(
func TestViewSubsetTemplate(t *testing.T) { | ||
t.Parallel() | ||
table := reflect.TypeOf(database.TemplateTable{}) | ||
joined := reflect.TypeOf(database.Template{}) | ||
|
||
tableFields := allFields(table) | ||
joinedFields := allFields(joined) | ||
if !assert.Subset(t, fieldNames(joinedFields), fieldNames(tableFields), "table is not subset") { | ||
t.Log("Some fields were added to the Template Table without updating the 'template_with_users' view.") | ||
t.Log("See migration 000138_join_users_up.sql to create the view.") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 nice molly-guard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will prevent us from forgetting to update the view @coadler
BEGIN; | ||
|
||
CREATE VIEW | ||
visible_users | ||
AS | ||
SELECT | ||
id, username, avatar_url | ||
FROM | ||
users; | ||
|
||
COMMENT ON VIEW visible_users IS 'Visible fields of users are allowed to be joined with other tables for including context of other resources.'; | ||
|
||
CREATE VIEW | ||
template_with_users | ||
AS | ||
SELECT | ||
templates.*, | ||
coalesce(visible_users.avatar_url, '') AS created_by_avatar_url, | ||
coalesce(visible_users.username, '') AS created_by_username | ||
FROM | ||
templates | ||
LEFT JOIN | ||
visible_users | ||
ON | ||
templates.created_by = visible_users.id; | ||
|
||
COMMENT ON VIEW template_with_users IS 'Joins in the username + avatar url of the created by user.'; | ||
|
||
COMMIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the big change. I am introducing views so sqlc creates a consistent return type to be used.
createdByNameMap, err := getCreatedByNamesByTemplateIDs(ctx, api.Database, []database.Template{template}) | ||
if err != nil { | ||
httpapi.Write(ctx, rw, http.StatusInternalServerError, codersdk.Response{ | ||
Message: "Internal error fetching creator name.", | ||
Detail: err.Error(), | ||
}) | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This query is no longer needed as now the data is joined
@@ -75,7 +75,7 @@ INSERT INTO | |||
allow_user_cancel_workspace_jobs | |||
) | |||
VALUES | |||
($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14) RETURNING *; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to remove Returning *
. Otherwise we would have 2 types used in the codebase. database.Template
and database.TemplateWithUser
.
This now requires a read to happen after an update or insert to fetch the updated data. This adds 1 extra DB call in those cases, but we remove 1 db call in the read case, and this should be less frequent anyway.
template: TemplateTable | ||
template_with_user: Template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
template_with_user
is the only template to be used throughout the codebase now. The original Template type is needed by DBFake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice if we could patch the db interface to do the update/select in one place vs. everywhere (avoid cost of no returning*). But it's not clear to me how we'd do that.
Btw, do API docs need updating? I see there was no change in documented/generated return types.
We'd need some layer between the db and the
Should not be. The fields were always in the api, we just had to do a |
Related:
This joins
created_by_username
andcreated_by_avatar_url
totemplate
.The created_by data needs to be joined to be apart of the template struct.
If this PR passes, the same will happen with template versions and workspace builds.
Why do this?
This is in anticipation of removing the permission for members to read other members. (#5002)
In order to relate the context data to the primary resource (template, workspace build, etc), it is best to
JOIN
this in SQL. How we currently do it is fetch the user in another query, but with the permission change, that fetch will fail.This has the side effect of reducing the DB calls needed to read a template.
Implementation
database.Template is now always this view. We never return the table without this joined data.
A migration is needed to add a
VIEW
for SQLc to generate a model struct that can be shared by all template related queries.sqlc.embed
was tried first, but this still creates a new struct for each query. Because we use sqlc's method signatures directly throughout the codebase, using these unique structs is a lot more work to deal with.The downside to the
VIEW
is that if a column is added to the view, the view must be dropped and recreated to include the new column. I added a unit test that is run ongenerate.sh
that ensures the view includes all fields from the original table. So if the table is updated, this error will happen if no migration is included to fix it.Slippery Slope
A common objection to this is that this only solves this particular join. What if we want join more information in tomorrow?
We do want to join more today, but we haven't. We only haven't because we do not have a more general solution on how to handle joins. Tackling that problem has proven to be challenging, and has resulted in closed attempts in the past under the common "premature optimization". We currently just resort to making
N
calls to replaceN
joins.This PR is different because I am removing the general permission to read users. So the
+1
db call would now fail.This contextual user information should be available, as we do have
user_id
as apart of thetemplate/workspacebuild/version
, and to the UI,username
is more informative.So this
view
is to make our permission system consistent with our data model.I think this PR helps move the needle on needing
JOIN
s, but is not enough to cause the change. When we finally tackle theJOIN
challenge, this will just serve as more context for that eventual RFC.Other Ideas
A branch was made to not use sqlc at all and just write queries using sqlx. The downside is that all sql queries do not have sql highlighting in the IDE. All
VIEW
s created in this PR become constant strings that use a%s
formatting directive instead in the sqlx query.It works, but you lose the sqlc framework entirely and move to manual queries. This has been opposed in the past.