Use postgres views to reduce database calls in api (allow `JOIN`s in pg queries)

_This is just tracking a suggestion and making a place for discussion._

Branch that implements this suggestion: [`stevenmasley/build_initiator_username`](https://github.com/coder/coder/tree/stevenmasley/build_initiator_username)

PR that started conversation: https://github.com/coder/coder/pull/2137

---

# Enable use of `JOIN` in sqlc queries

Currently the UI requires additional information regarding data structures returned by the database. An example is the `workspace_build` table does not include the following fields that the UI wants and uses [`convertWorkspaceBuild`][convertWorkspaceBuild] to add these fields.

- workspace owner (id & name)
- initiator name
- workspace name
- ~~provisioner job~~ (although included, going to not JOIN this data in right now)

This requires 2 extra database calls (fetch users, and workspace). Ideally we could fetch this information with a `JOIN`.

```sql
-- Make the SELECT a VIEW
CREATE VIEW workspace_build_with_names AS
SELECT
    -- coalesce is used because technically the joins do not guarantee a value.
    -- If we setup proper foreign keys, we can remove the coalesce.
	coalesce(initiator_user.username, 'unknown') AS initiator_username,
	coalesce(workspaces.owner_id, '00000000-00000000-00000000-00000000') AS owner_id,
	coalesce(owner_user.username, 'unknown') AS owner_name,
	coalesce(workspaces.name, 'unknown') AS workspace_name,
        -- This information also seems helpful for UI purposes and is used in some areas of the UI.
	coalesce(templates.id, '00000000-00000000-00000000-00000000') AS template_id,
	coalesce(templates.name, 'unknown') AS template_name,
	coalesce(templates.active_version_id, '00000000-00000000-00000000-00000000') AS template_active_version,
	workspace_builds.*
FROM workspace_builds
		 LEFT JOIN users AS initiator_user ON workspace_builds.initiator_id = initiator_user.id
		 LEFT JOIN workspaces ON workspaces.id = workspace_builds.workspace_id
		 LEFT JOIN users AS owner_user ON workspaces.owner_id = owner_user.id
		 LEFT JOIN templates ON workspaces.template_id = templates.id;
```

The solution to using joins is to make a migration that makes the above a VIEW in our database that can be reused. This simplifies all the queries around `workspace_builds` as we do not need to copy the `JOIN` more than once (DRY). 

# Downsides to views

## Maintaining the view

If a column is added to `workspace_build`, that column is **not** automatically added to the view. So your migration will look like this:

```sql
-- We must drop the old view first, then recreate it. We cannot just update the old view
-- as the new view must have the same columns.
DROP VIEW workspace_build_with_initiator;

-- This alter table could be adding a column OR removing a column. In both cases, we will 
-- need to drop the view first.
-- You cannot delete a column that has a dependency on it, and a view is a dependency.
ALTER TABLE workspace_builds
    ADD extra_column int DEFAULT 1 NOT NULL;

CREATE VIEW workspace_build_with_names AS
SELECT -- ... Copy the Select query again ...
```

The biggest downside is that is adds some complexity to our migrations. The developer writing the migration needs to know which views to also update. Postgres will fail if a view is a dependency and the underlying table is changed. They will likely have to copy the `CREATE VIEW ...` code block from migration to migration.

This is just the cost of the views.

### Con: Defaulting to using joins

One downside is that all our queries now use joins. In the example above, the data being joined is only used by the UI. So internal usage doesn't require the join (at the moment). However, having the additional context could be nice for logging purposes, so it might not be a total waste.

I don't think join performance will be so bad that this is really an issue, but it does cause over-fetching. 

If we use the [sqlc.embed][embed] fork of sqlc, we can make db calls with the join, and some db calls without the join. Having this mix can prevent overfetching if we hit performance issues. Using the `embed` means we can nest data-structs and have the UI data-structure be something like:

```golang
type WorkspaceBuildWithNames struct {
  // The backend can work exclusively with 'WorkspaceBuild' type. So calling the FE version of the call
  // is still usable for BE purposes.
  WorkspaceBuild
  OwnerName string
  // .. The rest of the fields
}
```

## The upside 

- Less db calls as we can merge them into 1 with a join
- Still 1 `database.Type` shared by all sqlc queries
- Join queries are reused, and only typed once as a view.
- Filter options can use names rather than ids. (owner_name vs owner_id from a user perspective)

## Alternative (Status Quo)

The alternative is to do the status quo, where we make multiple db calls and merge the resulting structs. 

- Query for workspace build
- Query for owner & initiator user's
- Merge in a go `convertWorkspaceBuild` function.
  - Handling errors in convert is currently inconsistent. The SQL joins handle these more consistently.

### Extra links

SQLc implementations that could help. But gated since these features do not exist yet :(.

- `sqlc.embed()` fork: https://github.com/kyleconroy/sqlc/pull/1615. 
- This was a design sqlc dropped, but [`template: workspace_build_initiator`](https://github.com/kyleconroy/sqlc/issues/112) support for putting the views in the queries.sql
  - Downside is that we have to copy the view's SELECT query with the join to every query. 

[embed]: https://github.com/kyleconroy/sqlc/pull/1615
[convertWorkspaceBuild]: https://github.com/coder/coder/blob/fff59ef6ad4e1d9026b228e8d2164004a5c7ff60/coderd/workspacebuilds.go#L524-L524

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use postgres views to reduce database calls in api (allow `JOIN`s in pg queries) #2201

Enable use of `JOIN` in sqlc queries

Downsides to views

Maintaining the view

Con: Defaulting to using joins

The upside

Alternative (Status Quo)

Extra links

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use postgres views to reduce database calls in api (allow JOINs in pg queries) #2201

Description

Enable use of JOIN in sqlc queries

Downsides to views

Maintaining the view

Con: Defaulting to using joins

The upside

Alternative (Status Quo)

Extra links

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Use postgres views to reduce database calls in api (allow `JOIN`s in pg queries) #2201

Enable use of `JOIN` in sqlc queries