-
Notifications
You must be signed in to change notification settings - Fork 889
feat: Add database data generator to make fakedbs easier to populate #5922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
One example shown in groupparm_test.go. More examples coming
coderd/httpmw/groupparam_test.go
Outdated
gen := databasefake.NewGenerator(t, db) | ||
group := gen.Group(ctx, "group", database.Group{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do ya'll think about this syntax?
Also supported:
org := gen.Organization(ctx, "group-org", database.Organization{})
group := gen.Group(ctx, "group", database.Group{
OrganizationID: gen.Lookup("group-org"),
// OR
// OrganizationID: org.ID,
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also go out of order with lookup:
group := gen.Group(ctx, "group", database.Group{
OrganizationID: gen.Lookup("group-org"),
})
org := gen.Organization(ctx, "group-org", database.Organization{})
I refactored a lot of examples. I can do more. |
|
||
// PrimaryOrg is to keep all resources in the same default org if not | ||
// specified. | ||
func (g *Generator) PrimaryOrg(ctx context.Context) database.Organization { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty cool, nice work!
|
||
func must[T any](value T, err error) T { | ||
if err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid panicing in tests, maybe pass t
and do t.Fatal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at where I do it, it is used for json.Marshal
on a standard struct. This should absolutely never fail. I cannot add another arg, because you cannot do:
must(t, json.Marshal())
You can't use multiple returns and an argument like that 😢. The must
is a nice way to keep the context in the struct body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose the intent of this is to reduce code duplication in favor of creating a common abstraction for populating the database. This PR does the opposite in that case because it nets positive on lines of code.
I'm not sure we should add this level of abstraction right now. I like the experiment, but it's unclear why I'd use this when the lines aren't dramatically reduced, and it becomes more difficult to understand what gets inserted.
Just my 2c, but I'm open to others' thoughts!
I had something similar before. I had a custom case for every enum (because reflect can't get the consts that are valid). I was having trouble with supporting nested structs, which I guess we do not need to support. What do you do for things like Other fields which need custom handling off the top of my head:
I also don't want truly random data. If you don not specify |
I guess in my mind we shouldn't handle these bespoke situations with a generator anyways, because they are bespoke, so why generalize it? |
Pretty much all fakers allow domain knowledge in the generator.
The problem with truly random data, is that is usually isn't useful. If we go that route, it's more like fuzzing than faking. If that is what we want, I can support the reflect path better. Just to put the differences in the switch statement. case database.Workspace:
workspace, err := db.InsertWorkspace(ctx, database.InsertWorkspaceParams{
ID: takeFirst(orig.ID, uuid.New()),
OwnerID: takeFirst(orig.OwnerID, uuid.New()),
CreatedAt: takeFirst(orig.CreatedAt, time.Now()),
UpdatedAt: takeFirst(orig.UpdatedAt, time.Now()),
OrganizationID: takeFirst(orig.OrganizationID, uuid.New()),
TemplateID: takeFirst(orig.TemplateID, uuid.New()),
Name: takeFirst(orig.Name, namesgenerator.GetRandomName(1)),
AutostartSchedule: orig.AutostartSchedule,
Ttl: orig.Ttl,
})
require.NoError(t, err, "insert workspace")
return workspace vs case database.Workspace:
// Reflect based random data seeded with the original data.
value := random(orig)
workspace, err := db.InsertWorkspace(ctx, value)
require.NoError(t, err, "insert workspace")
return workspace |
I think too the reflect based method would be inherently more challenging to maintain (unless we use a library). I think the current code is pretty straight forward for a new person to look at. |
Hmm, I see what you mean. I think what I'd prefer here is: package databasegen
func New(db database.Store) *Generator {
return &Generator{db}
}
type Generator struct {
db database.Store
}
func (g *Generator) Workspace(workspace database.Workspace) database.Workspace {
if workspace.RequiredField == nil {
panic("asdf")
}
}
func (g *Generator) APIKey() (database.APIKey, string) {
} Kinda a combo of what you have now and something new. I don't think it should go in |
I liked the idea of removing state, so I updated the code to look like this: https://github.com/coder/coder/blob/stevenmasley/fake_data/coderd/database/databasefake/generator.go You are right about not storing state in the generator imo. This change does remove the need for the switch, so I could do: func Template(t *testing.T, db database.Store, seed database.Template) database.Template {
template, err := db.InsertTemplate(context.Background(), database.InsertTemplateParams{
ID: takeFirst(seed.ID, uuid.New()),
CreatedAt: takeFirst(seed.CreatedAt, time.Now()),
UpdatedAt: takeFirst(seed.UpdatedAt, time.Now()),
OrganizationID: takeFirst(seed.OrganizationID, uuid.New()),
Name: takeFirst(seed.Name, namesgenerator.GetRandomName(1)),
Provisioner: takeFirst(seed.Provisioner, database.ProvisionerTypeEcho),
ActiveVersionID: takeFirst(seed.ActiveVersionID, uuid.New()),
Description: takeFirst(seed.Description, namesgenerator.GetRandomName(1)),
DefaultTTL: takeFirst(seed.DefaultTTL, 3600),
CreatedBy: takeFirst(seed.CreatedBy, uuid.New()),
Icon: takeFirst(seed.Icon, namesgenerator.GetRandomName(1)),
UserACL: seed.UserACL,
GroupACL: seed.GroupACL,
DisplayName: takeFirst(seed.DisplayName, namesgenerator.GetRandomName(1)),
AllowUserCancelWorkspaceJobs: takeFirst(seed.AllowUserCancelWorkspaceJobs, true),
})
require.NoError(t, err, "insert template")
return template
} We can add required fields, but at present no objects have any required fields. I put it in the |
Example of the new package: coder/coderd/database/databasefake/fakegen/generator.go Lines 20 to 220 in 1113d8f
No generics (except for take), no reflect. Actually pretty clean and similar to our current helper functions we write. |
Agreed. I like that @Emyrk |
We should move it out of |
Agreed. This came out way cleaner. Will refactor the tests I changed to use this. |
It also will work with postgres, the caller just needs to handle the constraints on their own. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I like the simplicity!
Thanks for iterating here @Emyrk. Your explanations helped drive a better UX for all!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might wanna add a package comment to generator.go
that denotes that fields will be generated if not provided. It's pretty implicit, but might be helpful to make it explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice and clean. 👍
Thanks for the input @kylecarbs!
Just a few import ordering nits, and potential bikeshedding over the name dbgen
instead of databasegen
.
I think you are right, |
We should probably rename |
@Emyrk thoughts on the above? |
I like We have |
What this does
Supports an easy way to populate random data into a fake database for unit testing.
How this works
You can create a generator around a fake database.
And generate whatever objects you want. If you omit a field, that field is randomly generated.
Why?
I think there is a good reason to use fake db, and not need to use Postgres for all tests. Postgres tests are much slower, and sometimes using postgres does not provide any additional value.
We often make
setup
functions for these tests to pre-populate data. This is ok, but does constrain each test case to use the same setup. The setup also has to write code like this:coder/coderd/httpmw/groupparam_test.go
Lines 28 to 43 in 3120c94
This code is repetitive and kinda annoying to write. The goal of this PR to is make pre-populating data easy. That previous code now looks like this:
coder/coderd/httpmw/groupparam_test.go
Lines 26 to 28 in 50ea251
This PR also supports some more complex syntax and out-of-order objects.
More complex API
A more complex api also exists. This is to support data sets as a single object to be reused, or easily stored in a field.
Example: