Skip to content

[6.x] Create Factory Primers #30880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

[6.x] Create Factory Primers #30880

wants to merge 3 commits into from

Conversation

browner12
Copy link
Contributor

Problem

Let's take a typical factory with a relationship. We'll use the example right off the Laravel docs.

$factory->define(App\Post::class, function ($faker) {
    return [
        'title' => $faker->title,
        'content' => $faker->paragraph,
        'user_id' => factory(App\User::class),
    ];
});

When we create a Post, we will also create a User. At the surface this seems okay, but let's think what happens when we create 50 Posts.

factory(Post::class, 50)->create();

Without the relationship we'd be running 50 inserts, but with the relationship we double that to 100 queries! Now imagine a factory that has multiple relationships. The number of database queries (and time) grows with each new relationship handled this way.


What if we assume that Users have already been created, and we'll randomly select 1 to attach the Post to?

$factory->define(App\Post::class, function ($faker) {
    return [
        'title' => $faker->title,
        'content' => $faker->paragraph,
        'user_id' => App\User::all()->random(),
    ];
});

Unfortunately, this does not cut down on our query count, since the all() query is executed for every single new Post.


We can also not pass it as an override attribute.

factory(Post::class, 50)->create([
    'user_id' => App\User::all()->random(),
]);

This will cut down our query count, but since it is only executed once, all 50 Posts will receive the same User ID, which is not what we want.

Solution

The solution to this is a feature I call "primers". At the time of building the factories, you can inject key/value pairs into the builder that the factories can then use to generate their data.

Let's first look at how we call it. Simply chain the prime() method onto your factory prior to calling make() or create(), and pass it a string key and a value of any type you like.

factory(Post::class, 50)->prime('users', User::all())->create();

With this call we've made 1 query to get all the Users. Now let's see how to use primers in the factories.

$factory->define(App\Post::class, function ($faker, $attributes, $primers) {
    return [
        'title' => $faker->title,
        'content' => $faker->paragraph,
        'user_id' => $primers['users'] ? $primers['users']->random() : factory(App\User::class),
    ];
});

Here we can see the primers are passed into the factory definition as a new 3rd parameter. For properties that we expect primers for, we should first check to see the primer exists. If the primer exists, we can use it as needed, but also fall back to the old way of determining the relationship if the primer was not provided.

With this new feature, and a primed factory, we've now cut our queries down from 100 to 51!

Primers can also be used with factory states.

$factory->state(App\Post::class, 'admin-posts', function ($faker, $attributes, $primers) {
    return [
        'user_id' => $primers['users'] ? $primers['users']->random() : factory(App\User::class),
    ];
});

factory(Post::class, 50)->state('admin-posts')->prime('users', User::where('type', 'admin')->get())->create();

Practical Usage

There are 2 primary places factories are used: tests and seeders.


In regards to Tests, let's assume we have Users, Posts, and Comments. Both Posts and Comments belong to a User that authored them.

Currently to test we may write something like:

public function test_posts()
{
    $posts = factory(Post::class, 10)->create();

    foreach($posts as $post) {
        factory(Comment::class, 5)->create([
            'post_id' => $post->id
        ]);
    }

    $this->get('/posts')
             ->makeAssertions();
}

Both the Post and Comment factory will be deferring to the factory to create new Users. Therefore, this test would run 120 queries. 10 to create the Posts, 50 to create the Comments, 10 to create the Post Users, and 50 to create the Comment Users.

Now let's tweak it just a little to improve our performance.

public function test_posts()
{
    $users = factory(User::class, 10)->create();

    $posts = factory(Post::class, 10)->prime('users', $users)->create();

    foreach($posts as $post) {
        factory(Comment::class, 5)->prime('users', $users)->create([
            'post_id' => $post->id
        ]);
    }

    $this->get('/posts')
             ->makeAssertions();
}

We're now down to 70 queries from 120!


In regards to seeding, we can make some small tweaks to our DatabaseSeeder to improve performance.

Currently you might do something like:

public function run()
{
    factory(User::class, 10)->create();
    factory(Post::class, 50)->create();
    factory(Comment::class, 50)->create();
}

Without primers this will require 210 queries.

Let use primers to improve this.

public function run()
{
    $users = factory(User::class, 10)->create();
    factory(Post::class, 50)->prime('users', $users)->create();
    factory(Comment::class, 50)->prime('users', $users)->create();
}

This takes us down to 110 queries, hooray!

Caveats

There is one thing to be aware of when using primers. It is very possible the factory could make assumptions about what type of data it is getting from the primer. My guess is the most common use case will be to pass Collections in.

$factory->define(App\Post::class, function ($faker, $attributes, $primers) {
    return [
        'title' => $faker->title,
        'content' => $faker->paragraph,
        'user_id' => $primers['users'] ? $primers['users']->random() : factory(App\User::class),
    ];
});

You can see if the primer exists we are calling random() on it. If the users passes in the incorrect type of data, it will cause an error.

factory(Post::class, 50)->prime('users', [1, 2, 3])->create();

Here you can see we passed an array instead of a Collection, so when the factory is run it will throw an error that random() does not exist.

This error will be easy enough to identify and correct, so I'm not really worried about it, but just something to be aware of.

primers are key/value pairs that can be injected into the FactoryBuilder at calling time that can be used by the factories to generate their data.

the primer can also be used by factory "states"

added a couple tests
@GrahamCampbell
Copy link
Member

Don't worry about the code style. It gets auto-fixed on merge.

@browner12
Copy link
Contributor Author

gotcha, thanks @GrahamCampbell.

it's more so OCD....

@taylorotwell
Copy link
Member

What are $attributes (the second argument)? I don't even remember. They don't seem to be documented anywhere on the website.

@browner12
Copy link
Contributor Author

here is the commit that added it. c238aac

I'm not exactly sure how you would use this....

@autaut03
Copy link

autaut03 commented Dec 19, 2019

$attributes is actually $overrides - an array you pass to ->make() or ->create(). It has a reason to exist for sure.

@browner12
Copy link
Contributor Author

What I'm failing to see, though, is why your would want/need to reference the override attributes in the factory definition.

Do you have an example of how you would use this?

Either way I don't think it makes a difference for this PR, since we couldn't remove that parameter in a Minor or Patch release.

@4refael
Copy link
Contributor

4refael commented Dec 19, 2019

There's another way to handle this which is to cache $users to avoid multiple queries.

So instead of this

$factory->define(App\Post::class, function ($faker) {
    return [
        'title' => $faker->title,
        'content' => $faker->paragraph,
        'user_id' => App\User::all()->random(),
    ];
});

You can do this

$factory->define(App\Post::class, function ($faker) {
    static $users;
    
    if ($users === null) {
        $users = App\User::all();
    }

    return [
        'title' => $faker->title,
        'content' => $faker->paragraph,
        'user_id' => $users->random(),
    ];
});

@autaut03
Copy link

@browner12 We use it on our project to create related entities based on given info and to create belongsTo related entities if no related_entity_id key was given.

We shouldn't be doing this and we no longer are, but there are still hundreds of places where it's necessary. Maybe 7.x is a good place to drop support for it to force usage of states - I don't know, but I'm pretty sure this won't happen and there are probably valid use cases outside of ones I described.

@taylorotwell
Copy link
Member

The more I think about this I think I will hold off on it. There are already fairly simple ways of doing this. You can use the static variable as pointed out above... You could also create a class within your seeders directory with static methods and return any "primer" data you need and could even use the spatie/once package for clean memoization of that or just use normal properties. I'm not sure we need any additional framework features to make this possible.

@browner12
Copy link
Contributor Author

The static suggestion only works for when you are defining the factories, it doesn't solve the need to inject data when the factories are called.

$factory->define(App\Post::class, function ($faker) {
    static $users;
    
    if ($users === null) {
        $users = App\User::all();
    }

    return [
        'title' => $faker->title,
        'content' => $faker->paragraph,
        'user_id' => $users->random(),
    ];
});

This assumes you always want to pick a user_id from all the Users. What if you want it from a specific selection?

$normalUsers = factory(App/User::class, 10)->create();

$superUsers = factory(App/User::class, 5)->state('super')->create();

$departments = factory(App/Department::class, 3)->prime('manager', $superUsers)->create();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants