Skip to content

[Scheduler] [Component] Scheduler #35315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from
Closed

[Scheduler] [Component] Scheduler #35315

wants to merge 16 commits into from

Conversation

Guikingone
Copy link
Contributor

@Guikingone Guikingone commented Jan 12, 2020

Q A
Branch? master
Bug fix? no
New feature? yes
Deprecations? no
Tickets Implement #36821
License MIT
Doc PR To define

Hi everyone 👋

New year, new idea, first, let me explain why I open this PR as a draft, this new "experimentation" is mostly a draft as the framework integration is not completely done (I've some work pushed on a private repository).

So, what is it?

First, it's a new component, inspired by what Laravel can do with Illuminate/Scheduling but more important, it's inspired by modern platform that allows us to deploy our apps, in most platform, we can define tasks (cron and so on) and if we deploy our apps to "old" platform, we need to edit a crontab file where we need to declare all of our tasks before executing them (and waiting to fail?), the Scheduler component is here to solve most of the problems that can comes with scheduled tasks:

  • What if my platform does not support cron?
  • What if I need to plan tasks using Http?
  • What if I need to store tasks in a queue?
  • Etc...

So, what's inside?

First, we've a Scheduler, that's the entrypoint when it comes to scheduling a Task (this one can be a Http one, Cron, command, queue and so on), once scheduled, a Worker handle a task and execute it.

For now, this experimentation is not complete (the Worker need to be improved, the Messenger "bridge" also and so on) and this PR lack what I'm working on when it comes to the framework integration, this PR is opened in order to discuss and see if it could be a good idea to work on or if it's preferable to stop it now 🙂

Thanks for the feedback 👋

PS: I know that SymfonyCloud use YAML to declare tasks but we don't always use it and the Scheduler idea is here to give a "standard" solution.

Here's a small example:

Standalone

<?php 

use ...; 

$scheduler = new Scheduler('Europe/Paris');

$tasks = new ShellTask('app.test', 'echo Symfony', '*/5 * * * *');
$scheduler->schedule($task);

Framework

framework:
  scheduler:
    timezone: 'Europe/Paris'
    transports:
      local:
        dsn: 'local://'
    schedulers:
      foo:
        transport: 'local'
    tasks:
      app.foo:
        scheduler: 'foo'
        type: 'shell'
        expression: '* * * * *'
        options:
          command: 'echo Symfony!'

Roadmap

Features State
Being able to schedule Symfony commands using the Command FQCN
Adding shortcuts to create "default" cron expressions
Add methods that can trigger some logic before/after the execution of the task
Being able to export the scheduled tasks using txt and json formats
Use transports to ease the configuration
Being able to create cron files
Schedule tasks for specific duration
Schedule tasks for specific execution date ➡️
Adding task (and pool) priority (Pools removed)
Track and optimise tasks execution ➡️
Allow to trigger tasks using a specific path (like fragments) ➡️
Scheduler factory
Cloud platform bridges ➡️
FrameworkBundle integration ➡️
WebProfilerBundle integration ➡️
Notifier integration ➡️
Documentation TODO

Core features:

Transports (under active work since January)

  • Local
  • Null
  • Google Cloud Scheduler
  • Redis
  • Doctrine (under active work test)
  • Kubernetes (under active work)
  • Nomad (under active work)

Commands

  • bin/console scheduler:consume foo, bar --limit 10 --time-limit 10
  • bin/console scheduler:export foo --directory /srv/app --format json --filename foo --limit 10
  • bin/console scheduler:generate foo, bar --directory var/schedulers/export
  • bin/console scheduler:get-tasks foo --expression * * * * * --state paused
  • bin/console scheduler:list-failed
  • bin/console scheduler:reboot foo --retry 10
  • bin/console scheduler:run foo --name app.foo --expression * * * * *

Kernel & classes as Scheduler "entrypoints" (Added on 18/05/2020)

Scheduler defines a SchedulerAwareInterface which allow virtually any class to be a entrypoint to define new tasks:

<?php

/*
 * This file is part of the Symfony package.
 *
 * (c) Fabien Potencier <fabien@symfony.com>
 *
 * For the full copyright and license information, please view the LICENSE
 * file that was distributed with this source code.
 */

namespace Symfony\Component\Scheduler;

/**
 * @author Guillaume Loulier <contact@guillaumeloulier.fr>
 */
interface SchedulerAwareInterface
{
    public function schedule(SchedulerRegistryInterface $registry): void;
}

This interface is auto configured and every "class" that implement in will receive the SchedulerRegistry which contain every declared scheduler.

Worker subscribers | Scheduler subscribers (Added on 21/05/2020)

SchedulerSubscriber are specific subscribers which are registered on every|specific scheduler(s), the registration is done thanks to getSubscribedSchedulers().

final class FooSchedulerSubscriber implements SchedulerSubscriberInterface
{
    /**
     * {@inheritdoc}
     */
    public static function getSubscribedEvents()
    {
        return ...;
    }

    /**
     * {@inheritdoc}
     */
    public static function getSubscribedSchedulers(): array
    {
        return ['*']; // But can also be ['foo', 'bar']
    }

    // ...
}

@Guikingone Guikingone changed the title [Feature] Scheduler [Feature][Draft] Scheduler Jan 12, 2020
@nicolas-grekas nicolas-grekas added this to the next milestone Jan 12, 2020
@stephane-lou
Copy link

I actually made a private bundle that does this. It doesn't use Messenger (for now) but allows me to schedule symfony commands using a graphical interface.
Currently I made a main command (taskamanager:execute) that checks every minute if there are scheduler commands to execute.

I also think this would be a great new component !

@Guikingone
Copy link
Contributor Author

Guikingone commented Jan 13, 2020

Hi @stephane-lou 👋

For now, the Messenger implementation is mainly here in order to allow to send tasks to queue and build something similar to what Notifier does, lately, I've planned to build something that allows us to do the following process:

  • Send a task to RabbitMQ queue
  • Consume the message thanks to a handler when the task expression is due
  • If the message task is successful, send a notification (think Slack and so on)

For the graphical interface, I think it could be a great idea (using Security of course 😉 ), this way, we can easily gave the access to this interface to a specific team department that allows them to create task without specific knowledge of the internal logic 🙂

@kbond
Copy link
Member

kbond commented Jan 13, 2020

Wow, what timing, I just open sourced a bundle for this: https://github.com/kbond/schedule-bundle

Preparing for a 1.0 release this week.

I'd be happy to help contribute if this is something that is decided should be in core.

@Guikingone Guikingone requested a review from fabpot January 18, 2020 08:44
@Guikingone
Copy link
Contributor Author

Oups, wrong click, sorry for the ping FabPot :(

@Guikingone Guikingone removed the request for review from fabpot January 18, 2020 08:45
@fabpot
Copy link
Member

fabpot commented Jan 18, 2020

The ping is actually interesting... as I do have a "Cron" component locally. It was supposed to be announced at a Symfony conference last year, but I decided to release the Notifier one first. Might come soon though.

@monteiro
Copy link
Contributor

Thanks a lot for contributing @Guikingone.

Besides handling crons, I think this could also be interesting for one time tasks. For example, after 3 hours on a specific action, send an e-mail to try to convert a the customer instead of using a CRON.

@Guikingone
Copy link
Contributor Author

@monteiro Actually, that's a planned feature, I've also planned to being able to create a pool of tasks (think about user specific tasks) that can be executed anytime 🙂

@Guikingone
Copy link
Contributor Author

Hi everyone 👋

Here's a small list of recently added features (that need to be completely tested):

  • Task pools
  • Task pools & tasks priority
  • Failed tasks list (the Worker will retrieve it when a runner throw an Error).

@Guikingone
Copy link
Contributor Author

@monteiro Regarding your feedback, I've pushed a refactor that allow to pass a valid expression handled by strtotime or even a \DatetimeInterface, the actual CRON lib allow to calculate the due date depending on a valid expression or a \DatetimeInterface

Copy link
Member

@jderusse jderusse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A common issue with scheduler in a cluster, is asserting that the task will be schedule. And be scheduled once.

IMHO the component should at least:

  • store the last scheduled time in order to re-run missed schedules
  • provide a mechanism to avoid scheduling the same task twice.

{
$task = $message->getTask();

if (!CronExpression::factory($task->get('expression'))->isDue(new \DateTimeImmutable())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the messages took 1 hours to be proceed? Maybe isDue should take the date of creation of the task instead of now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved

{
$task = $message->getTask();

if (!CronExpression::factory($task->get('expression'))->isDue(new \DateTimeImmutable())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you take timezone into account?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved

@Guikingone
Copy link
Contributor Author

@jderusse For the first point, that's actually the case, every time a task is runned, the subscriber define a new 'last execution date":

public function onTaskExecuted(TaskExecutedEvent $event): void
{
    $task = $event->getTask();

    $task->set('last_execution', new \DateTimeImmutable());
}

For the second point, that's an interesting point that I've not worked on directly, for now, a task cannot have the same name as an existing one, I'm gonna invest on this point 🙂

@bigfoot90
Copy link

bigfoot90 commented Jan 31, 2020

For tasks that should be executed once per time, maybe an integration with the Lock Component can be interesting.
Locks can be defined as *.pid file (locally per server node) or in a shared database/service like mysql, memcache or redis (globally per application)

@Guikingone
Copy link
Contributor Author

@bigfoot90 I'm not sure that adding a new dependency (even if it's a Symfony one) its a good idea, I'm pretty sure we can find a solution without using Lock.

@bigfoot90
Copy link

bigfoot90 commented Jan 31, 2020

It should not be a hard dependecy, but a soft dependency (mentioned in suggested inside composer.json)
Then a small How to implement locking tasks guide that says:
Run composer require symfony/lock implement LockingTaskInterface in your Task class.

@Guikingone
Copy link
Contributor Author

Guikingone commented Feb 2, 2020

Hi everyone 👋

First, I want to thanks everyone for the first feedbacks and @nicolas-grekas for the mention on the Sf meetup slides 😊

I've pushed a refactoring which remove the notion of TaskPool, the TaskList already play this role, second, I've started an integration of the Lock component (as mentioned by @bigfoot90 and @jderusse), It's not fully in place but I'm gonna improve the tests and continue to test the integration.

Third point, I've push a first functional implementation of the Exporter using the Serializer (for formats like XML and JSON), I've planned to take a moment to look at a CLI "formatter" but I'm not sure if it worth the works.

Fourth, what's the plan now?

As I've a first functional version of the component, I'm gonna polish the tests, finish every case that I can see by far, continue on the framework integration and improve the DX and so on 🙂

@Guikingone
Copy link
Contributor Author

Hi everyone,

I've just added a roadmap in the first post, the idea is to list the features that I've planned to add, which are done and so on.

@Guikingone
Copy link
Contributor Author

Ok, new day, new feature, I've just pushed an ExpressionFactory which is capable of building CRON expressions using shortcuts, as I don't know if the FabPot "CRON component" will support both at and batch syntax, I don't have shortcuts for it.

@Guikingone
Copy link
Contributor Author

Hi everyone 👋

New day, new feature (again, I know 😄 ), I've focused on bringing a DataCollector, for now, it collect the data during the request using a TraceableScheduler (I've planned to do the same thing for Worker), I'm not sure about what to collect by far so I've built a simple array which contains the core data linked to a task.

I've also worked on a core feature of the component, being able to execute tasks using a specific route (like fragments), I'm not sure about the implementation but definitively a fun moment playing with this.

Last but not least, I've improved the output handling when a runner finish to run a task, by default, the runner can return a string or null, the output is also send to an event.

Don't hesitate to give feedbacks 🙂

@Guikingone
Copy link
Contributor Author

Hi everyone 👋

New day, new feature, I've focused the work on the DX, the Scheduler has now an interface which define core methods (resume, pause, etc).

Most of the logic is tested and in place, I've started 3 bridges for the following platforms:

  • Google Cloud Platform
  • Amazon Web Services
  • Microsoft Azure

I've also planned to build a bridge for SymfonyCloud but I don't know if there's a public API/SDK which allows us to plan cron tasks, for now and if it's not possible, this bridge will stay as a "planned one", for the bridges, the Google one is prettymuch in place, I've used the HttpClient component to call the REST API, the Google approach is pretty simple when it comes to scheduling tasks, Microsoft is a bit harder (and I don't talk about Amazon for now ... 😄 ).

That's all for today, don't hesitate to give feedbacks 😉


public function __construct(TaskInterface $task)
{
$this->task = $task;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why passing the whole task and not a guid
And then the worker « restore » the task from its storage via this identifier?

*/
interface RunnerInterface
{
public function run(TaskInterface $task): string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe describe the string that must be returned

For the support: bool it is obvious, but not the run: string

@Guikingone Guikingone changed the title [Feature][Draft] Scheduler [Feature][Draft][WIP] Scheduler Feb 16, 2020
@Guikingone
Copy link
Contributor Author

Hi everyone 👋

Small update about this PR, I've pushed a "first valid implementation" of the framework configuration (not handled via the container for now) in order to demonstrate how to configure it, some parts are missing (but pushed on secondary repo).

I've also improved the syntax, some classes and the internal logic of the component, feel free to give feedbacks 🙂

@Guikingone
Copy link
Contributor Author

Hi everyone 👋

Small update, I've pushed a first valid configuration along with a first YAML / XML / PHP configuration files in order to validate the current implementation, I'm gonna continue to polish the DX and the configuration in order to finish the "framework" integration.

Small ping to @fabpot (sorry, I really don't want to bother you with stupid questions 😅) in order to know if it worth to continue to work on this PR or if its a better idea to wait for your "cron" component (even if it's planned for 5.2/5.3), the idea behind this ping is really to close this PR if its a bad idea to have 2 "WIP" implementations (I'm pretty sure that yours is probably way more advanced than mine 😄).

For now, the planning of this PR is to have a "valid" and "ready to be merged" implementation for 5.2 (5.1 is too close IMO) but it all depends on the validity of working on it 🙂 .

Again, sorry for the ping and have a great evening 🙂

PS: I've use this ping because I don't know if there's a "Symfony Core-Team" or "Symfony deciders" one 😅

@Guikingone
Copy link
Contributor Author

Hi everyone 👋

Small update, I've pushed a "final" version of the configuration, both YAML & PHP format are supported, the XML one continues to fail (I don't have the reason now).

The FrameworkBundle integration is almost done (same for the container injection)

@Guikingone
Copy link
Contributor Author

PR rebased and ready for reviews

Copy link
Contributor

@94noni 94noni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First small round :)

$this->task = $task;
$this->exitCode = $exitCode;
$this->output = $output;
$this->type = $type;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this property used somewhere?

* file that was distributed with this source code.
*/

namespace Symfony\Component\Scheduler\EventListener;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not inside a Scheduler/EventSubscriber folder as its a Subscriber?


public function onWorkerStarted(WorkerStartedEvent $event): void
{

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you forgot to call the private log function on both methods?

Copy link
Contributor

@94noni 94noni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second round :)

return $this->data['schedulers'];
}

public function getTransports(): array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you seem not displaying it in the profiler

return $priority <= 1000 && $priority >= -1000;
});

$resolver->setInfo('arrival_time', 'The time when the task is retrieved in order to execute it');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arrival seems weird :s
if the doc says "task is retreived", why not "retreival_time" ?
just asking/challenging :)


try {
if ($lockedTask->acquire() && !$this->isRunning()) {
$this->running = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, if the method is called execute() and not run()
this should be called executing and not running (and isRunning as well)
wdyt?

$this->dispatch(new WorkerStoppedEvent($this));

return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant you use a finally block here?
for the process of "lock releasing / memory running false / worker stop even dispatching"

$failedTasks = [];

$table = new Table($output);
$table->setHeaders(['Task', 'Reason', 'Date']);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would put Date first
as date values will always have same length, but reason may not, so that the table is more clearer to read

}

foreach ($failedTasksList as $task) {
$failedTasks[] = [$task->getName(), $task->getReason(), $task->getTriggerDate()->format('m-j-Y H:i:s')];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain the choice of this date format :) ?

{
$this->dispatch(new TaskToExecuteEvent($task));
$output = $runner->run($task);
$task->set('last_execution', new \DateTimeImmutable());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be inside a subscriber? or it is here on purposes

*/
private $workerTimeout;

public function __construct(TaskInterface $task, $workerTimeout = 0.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

watch out, sleep() waits an int, so need here int $workerTimeout = 1
or uses usleep() :)

@fabpot
Copy link
Member

fabpot commented Sep 1, 2020

Closing this PR for now. @Guikingone and I will work together to submit a new PR with some substantial changes.

@fabpot fabpot closed this Sep 1, 2020
@nicolas-grekas nicolas-grekas modified the milestones: next, 5.2 Oct 5, 2020
@carsonbot carsonbot changed the title [Component] Scheduler [Scheduler] [Component] Scheduler Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.