The document discusses different approaches for writing job workers and servers in Perl, including using Parallel::Prefork for managing preforked processes, Server::Starter for hot-deploying servers, and Parallel::Scoreboard for monitoring prefork-based workers and servers. It provides code examples and compares the prefork and event-driven approaches.
2. Job workers: the application area Essential for large-scale webapps to communicate with other services for synchronizing data between storages for resizing images, … Oct 15 2010 Writing Prefork Workers / Servers
3. Job workers: prefork vs. event-driven Prefork-based Good for writing application servers, transcoders (image resizing, etc.), DB-based job workers + easy to write and to maintain − consumes more memory (improved by CoW) Oct 15 2010 Writing Prefork Workers / Servers
4. Job Workers: prefork vs. event-driven (2) Event-driven-based good for writing chat server / client (irc), comet-based applications + easy to implement interaction between the tasks (connections) + consumes less memory − difficult to write and to maintain − most modules cannot be called asynchronously Oct 15 2010 Writing Prefork Workers / Servers
5. Job Workers: prefork vs. event-driven (3) Using prefork at first is generally good and then switch to an event-driven-based approach if performance matters unless you are writing a server that implements interaction between the clients Oct 15 2010 Writing Prefork Workers / Servers
6. Agenda Parallel::Prefork for managing prefork’ed processes Server::Starter for hot-deploying servers Parallel::Scoreboard for monitoring prefork-based workers / servers Oct 15 2010 Writing Prefork Workers / Servers
8. Parallel::ForkManager – the good old way # taken from the POD my $pm = new Parallel::ForkManager($MAX_PROCESSES); foreach $data (@all_data) { # Forks and returns the pid for the child: $pm->start and next; ... do some work with $data in the child process ... $pm->finish; # Terminates the child process } Oct 15 2010 Writing Prefork Workers / Servers
9. Parallel::ForkManager – the problem No support for shutdown designed for operating on already-existing data in parallel, not for receiving and handling data concurrently child processes are not killed by parent no clean shutdown or restart Oct 15 2010 Writing Prefork Workers / Servers
10. Parallel::Prefork – a signal-savvy manager my $pm = Parallel::Prefork->new({ max_workers => $MAX_PROCESSES, trap_signals => { TERM => 'TERM', # send TERM to children when parent gets TERM }, ); while ( $pm->signal_received ne 'TERM' ) { $pm->start and next; ... do some work within the child process ... $pm->finish; } $pm->wait_all_children(); # wait for all children to exit Oct 15 2010 Writing Prefork Workers / Servers
11. Parallel::Prefork – writing a Gearman worker my $pm = Parallel::Prefork->new(...); my $worker = Gearman::Worker->new; ... while ($pm->signal_received ne 'TERM') { $pm->start and next; # gracefully exit the child process when parent delegates TERM my $stop_worker = undef; local $SIG{TERM} = sub { $stop_worker = 1 }; $worker->work(stop_if => sub { $stop_worker }); $pm->finish; } $pm->wait_all_children(); # wait for all children to exit Oct 15 2010 Writing Prefork Workers / Servers
12. Parallel::Prefork – writing a prefork server my $pm = Parallel::Prefork->new(...); my $listen_sock = IO::Socket::INET->new( LocalAddr => $hostport, Listen => Socket::SOMAXCONN, ReuseAddr => 1, ); while ($pm->signal_received ne 'TERM') { $pm->start and next; while (my $socket = $listen_sock->accept) { ... communicate with the client ... } $pm->finish; } $pm->wait_all_children(); # wait for all children to exit Oct 15 2010 Writing Prefork Workers / Servers
13. Parallel::Prefork – advanced topics Graceful reconfiguration possible, but often unnecessary, since… for workers, short downtime is acceptable … , or we can run multiple instances of worker processes to hide the downtime for servers, we need hot-deploy to update code and the same technique can be used for changing configuration Changing # of worker processes in general a wrong idea :-p Oct 15 2010 Writing Prefork Workers / Servers
14. Parallel::Prefork – graceful reconfiguration my $pm = Parallel::Prefork->new({ max_workers => $MAX_PROCESSES, trap_signals => { TERM => 'TERM', # send TERM to children when parent gets TERM HUP => 'TERM', # send TERM to chlidren on graceful reconf. }, ); while ($pm->signal_received ne 'TERM') { reload_config() if $pm->signal_received eq 'HUP'; $pm->start and next; ... do some work within the child process ... $pm->finish; } $pm->wait_all_children(); # wait for all children to exit Oct 15 2010 Writing Prefork Workers / Servers
15. Parallel::Prefork – dynamic scaling my $pm = Parallel::Prefork::SpareWorkers->new({ max_workers => 40, min_spare_workers => 5, max_spare_workers => 10, ... ); while ($pm->signal_recieived ne 'TERM') { $pm->start and next; ... # setup signal handlers, etc. while (my $sock = $listener->accept()) { $pm->set_state('A'); # set state of the worker proc. to active ... $pm->set_state(Parallel::Prefork::SpareWorkers::STATUS_IDLE); } $pm->finish(); } ... Oct 15 2010 Writing Prefork Workers / Servers
17. Hot deployment what is it? upgrading web application without restarting the application server the goals no downtime no resource leaks fail-safe Oct 15 2010 Writing Prefork Workers / Servers
18. Old techniques to restart a webapp. server restart the interpreter (mod_perl) pros: graceful cons: XS may cause resource leaks, service-down on deployment failure, cannot implement in pure-perl bind to unix socket (FastCGI) pros: graceful, fail-safe cons: only useful for local-machine communication Oct 15 2010 Writing Prefork Workers / Servers
19. Old techniques to restart a webapp. server (2) exec(myself) (Net::Server) pros: graceful, pure-perl cons: file descriptor leaks, service-down on deployment failure Oct 15 2010 Writing Prefork Workers / Servers
20. The restart method of Server::Starter a superdaemon for hot-deploying TCP servers superdaemon binds to TCP ports, then spawns the application server Oct 15 2010 Writing Prefork Workers / Servers listen spawn app. servers SIGHUP accept app. logic fork & exec accept app. logic SIGTERM accept app. logic fork & exec
21. Server::Starter – no downtime listening socket shared by old and new generation app. servers old app. servers receive SIGTERM after new servers start Oct 15 2010 Writing Prefork Workers / Servers listen spawn app. servers SIGHUP accept app. logic fork & exec accept app. logic SIGTERM accept app. logic fork & exec
22. Server::Starter – no resource leaks no chance of resource leaks every generation of app. servers spawned from superdaemon Oct 15 2010 Writing Prefork Workers / Servers listen spawn app. servers SIGHUP accept app. logic fork & exec accept app. logic SIGTERM accept app. logic fork & exec
23. Server::Starter – fail safe old app. server retired if and only if the new app. server starts up successfully service continues even if the updated app. server fails to start, in cases like missing modules, etc. a good practice is to do self-testing in the app. server before starting to serve client connections is also an efficient way to preload modules Oct 15 2010 Writing Prefork Workers / Servers listen spawn app. servers SIGHUP accept app. logic fork & exec accept app. logic SIGTERM accept app. logic fork & exec
24. Server::Starter – the code # only change the code that listens to a port + if (defined $ENV{SERVER_STARTER_PORT}) { + ($hostport, my $fd) = %{Server::Starter::server_ports}; + $hostport = “0.0.0.0:$_” + unless $hostport =~ /:/; + $listen_sock = IO::Socket::INET->new(Proto => 'tcp'); + $listen_sock->fdopen($fd, 'w’) + or die "failed to bind to listening socket:$!"; + } else { $listen_sock = IO::Socket::INET->new( LocalAddr => $hostport, Listen => Socket::SOMAXCONN, ReuseAddr => 1, ); + } Oct 15 2010 Writing Prefork Workers / Servers
25. Server::Starter – integration w. daemontools The “run” script: #! /bin/sh exec start_server --port=80 -- my_server.pl To start the server: svc –u /service/my_server To stop the server: svc –d /service/my_server To gracefully restart the updated version of the server: svc –h /service/my_server Oct 15 2010 Writing Prefork Workers / Servers
27. Monitoring the workers / servers is essential for… resource provisioning fault management / fixing bugs but how? load average is not an answer the bottleneck might not be CPU or disk I/O logging is good for tracking down bugs, but difficult to use for monitoring Oct 15 2010 Writing Prefork Workers / Servers
28. Parallel::Scoreboard – the caveats a building block for monitoring processes for visualization (like the mod_status of Apache) for creating automated monitoring system flexible any number of processes can be monitored any information can be stored for monitoring any process can monitor any set of processes no relationship between the monitoring processes and monitored processes is necessary Oct 15 2010 Writing Prefork Workers / Servers
29. Parallel::Scoreboard – under the hood one file per monitored process <base_dir>/status_<pid> uses flock for GC monitored process locks its status file while alive monitoring process removes unlocked status files uses checksum for detecting r/w collision monitoring process retries on collision Oct 15 2010 Writing Prefork Workers / Servers
30. Parallel::Scoreboard – monitored process # prepare my $scoreboard = Parallel::Scoreboard->new( base_dir => '/tmp/scoreboard', ); # set the initial status. Note that the "update” method should only # be called from worker processes $scoreboard->update('initializing'); # the main loop while (1) { $scoreboard->update('waiting for task'); my $task = get_task(); $scoreboard->update('handling ' . $task->{id}); handle_task($task); } $scoreboard->update('exitting'); Oct 15 2010 Writing Prefork Workers / Servers
31. Parallel::Scoreboard – integrating with P::Prefork my $pm = Parallel::Prefork->new(...); my $scoreboard = Parallel::Scoreboard->new(...); my $worker = Gearman::Worker->new; $worker->register_function(handle_job => sub { $scoreboard->update('handling ' . ...); try { ... handle the job ... } finally { $scoreboard->update('idle'); } }); while ($pm->signal_received ne 'TERM') { $pm->start and next; $scoreboard->update(‘idle’); # just entered the child process ... Oct 15 2010 Writing Prefork Workers / Servers
32. Parallel::Scoreboard – monitoring process # prepare my $scoreboard = Parallel::Scoreboard->new( base_dir => '/tmp/scoreboard', ); # read the status, and print my $stats = $scoreboard->read_all(); for my $pid (sort { $a <=> $b } keys %$stats) { print "PID:$pid is $stats->{$pid}”; } Oct 15 2010 Writing Prefork Workers / Servers
33. Parallel::Scoreboard – monitoring by HTTP Parallel::Scoreboard::PSGI::App mod_status-like display Parallel::Scoreboard::PSGI::App::JSON scoreboard sent out in JSON expects the status to be string or a JSON array / object useful for gathering scoreboard of many machines also useful for auto-scaling automatically power-up / down the app. servers depending on the output of the scoreboards Oct 15 2010 Writing Prefork Workers / Servers
36. Summary use Parallel::Prefork when writing job workers / servers use Server::Starter when hot-deployment is necessary use Parallel::Scoreboard to create monitors for your job workers / servers Oct 15 2010 Writing Prefork Workers / Servers