Background Workers allow you to offload processes from your web component and run them in the background. They are generally used for heavy or long-running processes like mail, billing, image processing, etc. After reading this guide, you should understand:
- The basic infrastructure of workers
- How to deploy workers
- How to configure workers
Worker Logs
Currently Pagoda Box doesn't log output generated by workers. You can have your workers write to a custom log inside a shared writable directory.
Cannot Daemonize
You cannot daemonize your worker process. Pagoda Box will daemonize your workers for you.
Basic Infrastructure
Whenever code is deployed, workers are built using the same codebase as your web component, so all code necessary to run your workers should be included in your codebase. It should also be noted that you can have more than one worker component.
Workers are scaled in exactly the same way as web components. You can increase the number of instances and/or the amount of RAM per instance. Which method of scaling is best depends on how your worker is being used. For tasks that require a lot of RAM, you'll want to increase the amount of RAM per instance. For tasks that don't require a lot of RAM but do require a lot of throughput, you're better off to increase your number of worker instances.
Using Workers
Background workers let you run a task based on various conditions. These conditions could be time, information in a database, files in a folder, etc. The nice thing about background workers it that they never timeout. This makes them very good for anything from sending email, to processing and emailing reports that take a while to generate.
In short, you can drop tasks into a queue and have your background worker take care of those tasks. Workers accomplish many of the tasks cron jobs are used for, but are much more flexible.
The following example is a barebones example of a worker that logs the time a process is run. Code for workers resides in two places: Your Boxfile, which houses all of the necessary worker config, and the script that runs the worker.
YAMLWorker Example — Boxfile
PHPWorker Example
Whenever the worker is deployed, this process will automatically start running. Workers should never exit unless told to do so, which happens when instances are decomissioned.
Deploying Workers
Deploying workers is done by including a worker component in your Boxfile.
YAMLWorkers in the Boxfile
Configuring Workers
All configuration for workers is done in the Boxfile. You can use the majority of the web config options to configure your worker, but there are some added config options specific to workers. The majority of workers will only need a simple configuration, but there are also some advanced options:
Simple Config
All that's needed to get a worker up and running is the command to start your worker.
exec
The command that will start your worker process.
YAMLexec
Advanced Config
The advanced configuration allows you to customize how Pagoda Box starts and stops your workers. A new worker is started with each deploy after which the previous worker is stopped and decommissioned.
start
This allows you to set a specific start behavior for your worker instances. The deploy will pause and stream all output from your worker until one of two conditions are met:
1. The output contains your "ready" string.
2. The timeout limit is reached.
There are two config options for start: ready and timeout.
YAMLStart Config
ready
The string your worker will output when it is ready to handle requests.
timeout
The length of time (in seconds) the deploy stream will stay open waiting for your "ready" string. When the timeout is reached, we assume a successful start and continue the deploy process. The default timeout is 2 seconds.
stop
This allows you to set a specific stop behavior for your worker instances. In many cases, workers are assigned tasks that can not simply be stopped without a proper "shutdown" process. The stop config allows you to define that shutdown process and gracefully stop your worker instances, if necessary. There are three config options for stop: signal, timeout, and exec.
YAMLStop Config
signal
Workers are stopped by sending the process a UNIX kill signal. The default kill signal is "SIGQUIT". Here you can specify any valid POSIX kill signal. It is not required that your workers shutdown immediately upon receiving the kill signal.
timeout
The length of time (in seconds) worker instances are allowed to run after they've received the kill signal. The default timeout is 60 seconds. NOTE: Instances are decommissioned at the moment the process stops, regardless of the timeout setting.
exec
This command is executed instead of sending your worker a kill signal. It is useful if your worker cannot gracefully trap a UNIX kill signal. One use case might be to trigger a script that would set a value in a database. The worker could recognize the value before the next iteration and exit.