Java Task Scheduling with Cron Using YAML
The App Engine Cron Service allows you to configure regularly scheduled tasks that operate at defined times or regular intervals. These tasks are commonly known as cron jobs. These cron jobs are automatically triggered by the App Engine Cron Service. For instance, you might use this to send out a report email on a daily basis, to update some cached data every 10 minutes, or to update some summary information once an hour.
A cron job will invoke a URL, using an HTTP GET request, at a given time of day. An HTTP request invoked by cron is subject to the same limits as other HTTP requests, depending on the scaling type of the module.
Free applications can have up to 20 scheduled tasks. Paid applications can have up to 100 scheduled tasks.
Important: For a cron task to be considered successful it must return an HTTP status code between 200 and 299 (inclusive).
About cron.yaml
The schedule format
Cron retries
Securing URLs for cron
Calling Google Cloud Endpoints
Cron and app versions
Uploading cron jobs
Cron support in the Cloud Platform Console
Cron support in the development server
About cron.yaml
A cron.yaml
file in the WEB-INF
directory of your application (alongside app.yaml
) configures scheduled tasks for your Java application. The following is an example cron.yaml
file:
The syntax of cron.yaml
is the YAML format. For more information about this syntax, see the YAML website for more information.
A cron.yaml
file consists of a number of job definitions. A job definition must have a url
and a schedule
. You can also optionally specify a description
, timezone
, and a target
. The description is visible in the Cloud Platform Console and the development server's admin interface.
The url
field specifies a URL in your application that will be invoked by the Cron Service. See Securing URLs for Cron for more. The format of the schedule field is covered in The Schedule Format.
The timezone
should be the name of a standard zoneinfo time zone name. If you don't specify a timezone, the schedule will be in UTC (also known as GMT).
The target
string is prepended to your app's hostname. It is usually the name of a module. The cron job will be routed to the default version of the named module. Note that if the default version of the module changes, the job will run in the new default version.
Warning: Be careful if you run a cron job with traffic splitting enabled. The request from the cron job is always sent from the same IP address, so if you've specified IP address splitting, the logic will route the request to the same version every time. If you've specified cookie splitting, the request will not be split at all, since there is no cookie accompanying the request.
If there is no module with the name assigned to target
, the name is assumed to be a version of the default module, and App Engine will attempt to route the job to that version.
If you use a dispatch file, your job may be re-routed. For example, given the following cron.yaml and dispatch.yaml files, the job will run in module2, even though its target is module1:
The schedule format
Cron schedules are specified using a simple English-like format.
The following are examples of schedules:
If you don't need to run a recurring job at a specific time, but instead only need to run it at regular intervals, use the form:
The brackets are for illustration only, and quotes indicate a literal.
N specifies a number.
hours or minutes (you can also use mins) specifies the unit of time.
time specifies a time of day, as HH:MM in 24 hour time.
By default, an interval schedule starts the next interval after the last job has completed. If a from...to clause is specified, however, the jobs are scheduled at regular intervals independent of when the last job completed. For example:
This schedule runs the job three times per day at 10:00, 12:00, and 14:00, regardless of how long it takes to complete. You can use the literal "synchronized" as a synonym for from 00:00 to 23:59:
If you want more specific timing, you can specify the schedule as:
Where:
ordinal specifies a comma separated list of "1st", "first" and so forth (both forms are ok)
days specifies a comma separated list of days of the week (for example, "mon", "tuesday", with both short and long forms being accepted); "every day" is equivalent to "every mon,tue,wed,thu,fri,sat,sun"
monthspec specifies a comma separated list of month names (for example, "jan", "march", "sep"). If omitted, implies every month. You can also say "month" to mean every month, as in "1,8,15,22 of month 09:00".
time specifies the time of day, as HH:MM in 24 hour time.
Note: You cannot specify a schedule that includes both regular intervals and specific timing.
Cron retries
If a cron job's request handler returns a status code that is not in the range 200–299 (inclusive) App Engine considers the job to have failed. By default, failed jobs are not retried. You can cause failed jobs to be retried by including a job_retry_parameters
block in your configuration file.
Here is a sample cron.yaml file that contains a single cron job configured to retry up to five times (the default) with a starting backoff of 2.5 seconds that doubles each time.
The retry parameters are described in the table below.
Setting | Description |
---|---|
| The maximum number of retry attempts for a failed cron job not to exceed '5'. If specified with |
| The time limit for retrying a failed cron job, measured from when the cron job was first run. The value is a number followed by a unit of time, where the unit is s for seconds, m for minutes, h for hours, or d for days. For example, the value 5d specifies a limit of five days after the cron job's first execution attempt. If specified with |
| The minimum number of seconds to wait before retrying a cron job after it fails. |
| The maximum number of seconds to wait before retrying a cron job after it fails. |
| The maximum number of times that the interval between failed cron job retries will be doubled before the increase becomes constant. The constant is: 2**( |
Securing URLs for cron
A cron handler is just a normal handler defined in app.yaml
. You can prevent users from accessing URLs used by scheduled tasks by restricting access to administrator accounts. Scheduled tasks can access admin-only URLs. You can restrict a URL by adding login: admin
to the handler configuration in app.yaml
.
An example might look like this in app.yaml
:
Note: While cron jobs can use URL paths restricted with login: admin
, they cannot use URL paths restricted with login: required
because cron scheduled tasks are not run as any user. The admin
restriction is satisfied by the inclusion of the X-Appengine-Cron
header described below.
To test a cron job, sign in as an administrator and visit the URL of the handler in your browser.
Requests from the Cron Service will also contain a HTTP header:
The X-Appengine-Cron
header is set internally by Google App Engine. If your request handler finds this header it can trust that the request is a cron request. If the header is present in an external user request to your app, it is stripped, except for requests from logged in administrators of the application, who are allowed to set the header for testing purposes.
Google App Engine issues Cron requests from the IP address 0.1.0.1
.
Calling Google Cloud Endpoints
You cannot call a Google Cloud Endpoint from a cron job. Instead, you should issue a request to a target that is served by a handler that's specified in your app's configuration file or in a dispatch file. That handler then calls the appropriate endpoint class and method.
Cron and app versions
If the target
parameter has been set for a job, the request is sent to the specified version. Otherwise Cron requests are sent to the default version of the application.
Uploading cron jobs
You can use AppCfg
to upload cron jobs. When you upload your application to App Engine using AppCfg update
, the Cron Service is updated with the contents of cron.yaml
. You can update just the cron configuration without uploading the rest of the application using AppCfg update_cron
.
To delete all cron jobs, change the cron.yaml
file to just contain:
Cron support in the Cloud Platform Console
The Cloud Platform Console Task queues page has a tab that shows the tasks that are running cron jobs.
You can also visit the Logs page see when cron jobs were added or removed.
Cron support in the development server
The development server doesn't automatically run your cron jobs. You can use your local desktop's cron or scheduled tasks interface to trigger the URLs of your jobs with curl or a similar tool.