Blog post

Postgres as a CRON Server

03-05-2021

8 minute read

A Supabase user asked recently if they can trigger a webhook periodically. We haven't yet released Functions yet, so we checked whether it's possible with Postgres.

It is. Here's how.

What's cron?

A "cron job" is a script1 that runs periodically at fixed times, dates, or intervals. Traditionally you'd set it up on a Linux server. An example might be an hourly script that downloads emails to your computer.

These days, cron jobs are set up on a remote servers and in the cloud to run internet-related tasks. Like checking an endpoint every hour, or scraping a website every day.

Postgres + cron

Postgres has "extensions" which allow you to, well, extend the database with "non-core" features. Extensions essentially turn Postgres into an application server.

The team at Citus created pg_cron to run periodic jobs within your Postgres database.

Enabling the extension

If you're using a cloud-hosted Postgres database, make sure that pg_cron is installed first. The easiest way to do this is to run this command:

select name, comment, default_version, installed_version
from pg_available_extensions
where name = 'pg_cron';

If it returns a result then the extension is supported and you can turn it on by running:

create extension if not exists pg_cron;

If you're using Supabase you can also enable it in the Dashboard.

This image shows that pg_cron is enabled in the Supabase Dashboard

Granting access to the extension

If you're planning to use a non-superuser role to schedule jobs, ensure that they are granted access to the cron schema and its underlying objects beforehand.

grant usage on schema cron to {{DB user}};
grant all privileges on all tables in schema cron to {{DB user}};

Failure to do so would result in jobs by these roles to not run at all.

Postgres + webhooks

The Supabase customer wanted to call external endpoints every day. How would we do this? Another extension of course. This time we're going to use pgsql-http by @pramsey. Using the same technique, we can enable the extension (if it exists in your cloud provider).

create extension if not exists http;

This extension can now be used for sending GET, POST, PATCH, and DELETE requests.

For example, this function would get all the people in Star Wars (using the Star Wars API):

select content::json->'results'
from http_get('https://swapi.dev/api/people');

Postgres + cron + webhooks

Now the fun stuff. For this example we're going to call webhook.site every minute with the payload { "hello": "world" }.

Here's the code (with comments --like this).

select
  cron.schedule(
    'webhook-every-minute', -- name of the cron job
    '* * * * *', -- every minute
    $$
    select status
    from
      http_post(
        'https://webhook.site/223c8a43-725b-4cbd-b1fe-d0da73353a6b', -- webhook URL, replace the ID(223c8..) with your own
        '{"hello": "world"}', -- payload
        'application/json'
      )
    $$
  );

Now when we see that the payload is sent every minute, exactly on the minute.

This image shows the website that receives our webhook every minute.

And that's it! We've built a cron webhook. Breaking down the code example above we have 2 key parts:

POSTing data

This is the part that sends the data to the website:

select status
from
  http_post(
    'https://webhook.site/223c8a43-725b-4cbd-b1fe-d0da73353a6b', -- webhook URL
    '{"hello": "world"}', -- payload
    'application/json'
  )

Scheduling the job

The HTTP function is wrapped with the CRON scheduler:

select
  cron.schedule(
    'cron-name', -- name of the cron job
    '* * * * *', -- every minute
    $$
     -- Put your code between two dollar signs so that you can create full statements.
     -- Alternatively, you can write you code in a Postgres Function and call it here.
    $$
  );

The second parameter uses cron syntax:

 ┌───────────── min (0 - 59)
 │ ┌────────────── hour (0 - 23)
 │ │ ┌─────────────── day of month (1 - 31)
 │ │ │ ┌──────────────── month (1 - 12)
 │ │ │ │ ┌───────────────── day of week (0 - 6) (0 to 6 are Sunday to
 │ │ │ │ │                  Saturday, or use names; 7 is also Sunday)
 * * * * *

If you're unfamiliar with the cron syntax, useful shortcuts can be found on crontab.guru

* * * * *       # every minute
*/5 * * * *     # every 5th minute
0 * * * *       # every hour
0 0 * * *       # every day

Managing your cron jobs

To see a list of all your cron jobs, run:

select * from cron.job;

And if you need to see the results of each cron iterations, you can find them in cron.job_run_details:

select * from cron.job_run_details;

To stop a running cron job, you can run:

select cron.unschedule('webhook-every-minute'); -- pass the name of the cron job

What can I do with this?

There are plenty use-cases for this. For example:

  • Sending welcome emails. If you use an email provider with an HTTP API, then you batch emails to that service. Write a function that selects all your signups yesterday, then sends them to your favorite transactional email service. Schedule it every day to run at midnight.
  • Aggregating data. If you're providing analytical data, you might want to aggregate it into time periods for faster querying (which serves a similar purpose as a Materialized View).
  • Deleting old data. Need to free up space? Run a scheduled job to delete data you no longer need.

See a detailed list in the pg_cron README.

Addendum

Postgres background workers

You might have noticed this notice the warning at the bottom of the http readme:

"What happens if the web page takes a long time to return?" Your SQL call will just wait there until it does. Make sure your web service fails fast.

Luckily pg_cron implements Background Workers:

Care is taken that these extra processes do not interfere with other postmaster tasks: only one such process is started on each ServerLoop iteration. This means a large number of them could be waiting to be started up and postmaster is still able to quickly service external connection requests.

This means that even if your endpoint takes a long time to return, it's not going to be blocking your core Postgres functions. Either way, you should probably only call endpoints that will return a response quickly, or set the http extension to fail fast (http.timeout_msec = 300).

If you're familiar with C, you could also help @pramsey to implement async functions: https://github.com/pramsey/pgsql-http/issues/105

Should I use Postgres as a cron server?

There are plenty of ways to run cron jobs these days. You can trigger them from your local machine. You can install them on a VPS. You can schedule Serverless functions. You can use a paid service. You can use GitHub Actions.

Is Postgres the best place to put your cron jobs? ¯\_(ツ)_/¯. Postgres databases are free on Supabase and since it takes only one minute to get started, why not make your next cron server a Postgres database?

More Postgres resources

Footnotes

  1. Not necessarily a script. The cron is really a scheduler which triggers a job (of some sort, usually a bash script).

Share this article

Last post

Toad, a link shortener with simple APIs for low-coders

8 March 2021

Next post

Supabase Beta February 2021

2 March 2021

Related articles

Supabase Vault is now in Beta

PostgREST 11 pre-release

Point in Time Recovery is now available for Pro projects

pg_graphql v1.0

What's new in Postgres 15?

Build in a weekend, scale to millions