Recovery

In EDB Postgres Distributed for Kubernetes, recovery is available as a way to bootstrap a new PGD group starting from an available physical backup of a PGD node. The recovery can't be performed in place on an existing PGD group. EDB Postgres Distributed for Kubernetes also supports point-in-time recovery (PITR), which allows you to restore a PGD group up to any point in time, from the first available backup in your catalog to the last archived WAL. Having a WAL archive is mandatory in this case.

Prerequisite

Before recovering from a backup:

  • Make sure that the PostgreSQL configuration (.spec.cnp.postgresql.parameters) of the recovered cluster is compatible with the original one from a physical replication standpoint.

  • When recovering in a newly created namespace, first set up a cert-manager CA issuer before deploying the recovered PGD group.

For more information, see EDB Postgres for Kubernetes recovery - Additional considerations in the EDB Postgres for Kubernetes documentation.

Recovery from an object store

You can recover from a PGD node backup created by Barman Cloud and stored on supported object storage.

For example, given a PGD groupnamedpgdgroup-example` with three instances with backups available, your object storage contains a directory for each node:

pgdgroup-example-1, pgdgroup-example-2, pgdgroup-example-3

This example defines a full recovery from the object store. The operator transparently selects the latest backup between the defined serverNames and replays up to the last available WAL.

apiVersion: pgd.k8s.enterprisedb.io/v1beta1
kind: PGDGroup
metadata:
  name: pgdgroup-restore
spec:
  [...]
  restore:
    serverNames:
      - pgdgroup-backup-1
      - pgdgroup-backup-2
      - pgdgroup-backup-3
    barmanObjectStore:
      destinationPath: "<destination path here>"
      s3Credentials:
        accessKeyId:
          name: backup-storage-creds
          key: ID
        secretAccessKey:
          name: backup-storage-creds
          key: KEY
      wal:
        compression: gzip
        encryption: AES256
        maxParallel: 8
Important

Make sure to correctly configure the WAL section according to the source cluster. In the example, since the pgdgroup-example PGD group uses compression and encryption, make sure to set the proper parameters also in the PGD group that's being created by the restore.

Note

The example takes advantage of the parallel WAL restore feature, dedicating up to eight jobs to concurrently fetch the required WAL files from the archive. This feature can appreciably reduce the recovery time. Make sure that you plan ahead for this scenario and tune the value of this parameter for your environment. It makes a difference when you need it.

PITR from an object store

Instead of replaying all the WALs up to the latest one, after extracting a base backup, you can ask PostgreSQL to stop replaying WALs at any point in time. PostgreSQL uses this technique to achieve PITR. (The presence of a WAL archive is mandatory.)

This example defines a time-base target for the recovery:

apiVersion: pgd.k8s.enterprisedb.io/v1beta1
kind: PGDGroup
metadata:
  name: pgdgroup-restore
spec:
  [...]
  restore:
    recoveryTarget:
      targetTime: "2023-08-11 11:14:21.00000+02"
    serverNames:
      - pgdgroup-backup-1
      - pgdgroup-backup-2
      - pgdgroup-backup-3
    barmanObjectStore:
      destinationPath: "<destination path here>"
      s3Credentials:
        accessKeyId:
          name: backup-storage-creds
          key: ID
        secretAccessKey:
          name: backup-storage-creds
          key: KEY
      wal:
        compression: gzip
        encryption: AES256
        maxParallel: 8
Important

PITR requires you to specify a targetTime recovery target by using the options described in Recovery targets. When you use targetTime or targetLSN, the operator selects the closest backup that was completed before that target. Otherwise, it selects the last available backup in chronological order between the specified serverNames.

Recovery from an object store specifying a backupID

The .spec.restore.recoveryTarget.backupID option allows you to specify a base backup from which to start the recovery process. By default, this value is empty. If you assign a value to it, the operator uses that backup as the base for the recovery. The value must be in the form of a Barman backup ID.

This example recovers a new PGD group from a specific backupID of the pgdgroup-backup-1 PGD node:

apiVersion: pgd.k8s.enterprisedb.io/v1beta1
kind: PGDGroup
metadata:
  name: pgdgroup-restore
spec:
  [...]
  restore:
    recoveryTarget:
      backupID: 20230824T133000
    serverNames:
      - pgdgroup-backup-1
    barmanObjectStore:
      destinationPath: "<destination path here>"
      s3Credentials:
        accessKeyId:
          name: backup-storage-creds
          key: ID
        secretAccessKey:
          name: backup-storage-creds
          key: KEY
      wal:
        compression: gzip
        encryption: AES256
        maxParallel: 8
Important

When a backupID is specified, make sure to define only the related PGD node in the serverNames option, and avoid defining the other ones.

Note

Defining a specific backupID is especially needed when using one of the following recovery targets: targetName, targetXID, and targetImmediate. In such cases, it's important to specify backupID, unless the last available backup in the catalog is okay.

Recovery targets

Beyond PITR are other recovery target criteria you can use. For more information on all the available recovery targets, see EDB Postgres for Kubernetes recovery targets in the EDB Postgres for Kubernetes documentation.