Skip to main content

PostgreSQL Backup Image

The postgres-backup Docker image provides a containerized solution for PostgreSQL database backups with AWS S3 upload support.

Overview

This image is based on CloudNativePG's PostGIS image and includes additional tools for creating compressed backups and uploading them to AWS S3. It's designed to be used in Kubernetes Jobs for scheduled database backups, particularly in CloudNativePG (CNPG) environments.

Image Details

  • Base Image: ghcr.io/cloudnative-pg/postgis:14
  • Registry: registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup
  • Purpose: PostgreSQL database backups with S3 storage
  • PostGIS Support: Includes PostGIS extensions for geospatial data

Included Tools

The image includes:

  • PostgreSQL 14: Base PostgreSQL client tools including pg_dump and pg_dumpall
  • PostGIS: Geospatial extensions for PostgreSQL
  • Python 3: For running scripts and AWS CLI
  • AWS CLI: For uploading backups to S3
  • gzip: For compressing backup files
  • bash: Shell for scripting
  • less: For viewing files
  • groff: Required dependency for AWS CLI

Dockerfile

FROM ghcr.io/cloudnative-pg/postgis:14

USER root

RUN set -eux; \
# Make sure apt lists dir exists and is writable
rm -rf /var/lib/apt/lists && \
mkdir -p /var/lib/apt/lists/partial && \
chmod -R 755 /var/lib/apt/lists && \
\
apt-get update && \
apt-get install -y --no-install-recommends \
bash \
python3-pip \
gzip \
less \
groff && \
pip3 install --no-cache-dir awscli && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* /tmp/*

# Back to postgres user (uid 26 in CNPG images)
USER 26

RUN pg_dump --version && aws --version

Key Features

  • CloudNativePG Compatible: Based on CNPG's PostGIS image for compatibility with CNPG clusters
  • PostGIS Support: Includes PostGIS extensions for geospatial database backups
  • Security: Runs as non-root user (UID 26, postgres user) after installation
  • Optimized: Uses --no-install-recommends and cleans up apt cache to minimize image size

Usage

Pulling the Image

docker pull registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest

Running Manually

docker run --rm \
-e PGHOST=your-db-host \
-e PGPORT=5432 \
-e PGUSER=backup-user \
-e PGPASSWORD=your-password \
-e PGDATABASE=your-database \
-e AWS_ACCESS_KEY_ID=your-key \
-e AWS_SECRET_ACCESS_KEY=your-secret \
-e S3_BUCKET=your-bucket \
registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest \
pg_dump | gzip | aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz

Kubernetes Job Example

apiVersion: batch/v1
kind: Job
metadata:
name: postgres-backup
spec:
template:
spec:
containers:
- name: postgres-backup
image: registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest
env:
- name: PGHOST
value: "postgres-service.default.svc.cluster.local"
- name: PGPORT
value: "5432"
- name: PGUSER
valueFrom:
secretKeyRef:
name: postgres-credentials
key: username
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
- name: PGDATABASE
value: "my-database"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret-access-key
- name: S3_BUCKET
value: "my-backup-bucket"
command:
- /bin/bash
- -c
- |
pg_dump | \
gzip | \
aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz
restartPolicy: Never

CronJob Example (Scheduled Backups)

apiVersion: batch/v1
kind: CronJob
metadata:
name: postgres-backup-daily
spec:
schedule: "0 2 * * *" # Run daily at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: postgres-backup
image: registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest
env:
- name: PGHOST
value: "postgres-service.default.svc.cluster.local"
- name: PGPORT
value: "5432"
- name: PGUSER
valueFrom:
secretKeyRef:
name: postgres-credentials
key: username
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
- name: PGDATABASE
value: "my-database"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret-access-key
- name: S3_BUCKET
value: "my-backup-bucket"
command:
- /bin/bash
- -c
- |
pg_dump | \
gzip | \
aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz
restartPolicy: OnFailure

CloudNativePG (CNPG) Example

apiVersion: batch/v1
kind: Job
metadata:
name: postgres-backup
spec:
template:
spec:
containers:
- name: postgres-backup
image: registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest
env:
# CNPG uses standard PostgreSQL environment variables
- name: PGHOST
value: "my-cluster-rw.default.svc.cluster.local"
- name: PGPORT
value: "5432"
- name: PGUSER
valueFrom:
secretKeyRef:
name: my-cluster-replication-secret
key: username
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: my-cluster-replication-secret
key: password
- name: PGDATABASE
value: "my-database"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret-access-key
- name: S3_BUCKET
value: "my-backup-bucket"
command:
- /bin/bash
- -c
- |
pg_dump | \
gzip | \
aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz
restartPolicy: Never

Environment Variables

PostgreSQL uses standard environment variables for connection:

VariableDescriptionRequired
PGHOSTPostgreSQL server hostnameYes
PGPORTPostgreSQL server portNo (defaults to 5432)
PGUSERPostgreSQL usernameYes
PGPASSWORDPostgreSQL passwordYes
PGDATABASEDatabase name to backupYes
AWS_ACCESS_KEY_IDAWS access key for S3Yes (if uploading to S3)
AWS_SECRET_ACCESS_KEYAWS secret key for S3Yes (if uploading to S3)
S3_BUCKETS3 bucket nameYes (if uploading to S3)
AWS_DEFAULT_REGIONAWS regionNo (defaults to us-east-1)

Command Override

The image does not define ENTRYPOINT or CMD, allowing Kubernetes Jobs to specify custom commands. This provides flexibility for different backup strategies.

Typical Backup Commands

Single Database Backup

pg_dump | gzip | aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz

All Databases Backup

pg_dumpall | gzip | aws s3 cp - s3://$S3_BUCKET/all-databases-backup-$(date +%Y%m%d-%H%M%S).sql.gz

Custom Format Backup

pg_dump -Fc | aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).dump

Advanced Usage

Backup with Options

pg_dump \
--verbose \
--no-owner \
--no-acl \
--format=custom \
| aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).dump

PostGIS-Specific Backup

pg_dump \
--schema=public \
--schema=postgis \
| gzip | \
aws s3 cp - s3://$S3_BUCKET/postgis-backup-$(date +%Y%m%d-%H%M%S).sql.gz

Multiple Databases

for db in database1 database2 database3; do
PGDATABASE=$db pg_dump | \
gzip | \
aws s3 cp - s3://$S3_BUCKET/${db}-backup-$(date +%Y%m%d-%H%M%S).sql.gz
done

Local Backup (No S3)

pg_dump | gzip > /backup/backup-$(date +%Y%m%d-%H%M%S).sql.gz

Differences from MySQL Backup Image

  • Base Image: Uses CloudNativePG's PostGIS image instead of MySQL
  • Package Manager: Uses apt-get (Debian-based) instead of microdnf (Red Hat-based)
  • User: Runs as postgres user (UID 26) instead of root
  • Backup Tool: Uses pg_dump/pg_dumpall instead of mysqldump
  • PostGIS Support: Includes geospatial extensions
  • Environment Variables: Uses PostgreSQL standard variables (PGHOST, PGUSER, etc.) instead of MySQL-specific ones

Prerequisites

  • PostgreSQL Server: Accessible PostgreSQL server with appropriate credentials
  • CloudNativePG (Optional): If using CNPG clusters, ensure proper cluster configuration
  • AWS Credentials: If uploading to S3, valid AWS credentials with S3 write permissions
  • S3 Bucket: If uploading to S3, an existing S3 bucket with appropriate permissions

Security Considerations

  • Credentials: Store PostgreSQL and AWS credentials as Kubernetes Secrets, not in plain text
  • Network Access: Ensure the container can reach the PostgreSQL server and AWS S3
  • IAM Permissions: Use IAM roles with minimal required permissions (S3 write only)
  • Encryption: Consider encrypting backups at rest in S3
  • Non-Root User: The image runs as postgres user (UID 26) for better security
  • CNPG Secrets: When using CloudNativePG, use cluster-generated secrets for authentication

CloudNativePG Integration

This image is optimized for use with CloudNativePG (CNPG) clusters:

  • Compatible Base: Uses CNPG's official PostGIS image
  • User Permissions: Runs as postgres user (UID 26) matching CNPG's user model
  • Connection: Works with CNPG's service endpoints (e.g., cluster-name-rw)
  • Secrets: Compatible with CNPG's generated secrets

Notes

  • The image uses apt-get (Debian package manager) for package installation
  • Python packages are installed without cache to reduce image size
  • The image switches to postgres user (UID 26) after installation for security
  • Version information is displayed when building the image for verification
  • The image is optimized for Kubernetes Job usage patterns
  • Backups are compressed with gzip to reduce storage and transfer costs
  • PostGIS extensions are available for geospatial database backups
  • The image follows CloudNativePG best practices for user permissions