PostgreSQL Backup Image
The postgres-backup Docker image provides a containerized solution for PostgreSQL database backups with AWS S3 upload support.
Overview
This image is based on CloudNativePG's PostGIS image and includes additional tools for creating compressed backups and uploading them to AWS S3. It's designed to be used in Kubernetes Jobs for scheduled database backups, particularly in CloudNativePG (CNPG) environments.
Image Details
- Base Image:
ghcr.io/cloudnative-pg/postgis:14 - Registry:
registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup - Purpose: PostgreSQL database backups with S3 storage
- PostGIS Support: Includes PostGIS extensions for geospatial data
Included Tools
The image includes:
- PostgreSQL 14: Base PostgreSQL client tools including
pg_dumpandpg_dumpall - PostGIS: Geospatial extensions for PostgreSQL
- Python 3: For running scripts and AWS CLI
- AWS CLI: For uploading backups to S3
- gzip: For compressing backup files
- bash: Shell for scripting
- less: For viewing files
- groff: Required dependency for AWS CLI
Dockerfile
FROM ghcr.io/cloudnative-pg/postgis:14
USER root
RUN set -eux; \
# Make sure apt lists dir exists and is writable
rm -rf /var/lib/apt/lists && \
mkdir -p /var/lib/apt/lists/partial && \
chmod -R 755 /var/lib/apt/lists && \
\
apt-get update && \
apt-get install -y --no-install-recommends \
bash \
python3-pip \
gzip \
less \
groff && \
pip3 install --no-cache-dir awscli && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* /tmp/*
# Back to postgres user (uid 26 in CNPG images)
USER 26
RUN pg_dump --version && aws --version
Key Features
- CloudNativePG Compatible: Based on CNPG's PostGIS image for compatibility with CNPG clusters
- PostGIS Support: Includes PostGIS extensions for geospatial database backups
- Security: Runs as non-root user (UID 26, postgres user) after installation
- Optimized: Uses
--no-install-recommendsand cleans up apt cache to minimize image size
Usage
Pulling the Image
docker pull registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest
Running Manually
docker run --rm \
-e PGHOST=your-db-host \
-e PGPORT=5432 \
-e PGUSER=backup-user \
-e PGPASSWORD=your-password \
-e PGDATABASE=your-database \
-e AWS_ACCESS_KEY_ID=your-key \
-e AWS_SECRET_ACCESS_KEY=your-secret \
-e S3_BUCKET=your-bucket \
registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest \
pg_dump | gzip | aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz
Kubernetes Job Example
apiVersion: batch/v1
kind: Job
metadata:
name: postgres-backup
spec:
template:
spec:
containers:
- name: postgres-backup
image: registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest
env:
- name: PGHOST
value: "postgres-service.default.svc.cluster.local"
- name: PGPORT
value: "5432"
- name: PGUSER
valueFrom:
secretKeyRef:
name: postgres-credentials
key: username
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
- name: PGDATABASE
value: "my-database"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret-access-key
- name: S3_BUCKET
value: "my-backup-bucket"
command:
- /bin/bash
- -c
- |
pg_dump | \
gzip | \
aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz
restartPolicy: Never
CronJob Example (Scheduled Backups)
apiVersion: batch/v1
kind: CronJob
metadata:
name: postgres-backup-daily
spec:
schedule: "0 2 * * *" # Run daily at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: postgres-backup
image: registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest
env:
- name: PGHOST
value: "postgres-service.default.svc.cluster.local"
- name: PGPORT
value: "5432"
- name: PGUSER
valueFrom:
secretKeyRef:
name: postgres-credentials
key: username
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
- name: PGDATABASE
value: "my-database"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret-access-key
- name: S3_BUCKET
value: "my-backup-bucket"
command:
- /bin/bash
- -c
- |
pg_dump | \
gzip | \
aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz
restartPolicy: OnFailure
CloudNativePG (CNPG) Example
apiVersion: batch/v1
kind: Job
metadata:
name: postgres-backup
spec:
template:
spec:
containers:
- name: postgres-backup
image: registry.gitlab.com/welance/platform/pipelines/container/images/postgres-backup:latest
env:
# CNPG uses standard PostgreSQL environment variables
- name: PGHOST
value: "my-cluster-rw.default.svc.cluster.local"
- name: PGPORT
value: "5432"
- name: PGUSER
valueFrom:
secretKeyRef:
name: my-cluster-replication-secret
key: username
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: my-cluster-replication-secret
key: password
- name: PGDATABASE
value: "my-database"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret-access-key
- name: S3_BUCKET
value: "my-backup-bucket"
command:
- /bin/bash
- -c
- |
pg_dump | \
gzip | \
aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz
restartPolicy: Never
Environment Variables
PostgreSQL uses standard environment variables for connection:
| Variable | Description | Required |
|---|---|---|
PGHOST | PostgreSQL server hostname | Yes |
PGPORT | PostgreSQL server port | No (defaults to 5432) |
PGUSER | PostgreSQL username | Yes |
PGPASSWORD | PostgreSQL password | Yes |
PGDATABASE | Database name to backup | Yes |
AWS_ACCESS_KEY_ID | AWS access key for S3 | Yes (if uploading to S3) |
AWS_SECRET_ACCESS_KEY | AWS secret key for S3 | Yes (if uploading to S3) |
S3_BUCKET | S3 bucket name | Yes (if uploading to S3) |
AWS_DEFAULT_REGION | AWS region | No (defaults to us-east-1) |
Command Override
The image does not define ENTRYPOINT or CMD, allowing Kubernetes Jobs to specify custom commands. This provides flexibility for different backup strategies.
Typical Backup Commands
Single Database Backup
pg_dump | gzip | aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).sql.gz
All Databases Backup
pg_dumpall | gzip | aws s3 cp - s3://$S3_BUCKET/all-databases-backup-$(date +%Y%m%d-%H%M%S).sql.gz
Custom Format Backup
pg_dump -Fc | aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).dump
Advanced Usage
Backup with Options
pg_dump \
--verbose \
--no-owner \
--no-acl \
--format=custom \
| aws s3 cp - s3://$S3_BUCKET/backup-$(date +%Y%m%d-%H%M%S).dump
PostGIS-Specific Backup
pg_dump \
--schema=public \
--schema=postgis \
| gzip | \
aws s3 cp - s3://$S3_BUCKET/postgis-backup-$(date +%Y%m%d-%H%M%S).sql.gz
Multiple Databases
for db in database1 database2 database3; do
PGDATABASE=$db pg_dump | \
gzip | \
aws s3 cp - s3://$S3_BUCKET/${db}-backup-$(date +%Y%m%d-%H%M%S).sql.gz
done
Local Backup (No S3)
pg_dump | gzip > /backup/backup-$(date +%Y%m%d-%H%M%S).sql.gz
Differences from MySQL Backup Image
- Base Image: Uses CloudNativePG's PostGIS image instead of MySQL
- Package Manager: Uses
apt-get(Debian-based) instead ofmicrodnf(Red Hat-based) - User: Runs as postgres user (UID 26) instead of root
- Backup Tool: Uses
pg_dump/pg_dumpallinstead ofmysqldump - PostGIS Support: Includes geospatial extensions
- Environment Variables: Uses PostgreSQL standard variables (PGHOST, PGUSER, etc.) instead of MySQL-specific ones
Prerequisites
- PostgreSQL Server: Accessible PostgreSQL server with appropriate credentials
- CloudNativePG (Optional): If using CNPG clusters, ensure proper cluster configuration
- AWS Credentials: If uploading to S3, valid AWS credentials with S3 write permissions
- S3 Bucket: If uploading to S3, an existing S3 bucket with appropriate permissions
Security Considerations
- Credentials: Store PostgreSQL and AWS credentials as Kubernetes Secrets, not in plain text
- Network Access: Ensure the container can reach the PostgreSQL server and AWS S3
- IAM Permissions: Use IAM roles with minimal required permissions (S3 write only)
- Encryption: Consider encrypting backups at rest in S3
- Non-Root User: The image runs as postgres user (UID 26) for better security
- CNPG Secrets: When using CloudNativePG, use cluster-generated secrets for authentication
CloudNativePG Integration
This image is optimized for use with CloudNativePG (CNPG) clusters:
- Compatible Base: Uses CNPG's official PostGIS image
- User Permissions: Runs as postgres user (UID 26) matching CNPG's user model
- Connection: Works with CNPG's service endpoints (e.g.,
cluster-name-rw) - Secrets: Compatible with CNPG's generated secrets
Notes
- The image uses
apt-get(Debian package manager) for package installation - Python packages are installed without cache to reduce image size
- The image switches to postgres user (UID 26) after installation for security
- Version information is displayed when building the image for verification
- The image is optimized for Kubernetes Job usage patterns
- Backups are compressed with gzip to reduce storage and transfer costs
- PostGIS extensions are available for geospatial database backups
- The image follows CloudNativePG best practices for user permissions