How to restore Infrahub on Kubernetes

This guide walks you through restoring your Infrahub instance on Kubernetes from a backup stored in S3-compatible storage. The restore process uses the same infrahub-backup Helm chart and runs as a Kubernetes Job.

warning

Restoring a backup will overwrite your current Infrahub data. Create a safety backup before proceeding if you need to preserve any recent changes.

Prerequisites

Before restoring:

A backup file stored in S3-compatible storage
The backup was created from the same Infrahub edition (Community or Enterprise)
Access to modify Helm values for your Infrahub deployment
S3 credentials with read access to the backup bucket

Step 1: Identify the backup to restore

List available backups in your S3 bucket:

# AWS S3
aws s3 ls s3://my-infrahub-backups/

# MinIO
mc ls myminio/my-infrahub-backups/

Note the exact filename of the backup you want to restore, for example: infrahub_backup_20250120_020000.tar.gz

Verify backup metadata

Before restoring, verify the backup is compatible:

# Download and inspect metadata
aws s3 cp s3://my-infrahub-backups/infrahub_backup_20250120_020000.tar.gz ./
tar -xzOf infrahub_backup_20250120_020000.tar.gz backup_information.json | jq '.'

Check that:

neo4j_edition matches your current deployment
infrahub_version is compatible with your target version
components includes the data you need to restore

Step 2: Configure S3 source

Create or verify the S3 credentials secret exists:

kubectl create secret generic backup-s3-credentials \
  --namespace infrahub \
  --from-literal=AWS_ACCESS_KEY_ID=your-access-key \
  --from-literal=AWS_SECRET_ACCESS_KEY=your-secret-key

Step 3: Configure the restore Job

Via Infrahub Helm Chart (Recommended)
Standalone Chart

Add the restore configuration to your Infrahub Helm values:

# values.yaml for Infrahub Helm chart
infrahub-backup:
  enabled: true

  # Disable backup during restore
  backup:
    enabled: false

  # Enable restore
  restore:
    enabled: true
    s3:
      bucket: "my-infrahub-backups"
      key: "infrahub_backup_20250120_020000.tar.gz"
      endpoint: "https://s3.amazonaws.com"
      region: "us-east-1"
      secretName: "backup-s3-credentials"

Create a values file for the restore operation:

# restore-values.yaml
backup:
  enabled: false

restore:
  enabled: true
  s3:
    bucket: "my-infrahub-backups"
    key: "infrahub_backup_20250120_020000.tar.gz"
    endpoint: "https://s3.amazonaws.com"
    region: "us-east-1"
    secretName: "backup-s3-credentials"

Step 4: Deploy the restore Job

warning

The restore process will stop Infrahub services temporarily. Plan for downtime during the restore operation.

Via Infrahub Helm Chart
Standalone Chart

Update your Infrahub Helm release:

helm upgrade infrahub opsmill/infrahub \
  --namespace infrahub \
  --values values.yaml

Install or upgrade the chart with restore enabled:

helm upgrade --install infrahub-backup opsmill/infrahub-backup \
  --namespace infrahub \
  --values restore-values.yaml

Step 5: Monitor restore progress

Watch the restore Job

# Check Job status
kubectl get job -n infrahub -l app.kubernetes.io/name=infrahub-backup

# Watch pod status
kubectl get pods -n infrahub -l app.kubernetes.io/name=infrahub-backup -w

View restore logs

# Stream logs from the restore pod
kubectl logs -n infrahub -l app.kubernetes.io/name=infrahub-backup -f

Expected output for a successful restore:

INFO[0000] Starting restore process...
INFO[0001] Downloading backup from S3: s3://my-infrahub-backups/infrahub_backup_20250120_020000.tar.gz
INFO[0010] Download completed (1.5GB)
INFO[0010] Extracting backup archive...
INFO[0012] Validating backup metadata...
INFO[0012] Backup ID: 20250120_020000
INFO[0012] Infrahub version: 0.15.0
INFO[0012] Components: database, task-manager-db
INFO[0013] Validating checksums...
INFO[0015] All checksums valid
INFO[0015] Stopping Infrahub services...
INFO[0020] Wiping transient data (cache, message-queue)...
INFO[0022] Restoring PostgreSQL database...
INFO[0030] PostgreSQL restore completed
INFO[0030] Restarting support services...
INFO[0035] Restoring Neo4j database...
INFO[0060] Neo4j restore completed
INFO[0060] Starting Infrahub services...
INFO[0070] All services started
INFO[0070] Restore completed successfully

Step 6: Verify restored instance

Check service health

# Verify all pods are running
kubectl get pods -n infrahub

# Check Infrahub server logs
kubectl logs -n infrahub -l app.kubernetes.io/component=infrahub-server --tail=50

Validate data integrity

Access the Infrahub UI - Log in and verify your data is present
Check the GraphQL API - Query a known object to confirm data restoration
Review task manager - Verify historical task runs are visible

Test a sample query

# Port-forward to the Infrahub server
kubectl port-forward -n infrahub svc/infrahub-server 8000:8000

# Query the API
curl -X POST http://localhost:8000/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ InfrahubStatus { summary { schema_hash } } }"}'

Step 7: Disable restore and re-enable backups

After a successful restore, update your values to disable the restore Job and re-enable scheduled backups:

infrahub-backup:
  enabled: true

  backup:
    enabled: true
    mode: "cronjob"
    schedule: "0 2 * * *"
    storage:
      type: "s3"
      s3:
        bucket: "my-infrahub-backups"
        endpoint: "https://s3.amazonaws.com"
        region: "us-east-1"
        secretName: "backup-s3-credentials"

  restore:
    enabled: false  # Disable restore

Apply the updated configuration:

helm upgrade infrahub opsmill/infrahub \
  --namespace infrahub \
  --values values.yaml

Troubleshooting

Restore Job fails to start

Check if the S3 credentials secret exists:

kubectl get secret backup-s3-credentials -n infrahub

Verify the secret contains the expected keys:

kubectl get secret backup-s3-credentials -n infrahub -o jsonpath='{.data}' | jq 'keys'

Download fails

Check network connectivity and S3 endpoint configuration:

# View detailed error in logs
kubectl logs -n infrahub -l app.kubernetes.io/name=infrahub-backup

# Common issues:
# - Incorrect bucket name
# - Wrong S3 endpoint URL
# - Invalid credentials
# - Network policy blocking egress

Checksum validation fails

The backup file may be corrupted. Try downloading a fresh copy:

# Verify the backup locally
aws s3 cp s3://my-infrahub-backups/infrahub_backup_20250120_020000.tar.gz ./
tar -tzf infrahub_backup_20250120_020000.tar.gz > /dev/null

Services fail to restart

Check for resource constraints or scheduling issues:

# Check pod events
kubectl describe pods -n infrahub -l app.kubernetes.io/name=infrahub

# Check node resources
kubectl top nodes

Validation

Confirm your restore completed successfully:

Restore Job completed without errors
All Infrahub pods are running
Infrahub UI is accessible
Data is present and correct
Task manager shows historical runs
Scheduled backups are re-enabled
Restore Job is disabled to prevent accidental re-runs

Prerequisites​

Step 1: Identify the backup to restore​

Verify backup metadata​

Step 2: Configure S3 source​

Step 3: Configure the restore Job​

Step 4: Deploy the restore Job​

Step 5: Monitor restore progress​

Watch the restore Job​

View restore logs​

Step 6: Verify restored instance​

Check service health​

Validate data integrity​

Test a sample query​

Step 7: Disable restore and re-enable backups​

Troubleshooting​

Restore Job fails to start​

Download fails​

Checksum validation fails​

Services fail to restart​

Validation​

Related resources​