Files
system-docs/05_OPERACIONES/backup-recovery.md
ARCHITECT 0ee01d07a3 fix(arch): Enforce instance autonomy principle across docs
Updates to ensure DECK/CORP are documented as autonomous instances:

- overview.md: Clarify ARCHITECT is for build/deploy only, not runtime
- filosofia.md: Mark shared services (GRACE, etc.) as optional
- backup-recovery.md: Each instance does its own local backup to its own R2 bucket

Key principle: Instances never depend on ARCHITECT at runtime.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-25 10:40:19 +00:00

423 lines
9.9 KiB
Markdown

# Backup y Recovery TZZR
**Versión:** 5.0
**Fecha:** 2024-12-24
---
## Principio Fundamental
> **Cada instancia es responsable de su propio backup.**
DECK y CORP son instancias autónomas. No dependen de ARCHITECT para hacer backups.
Cada servidor ejecuta su script de backup localmente y sube directamente a R2.
---
## Estado Actual
### Backups Existentes
| Sistema | Backup | Destino | Frecuencia | Estado |
|---------|--------|---------|------------|--------|
| Gitea | Sí | R2 | Manual | Operativo |
| PostgreSQL ARCHITECT | No | - | - | **CRÍTICO** |
| PostgreSQL DECK | No | - | - | **CRÍTICO** |
| PostgreSQL CORP | No | - | - | **CRÍTICO** |
| PostgreSQL HST | No | - | - | **CRÍTICO** |
| R2 buckets | Built-in | R2 | Automático | Operativo |
---
## Arquitectura de Backups
```
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ ARCHITECT │ │ DECK │ │ CORP │
│ │ │ │ │ │
│ backup.sh ───┼────►│ backup.sh ───┼────►│ backup.sh ───┼────► R2
│ (local) │ │ (local) │ │ (local) │
└──────────────┘ └──────────────┘ └──────────────┘
│ │
▼ ▼
Sin dependencia Sin dependencia
de ARCHITECT de ARCHITECT
```
---
## Backup por Servidor (LOCAL)
### ARCHITECT (69.62.126.110)
**Ubicación script:** `/opt/scripts/backup_postgres.sh`
```bash
#!/bin/bash
# Ejecutar EN ARCHITECT - backup local
set -e
DATE=$(date +%F)
# Credenciales R2 (desde Vaultwarden local o .env)
source /opt/architect/.env
export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$R2_SECRET_KEY"
R2_ENDPOINT="https://7dedae6030f5554d99d37e98a5232996.r2.cloudflarestorage.com"
# Backup local
sudo -u postgres pg_dump architect | gzip > /tmp/architect_$DATE.sql.gz
# Subir a R2
aws s3 cp /tmp/architect_$DATE.sql.gz s3://architect/backups/postgres/ \
--endpoint-url $R2_ENDPOINT
rm /tmp/architect_$DATE.sql.gz
echo "ARCHITECT backup completado: $DATE"
```
### DECK (72.62.1.113)
**Ubicación script:** `/opt/scripts/backup_postgres.sh`
```bash
#!/bin/bash
# Ejecutar EN DECK - backup local (NO depende de ARCHITECT)
set -e
DATE=$(date +%F)
# Credenciales R2 (desde Vaultwarden DECK)
source /opt/deck/.env
export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$R2_SECRET_KEY"
R2_ENDPOINT="https://7dedae6030f5554d99d37e98a5232996.r2.cloudflarestorage.com"
# Backup local
sudo -u postgres pg_dump tzzr | gzip > /tmp/deck_tzzr_$DATE.sql.gz
# Subir a R2 (bucket propio de DECK)
aws s3 cp /tmp/deck_tzzr_$DATE.sql.gz s3://deck/backups/postgres/ \
--endpoint-url $R2_ENDPOINT
rm /tmp/deck_tzzr_$DATE.sql.gz
echo "DECK backup completado: $DATE"
```
### CORP (92.112.181.188)
**Ubicación script:** `/opt/scripts/backup_postgres.sh`
```bash
#!/bin/bash
# Ejecutar EN CORP - backup local (NO depende de ARCHITECT)
set -e
DATE=$(date +%F)
# Credenciales R2 (desde Vaultwarden CORP)
source /opt/corp/.env
export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$R2_SECRET_KEY"
R2_ENDPOINT="https://7dedae6030f5554d99d37e98a5232996.r2.cloudflarestorage.com"
# Backup local
sudo -u postgres pg_dump corp | gzip > /tmp/corp_$DATE.sql.gz
# Subir a R2 (bucket propio de CORP)
aws s3 cp /tmp/corp_$DATE.sql.gz s3://corp/backups/postgres/ \
--endpoint-url $R2_ENDPOINT
rm /tmp/corp_$DATE.sql.gz
echo "CORP backup completado: $DATE"
```
### HST (72.62.2.84)
**Ubicación script:** `/opt/scripts/backup_postgres.sh`
```bash
#!/bin/bash
# Ejecutar EN HST - backup local
set -e
DATE=$(date +%F)
source /opt/hst/.env
export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$R2_SECRET_KEY"
R2_ENDPOINT="https://7dedae6030f5554d99d37e98a5232996.r2.cloudflarestorage.com"
sudo -u postgres pg_dump hst_images | gzip > /tmp/hst_$DATE.sql.gz
aws s3 cp /tmp/hst_$DATE.sql.gz s3://hst/backups/postgres/ \
--endpoint-url $R2_ENDPOINT
rm /tmp/hst_$DATE.sql.gz
echo "HST backup completado: $DATE"
```
---
## Cron en Cada Servidor
Cada instancia configura su propio cron:
```bash
# /etc/cron.d/tzzr-backup (en cada servidor)
# Backup diario a las 3:00 AM
0 3 * * * root /opt/scripts/backup_postgres.sh >> /var/log/backup.log 2>&1
```
---
## Gitea Backup
### Backup Manual
```bash
# En ARCHITECT
docker exec -t gitea bash -c 'gitea dump -c /data/gitea/conf/app.ini'
docker cp gitea:/app/gitea/gitea-dump-*.zip ./
# Subir a R2
aws s3 cp gitea-dump-*.zip s3://architect/backups/gitea/ \
--endpoint-url $R2_ENDPOINT
```
### Backup Automático
```bash
#!/bin/bash
# /opt/scripts/backup_gitea.sh
DATE=$(date +%F_%H%M)
# Crear dump
docker exec -t gitea bash -c "gitea dump -c /data/gitea/conf/app.ini -f /tmp/gitea-dump-$DATE.zip"
# Copiar fuera del container
docker cp gitea:/tmp/gitea-dump-$DATE.zip /tmp/
# Subir a R2
source /home/orchestrator/orchestrator/.env
export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$R2_SECRET_KEY"
aws s3 cp /tmp/gitea-dump-$DATE.zip s3://architect/backups/gitea/ \
--endpoint-url https://7dedae6030f5554d99d37e98a5232996.r2.cloudflarestorage.com
# Cleanup
rm /tmp/gitea-dump-$DATE.zip
docker exec gitea rm /tmp/gitea-dump-$DATE.zip
```
---
## Estructura de Backups en R2
Cada instancia usa su propio bucket:
```
s3://architect/backups/
├── postgres/
│ └── architect_2024-12-24.sql.gz
└── gitea/
└── gitea-dump-2024-12-24_0300.zip
s3://deck/backups/
└── postgres/
└── deck_tzzr_2024-12-24.sql.gz
s3://corp/backups/
└── postgres/
└── corp_2024-12-24.sql.gz
s3://hst/backups/
└── postgres/
└── hst_2024-12-24.sql.gz
```
> **Nota:** Cada instancia es dueña de sus backups. No hay dependencia cruzada.
---
## Retención de Backups
### Política Propuesta
| Tipo | Retención | Notas |
|------|-----------|-------|
| Diario | 7 días | Últimos 7 backups |
| Semanal | 4 semanas | Domingos |
| Mensual | 12 meses | Primer día del mes |
### Script de Limpieza
```bash
#!/bin/bash
# /opt/scripts/cleanup_backups.sh
source /home/orchestrator/orchestrator/.env
export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY"
export AWS_SECRET_ACCESS_KEY="$R2_SECRET_KEY"
R2_ENDPOINT="https://7dedae6030f5554d99d37e98a5232996.r2.cloudflarestorage.com"
# Eliminar backups más antiguos de 30 días
# (Implementar con lifecycle rules de R2 preferentemente)
aws s3 ls s3://architect/backups/postgres/ --endpoint-url $R2_ENDPOINT | \
while read -r line; do
createDate=$(echo $line | awk '{print $1}')
fileName=$(echo $line | awk '{print $4}')
# Comparar fechas y eliminar si > 30 días
# ...
done
```
---
## Recovery
### PostgreSQL - Restaurar Base de Datos
```bash
# Descargar backup
aws s3 cp s3://architect/backups/postgres/architect_2024-12-24.sql.gz . \
--endpoint-url $R2_ENDPOINT
# Descomprimir
gunzip architect_2024-12-24.sql.gz
# Restaurar (¡CUIDADO! Sobrescribe datos existentes)
sudo -u postgres psql -d architect < architect_2024-12-24.sql
```
### PostgreSQL - Restaurar a Nueva Base
```bash
# Crear nueva base de datos
sudo -u postgres createdb architect_restored
# Restaurar
gunzip -c architect_2024-12-24.sql.gz | sudo -u postgres psql -d architect_restored
```
### Gitea - Restaurar
```bash
# Descargar backup
aws s3 cp s3://architect/backups/gitea/gitea-dump-2024-12-24_0300.zip . \
--endpoint-url $R2_ENDPOINT
# Detener Gitea
docker stop gitea
# Copiar al container
docker cp gitea-dump-2024-12-24_0300.zip gitea:/tmp/
# Restaurar
docker exec gitea bash -c "cd /tmp && unzip gitea-dump-2024-12-24_0300.zip"
# Seguir instrucciones de Gitea para restore
# Iniciar Gitea
docker start gitea
```
---
## Disaster Recovery Plan
### Escenario 1: Pérdida de ARCHITECT
1. Provisionar nuevo VPS con misma IP (si posible)
2. Instalar Ubuntu 22.04
3. Configurar usuario orchestrator
4. Restaurar PostgreSQL desde R2
5. Restaurar Gitea desde R2
6. Reinstalar Docker y servicios
7. Verificar conectividad con DECK/CORP/HST
### Escenario 2: Pérdida de DECK
1. Provisionar nuevo VPS
2. Restaurar PostgreSQL (tzzr) desde backup
3. Reinstalar CLARA, ALFRED
4. Reinstalar Mailcow (requiere backup separado)
5. Actualizar DNS si IP cambió
### Escenario 3: Pérdida de CORP
1. Provisionar nuevo VPS
2. Restaurar PostgreSQL (corp) desde backup
3. Reinstalar MARGARET, JARED, MASON, FELDMAN
4. Reinstalar Odoo, Nextcloud
5. Activar UFW (nuevo servidor)
### Escenario 4: Pérdida de R2
**IMPROBABLE** - Cloudflare tiene redundancia multi-región.
Mitigación: Backup mensual a segundo proveedor (AWS S3 Glacier).
---
## Verificación de Backups
### Test Mensual
```bash
#!/bin/bash
# /opt/scripts/verify_backup.sh
# Descargar último backup
LATEST=$(aws s3 ls s3://architect/backups/postgres/ --endpoint-url $R2_ENDPOINT | \
sort | tail -1 | awk '{print $4}')
aws s3 cp s3://architect/backups/postgres/$LATEST /tmp/ \
--endpoint-url $R2_ENDPOINT
# Verificar integridad
gunzip -t /tmp/$LATEST
if [ $? -eq 0 ]; then
echo "Backup válido: $LATEST"
else
echo "ERROR: Backup corrupto: $LATEST"
# Enviar alerta
fi
rm /tmp/$LATEST
```
### Checklist de Verificación
- [ ] Backup PostgreSQL ARCHITECT existe (< 24h)
- [ ] Backup PostgreSQL DECK existe (< 24h)
- [ ] Backup PostgreSQL CORP existe (< 24h)
- [ ] Backup Gitea existe (< 7d)
- [ ] Integridad verificada (gunzip -t)
- [ ] Restore test exitoso (mensual)
---
## Alertas
### Configuración ntfy
```bash
# Notificar si backup falla
if [ $? -ne 0 ]; then
curl -d "Backup FALLIDO: $DATE" ntfy.sh/tzzr-alerts
fi
```
### Monitoreo
```bash
# Verificar último backup
aws s3 ls s3://architect/backups/postgres/ --endpoint-url $R2_ENDPOINT | \
sort | tail -5
```