Files
ai-tax-agent/infra/MIGRATION_GUIDE.md
harkon b324ff09ef
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
Initial commit
2025-10-11 08:41:36 +01:00

7.2 KiB

Infrastructure Migration Guide

This guide helps you migrate from the old infrastructure structure to the new organized multi-environment setup.

Old Structure vs New Structure

Old Structure

infra/
├── compose/
│   ├── docker-compose.local.yml (1013 lines - everything)
│   ├── docker-compose.backend.yml (1014 lines - everything)
│   ├── authentik/compose.yaml
│   ├── gitea/compose.yaml
│   ├── nextcloud/compose.yaml
│   ├── portainer/docker-compose.yaml
│   └── traefik/compose.yaml
├── production/
│   ├── infrastructure.yaml
│   ├── services.yaml
│   └── monitoring.yaml
├── .env.production
└── various config folders

New Structure

infra/
├── base/                      # Shared compose files
│   ├── infrastructure.yaml
│   ├── services.yaml
│   ├── monitoring.yaml
│   └── external.yaml
├── environments/              # Environment-specific configs
│   ├── local/.env
│   ├── development/.env
│   └── production/.env
├── configs/                   # Service configurations
│   ├── traefik/
│   ├── grafana/
│   ├── prometheus/
│   └── ...
└── scripts/
    └── deploy.sh              # Unified deployment script

Migration Steps

Step 1: Backup Current Setup

# Backup current environment files
cp infra/.env.production infra/.env.production.backup
cp infra/compose/.env infra/compose/.env.backup

# Backup compose files
tar -czf infra-backup-$(date +%Y%m%d).tar.gz infra/

Step 2: Stop Current Services (if migrating live)

# Stop services (if running)
cd infra/compose
docker compose -f docker-compose.local.yml down

# Or for production
cd infra/production
docker compose -f infrastructure.yaml down
docker compose -f services.yaml down
docker compose -f monitoring.yaml down

Step 3: Create Environment Files

# For local development
cp infra/environments/local/.env.example infra/environments/local/.env
vim infra/environments/local/.env

# For development server
cp infra/environments/development/.env.example infra/environments/development/.env
vim infra/environments/development/.env

# For production (copy from existing)
cp infra/.env.production infra/environments/production/.env

Step 4: Move Configuration Files

# Move Traefik configs
cp -r infra/traefik/* infra/configs/traefik/

# Move Grafana configs
cp -r infra/grafana/* infra/configs/grafana/

# Move Prometheus configs
cp -r infra/prometheus/* infra/configs/prometheus/

# Move Loki configs
cp -r infra/loki/* infra/configs/loki/

# Move Vault configs
cp -r infra/vault/* infra/configs/vault/

# Move Authentik configs
cp -r infra/authentik/* infra/configs/authentik/

Step 5: Update Volume Names (if needed)

If you want to preserve existing data, you have two options:

The new compose files use the same volume names, so your data will be preserved automatically.

Option B: Rename Volumes

If you want environment-specific volume names:

# List current volumes
docker volume ls

# Rename volumes (example for production)
docker volume create prod_postgres_data
docker run --rm -v postgres_data:/from -v prod_postgres_data:/to alpine sh -c "cd /from && cp -av . /to"

# Repeat for each volume

Step 6: Setup Networks

# Create Docker networks
./infra/scripts/setup-networks.sh

Step 7: Deploy New Structure

# For local
./infra/scripts/deploy.sh local all

# For development
./infra/scripts/deploy.sh development all

# For production
./infra/scripts/deploy.sh production all

Step 8: Verify Services

# Check running services
docker ps

# Check logs
docker compose -f infra/base/infrastructure.yaml --env-file infra/environments/production/.env logs -f

# Test endpoints
curl https://vault.harkon.co.uk
curl https://minio.harkon.co.uk
curl https://grafana.harkon.co.uk

Handling External Services

If you have existing Traefik, Authentik, Gitea, Nextcloud, or Portainer:

Don't deploy external.yaml. Just ensure:

  1. Networks are shared:
networks:
  frontend:
    external: true
  backend:
    external: true
  1. Services can discover each other via network

Option 2: Migrate to New Structure

  1. Stop existing services
  2. Update their compose files to use new structure
  3. Deploy via external.yaml

Environment-Specific Differences

Local Development

  • Uses localhost or *.local.harkon.co.uk
  • Self-signed SSL certificates
  • Simple passwords
  • Optional Authentik
  • Traefik dashboard exposed on port 8080

Development Server

  • Uses *.dev.harkon.co.uk
  • Let's Encrypt SSL via DNS-01 challenge
  • Strong passwords (generated)
  • Authentik SSO enabled
  • Gitea container registry

Production Server

  • Uses *.harkon.co.uk
  • Let's Encrypt SSL via DNS-01 challenge
  • Strong passwords (generated)
  • Authentik SSO enabled
  • Gitea container registry
  • No debug ports exposed

Troubleshooting

Issue: Services can't find each other

Solution: Ensure networks are created and services are on the correct networks

docker network ls
docker network inspect frontend
docker network inspect backend

Issue: Volumes not found

Solution: Check volume names match

docker volume ls
docker compose -f infra/base/infrastructure.yaml --env-file infra/environments/production/.env config

Issue: Environment variables not loaded

Solution: Check .env file exists and is in correct location

ls -la infra/environments/production/.env
cat infra/environments/production/.env | grep DOMAIN

Issue: Traefik routing not working

Solution: Check labels and ensure Traefik can see containers

docker logs traefik | grep -i error
docker inspect <container> | grep -A 20 Labels

Rollback Plan

If migration fails:

# Stop new services
./infra/scripts/deploy.sh production down

# Restore old structure
cd infra/compose
docker compose -f docker-compose.backend.yml up -d

# Or for production
cd infra/production
docker compose -f infrastructure.yaml up -d
docker compose -f services.yaml up -d
docker compose -f monitoring.yaml up -d

Post-Migration Cleanup

After successful migration and verification:

# Remove old compose files (optional)
rm -rf infra/compose/docker-compose.*.yml

# Remove old production folder (optional)
rm -rf infra/production.old

# Remove backup files
rm infra/.env.production.backup
rm infra-backup-*.tar.gz

Benefits of New Structure

Multi-environment support - Easy to deploy to local, dev, prod Cleaner organization - Configs separated by purpose Unified deployment - Single script for all environments Better security - Environment-specific secrets Easier maintenance - Clear separation of concerns Scalable - Easy to add new environments or services

Next Steps

  1. Test in local environment first
  2. Deploy to development server
  3. Verify all services work
  4. Deploy to production
  5. Update documentation
  6. Train team on new structure