# Deployment Checklist ## Pre-Deployment Checklist ### Local Development - [ ] Docker and Docker Compose installed - [ ] Git repository cloned - [ ] Environment file created: `cp infra/environments/local/.env.example infra/environments/local/.env` - [ ] Docker networks created: `./infra/scripts/setup-networks.sh` - [ ] Sufficient disk space (10GB+) ### Development Server - [ ] Server accessible via SSH - [ ] Docker and Docker Compose installed on server - [ ] Domain configured: `*.dev.harkon.co.uk` - [ ] DNS records pointing to server - [ ] GoDaddy API credentials available - [ ] Environment file created: `cp infra/environments/development/.env.example infra/environments/development/.env` - [ ] Secrets generated: `./scripts/generate-secrets.sh` - [ ] Docker networks created: `./infra/scripts/setup-networks.sh` ### Production Server - [ ] Server accessible via SSH (deploy@141.136.35.199) - [ ] Docker and Docker Compose installed - [ ] Domain configured: `*.harkon.co.uk` - [ ] DNS records verified - [ ] GoDaddy API credentials configured - [ ] Environment file exists: `infra/environments/production/.env` - [ ] All secrets verified (no CHANGE_ME values) - [ ] Docker networks created: `./infra/scripts/setup-networks.sh` - [ ] Backup of existing data (if migrating) --- ## Deployment Checklist ### Phase 1: External Services (Production Only) #### Traefik - [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/traefik` - [ ] Verify config: `cat config/traefik.yaml` - [ ] Verify provider credentials: `cat .provider.env` - [ ] Deploy: `docker compose up -d` - [ ] Check logs: `docker compose logs -f` - [ ] Verify running: `docker ps | grep traefik` - [ ] Test dashboard: `https://traefik.harkon.co.uk` - [ ] Verify SSL certificate obtained #### Authentik - [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/authentik` - [ ] Verify environment: `cat .env` - [ ] Deploy: `docker compose up -d` - [ ] Wait for startup: `sleep 30` - [ ] Check logs: `docker compose logs -f authentik-server` - [ ] Verify running: `docker ps | grep authentik` - [ ] Access UI: `https://authentik.harkon.co.uk` - [ ] Complete initial setup - [ ] Create admin user - [ ] Note down API token #### Gitea - [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/gitea` - [ ] Verify environment: `cat .env` - [ ] Deploy: `docker compose up -d` - [ ] Wait for startup: `sleep 30` - [ ] Check logs: `docker compose logs -f gitea-server` - [ ] Verify running: `docker ps | grep gitea` - [ ] Access UI: `https://gitea.harkon.co.uk` - [ ] Complete initial setup - [ ] Enable container registry - [ ] Create access token - [ ] Test docker login: `docker login gitea.harkon.co.uk` #### Nextcloud (Optional) - [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/nextcloud` - [ ] Deploy: `docker compose up -d` - [ ] Access UI: `https://nextcloud.harkon.co.uk` - [ ] Complete setup #### Portainer (Optional) - [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/portainer` - [ ] Deploy: `docker compose up -d` - [ ] Access UI: `https://portainer.harkon.co.uk` - [ ] Create admin user ### Phase 2: Application Infrastructure #### Infrastructure Services - [ ] Navigate to: `cd /opt/ai-tax-agent` - [ ] Verify environment: `cat infra/environments/production/.env` - [ ] Deploy: `./infra/scripts/deploy.sh production infrastructure` - [ ] Wait for services: `sleep 30` - [ ] Check status: `docker ps | grep -E "apa-vault|apa-minio|apa-postgres|apa-neo4j|apa-qdrant|apa-redis|apa-nats"` - [ ] Verify Vault: `curl https://vault.harkon.co.uk/v1/sys/health` - [ ] Verify MinIO: `curl https://minio-api.harkon.co.uk/minio/health/live` - [ ] Verify PostgreSQL: `docker exec apa-postgres pg_isready` - [ ] Verify Neo4j: `curl http://localhost:7474` - [ ] Verify Qdrant: `curl http://localhost:6333/health` - [ ] Verify Redis: `docker exec apa-redis redis-cli ping` - [ ] Verify NATS: `docker logs nats | grep "Server is ready"` #### Initialize Vault - [ ] Access Vault: `docker exec -it vault sh` - [ ] Initialize: `vault operator init` (if first time) - [ ] Save unseal keys and root token - [ ] Unseal: `vault operator unseal` (3 times with different keys) - [ ] Login: `vault login ` - [ ] Enable KV secrets: `vault secrets enable -path=secret kv-v2` - [ ] Exit: `exit` #### Initialize MinIO - [ ] Access MinIO console: `https://minio.harkon.co.uk` - [ ] Login with credentials from .env - [ ] Create buckets: - [ ] `documents` - [ ] `embeddings` - [ ] `models` - [ ] `backups` - [ ] Set bucket policies (public/private as needed) - [ ] Create access keys for services #### Initialize Databases - [ ] PostgreSQL: - [ ] Access: `docker exec -it apa-postgres psql -U postgres` - [ ] Create databases: `CREATE DATABASE tax_system;` - [ ] Verify: `\l` - [ ] Exit: `\q` - [ ] Neo4j: - [ ] Access: `docker exec -it apa-neo4j cypher-shell -u neo4j -p ` - [ ] Create constraints (if needed) - [ ] Exit: `:exit` - [ ] Qdrant: - [ ] Create collections via API or wait for services to create them ### Phase 3: Monitoring Stack - [ ] Deploy: `./infra/scripts/deploy.sh production monitoring` - [ ] Wait for services: `sleep 30` - [ ] Check status: `docker ps | grep -E "prometheus|grafana|loki|promtail"` - [ ] Access Grafana: `https://grafana.harkon.co.uk` - [ ] Login with credentials from .env - [ ] Verify Prometheus datasource - [ ] Verify Loki datasource - [ ] Import dashboards - [ ] Test queries ### Phase 4: Application Services #### Build and Push Images - [ ] Verify Gitea registry access: `docker login gitea.harkon.co.uk` - [ ] Build base images: `./scripts/build-base-images.sh gitea.harkon.co.uk v1.0.1 harkon` - [ ] Build service images: `./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.1 harkon` - [ ] Verify images in Gitea: `https://gitea.harkon.co.uk/harkon/-/packages` #### Deploy Services - [ ] Deploy: `./infra/scripts/deploy.sh production services` - [ ] Wait for services: `sleep 60` - [ ] Check status: `docker ps | grep svc-` - [ ] Check logs: `docker compose -f infra/base/services.yaml --env-file infra/environments/production/.env logs -f` - [ ] Verify all 14 services running - [ ] Check health endpoints ### Phase 5: Configure Authentik OAuth For each service that needs OAuth: #### Grafana - [ ] Create OAuth provider in Authentik - [ ] Note client ID and secret - [ ] Update `GRAFANA_OAUTH_CLIENT_SECRET` in .env - [ ] Restart Grafana: `docker restart grafana` - [ ] Test OAuth login #### MinIO - [ ] Create OAuth provider in Authentik - [ ] Note client ID and secret - [ ] Update `AUTHENTIK_MINIO_CLIENT_SECRET` in .env - [ ] Restart MinIO: `docker restart minio` - [ ] Test OAuth login #### Vault - [ ] Create OAuth provider in Authentik - [ ] Note client ID and secret - [ ] Update `AUTHENTIK_VAULT_CLIENT_SECRET` in .env - [ ] Configure Vault OIDC - [ ] Test OAuth login #### UI Review - [ ] Create OAuth provider in Authentik - [ ] Note client ID and secret - [ ] Update `AUTHENTIK_UI_REVIEW_CLIENT_SECRET` in .env - [ ] Restart UI Review: `docker restart ui-review` - [ ] Test OAuth login --- ## Post-Deployment Verification ### Service Accessibility - [ ] Traefik Dashboard: `https://traefik.harkon.co.uk` - [ ] Authentik: `https://auth.harkon.co.uk` - [ ] Gitea: `https://gitea.harkon.co.uk` - [ ] Grafana: `https://grafana.harkon.co.uk` - [ ] Prometheus: `https://prometheus.harkon.co.uk` - [ ] Vault: `https://vault.harkon.co.uk` - [ ] MinIO: `https://minio.harkon.co.uk` - [ ] UI Review: `https://app.harkon.co.uk` ### Health Checks - [ ] All services show as healthy in `docker ps` - [ ] No error logs in `docker compose logs` - [ ] Grafana shows metrics from Prometheus - [ ] Loki receiving logs - [ ] Traefik routing working correctly - [ ] SSL certificates valid ### Functional Tests - [ ] Can log in to Authentik - [ ] Can log in to Grafana via OAuth - [ ] Can access MinIO console - [ ] Can push/pull from Gitea registry - [ ] Can access UI Review - [ ] Can query Prometheus - [ ] Can view logs in Loki ### Performance Checks - [ ] Response times acceptable (<2s) - [ ] No memory leaks (check `docker stats`) - [ ] No CPU spikes - [ ] Disk usage reasonable --- ## Rollback Plan If deployment fails: ### Rollback External Services - [ ] Stop service: `cd infra/compose/ && docker compose down` - [ ] Restore previous version - [ ] Restart: `docker compose up -d` ### Rollback Application Infrastructure - [ ] Stop services: `./infra/scripts/deploy.sh production down` - [ ] Restore data from backup - [ ] Deploy previous version - [ ] Verify functionality ### Restore Data - [ ] PostgreSQL: `docker exec -i apa-postgres psql -U postgres -d tax_system < backup.sql` - [ ] Neo4j: `docker exec apa-neo4j neo4j-admin load --from=/backup/neo4j.dump` - [ ] MinIO: Restore from backup bucket - [ ] Vault: Restore from snapshot --- ## Maintenance Checklist ### Daily - [ ] Check service status: `make status` - [ ] Check logs for errors: `make logs | grep ERROR` - [ ] Check disk space: `df -h` - [ ] Check Grafana dashboards ### Weekly - [ ] Review Grafana metrics - [ ] Check for security updates - [ ] Review logs for anomalies - [ ] Test backups ### Monthly - [ ] Update Docker images - [ ] Rotate secrets - [ ] Review and update documentation - [ ] Test disaster recovery --- ## Emergency Contacts - **Infrastructure Lead**: [Name] - **DevOps Team**: [Contact] - **On-Call**: [Contact] --- ## Notes - Keep this checklist updated - Document any deviations - Note any issues encountered - Update runbooks based on experience