Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
324 lines
9.3 KiB
Markdown
324 lines
9.3 KiB
Markdown
# Deployment Checklist
|
|
|
|
## Pre-Deployment Checklist
|
|
|
|
### Local Development
|
|
|
|
- [ ] Docker and Docker Compose installed
|
|
- [ ] Git repository cloned
|
|
- [ ] Environment file created: `cp infra/environments/local/.env.example infra/environments/local/.env`
|
|
- [ ] Docker networks created: `./infra/scripts/setup-networks.sh`
|
|
- [ ] Sufficient disk space (10GB+)
|
|
|
|
### Development Server
|
|
|
|
- [ ] Server accessible via SSH
|
|
- [ ] Docker and Docker Compose installed on server
|
|
- [ ] Domain configured: `*.dev.harkon.co.uk`
|
|
- [ ] DNS records pointing to server
|
|
- [ ] GoDaddy API credentials available
|
|
- [ ] Environment file created: `cp infra/environments/development/.env.example infra/environments/development/.env`
|
|
- [ ] Secrets generated: `./scripts/generate-secrets.sh`
|
|
- [ ] Docker networks created: `./infra/scripts/setup-networks.sh`
|
|
|
|
### Production Server
|
|
|
|
- [ ] Server accessible via SSH (deploy@141.136.35.199)
|
|
- [ ] Docker and Docker Compose installed
|
|
- [ ] Domain configured: `*.harkon.co.uk`
|
|
- [ ] DNS records verified
|
|
- [ ] GoDaddy API credentials configured
|
|
- [ ] Environment file exists: `infra/environments/production/.env`
|
|
- [ ] All secrets verified (no CHANGE_ME values)
|
|
- [ ] Docker networks created: `./infra/scripts/setup-networks.sh`
|
|
- [ ] Backup of existing data (if migrating)
|
|
|
|
---
|
|
|
|
## Deployment Checklist
|
|
|
|
### Phase 1: External Services (Production Only)
|
|
|
|
#### Traefik
|
|
|
|
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/traefik`
|
|
- [ ] Verify config: `cat config/traefik.yaml`
|
|
- [ ] Verify provider credentials: `cat .provider.env`
|
|
- [ ] Deploy: `docker compose up -d`
|
|
- [ ] Check logs: `docker compose logs -f`
|
|
- [ ] Verify running: `docker ps | grep traefik`
|
|
- [ ] Test dashboard: `https://traefik.harkon.co.uk`
|
|
- [ ] Verify SSL certificate obtained
|
|
|
|
#### Authentik
|
|
|
|
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/authentik`
|
|
- [ ] Verify environment: `cat .env`
|
|
- [ ] Deploy: `docker compose up -d`
|
|
- [ ] Wait for startup: `sleep 30`
|
|
- [ ] Check logs: `docker compose logs -f authentik-server`
|
|
- [ ] Verify running: `docker ps | grep authentik`
|
|
- [ ] Access UI: `https://authentik.harkon.co.uk`
|
|
- [ ] Complete initial setup
|
|
- [ ] Create admin user
|
|
- [ ] Note down API token
|
|
|
|
#### Gitea
|
|
|
|
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/gitea`
|
|
- [ ] Verify environment: `cat .env`
|
|
- [ ] Deploy: `docker compose up -d`
|
|
- [ ] Wait for startup: `sleep 30`
|
|
- [ ] Check logs: `docker compose logs -f gitea-server`
|
|
- [ ] Verify running: `docker ps | grep gitea`
|
|
- [ ] Access UI: `https://gitea.harkon.co.uk`
|
|
- [ ] Complete initial setup
|
|
- [ ] Enable container registry
|
|
- [ ] Create access token
|
|
- [ ] Test docker login: `docker login gitea.harkon.co.uk`
|
|
|
|
#### Nextcloud (Optional)
|
|
|
|
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/nextcloud`
|
|
- [ ] Deploy: `docker compose up -d`
|
|
- [ ] Access UI: `https://nextcloud.harkon.co.uk`
|
|
- [ ] Complete setup
|
|
|
|
#### Portainer (Optional)
|
|
|
|
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/portainer`
|
|
- [ ] Deploy: `docker compose up -d`
|
|
- [ ] Access UI: `https://portainer.harkon.co.uk`
|
|
- [ ] Create admin user
|
|
|
|
### Phase 2: Application Infrastructure
|
|
|
|
#### Infrastructure Services
|
|
|
|
- [ ] Navigate to: `cd /opt/ai-tax-agent`
|
|
- [ ] Verify environment: `cat infra/environments/production/.env`
|
|
- [ ] Deploy: `./infra/scripts/deploy.sh production infrastructure`
|
|
- [ ] Wait for services: `sleep 30`
|
|
- [ ] Check status: `docker ps | grep -E "vault|minio|postgres|neo4j|qdrant|redis|nats"`
|
|
- [ ] Verify Vault: `curl https://vault.harkon.co.uk/v1/sys/health`
|
|
- [ ] Verify MinIO: `curl https://minio-api.harkon.co.uk/minio/health/live`
|
|
- [ ] Verify PostgreSQL: `docker exec postgres pg_isready`
|
|
- [ ] Verify Neo4j: `curl http://localhost:7474`
|
|
- [ ] Verify Qdrant: `curl http://localhost:6333/health`
|
|
- [ ] Verify Redis: `docker exec redis redis-cli ping`
|
|
- [ ] Verify NATS: `docker logs nats | grep "Server is ready"`
|
|
|
|
#### Initialize Vault
|
|
|
|
- [ ] Access Vault: `docker exec -it vault sh`
|
|
- [ ] Initialize: `vault operator init` (if first time)
|
|
- [ ] Save unseal keys and root token
|
|
- [ ] Unseal: `vault operator unseal` (3 times with different keys)
|
|
- [ ] Login: `vault login <root-token>`
|
|
- [ ] Enable KV secrets: `vault secrets enable -path=secret kv-v2`
|
|
- [ ] Exit: `exit`
|
|
|
|
#### Initialize MinIO
|
|
|
|
- [ ] Access MinIO console: `https://minio.harkon.co.uk`
|
|
- [ ] Login with credentials from .env
|
|
- [ ] Create buckets:
|
|
- [ ] `documents`
|
|
- [ ] `embeddings`
|
|
- [ ] `models`
|
|
- [ ] `backups`
|
|
- [ ] Set bucket policies (public/private as needed)
|
|
- [ ] Create access keys for services
|
|
|
|
#### Initialize Databases
|
|
|
|
- [ ] PostgreSQL:
|
|
- [ ] Access: `docker exec -it postgres psql -U postgres`
|
|
- [ ] Create databases: `CREATE DATABASE tax_system;`
|
|
- [ ] Verify: `\l`
|
|
- [ ] Exit: `\q`
|
|
|
|
- [ ] Neo4j:
|
|
- [ ] Access: `docker exec -it neo4j cypher-shell -u neo4j -p <password>`
|
|
- [ ] Create constraints (if needed)
|
|
- [ ] Exit: `:exit`
|
|
|
|
- [ ] Qdrant:
|
|
- [ ] Create collections via API or wait for services to create them
|
|
|
|
### Phase 3: Monitoring Stack
|
|
|
|
- [ ] Deploy: `./infra/scripts/deploy.sh production monitoring`
|
|
- [ ] Wait for services: `sleep 30`
|
|
- [ ] Check status: `docker ps | grep -E "prometheus|grafana|loki|promtail"`
|
|
- [ ] Access Grafana: `https://grafana.harkon.co.uk`
|
|
- [ ] Login with credentials from .env
|
|
- [ ] Verify Prometheus datasource
|
|
- [ ] Verify Loki datasource
|
|
- [ ] Import dashboards
|
|
- [ ] Test queries
|
|
|
|
### Phase 4: Application Services
|
|
|
|
#### Build and Push Images
|
|
|
|
- [ ] Verify Gitea registry access: `docker login gitea.harkon.co.uk`
|
|
- [ ] Build base images: `./scripts/build-base-images.sh gitea.harkon.co.uk v1.0.1 harkon`
|
|
- [ ] Build service images: `./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.1 harkon`
|
|
- [ ] Verify images in Gitea: `https://gitea.harkon.co.uk/harkon/-/packages`
|
|
|
|
#### Deploy Services
|
|
|
|
- [ ] Deploy: `./infra/scripts/deploy.sh production services`
|
|
- [ ] Wait for services: `sleep 60`
|
|
- [ ] Check status: `docker ps | grep svc-`
|
|
- [ ] Check logs: `docker compose -f infra/base/services.yaml --env-file infra/environments/production/.env logs -f`
|
|
- [ ] Verify all 14 services running
|
|
- [ ] Check health endpoints
|
|
|
|
### Phase 5: Configure Authentik OAuth
|
|
|
|
For each service that needs OAuth:
|
|
|
|
#### Grafana
|
|
|
|
- [ ] Create OAuth provider in Authentik
|
|
- [ ] Note client ID and secret
|
|
- [ ] Update `GRAFANA_OAUTH_CLIENT_SECRET` in .env
|
|
- [ ] Restart Grafana: `docker restart grafana`
|
|
- [ ] Test OAuth login
|
|
|
|
#### MinIO
|
|
|
|
- [ ] Create OAuth provider in Authentik
|
|
- [ ] Note client ID and secret
|
|
- [ ] Update `AUTHENTIK_MINIO_CLIENT_SECRET` in .env
|
|
- [ ] Restart MinIO: `docker restart minio`
|
|
- [ ] Test OAuth login
|
|
|
|
#### Vault
|
|
|
|
- [ ] Create OAuth provider in Authentik
|
|
- [ ] Note client ID and secret
|
|
- [ ] Update `AUTHENTIK_VAULT_CLIENT_SECRET` in .env
|
|
- [ ] Configure Vault OIDC
|
|
- [ ] Test OAuth login
|
|
|
|
#### UI Review
|
|
|
|
- [ ] Create OAuth provider in Authentik
|
|
- [ ] Note client ID and secret
|
|
- [ ] Update `AUTHENTIK_UI_REVIEW_CLIENT_SECRET` in .env
|
|
- [ ] Restart UI Review: `docker restart ui-review`
|
|
- [ ] Test OAuth login
|
|
|
|
---
|
|
|
|
## Post-Deployment Verification
|
|
|
|
### Service Accessibility
|
|
|
|
- [ ] Traefik Dashboard: `https://traefik.harkon.co.uk`
|
|
- [ ] Authentik: `https://authentik.harkon.co.uk`
|
|
- [ ] Gitea: `https://gitea.harkon.co.uk`
|
|
- [ ] Grafana: `https://grafana.harkon.co.uk`
|
|
- [ ] Prometheus: `https://prometheus.harkon.co.uk`
|
|
- [ ] Vault: `https://vault.harkon.co.uk`
|
|
- [ ] MinIO: `https://minio.harkon.co.uk`
|
|
- [ ] UI Review: `https://ui-review.harkon.co.uk`
|
|
|
|
### Health Checks
|
|
|
|
- [ ] All services show as healthy in `docker ps`
|
|
- [ ] No error logs in `docker compose logs`
|
|
- [ ] Grafana shows metrics from Prometheus
|
|
- [ ] Loki receiving logs
|
|
- [ ] Traefik routing working correctly
|
|
- [ ] SSL certificates valid
|
|
|
|
### Functional Tests
|
|
|
|
- [ ] Can log in to Authentik
|
|
- [ ] Can log in to Grafana via OAuth
|
|
- [ ] Can access MinIO console
|
|
- [ ] Can push/pull from Gitea registry
|
|
- [ ] Can access UI Review
|
|
- [ ] Can query Prometheus
|
|
- [ ] Can view logs in Loki
|
|
|
|
### Performance Checks
|
|
|
|
- [ ] Response times acceptable (<2s)
|
|
- [ ] No memory leaks (check `docker stats`)
|
|
- [ ] No CPU spikes
|
|
- [ ] Disk usage reasonable
|
|
|
|
---
|
|
|
|
## Rollback Plan
|
|
|
|
If deployment fails:
|
|
|
|
### Rollback External Services
|
|
|
|
- [ ] Stop service: `cd infra/compose/<service> && docker compose down`
|
|
- [ ] Restore previous version
|
|
- [ ] Restart: `docker compose up -d`
|
|
|
|
### Rollback Application Infrastructure
|
|
|
|
- [ ] Stop services: `./infra/scripts/deploy.sh production down`
|
|
- [ ] Restore data from backup
|
|
- [ ] Deploy previous version
|
|
- [ ] Verify functionality
|
|
|
|
### Restore Data
|
|
|
|
- [ ] PostgreSQL: `docker exec -i postgres psql -U postgres -d tax_system < backup.sql`
|
|
- [ ] Neo4j: `docker exec neo4j neo4j-admin load --from=/backup/neo4j.dump`
|
|
- [ ] MinIO: Restore from backup bucket
|
|
- [ ] Vault: Restore from snapshot
|
|
|
|
---
|
|
|
|
## Maintenance Checklist
|
|
|
|
### Daily
|
|
|
|
- [ ] Check service status: `make status`
|
|
- [ ] Check logs for errors: `make logs | grep ERROR`
|
|
- [ ] Check disk space: `df -h`
|
|
- [ ] Check Grafana dashboards
|
|
|
|
### Weekly
|
|
|
|
- [ ] Review Grafana metrics
|
|
- [ ] Check for security updates
|
|
- [ ] Review logs for anomalies
|
|
- [ ] Test backups
|
|
|
|
### Monthly
|
|
|
|
- [ ] Update Docker images
|
|
- [ ] Rotate secrets
|
|
- [ ] Review and update documentation
|
|
- [ ] Test disaster recovery
|
|
|
|
---
|
|
|
|
## Emergency Contacts
|
|
|
|
- **Infrastructure Lead**: [Name]
|
|
- **DevOps Team**: [Contact]
|
|
- **On-Call**: [Contact]
|
|
|
|
---
|
|
|
|
## Notes
|
|
|
|
- Keep this checklist updated
|
|
- Document any deviations
|
|
- Note any issues encountered
|
|
- Update runbooks based on experience
|
|
|