Initial commit
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled

This commit is contained in:
harkon
2025-10-11 08:41:36 +01:00
commit b324ff09ef
276 changed files with 55220 additions and 0 deletions

View File

@@ -0,0 +1,323 @@
# Deployment Checklist
## Pre-Deployment Checklist
### Local Development
- [ ] Docker and Docker Compose installed
- [ ] Git repository cloned
- [ ] Environment file created: `cp infra/environments/local/.env.example infra/environments/local/.env`
- [ ] Docker networks created: `./infra/scripts/setup-networks.sh`
- [ ] Sufficient disk space (10GB+)
### Development Server
- [ ] Server accessible via SSH
- [ ] Docker and Docker Compose installed on server
- [ ] Domain configured: `*.dev.harkon.co.uk`
- [ ] DNS records pointing to server
- [ ] GoDaddy API credentials available
- [ ] Environment file created: `cp infra/environments/development/.env.example infra/environments/development/.env`
- [ ] Secrets generated: `./scripts/generate-secrets.sh`
- [ ] Docker networks created: `./infra/scripts/setup-networks.sh`
### Production Server
- [ ] Server accessible via SSH (deploy@141.136.35.199)
- [ ] Docker and Docker Compose installed
- [ ] Domain configured: `*.harkon.co.uk`
- [ ] DNS records verified
- [ ] GoDaddy API credentials configured
- [ ] Environment file exists: `infra/environments/production/.env`
- [ ] All secrets verified (no CHANGE_ME values)
- [ ] Docker networks created: `./infra/scripts/setup-networks.sh`
- [ ] Backup of existing data (if migrating)
---
## Deployment Checklist
### Phase 1: External Services (Production Only)
#### Traefik
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/traefik`
- [ ] Verify config: `cat config/traefik.yaml`
- [ ] Verify provider credentials: `cat .provider.env`
- [ ] Deploy: `docker compose up -d`
- [ ] Check logs: `docker compose logs -f`
- [ ] Verify running: `docker ps | grep traefik`
- [ ] Test dashboard: `https://traefik.harkon.co.uk`
- [ ] Verify SSL certificate obtained
#### Authentik
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/authentik`
- [ ] Verify environment: `cat .env`
- [ ] Deploy: `docker compose up -d`
- [ ] Wait for startup: `sleep 30`
- [ ] Check logs: `docker compose logs -f authentik-server`
- [ ] Verify running: `docker ps | grep authentik`
- [ ] Access UI: `https://authentik.harkon.co.uk`
- [ ] Complete initial setup
- [ ] Create admin user
- [ ] Note down API token
#### Gitea
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/gitea`
- [ ] Verify environment: `cat .env`
- [ ] Deploy: `docker compose up -d`
- [ ] Wait for startup: `sleep 30`
- [ ] Check logs: `docker compose logs -f gitea-server`
- [ ] Verify running: `docker ps | grep gitea`
- [ ] Access UI: `https://gitea.harkon.co.uk`
- [ ] Complete initial setup
- [ ] Enable container registry
- [ ] Create access token
- [ ] Test docker login: `docker login gitea.harkon.co.uk`
#### Nextcloud (Optional)
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/nextcloud`
- [ ] Deploy: `docker compose up -d`
- [ ] Access UI: `https://nextcloud.harkon.co.uk`
- [ ] Complete setup
#### Portainer (Optional)
- [ ] Navigate to: `cd /opt/ai-tax-agent/infra/compose/portainer`
- [ ] Deploy: `docker compose up -d`
- [ ] Access UI: `https://portainer.harkon.co.uk`
- [ ] Create admin user
### Phase 2: Application Infrastructure
#### Infrastructure Services
- [ ] Navigate to: `cd /opt/ai-tax-agent`
- [ ] Verify environment: `cat infra/environments/production/.env`
- [ ] Deploy: `./infra/scripts/deploy.sh production infrastructure`
- [ ] Wait for services: `sleep 30`
- [ ] Check status: `docker ps | grep -E "vault|minio|postgres|neo4j|qdrant|redis|nats"`
- [ ] Verify Vault: `curl https://vault.harkon.co.uk/v1/sys/health`
- [ ] Verify MinIO: `curl https://minio-api.harkon.co.uk/minio/health/live`
- [ ] Verify PostgreSQL: `docker exec postgres pg_isready`
- [ ] Verify Neo4j: `curl http://localhost:7474`
- [ ] Verify Qdrant: `curl http://localhost:6333/health`
- [ ] Verify Redis: `docker exec redis redis-cli ping`
- [ ] Verify NATS: `docker logs nats | grep "Server is ready"`
#### Initialize Vault
- [ ] Access Vault: `docker exec -it vault sh`
- [ ] Initialize: `vault operator init` (if first time)
- [ ] Save unseal keys and root token
- [ ] Unseal: `vault operator unseal` (3 times with different keys)
- [ ] Login: `vault login <root-token>`
- [ ] Enable KV secrets: `vault secrets enable -path=secret kv-v2`
- [ ] Exit: `exit`
#### Initialize MinIO
- [ ] Access MinIO console: `https://minio.harkon.co.uk`
- [ ] Login with credentials from .env
- [ ] Create buckets:
- [ ] `documents`
- [ ] `embeddings`
- [ ] `models`
- [ ] `backups`
- [ ] Set bucket policies (public/private as needed)
- [ ] Create access keys for services
#### Initialize Databases
- [ ] PostgreSQL:
- [ ] Access: `docker exec -it postgres psql -U postgres`
- [ ] Create databases: `CREATE DATABASE tax_system;`
- [ ] Verify: `\l`
- [ ] Exit: `\q`
- [ ] Neo4j:
- [ ] Access: `docker exec -it neo4j cypher-shell -u neo4j -p <password>`
- [ ] Create constraints (if needed)
- [ ] Exit: `:exit`
- [ ] Qdrant:
- [ ] Create collections via API or wait for services to create them
### Phase 3: Monitoring Stack
- [ ] Deploy: `./infra/scripts/deploy.sh production monitoring`
- [ ] Wait for services: `sleep 30`
- [ ] Check status: `docker ps | grep -E "prometheus|grafana|loki|promtail"`
- [ ] Access Grafana: `https://grafana.harkon.co.uk`
- [ ] Login with credentials from .env
- [ ] Verify Prometheus datasource
- [ ] Verify Loki datasource
- [ ] Import dashboards
- [ ] Test queries
### Phase 4: Application Services
#### Build and Push Images
- [ ] Verify Gitea registry access: `docker login gitea.harkon.co.uk`
- [ ] Build base images: `./scripts/build-base-images.sh gitea.harkon.co.uk v1.0.1 harkon`
- [ ] Build service images: `./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.1 harkon`
- [ ] Verify images in Gitea: `https://gitea.harkon.co.uk/harkon/-/packages`
#### Deploy Services
- [ ] Deploy: `./infra/scripts/deploy.sh production services`
- [ ] Wait for services: `sleep 60`
- [ ] Check status: `docker ps | grep svc-`
- [ ] Check logs: `docker compose -f infra/base/services.yaml --env-file infra/environments/production/.env logs -f`
- [ ] Verify all 14 services running
- [ ] Check health endpoints
### Phase 5: Configure Authentik OAuth
For each service that needs OAuth:
#### Grafana
- [ ] Create OAuth provider in Authentik
- [ ] Note client ID and secret
- [ ] Update `GRAFANA_OAUTH_CLIENT_SECRET` in .env
- [ ] Restart Grafana: `docker restart grafana`
- [ ] Test OAuth login
#### MinIO
- [ ] Create OAuth provider in Authentik
- [ ] Note client ID and secret
- [ ] Update `AUTHENTIK_MINIO_CLIENT_SECRET` in .env
- [ ] Restart MinIO: `docker restart minio`
- [ ] Test OAuth login
#### Vault
- [ ] Create OAuth provider in Authentik
- [ ] Note client ID and secret
- [ ] Update `AUTHENTIK_VAULT_CLIENT_SECRET` in .env
- [ ] Configure Vault OIDC
- [ ] Test OAuth login
#### UI Review
- [ ] Create OAuth provider in Authentik
- [ ] Note client ID and secret
- [ ] Update `AUTHENTIK_UI_REVIEW_CLIENT_SECRET` in .env
- [ ] Restart UI Review: `docker restart ui-review`
- [ ] Test OAuth login
---
## Post-Deployment Verification
### Service Accessibility
- [ ] Traefik Dashboard: `https://traefik.harkon.co.uk`
- [ ] Authentik: `https://authentik.harkon.co.uk`
- [ ] Gitea: `https://gitea.harkon.co.uk`
- [ ] Grafana: `https://grafana.harkon.co.uk`
- [ ] Prometheus: `https://prometheus.harkon.co.uk`
- [ ] Vault: `https://vault.harkon.co.uk`
- [ ] MinIO: `https://minio.harkon.co.uk`
- [ ] UI Review: `https://ui-review.harkon.co.uk`
### Health Checks
- [ ] All services show as healthy in `docker ps`
- [ ] No error logs in `docker compose logs`
- [ ] Grafana shows metrics from Prometheus
- [ ] Loki receiving logs
- [ ] Traefik routing working correctly
- [ ] SSL certificates valid
### Functional Tests
- [ ] Can log in to Authentik
- [ ] Can log in to Grafana via OAuth
- [ ] Can access MinIO console
- [ ] Can push/pull from Gitea registry
- [ ] Can access UI Review
- [ ] Can query Prometheus
- [ ] Can view logs in Loki
### Performance Checks
- [ ] Response times acceptable (<2s)
- [ ] No memory leaks (check `docker stats`)
- [ ] No CPU spikes
- [ ] Disk usage reasonable
---
## Rollback Plan
If deployment fails:
### Rollback External Services
- [ ] Stop service: `cd infra/compose/<service> && docker compose down`
- [ ] Restore previous version
- [ ] Restart: `docker compose up -d`
### Rollback Application Infrastructure
- [ ] Stop services: `./infra/scripts/deploy.sh production down`
- [ ] Restore data from backup
- [ ] Deploy previous version
- [ ] Verify functionality
### Restore Data
- [ ] PostgreSQL: `docker exec -i postgres psql -U postgres -d tax_system < backup.sql`
- [ ] Neo4j: `docker exec neo4j neo4j-admin load --from=/backup/neo4j.dump`
- [ ] MinIO: Restore from backup bucket
- [ ] Vault: Restore from snapshot
---
## Maintenance Checklist
### Daily
- [ ] Check service status: `make status`
- [ ] Check logs for errors: `make logs | grep ERROR`
- [ ] Check disk space: `df -h`
- [ ] Check Grafana dashboards
### Weekly
- [ ] Review Grafana metrics
- [ ] Check for security updates
- [ ] Review logs for anomalies
- [ ] Test backups
### Monthly
- [ ] Update Docker images
- [ ] Rotate secrets
- [ ] Review and update documentation
- [ ] Test disaster recovery
---
## Emergency Contacts
- **Infrastructure Lead**: [Name]
- **DevOps Team**: [Contact]
- **On-Call**: [Contact]
---
## Notes
- Keep this checklist updated
- Document any deviations
- Note any issues encountered
- Update runbooks based on experience