Files
ai-tax-agent/infra/README.md
harkon b324ff09ef
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
Initial commit
2025-10-11 08:41:36 +01:00

248 lines
6.7 KiB
Markdown

# AI Tax Agent Infrastructure
Multi-environment Docker Compose infrastructure for AI Tax Agent.
## Directory Structure
```
infra/
├── environments/ # Environment-specific configurations
│ ├── local/ # Local development (localhost, self-signed certs)
│ ├── development/ # Development server (dev.harkon.co.uk)
│ └── production/ # Production server (harkon.co.uk)
├── base/ # Base compose files (shared across environments)
│ ├── infrastructure.yaml # Core infra (Vault, MinIO, DBs, etc.)
│ ├── monitoring.yaml # Monitoring stack (Prometheus, Grafana, Loki)
│ ├── services.yaml # Application services
│ └── external.yaml # External services (Traefik, Authentik, Gitea, etc.)
├── configs/ # Service configurations
│ ├── traefik/ # Traefik configs
│ ├── grafana/ # Grafana dashboards & provisioning
│ ├── prometheus/ # Prometheus config
│ ├── loki/ # Loki config
│ ├── vault/ # Vault config
│ └── authentik/ # Authentik bootstrap
├── certs/ # SSL certificates (gitignored)
│ ├── local/ # Self-signed certs for local
│ ├── development/ # Let's Encrypt certs for dev
│ └── production/ # Let's Encrypt certs for prod
└── scripts/ # Deployment scripts
├── deploy.sh # Main deployment script
├── setup-networks.sh # Create Docker networks
└── cleanup.sh # Cleanup script
```
## Environments
### Local Development
- **Domain**: `localhost` / `*.local.harkon.co.uk`
- **SSL**: Self-signed certificates
- **Auth**: Authentik (optional)
- **Registry**: Local Docker registry or Gitea
- **Purpose**: Local development and testing
### Development
- **Domain**: `*.dev.harkon.co.uk`
- **SSL**: Let's Encrypt (DNS-01 challenge)
- **Auth**: Authentik SSO
- **Registry**: Gitea container registry
- **Purpose**: Staging/testing before production
### Production
- **Domain**: `*.harkon.co.uk`
- **SSL**: Let's Encrypt (DNS-01 challenge)
- **Auth**: Authentik SSO
- **Registry**: Gitea container registry
- **Purpose**: Production deployment
## Quick Start
### 1. Setup Environment
```bash
# Choose your environment
export ENV=local # or development, production
# Copy environment template
cp infra/environments/$ENV/.env.example infra/environments/$ENV/.env
# Edit environment variables
vim infra/environments/$ENV/.env
```
### 2. Generate Secrets (Production/Development only)
```bash
./scripts/generate-production-secrets.sh
```
### 3. Create Docker Networks
```bash
./infra/scripts/setup-networks.sh
```
### 4. Deploy Infrastructure
```bash
# Deploy everything
./infra/scripts/deploy.sh $ENV all
# Or deploy specific stacks
./infra/scripts/deploy.sh $ENV infrastructure
./infra/scripts/deploy.sh $ENV monitoring
./infra/scripts/deploy.sh $ENV services
```
## Environment Variables
Each environment has its own `.env` file with:
- **Domain Configuration**: `DOMAIN`, `EMAIL`
- **Database Passwords**: `POSTGRES_PASSWORD`, `NEO4J_PASSWORD`, etc.
- **Object Storage**: `MINIO_ROOT_USER`, `MINIO_ROOT_PASSWORD`
- **Secrets Management**: `VAULT_DEV_ROOT_TOKEN_ID`
- **SSO/Auth**: `AUTHENTIK_SECRET_KEY`, `AUTHENTIK_BOOTSTRAP_PASSWORD`
- **Monitoring**: `GRAFANA_PASSWORD`, OAuth secrets
- **Application**: Service-specific configs
## Deployment Commands
### Deploy Full Stack
```bash
# Local
./infra/scripts/deploy.sh local all
# Development
./infra/scripts/deploy.sh development all
# Production
./infra/scripts/deploy.sh production all
```
### Deploy Individual Stacks
```bash
# Infrastructure only (Vault, MinIO, DBs, etc.)
./infra/scripts/deploy.sh production infrastructure
# Monitoring only (Prometheus, Grafana, Loki)
./infra/scripts/deploy.sh production monitoring
# Services only (Application microservices)
./infra/scripts/deploy.sh production services
# External services (Traefik, Authentik, Gitea - usually pre-existing)
./infra/scripts/deploy.sh production external
```
### Stop/Remove Stacks
```bash
# Stop all
./infra/scripts/deploy.sh production down
# Stop specific stack
docker compose -f infra/base/infrastructure.yaml --env-file infra/environments/production/.env down
```
## Network Architecture
All environments use two Docker networks:
- **frontend**: Public-facing services (Traefik, UI)
- **backend**: Internal services (DBs, message queues, etc.)
Networks are created with:
```bash
docker network create frontend
docker network create backend
```
## Volume Management
Volumes are environment-specific and named with environment prefix:
- Local: `local_postgres_data`, `local_vault_data`, etc.
- Development: `dev_postgres_data`, `dev_vault_data`, etc.
- Production: `prod_postgres_data`, `prod_vault_data`, etc.
## SSL Certificates
### Local
- Self-signed certificates in `infra/certs/local/`
- Generated with `scripts/generate-dev-certs.sh`
### Development/Production
- Let's Encrypt certificates via Traefik
- DNS-01 challenge using GoDaddy API
- Stored in `infra/certs/{environment}/`
## External Services
Some services (Traefik, Authentik, Gitea, Nextcloud, Portainer) may already exist on the server.
To use existing services:
1. Don't deploy `external.yaml`
2. Ensure networks are shared
3. Update service discovery labels
## Monitoring
Access monitoring dashboards:
- **Grafana**: `https://grafana.{domain}`
- **Prometheus**: `https://prometheus.{domain}`
- **Traefik Dashboard**: `https://traefik.{domain}/dashboard/`
## Troubleshooting
### Check Service Status
```bash
docker compose -f infra/base/infrastructure.yaml --env-file infra/environments/production/.env ps
```
### View Logs
```bash
docker compose -f infra/base/infrastructure.yaml --env-file infra/environments/production/.env logs -f vault
```
### Restart Service
```bash
docker compose -f infra/base/infrastructure.yaml --env-file infra/environments/production/.env restart vault
```
## Security Notes
- **Never commit `.env` files** - They contain secrets!
- **Rotate secrets regularly** - Use `generate-production-secrets.sh`
- **Use strong passwords** - Minimum 32 characters
- **Enable Authentik SSO** - For all production services
- **Backup volumes** - Especially databases and Vault
## Migration from Old Structure
If migrating from the old structure:
1. Copy environment variables from old `.env` files
2. Update volume names if needed
3. Migrate data volumes
4. Update Traefik labels if using existing Traefik
5. Test in development first!
## Support
For issues or questions:
- Check logs: `docker compose logs -f <service>`
- Review documentation in `docs/`
- Check Traefik dashboard for routing issues