10 KiB
Infrastructure Structure Overview
New Multi-Environment Structure
infra/
├── README.md # Main infrastructure documentation
├── DEPLOYMENT_GUIDE.md # Complete deployment guide
├── MIGRATION_GUIDE.md # Migration from old structure
├── STRUCTURE_OVERVIEW.md # This file
│
├── base/ # Base compose files (environment-agnostic)
│ ├── infrastructure.yaml # Core infrastructure services
│ ├── services.yaml # Application microservices
│ ├── monitoring.yaml # Monitoring stack
│ └── external.yaml # External services (Traefik, Authentik, etc.)
│
├── environments/ # Environment-specific configurations
│ ├── local/ # Local development
│ │ ├── .env.example # Template
│ │ └── .env # Actual config (gitignored)
│ ├── development/ # Development server
│ │ ├── .env.example # Template
│ │ └── .env # Actual config (gitignored)
│ └── production/ # Production server
│ ├── .env.example # Template
│ └── .env # Actual config (gitignored)
│
├── configs/ # Service configuration files
│ ├── traefik/ # Traefik configs
│ │ ├── config/ # Dynamic configuration
│ │ │ ├── middlewares.yml
│ │ │ ├── routers.yml
│ │ │ └── services.yml
│ │ ├── traefik.yml # Static configuration
│ │ └── .provider.env # GoDaddy API credentials (gitignored)
│ ├── grafana/ # Grafana configs
│ │ ├── dashboards/ # Dashboard JSON files
│ │ └── provisioning/ # Datasources, dashboards
│ ├── prometheus/ # Prometheus config
│ │ └── prometheus.yml
│ ├── loki/ # Loki config
│ │ └── loki-config.yml
│ ├── promtail/ # Promtail config
│ │ └── promtail-config.yml
│ ├── vault/ # Vault config
│ │ └── config/
│ └── authentik/ # Authentik bootstrap
│ ├── bootstrap.yaml
│ ├── custom-templates/
│ └── media/
│
├── certs/ # SSL certificates (gitignored)
│ ├── local/ # Self-signed certs
│ ├── development/ # Let's Encrypt certs
│ └── production/ # Let's Encrypt certs
│
├── docker/ # Dockerfile templates
│ ├── base-runtime.Dockerfile # Base image for all services
│ ├── base-ml.Dockerfile # Base image for ML services
│ └── Dockerfile.ml-service.template
│
└── scripts/ # Deployment and utility scripts
├── deploy.sh # Main deployment script
├── setup-networks.sh # Create Docker networks
└── cleanup.sh # Cleanup script
Base Compose Files
infrastructure.yaml
Core infrastructure services needed by the application:
- Vault - Secrets management
- MinIO - Object storage (S3-compatible)
- PostgreSQL - Relational database
- Neo4j - Graph database
- Qdrant - Vector database
- Redis - Cache and session store
- NATS - Message queue (with JetStream)
services.yaml
Application microservices (14 services):
- svc-ingestion - Document ingestion
- svc-extract - Data extraction
- svc-kg - Knowledge graph
- svc-rag-indexer - RAG indexing (ML)
- svc-rag-retriever - RAG retrieval (ML)
- svc-forms - Form processing
- svc-hmrc - HMRC integration
- svc-ocr - OCR processing (ML)
- svc-rpa - RPA automation
- svc-normalize-map - Data normalization
- svc-reason - Reasoning engine
- svc-firm-connectors - Firm integrations
- svc-coverage - Coverage analysis
- ui-review - Review UI (Next.js)
monitoring.yaml
Monitoring and observability stack:
- Prometheus - Metrics collection
- Grafana - Dashboards and visualization
- Loki - Log aggregation
- Promtail - Log collection
external.yaml (optional)
External services that may already exist:
- Traefik - Reverse proxy and load balancer
- Authentik - SSO and authentication
- Gitea - Git repository and container registry
- Nextcloud - File storage
- Portainer - Docker management UI
Environment Configurations
Local Development
- Domain:
localhostor*.local.harkon.co.uk - SSL: Self-signed certificates
- Auth: Optional (can disable Authentik)
- Registry: Local Docker registry or Gitea
- Passwords: Simple (postgres, admin, etc.)
- Purpose: Local development and testing
- Traefik Dashboard: Exposed on port 8080
Development Server
- Domain:
*.dev.harkon.co.uk - SSL: Let's Encrypt (DNS-01 via GoDaddy)
- Auth: Authentik SSO enabled
- Registry: Gitea container registry
- Passwords: Strong (auto-generated)
- Purpose: Staging and integration testing
- Traefik Dashboard: Protected by Authentik
Production Server
- Domain:
*.harkon.co.uk - SSL: Let's Encrypt (DNS-01 via GoDaddy)
- Auth: Authentik SSO enabled
- Registry: Gitea container registry
- Passwords: Strong (auto-generated)
- Purpose: Production deployment
- Traefik Dashboard: Protected by Authentik
- Monitoring: Full stack enabled
Docker Networks
All environments use two networks:
frontend
- Public-facing services
- Connected to Traefik
- Services: UI, Grafana, Vault, MinIO console
backend
- Internal services
- Not directly accessible
- Services: Databases, message queues, internal APIs
Volume Naming
Volumes are named consistently across environments:
postgres_dataneo4j_dataneo4j_logsqdrant_dataminio_datavault_dataredis_datanats_dataprometheus_datagrafana_dataloki_data
Deployment Workflow
1. Setup Environment
cp infra/environments/production/.env.example infra/environments/production/.env
vim infra/environments/production/.env
2. Generate Secrets
./scripts/generate-production-secrets.sh
3. Setup Networks
./infra/scripts/setup-networks.sh
4. Deploy Infrastructure
./infra/scripts/deploy.sh production infrastructure
5. Deploy Monitoring
./infra/scripts/deploy.sh production monitoring
6. Deploy Services
./infra/scripts/deploy.sh production services
Key Features
✅ Multi-Environment Support
Single codebase deploys to local, development, and production with environment-specific configurations.
✅ Modular Architecture
Services split into logical groups (infrastructure, monitoring, services, external) for independent deployment.
✅ Unified Deployment
Single deploy.sh script handles all environments and stacks.
✅ Environment Isolation
Each environment has its own .env file with appropriate secrets and configurations.
✅ Shared Configurations
Common service configs in configs/ directory, referenced by all environments.
✅ Security Best Practices
- Secrets in gitignored
.envfiles - Strong password generation
- Authentik SSO integration
- SSL/TLS everywhere (Let's Encrypt)
✅ Easy Maintenance
- Clear directory structure
- Comprehensive documentation
- Migration guide from old structure
- Troubleshooting guides
Service Access
Local
- http://localhost:3000 - Grafana
- http://localhost:9093 - MinIO
- http://localhost:8200 - Vault
- http://localhost:8080 - Traefik Dashboard
Development
- https://grafana.dev.harkon.co.uk
- https://minio.dev.harkon.co.uk
- https://vault.dev.harkon.co.uk
- https://ui-review.dev.harkon.co.uk
Production
- https://grafana.harkon.co.uk
- https://minio.harkon.co.uk
- https://vault.harkon.co.uk
- https://ui-review.harkon.co.uk
Configuration Management
Environment Variables
All configuration via environment variables in .env files:
- Domain settings
- Database passwords
- API keys
- OAuth secrets
- Registry credentials
Service Configs
Static configurations in configs/ directory:
- Traefik routing rules
- Grafana dashboards
- Prometheus scrape configs
- Loki retention policies
Secrets Management
- Development/Production: Vault
- Local: Environment variables
- Rotation:
generate-production-secrets.sh
Monitoring and Observability
Metrics (Prometheus)
- Service health
- Resource usage
- Request rates
- Error rates
Logs (Loki)
- Centralized logging
- Query via Grafana
- Retention policies
- Log aggregation
Dashboards (Grafana)
- Infrastructure overview
- Service metrics
- Application performance
- Business metrics
Alerts
- Prometheus AlertManager
- Slack/Email notifications
- PagerDuty integration
Backup Strategy
What to Backup
- PostgreSQL database
- Neo4j graph data
- Vault secrets
- MinIO objects
- Qdrant vectors
- Grafana dashboards
How to Backup
# Automated backup script
./scripts/backup-volumes.sh production
# Manual backup
docker run --rm -v postgres_data:/data -v $(pwd):/backup alpine tar czf /backup/postgres.tar.gz /data
Backup Schedule
- Daily: Databases
- Weekly: Full system
- Monthly: Archive
Disaster Recovery
Recovery Steps
- Restore infrastructure
- Restore volumes from backup
- Deploy services
- Verify functionality
- Update DNS if needed
RTO/RPO
- RTO: 4 hours (Recovery Time Objective)
- RPO: 24 hours (Recovery Point Objective)
Next Steps
- Review DEPLOYMENT_GUIDE.md for deployment instructions
- Review MIGRATION_GUIDE.md if migrating from old structure
- Setup environment files
- Deploy to local first
- Test in development
- Deploy to production