Initial commit
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
This commit is contained in:
346
infra/STRUCTURE_OVERVIEW.md
Normal file
346
infra/STRUCTURE_OVERVIEW.md
Normal file
@@ -0,0 +1,346 @@
|
||||
# Infrastructure Structure Overview
|
||||
|
||||
## New Multi-Environment Structure
|
||||
|
||||
```
|
||||
infra/
|
||||
├── README.md # Main infrastructure documentation
|
||||
├── DEPLOYMENT_GUIDE.md # Complete deployment guide
|
||||
├── MIGRATION_GUIDE.md # Migration from old structure
|
||||
├── STRUCTURE_OVERVIEW.md # This file
|
||||
│
|
||||
├── base/ # Base compose files (environment-agnostic)
|
||||
│ ├── infrastructure.yaml # Core infrastructure services
|
||||
│ ├── services.yaml # Application microservices
|
||||
│ ├── monitoring.yaml # Monitoring stack
|
||||
│ └── external.yaml # External services (Traefik, Authentik, etc.)
|
||||
│
|
||||
├── environments/ # Environment-specific configurations
|
||||
│ ├── local/ # Local development
|
||||
│ │ ├── .env.example # Template
|
||||
│ │ └── .env # Actual config (gitignored)
|
||||
│ ├── development/ # Development server
|
||||
│ │ ├── .env.example # Template
|
||||
│ │ └── .env # Actual config (gitignored)
|
||||
│ └── production/ # Production server
|
||||
│ ├── .env.example # Template
|
||||
│ └── .env # Actual config (gitignored)
|
||||
│
|
||||
├── configs/ # Service configuration files
|
||||
│ ├── traefik/ # Traefik configs
|
||||
│ │ ├── config/ # Dynamic configuration
|
||||
│ │ │ ├── middlewares.yml
|
||||
│ │ │ ├── routers.yml
|
||||
│ │ │ └── services.yml
|
||||
│ │ ├── traefik.yml # Static configuration
|
||||
│ │ └── .provider.env # GoDaddy API credentials (gitignored)
|
||||
│ ├── grafana/ # Grafana configs
|
||||
│ │ ├── dashboards/ # Dashboard JSON files
|
||||
│ │ └── provisioning/ # Datasources, dashboards
|
||||
│ ├── prometheus/ # Prometheus config
|
||||
│ │ └── prometheus.yml
|
||||
│ ├── loki/ # Loki config
|
||||
│ │ └── loki-config.yml
|
||||
│ ├── promtail/ # Promtail config
|
||||
│ │ └── promtail-config.yml
|
||||
│ ├── vault/ # Vault config
|
||||
│ │ └── config/
|
||||
│ └── authentik/ # Authentik bootstrap
|
||||
│ ├── bootstrap.yaml
|
||||
│ ├── custom-templates/
|
||||
│ └── media/
|
||||
│
|
||||
├── certs/ # SSL certificates (gitignored)
|
||||
│ ├── local/ # Self-signed certs
|
||||
│ ├── development/ # Let's Encrypt certs
|
||||
│ └── production/ # Let's Encrypt certs
|
||||
│
|
||||
├── docker/ # Dockerfile templates
|
||||
│ ├── base-runtime.Dockerfile # Base image for all services
|
||||
│ ├── base-ml.Dockerfile # Base image for ML services
|
||||
│ └── Dockerfile.ml-service.template
|
||||
│
|
||||
└── scripts/ # Deployment and utility scripts
|
||||
├── deploy.sh # Main deployment script
|
||||
├── setup-networks.sh # Create Docker networks
|
||||
└── cleanup.sh # Cleanup script
|
||||
```
|
||||
|
||||
## Base Compose Files
|
||||
|
||||
### infrastructure.yaml
|
||||
Core infrastructure services needed by the application:
|
||||
- **Vault** - Secrets management
|
||||
- **MinIO** - Object storage (S3-compatible)
|
||||
- **PostgreSQL** - Relational database
|
||||
- **Neo4j** - Graph database
|
||||
- **Qdrant** - Vector database
|
||||
- **Redis** - Cache and session store
|
||||
- **NATS** - Message queue (with JetStream)
|
||||
|
||||
### services.yaml
|
||||
Application microservices (14 services):
|
||||
- **svc-ingestion** - Document ingestion
|
||||
- **svc-extract** - Data extraction
|
||||
- **svc-kg** - Knowledge graph
|
||||
- **svc-rag-indexer** - RAG indexing (ML)
|
||||
- **svc-rag-retriever** - RAG retrieval (ML)
|
||||
- **svc-forms** - Form processing
|
||||
- **svc-hmrc** - HMRC integration
|
||||
- **svc-ocr** - OCR processing (ML)
|
||||
- **svc-rpa** - RPA automation
|
||||
- **svc-normalize-map** - Data normalization
|
||||
- **svc-reason** - Reasoning engine
|
||||
- **svc-firm-connectors** - Firm integrations
|
||||
- **svc-coverage** - Coverage analysis
|
||||
- **ui-review** - Review UI (Next.js)
|
||||
|
||||
### monitoring.yaml
|
||||
Monitoring and observability stack:
|
||||
- **Prometheus** - Metrics collection
|
||||
- **Grafana** - Dashboards and visualization
|
||||
- **Loki** - Log aggregation
|
||||
- **Promtail** - Log collection
|
||||
|
||||
### external.yaml (optional)
|
||||
External services that may already exist:
|
||||
- **Traefik** - Reverse proxy and load balancer
|
||||
- **Authentik** - SSO and authentication
|
||||
- **Gitea** - Git repository and container registry
|
||||
- **Nextcloud** - File storage
|
||||
- **Portainer** - Docker management UI
|
||||
|
||||
## Environment Configurations
|
||||
|
||||
### Local Development
|
||||
- **Domain**: `localhost` or `*.local.harkon.co.uk`
|
||||
- **SSL**: Self-signed certificates
|
||||
- **Auth**: Optional (can disable Authentik)
|
||||
- **Registry**: Local Docker registry or Gitea
|
||||
- **Passwords**: Simple (postgres, admin, etc.)
|
||||
- **Purpose**: Local development and testing
|
||||
- **Traefik Dashboard**: Exposed on port 8080
|
||||
|
||||
### Development Server
|
||||
- **Domain**: `*.dev.harkon.co.uk`
|
||||
- **SSL**: Let's Encrypt (DNS-01 via GoDaddy)
|
||||
- **Auth**: Authentik SSO enabled
|
||||
- **Registry**: Gitea container registry
|
||||
- **Passwords**: Strong (auto-generated)
|
||||
- **Purpose**: Staging and integration testing
|
||||
- **Traefik Dashboard**: Protected by Authentik
|
||||
|
||||
### Production Server
|
||||
- **Domain**: `*.harkon.co.uk`
|
||||
- **SSL**: Let's Encrypt (DNS-01 via GoDaddy)
|
||||
- **Auth**: Authentik SSO enabled
|
||||
- **Registry**: Gitea container registry
|
||||
- **Passwords**: Strong (auto-generated)
|
||||
- **Purpose**: Production deployment
|
||||
- **Traefik Dashboard**: Protected by Authentik
|
||||
- **Monitoring**: Full stack enabled
|
||||
|
||||
## Docker Networks
|
||||
|
||||
All environments use two networks:
|
||||
|
||||
### frontend
|
||||
- Public-facing services
|
||||
- Connected to Traefik
|
||||
- Services: UI, Grafana, Vault, MinIO console
|
||||
|
||||
### backend
|
||||
- Internal services
|
||||
- Not directly accessible
|
||||
- Services: Databases, message queues, internal APIs
|
||||
|
||||
## Volume Naming
|
||||
|
||||
Volumes are named consistently across environments:
|
||||
- `postgres_data`
|
||||
- `neo4j_data`
|
||||
- `neo4j_logs`
|
||||
- `qdrant_data`
|
||||
- `minio_data`
|
||||
- `vault_data`
|
||||
- `redis_data`
|
||||
- `nats_data`
|
||||
- `prometheus_data`
|
||||
- `grafana_data`
|
||||
- `loki_data`
|
||||
|
||||
## Deployment Workflow
|
||||
|
||||
### 1. Setup Environment
|
||||
```bash
|
||||
cp infra/environments/production/.env.example infra/environments/production/.env
|
||||
vim infra/environments/production/.env
|
||||
```
|
||||
|
||||
### 2. Generate Secrets
|
||||
```bash
|
||||
./scripts/generate-production-secrets.sh
|
||||
```
|
||||
|
||||
### 3. Setup Networks
|
||||
```bash
|
||||
./infra/scripts/setup-networks.sh
|
||||
```
|
||||
|
||||
### 4. Deploy Infrastructure
|
||||
```bash
|
||||
./infra/scripts/deploy.sh production infrastructure
|
||||
```
|
||||
|
||||
### 5. Deploy Monitoring
|
||||
```bash
|
||||
./infra/scripts/deploy.sh production monitoring
|
||||
```
|
||||
|
||||
### 6. Deploy Services
|
||||
```bash
|
||||
./infra/scripts/deploy.sh production services
|
||||
```
|
||||
|
||||
## Key Features
|
||||
|
||||
### ✅ Multi-Environment Support
|
||||
Single codebase deploys to local, development, and production with environment-specific configurations.
|
||||
|
||||
### ✅ Modular Architecture
|
||||
Services split into logical groups (infrastructure, monitoring, services, external) for independent deployment.
|
||||
|
||||
### ✅ Unified Deployment
|
||||
Single `deploy.sh` script handles all environments and stacks.
|
||||
|
||||
### ✅ Environment Isolation
|
||||
Each environment has its own `.env` file with appropriate secrets and configurations.
|
||||
|
||||
### ✅ Shared Configurations
|
||||
Common service configs in `configs/` directory, referenced by all environments.
|
||||
|
||||
### ✅ Security Best Practices
|
||||
- Secrets in gitignored `.env` files
|
||||
- Strong password generation
|
||||
- Authentik SSO integration
|
||||
- SSL/TLS everywhere (Let's Encrypt)
|
||||
|
||||
### ✅ Easy Maintenance
|
||||
- Clear directory structure
|
||||
- Comprehensive documentation
|
||||
- Migration guide from old structure
|
||||
- Troubleshooting guides
|
||||
|
||||
## Service Access
|
||||
|
||||
### Local
|
||||
- http://localhost:3000 - Grafana
|
||||
- http://localhost:9093 - MinIO
|
||||
- http://localhost:8200 - Vault
|
||||
- http://localhost:8080 - Traefik Dashboard
|
||||
|
||||
### Development
|
||||
- https://grafana.dev.harkon.co.uk
|
||||
- https://minio.dev.harkon.co.uk
|
||||
- https://vault.dev.harkon.co.uk
|
||||
- https://ui-review.dev.harkon.co.uk
|
||||
|
||||
### Production
|
||||
- https://grafana.harkon.co.uk
|
||||
- https://minio.harkon.co.uk
|
||||
- https://vault.harkon.co.uk
|
||||
- https://ui-review.harkon.co.uk
|
||||
|
||||
## Configuration Management
|
||||
|
||||
### Environment Variables
|
||||
All configuration via environment variables in `.env` files:
|
||||
- Domain settings
|
||||
- Database passwords
|
||||
- API keys
|
||||
- OAuth secrets
|
||||
- Registry credentials
|
||||
|
||||
### Service Configs
|
||||
Static configurations in `configs/` directory:
|
||||
- Traefik routing rules
|
||||
- Grafana dashboards
|
||||
- Prometheus scrape configs
|
||||
- Loki retention policies
|
||||
|
||||
### Secrets Management
|
||||
- Development/Production: Vault
|
||||
- Local: Environment variables
|
||||
- Rotation: `generate-production-secrets.sh`
|
||||
|
||||
## Monitoring and Observability
|
||||
|
||||
### Metrics (Prometheus)
|
||||
- Service health
|
||||
- Resource usage
|
||||
- Request rates
|
||||
- Error rates
|
||||
|
||||
### Logs (Loki)
|
||||
- Centralized logging
|
||||
- Query via Grafana
|
||||
- Retention policies
|
||||
- Log aggregation
|
||||
|
||||
### Dashboards (Grafana)
|
||||
- Infrastructure overview
|
||||
- Service metrics
|
||||
- Application performance
|
||||
- Business metrics
|
||||
|
||||
### Alerts
|
||||
- Prometheus AlertManager
|
||||
- Slack/Email notifications
|
||||
- PagerDuty integration
|
||||
|
||||
## Backup Strategy
|
||||
|
||||
### What to Backup
|
||||
- PostgreSQL database
|
||||
- Neo4j graph data
|
||||
- Vault secrets
|
||||
- MinIO objects
|
||||
- Qdrant vectors
|
||||
- Grafana dashboards
|
||||
|
||||
### How to Backup
|
||||
```bash
|
||||
# Automated backup script
|
||||
./scripts/backup-volumes.sh production
|
||||
|
||||
# Manual backup
|
||||
docker run --rm -v postgres_data:/data -v $(pwd):/backup alpine tar czf /backup/postgres.tar.gz /data
|
||||
```
|
||||
|
||||
### Backup Schedule
|
||||
- Daily: Databases
|
||||
- Weekly: Full system
|
||||
- Monthly: Archive
|
||||
|
||||
## Disaster Recovery
|
||||
|
||||
### Recovery Steps
|
||||
1. Restore infrastructure
|
||||
2. Restore volumes from backup
|
||||
3. Deploy services
|
||||
4. Verify functionality
|
||||
5. Update DNS if needed
|
||||
|
||||
### RTO/RPO
|
||||
- **RTO**: 4 hours (Recovery Time Objective)
|
||||
- **RPO**: 24 hours (Recovery Point Objective)
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Review [DEPLOYMENT_GUIDE.md](DEPLOYMENT_GUIDE.md) for deployment instructions
|
||||
2. Review [MIGRATION_GUIDE.md](MIGRATION_GUIDE.md) if migrating from old structure
|
||||
3. Setup environment files
|
||||
4. Deploy to local first
|
||||
5. Test in development
|
||||
6. Deploy to production
|
||||
|
||||
Reference in New Issue
Block a user