Initial commit
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
This commit is contained in:
345
docs/DEPLOYMENT_PLAN.md
Normal file
345
docs/DEPLOYMENT_PLAN.md
Normal file
@@ -0,0 +1,345 @@
|
||||
# Unified Infrastructure Deployment Plan
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This plan outlines the strategy to host both the **AI Tax Agent application** and **company services** (Nextcloud, Gitea, Portainer, Authentik) on the remote server at `141.136.35.199` while maintaining an efficient local development workflow.
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### Remote Server (`141.136.35.199`)
|
||||
- **Location**: `/opt/compose/`
|
||||
- **Existing Services**:
|
||||
- Traefik v3.5.1 (reverse proxy with GoDaddy DNS challenge)
|
||||
- Authentik 2025.8.1 (SSO/Authentication)
|
||||
- Gitea 1.24.5 (Git hosting)
|
||||
- Nextcloud (Cloud storage)
|
||||
- Portainer 2.33.1 (Docker management)
|
||||
- **Networks**: `frontend` and `backend` (external)
|
||||
- **Domain**: `harkon.co.uk`
|
||||
- **SSL**: Let's Encrypt via GoDaddy DNS challenge
|
||||
- **Exposed Subdomains**:
|
||||
- `traefik.harkon.co.uk`
|
||||
- `authentik.harkon.co.uk`
|
||||
- `gitea.harkon.co.uk`
|
||||
- `cloud.harkon.co.uk`
|
||||
- `portainer.harkon.co.uk`
|
||||
|
||||
### Local Repository (`infra/compose/`)
|
||||
- **Compose Files**:
|
||||
- `docker-compose.local.yml` - Full stack for local development
|
||||
- `docker-compose.backend.yml` - Backend services (appears to be production-ready)
|
||||
- **Application Services**:
|
||||
- 13+ microservices (svc-ingestion, svc-extract, svc-forms, svc-hmrc, etc.)
|
||||
- UI Review application
|
||||
- Infrastructure: Vault, MinIO, Qdrant, Neo4j, Postgres, Redis, NATS, Prometheus, Grafana, Loki
|
||||
- **Networks**: `ai-tax-agent-frontend` and `ai-tax-agent-backend`
|
||||
- **Domain**: `local.lan` (for development)
|
||||
- **Authentication**: Authentik with ForwardAuth middleware
|
||||
|
||||
## Challenges & Conflicts
|
||||
|
||||
### 1. **Duplicate Services**
|
||||
- Both environments have Traefik and Authentik
|
||||
- Need to decide: shared vs. isolated
|
||||
|
||||
### 2. **Network Naming**
|
||||
- Remote: `frontend`, `backend`
|
||||
- Local: `ai-tax-agent-frontend`, `ai-tax-agent-backend`
|
||||
- Production needs: Consistent naming
|
||||
|
||||
### 3. **Domain Management**
|
||||
- Remote: `*.harkon.co.uk` (public)
|
||||
- Local: `*.local.lan` (development)
|
||||
- Production: Need subdomains like `app.harkon.co.uk`, `api.harkon.co.uk`
|
||||
|
||||
### 4. **SSL Certificates**
|
||||
- Remote: GoDaddy DNS challenge (production)
|
||||
- Local: Self-signed certificates
|
||||
- Production: Must use GoDaddy DNS challenge
|
||||
|
||||
### 5. **Resource Isolation**
|
||||
- Company services need to remain stable
|
||||
- Application services need independent deployment/rollback
|
||||
|
||||
## Recommended Architecture
|
||||
|
||||
### Option A: Unified Traefik & Authentik (RECOMMENDED)
|
||||
|
||||
**Pros**:
|
||||
- Single point of entry
|
||||
- Shared authentication across all services
|
||||
- Simplified SSL management
|
||||
- Cost-effective (one Traefik, one Authentik)
|
||||
|
||||
**Cons**:
|
||||
- Application deployments could affect company services
|
||||
- Requires careful configuration management
|
||||
|
||||
**Implementation**:
|
||||
```
|
||||
/opt/compose/
|
||||
├── traefik/ # Shared Traefik (existing)
|
||||
├── authentik/ # Shared Authentik (existing)
|
||||
├── company/ # Company services
|
||||
│ ├── gitea/
|
||||
│ ├── nextcloud/
|
||||
│ └── portainer/
|
||||
└── ai-tax-agent/ # Application services
|
||||
├── infrastructure/ # App-specific infra (Vault, MinIO, Neo4j, etc.)
|
||||
└── services/ # Microservices
|
||||
```
|
||||
|
||||
### Option B: Isolated Stacks
|
||||
|
||||
**Pros**:
|
||||
- Complete isolation
|
||||
- Independent scaling
|
||||
- No cross-contamination
|
||||
|
||||
**Cons**:
|
||||
- Duplicate Traefik/Authentik
|
||||
- More complex SSL management
|
||||
- Higher resource usage
|
||||
- Users need separate logins
|
||||
|
||||
## Proposed Solution: Hybrid Approach
|
||||
|
||||
### Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Internet (*.harkon.co.uk) │
|
||||
└────────────────────────┬────────────────────────────────────┘
|
||||
│
|
||||
┌────▼────┐
|
||||
│ Traefik │ (Port 80/443)
|
||||
│ v3.5.1 │
|
||||
└────┬────┘
|
||||
│
|
||||
┌────────────────┼────────────────┐
|
||||
│ │ │
|
||||
┌────▼─────┐ ┌────▼────┐ ┌────▼─────┐
|
||||
│Authentik │ │ Company │ │ App │
|
||||
│ SSO │ │Services │ │ Services │
|
||||
└──────────┘ └─────────┘ └──────────┘
|
||||
│ │
|
||||
┌────┴────┐ ┌────┴────┐
|
||||
│ Gitea │ │ Vault │
|
||||
│Nextcloud│ │ MinIO │
|
||||
│Portainer│ │ Neo4j │
|
||||
└─────────┘ │ Qdrant │
|
||||
│ Postgres│
|
||||
│ Redis │
|
||||
│ NATS │
|
||||
│ 13 SVCs │
|
||||
│ UI │
|
||||
└─────────┘
|
||||
```
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
/opt/compose/
|
||||
├── traefik/ # Shared reverse proxy
|
||||
│ ├── compose.yaml
|
||||
│ ├── config/
|
||||
│ │ ├── traefik.yaml # Static config
|
||||
│ │ ├── dynamic-company.yaml
|
||||
│ │ └── dynamic-app.yaml
|
||||
│ └── certs/
|
||||
├── authentik/ # Shared SSO
|
||||
│ ├── compose.yaml
|
||||
│ └── ...
|
||||
├── company/ # Company services namespace
|
||||
│ ├── gitea/
|
||||
│ │ └── compose.yaml
|
||||
│ ├── nextcloud/
|
||||
│ │ └── compose.yaml
|
||||
│ └── portainer/
|
||||
│ └── compose.yaml
|
||||
└── ai-tax-agent/ # Application namespace
|
||||
├── .env # Production environment
|
||||
├── infrastructure.yaml # Vault, MinIO, Neo4j, Qdrant, etc.
|
||||
├── services.yaml # All microservices
|
||||
└── monitoring.yaml # Prometheus, Grafana, Loki
|
||||
```
|
||||
|
||||
### Network Strategy
|
||||
|
||||
**Shared Networks**:
|
||||
- `frontend` - For all services exposed via Traefik
|
||||
- `backend` - For internal service communication
|
||||
|
||||
**Application-Specific Networks** (optional):
|
||||
- `ai-tax-agent-internal` - For app-only internal communication
|
||||
|
||||
### Domain Mapping
|
||||
|
||||
**Company Services** (existing):
|
||||
- `traefik.harkon.co.uk` - Traefik dashboard
|
||||
- `authentik.harkon.co.uk` - Authentik SSO
|
||||
- `gitea.harkon.co.uk` - Git hosting
|
||||
- `cloud.harkon.co.uk` - Nextcloud
|
||||
- `portainer.harkon.co.uk` - Docker management
|
||||
|
||||
**Application Services** (new):
|
||||
- `app.harkon.co.uk` - Review UI
|
||||
- `api.harkon.co.uk` - API Gateway (all microservices)
|
||||
- `vault.harkon.co.uk` - Vault UI (admin only)
|
||||
- `minio.harkon.co.uk` - MinIO Console (admin only)
|
||||
- `neo4j.harkon.co.uk` - Neo4j Browser (admin only)
|
||||
- `qdrant.harkon.co.uk` - Qdrant UI (admin only)
|
||||
- `grafana.harkon.co.uk` - Grafana (monitoring)
|
||||
- `prometheus.harkon.co.uk` - Prometheus (admin only)
|
||||
- `loki.harkon.co.uk` - Loki (admin only)
|
||||
|
||||
### Authentication Strategy
|
||||
|
||||
**Authentik Configuration**:
|
||||
1. **Company Group** - Access to Gitea, Nextcloud, Portainer
|
||||
2. **App Admin Group** - Full access to all app services
|
||||
3. **App User Group** - Access to Review UI and API
|
||||
4. **App Reviewer Group** - Access to Review UI only
|
||||
|
||||
**Middleware Configuration**:
|
||||
- `authentik-forwardauth` - Standard auth for all services
|
||||
- `admin-auth` - Requires admin group (Vault, MinIO, Neo4j, etc.)
|
||||
- `reviewer-auth` - Requires reviewer or higher
|
||||
- `rate-limit` - Standard rate limiting
|
||||
- `api-rate-limit` - Stricter API rate limiting
|
||||
|
||||
## Local Development Workflow
|
||||
|
||||
### Development Environment
|
||||
|
||||
**Keep Existing Setup**:
|
||||
- Use `docker-compose.local.yml` as-is
|
||||
- Domain: `*.local.lan`
|
||||
- Self-signed certificates
|
||||
- Isolated networks: `ai-tax-agent-frontend`, `ai-tax-agent-backend`
|
||||
- Full stack runs locally
|
||||
|
||||
**Benefits**:
|
||||
- No dependency on remote server
|
||||
- Fast iteration
|
||||
- Complete isolation
|
||||
- Works offline
|
||||
|
||||
### Development Commands
|
||||
|
||||
```bash
|
||||
# Local development
|
||||
make bootstrap # Initial setup
|
||||
make up # Start all services
|
||||
make down # Stop all services
|
||||
make logs SERVICE=svc-ingestion
|
||||
|
||||
# Build and test
|
||||
make build # Build all images
|
||||
make test # Run tests
|
||||
make test-integration # Integration tests
|
||||
|
||||
# Deploy to production
|
||||
make deploy-production # Deploy to remote server
|
||||
```
|
||||
|
||||
## Production Deployment Strategy
|
||||
|
||||
### Phase 1: Preparation (Week 1)
|
||||
|
||||
1. **Backup Current State**
|
||||
```bash
|
||||
ssh deploy@141.136.35.199
|
||||
cd /opt/compose
|
||||
tar -czf ~/backup-$(date +%Y%m%d).tar.gz .
|
||||
```
|
||||
|
||||
2. **Create Production Environment File**
|
||||
- Copy `infra/compose/env.example` to `infra/compose/.env.production`
|
||||
- Update all secrets and passwords
|
||||
- Set `DOMAIN=harkon.co.uk`
|
||||
- Configure GoDaddy API credentials
|
||||
|
||||
3. **Update Traefik Configuration**
|
||||
- Merge local Traefik config with remote
|
||||
- Add application routes
|
||||
- Configure Authentik ForwardAuth
|
||||
|
||||
4. **Prepare Docker Images**
|
||||
- Build all application images
|
||||
- Push to container registry (Gitea registry or Docker Hub)
|
||||
- Tag with version numbers
|
||||
|
||||
### Phase 2: Infrastructure Deployment (Week 2)
|
||||
|
||||
1. **Deploy Application Infrastructure**
|
||||
```bash
|
||||
# On remote server
|
||||
cd /opt/compose/ai-tax-agent
|
||||
docker compose -f infrastructure.yaml up -d
|
||||
```
|
||||
|
||||
2. **Initialize Services**
|
||||
- Vault: Unseal and configure
|
||||
- Postgres: Run migrations
|
||||
- Neo4j: Install plugins
|
||||
- MinIO: Create buckets
|
||||
|
||||
3. **Configure Authentik**
|
||||
- Create application groups
|
||||
- Configure OAuth providers
|
||||
- Set up ForwardAuth outpost
|
||||
|
||||
### Phase 3: Application Deployment (Week 3)
|
||||
|
||||
1. **Deploy Microservices**
|
||||
```bash
|
||||
docker compose -f services.yaml up -d
|
||||
```
|
||||
|
||||
2. **Deploy Monitoring**
|
||||
```bash
|
||||
docker compose -f monitoring.yaml up -d
|
||||
```
|
||||
|
||||
3. **Verify Health**
|
||||
- Check all service health endpoints
|
||||
- Verify Traefik routing
|
||||
- Test authentication flow
|
||||
|
||||
### Phase 4: Testing & Validation (Week 4)
|
||||
|
||||
1. **Smoke Tests**
|
||||
2. **Integration Tests**
|
||||
3. **Performance Tests**
|
||||
4. **Security Audit**
|
||||
|
||||
## Deployment Files Structure
|
||||
|
||||
Create three new compose files for production:
|
||||
|
||||
1. **`infrastructure.yaml`** - Vault, MinIO, Neo4j, Qdrant, Postgres, Redis, NATS
|
||||
2. **`services.yaml`** - All 13 microservices + UI
|
||||
3. **`monitoring.yaml`** - Prometheus, Grafana, Loki
|
||||
|
||||
## Rollback Strategy
|
||||
|
||||
1. **Service-Level Rollback**: Use Docker image tags
|
||||
2. **Full Rollback**: Restore from backup
|
||||
3. **Gradual Rollout**: Deploy services incrementally
|
||||
|
||||
## Monitoring & Maintenance
|
||||
|
||||
- **Logs**: Centralized in Loki
|
||||
- **Metrics**: Prometheus + Grafana
|
||||
- **Alerts**: Configure Grafana alerts
|
||||
- **Backups**: Daily automated backups of volumes
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Review and approve this plan
|
||||
2. Create production environment file
|
||||
3. Create production compose files
|
||||
4. Set up CI/CD pipeline for automated deployment
|
||||
5. Execute Phase 1 (Preparation)
|
||||
|
||||
Reference in New Issue
Block a user