# Unified Infrastructure Deployment Plan ## Executive Summary This plan outlines the strategy to host both the **AI Tax Agent application** and **company services** (Nextcloud, Gitea, Portainer, Authentik) on the remote server at `141.136.35.199` while maintaining an efficient local development workflow. ## Current State Analysis ### Remote Server (`141.136.35.199`) - **Location**: `/opt/compose/` - **Existing Services**: - Traefik v3.5.1 (reverse proxy with GoDaddy DNS challenge) - Authentik 2025.8.1 (SSO/Authentication) - Gitea 1.24.5 (Git hosting) - Nextcloud (Cloud storage) - Portainer 2.33.1 (Docker management) - **Networks**: `frontend` and `backend` (external) - **Domain**: `harkon.co.uk` - **SSL**: Let's Encrypt via GoDaddy DNS challenge - **Exposed Subdomains**: - `traefik.harkon.co.uk` - `authentik.harkon.co.uk` - `gitea.harkon.co.uk` - `cloud.harkon.co.uk` - `portainer.harkon.co.uk` ### Local Repository (`infra/compose/`) - **Compose Files**: - `docker-compose.local.yml` - Full stack for local development - `docker-compose.backend.yml` - Backend services (appears to be production-ready) - **Application Services**: - 13+ microservices (svc-ingestion, svc-extract, svc-forms, svc-hmrc, etc.) - UI Review application - Infrastructure: Vault, MinIO, Qdrant, Neo4j, Postgres, Redis, NATS, Prometheus, Grafana, Loki - **Networks**: `ai-tax-agent-frontend` and `ai-tax-agent-backend` - **Domain**: `local.lan` (for development) - **Authentication**: Authentik with ForwardAuth middleware ## Challenges & Conflicts ### 1. **Duplicate Services** - Both environments have Traefik and Authentik - Need to decide: shared vs. isolated ### 2. **Network Naming** - Remote: `frontend`, `backend` - Local: `ai-tax-agent-frontend`, `ai-tax-agent-backend` - Production needs: Consistent naming ### 3. **Domain Management** - Remote: `*.harkon.co.uk` (public) - Local: `*.local.lan` (development) - Production: Need subdomains like `app.harkon.co.uk`, `api.harkon.co.uk` ### 4. **SSL Certificates** - Remote: GoDaddy DNS challenge (production) - Local: Self-signed certificates - Production: Must use GoDaddy DNS challenge ### 5. **Resource Isolation** - Company services need to remain stable - Application services need independent deployment/rollback ## Recommended Architecture ### Option A: Unified Traefik & Authentik (RECOMMENDED) **Pros**: - Single point of entry - Shared authentication across all services - Simplified SSL management - Cost-effective (one Traefik, one Authentik) **Cons**: - Application deployments could affect company services - Requires careful configuration management **Implementation**: ``` /opt/compose/ ├── traefik/ # Shared Traefik (existing) ├── authentik/ # Shared Authentik (existing) ├── company/ # Company services │ ├── gitea/ │ ├── nextcloud/ │ └── portainer/ └── ai-tax-agent/ # Application services ├── infrastructure/ # App-specific infra (Vault, MinIO, Neo4j, etc.) └── services/ # Microservices ``` ### Option B: Isolated Stacks **Pros**: - Complete isolation - Independent scaling - No cross-contamination **Cons**: - Duplicate Traefik/Authentik - More complex SSL management - Higher resource usage - Users need separate logins ## Proposed Solution: Hybrid Approach ### Architecture Overview ``` ┌─────────────────────────────────────────────────────────────┐ │ Internet (*.harkon.co.uk) │ └────────────────────────┬────────────────────────────────────┘ │ ┌────▼────┐ │ Traefik │ (Port 80/443) │ v3.5.1 │ └────┬────┘ │ ┌────────────────┼────────────────┐ │ │ │ ┌────▼─────┐ ┌────▼────┐ ┌────▼─────┐ │Authentik │ │ Company │ │ App │ │ SSO │ │Services │ │ Services │ └──────────┘ └─────────┘ └──────────┘ │ │ ┌────┴────┐ ┌────┴────┐ │ Gitea │ │ Vault │ │Nextcloud│ │ MinIO │ │Portainer│ │ Neo4j │ └─────────┘ │ Qdrant │ │ Postgres│ │ Redis │ │ NATS │ │ 13 SVCs │ │ UI │ └─────────┘ ``` ### Directory Structure ``` /opt/compose/ ├── traefik/ # Shared reverse proxy │ ├── compose.yaml │ ├── config/ │ │ ├── traefik.yaml # Static config │ │ ├── dynamic-company.yaml │ │ └── dynamic-app.yaml │ └── certs/ ├── authentik/ # Shared SSO │ ├── compose.yaml │ └── ... ├── company/ # Company services namespace │ ├── gitea/ │ │ └── compose.yaml │ ├── nextcloud/ │ │ └── compose.yaml │ └── portainer/ │ └── compose.yaml └── ai-tax-agent/ # Application namespace ├── .env # Production environment ├── infrastructure.yaml # Vault, MinIO, Neo4j, Qdrant, etc. ├── services.yaml # All microservices └── monitoring.yaml # Prometheus, Grafana, Loki ``` ### Network Strategy **Shared Networks**: - `frontend` - For all services exposed via Traefik - `backend` - For internal service communication **Application-Specific Networks** (optional): - `ai-tax-agent-internal` - For app-only internal communication ### Domain Mapping **Company Services** (existing): - `traefik.harkon.co.uk` - Traefik dashboard - `authentik.harkon.co.uk` - Authentik SSO - `gitea.harkon.co.uk` - Git hosting - `cloud.harkon.co.uk` - Nextcloud - `portainer.harkon.co.uk` - Docker management **Application Services** (new): - `app.harkon.co.uk` - Review UI - `api.harkon.co.uk` - API Gateway (all microservices) - `vault.harkon.co.uk` - Vault UI (admin only) - `minio.harkon.co.uk` - MinIO Console (admin only) - `neo4j.harkon.co.uk` - Neo4j Browser (admin only) - `qdrant.harkon.co.uk` - Qdrant UI (admin only) - `grafana.harkon.co.uk` - Grafana (monitoring) - `prometheus.harkon.co.uk` - Prometheus (admin only) - `loki.harkon.co.uk` - Loki (admin only) ### Authentication Strategy **Authentik Configuration**: 1. **Company Group** - Access to Gitea, Nextcloud, Portainer 2. **App Admin Group** - Full access to all app services 3. **App User Group** - Access to Review UI and API 4. **App Reviewer Group** - Access to Review UI only **Middleware Configuration**: - `authentik-forwardauth` - Standard auth for all services - `admin-auth` - Requires admin group (Vault, MinIO, Neo4j, etc.) - `reviewer-auth` - Requires reviewer or higher - `rate-limit` - Standard rate limiting - `api-rate-limit` - Stricter API rate limiting ## Local Development Workflow ### Development Environment **Keep Existing Setup**: - Use `docker-compose.local.yml` as-is - Domain: `*.local.lan` - Self-signed certificates - Isolated networks: `ai-tax-agent-frontend`, `ai-tax-agent-backend` - Full stack runs locally **Benefits**: - No dependency on remote server - Fast iteration - Complete isolation - Works offline ### Development Commands ```bash # Local development make bootstrap # Initial setup make up # Start all services make down # Stop all services make logs SERVICE=svc-ingestion # Build and test make build # Build all images make test # Run tests make test-integration # Integration tests # Deploy to production make deploy-production # Deploy to remote server ``` ## Production Deployment Strategy ### Phase 1: Preparation (Week 1) 1. **Backup Current State** ```bash ssh deploy@141.136.35.199 cd /opt/compose tar -czf ~/backup-$(date +%Y%m%d).tar.gz . ``` 2. **Create Production Environment File** - Copy `infra/compose/env.example` to `infra/compose/.env.production` - Update all secrets and passwords - Set `DOMAIN=harkon.co.uk` - Configure GoDaddy API credentials 3. **Update Traefik Configuration** - Merge local Traefik config with remote - Add application routes - Configure Authentik ForwardAuth 4. **Prepare Docker Images** - Build all application images - Push to container registry (Gitea registry or Docker Hub) - Tag with version numbers ### Phase 2: Infrastructure Deployment (Week 2) 1. **Deploy Application Infrastructure** ```bash # On remote server cd /opt/compose/ai-tax-agent docker compose -f infrastructure.yaml up -d ``` 2. **Initialize Services** - Vault: Unseal and configure - Postgres: Run migrations - Neo4j: Install plugins - MinIO: Create buckets 3. **Configure Authentik** - Create application groups - Configure OAuth providers - Set up ForwardAuth outpost ### Phase 3: Application Deployment (Week 3) 1. **Deploy Microservices** ```bash docker compose -f services.yaml up -d ``` 2. **Deploy Monitoring** ```bash docker compose -f monitoring.yaml up -d ``` 3. **Verify Health** - Check all service health endpoints - Verify Traefik routing - Test authentication flow ### Phase 4: Testing & Validation (Week 4) 1. **Smoke Tests** 2. **Integration Tests** 3. **Performance Tests** 4. **Security Audit** ## Deployment Files Structure Create three new compose files for production: 1. **`infrastructure.yaml`** - Vault, MinIO, Neo4j, Qdrant, Postgres, Redis, NATS 2. **`services.yaml`** - All 13 microservices + UI 3. **`monitoring.yaml`** - Prometheus, Grafana, Loki ## Rollback Strategy 1. **Service-Level Rollback**: Use Docker image tags 2. **Full Rollback**: Restore from backup 3. **Gradual Rollout**: Deploy services incrementally ## Monitoring & Maintenance - **Logs**: Centralized in Loki - **Metrics**: Prometheus + Grafana - **Alerts**: Configure Grafana alerts - **Backups**: Daily automated backups of volumes ## Next Steps 1. Review and approve this plan 2. Create production environment file 3. Create production compose files 4. Set up CI/CD pipeline for automated deployment 5. Execute Phase 1 (Preparation)