11 KiB
Isolated Stacks Deployment Plan
Executive Summary
This plan outlines the strategy to host both the AI Tax Agent application and company services (Nextcloud, Gitea, Portainer, Authentik) on the remote server at 141.136.35.199 while maintaining an efficient local development workflow.
Current State Analysis
Remote Server (141.136.35.199)
- Location:
/opt/compose/ - Existing Services:
- Traefik v3.5.1 (reverse proxy with GoDaddy DNS challenge)
- Authentik 2025.8.1 (SSO/Authentication)
- Gitea 1.24.5 (Git hosting)
- Nextcloud (Cloud storage)
- Portainer 2.33.1 (Docker management)
- Networks:
frontendandbackend(external) - Domain:
harkon.co.uk - SSL: Let's Encrypt via GoDaddy DNS challenge
- Exposed Subdomains:
traefik.harkon.co.ukauth.harkon.co.ukgitea.harkon.co.ukcloud.harkon.co.ukportainer.harkon.co.uk
Local Repository (infra/compose/)
- Compose Files:
docker-compose.local.yml- Full stack for local developmentdocker-compose.backend.yml- Backend services (appears to be production-ready)
- Application Services:
- 13+ microservices (svc-ingestion, svc-extract, svc-forms, svc-hmrc, etc.)
- UI Review application
- Infrastructure: Vault, MinIO, Qdrant, Neo4j, Postgres, Redis, NATS, Prometheus, Grafana, Loki
- Networks:
ai-tax-agent-frontendandai-tax-agent-backend - Domain:
local.lan(for development) - Authentication: Authentik with ForwardAuth middleware
Challenges & Conflicts
1. Duplicate Services
- Both environments have Traefik and Authentik
- Need to decide: shared vs. isolated
2. Network Naming
- Remote:
frontend,backend - Local:
ai-tax-agent-frontend,ai-tax-agent-backend - Production needs: Consistent naming
3. Domain Management
- Remote:
*.harkon.co.uk(public) - Local:
*.local.lan(development) - Production: Need subdomains like
app.harkon.co.uk,api.harkon.co.uk
4. SSL Certificates
- Remote: GoDaddy DNS challenge (production)
- Local: Self-signed certificates
- Production: Must use GoDaddy DNS challenge
5. Resource Isolation
- Company services need to remain stable
- Application services need independent deployment/rollback
Decision: Keep Stacks Completely Separate
We will deploy the company services and the AI Tax Agent as two fully isolated stacks, each with its own Traefik and Authentik. This maximizes blast-radius isolation and avoids naming and DNS conflicts across environments.
Key implications:
- Separate external networks and DNS namespaces per stack
- Duplicate edge (Traefik) and IdP (Authentik), independent upgrades and rollbacks
- Slightly higher resource usage in exchange for strong isolation
Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ Internet (*.harkon.co.uk) │
└────────────────────────┬────────────────────────────────────┘
│
┌────▼────┐
│ Traefik │ (Port 80/443)
│ v3.5.1 │
└────┬────┘
│
┌────────────────┼────────────────┐
│ │ │
┌────▼─────┐ ┌────▼────┐ ┌────▼─────┐
│Authentik │ │ Company │ │ App │
│ SSO │ │Services │ │ Services │
└──────────┘ └─────────┘ └──────────┘
│ │
┌────┴────┐ ┌────┴────┐
│ Gitea │ │ Vault │
│Nextcloud│ │ MinIO │
│Portainer│ │ Neo4j │
└─────────┘ │ Qdrant │
│ Postgres│
│ Redis │
│ NATS │
│ 13 SVCs │
│ UI │
└─────────┘
Directory Structure (per stack)
/opt/compose/<stack>/
├── traefik/ # Stack-local reverse proxy
│ ├── compose.yaml
│ ├── config/
│ │ ├── traefik.yaml # Static config
│ │ ├── dynamic-company.yaml
│ │ └── dynamic-app.yaml
│ └── certs/
├── authentik/ # Stack-local SSO
│ ├── compose.yaml
│ └── ...
├── company/ # Company services namespace
│ ├── gitea/
│ │ └── compose.yaml
│ ├── nextcloud/
│ │ └── compose.yaml
│ └── portainer/
│ └── compose.yaml
└── ai-tax-agent/ # Application namespace (if this is the app stack)
├── .env # Production environment
├── infrastructure.yaml # Vault, MinIO, Neo4j, Qdrant, etc.
├── services.yaml # All microservices
└── monitoring.yaml # Prometheus, Grafana, Loki
Network Strategy
- Use stack-scoped network names to avoid collisions:
apa-frontend,apa-backend. - Only attach services that must be public to
apa-frontend. - Keep internal communication on
apa-backend.
Domain Mapping
Company Services (existing):
traefik.harkon.co.uk- Traefik dashboardauth.harkon.co.uk- Authentik SSOgitea.harkon.co.uk- Git hostingcloud.harkon.co.uk- Nextcloudportainer.harkon.co.uk- Docker management
Application Services (app stack):
review.<domain>- Review UIapi.<domain>- API Gateway (microservices via Traefik)vault.<domain>- Vault UI (admin only)minio.<domain>- MinIO Console (admin only)neo4j.<domain>- Neo4j Browser (admin only)qdrant.<domain>- Qdrant UI (admin only)grafana.<domain>- Grafana (monitoring)prometheus.<domain>- Prometheus (admin only)loki.<domain>- Loki (admin only)
Authentication Strategy
Authentik Configuration:
- Company Group - Access to Gitea, Nextcloud, Portainer
- App Admin Group - Full access to all app services
- App User Group - Access to Review UI and API
- App Reviewer Group - Access to Review UI only
Middleware Configuration:
authentik-forwardauth- Standard auth for all servicesadmin-auth- Requires admin group (Vault, MinIO, Neo4j, etc.)reviewer-auth- Requires reviewer or higherrate-limit- Standard rate limitingapi-rate-limit- Stricter API rate limiting
Implementation Notes
- infra/base/infrastructure.yaml now includes Traefik and Authentik in the infrastructure stack with stack-scoped networks and service names.
- All infrastructure component service keys and container names use the
apa-prefix to avoid DNS collisions on shared Docker hosts. - Traefik static and dynamic configs live under
infra/base/traefik/config/.
Local Development Workflow
Development Environment
Keep Existing Setup:
- Use
docker-compose.local.ymlas-is - Domain:
*.local.lan - Self-signed certificates
- Isolated networks:
ai-tax-agent-frontend,ai-tax-agent-backend - Full stack runs locally
Benefits:
- No dependency on remote server
- Fast iteration
- Complete isolation
- Works offline
Development Commands
# Local development
make bootstrap # Initial setup
make up # Start all services
make down # Stop all services
make logs SERVICE=svc-ingestion
# Build and test
make build # Build all images
make test # Run tests
make test-integration # Integration tests
# Deploy to production
make deploy-production # Deploy to remote server
Production Deployment Strategy
Phase 1: Preparation (Week 1)
-
Backup Current State
ssh deploy@141.136.35.199 cd /opt/compose tar -czf ~/backup-$(date +%Y%m%d).tar.gz . -
Create Production Environment File
- Copy
infra/compose/env.exampletoinfra/compose/.env.production - Update all secrets and passwords
- Set
DOMAIN=harkon.co.uk - Configure GoDaddy API credentials
- Copy
-
Update Traefik Configuration
- Merge local Traefik config with remote
- Add application routes
- Configure Authentik ForwardAuth
-
Prepare Docker Images
- Build all application images
- Push to container registry (Gitea registry or Docker Hub)
- Tag with version numbers
Phase 2: Infrastructure Deployment (Week 2)
-
Deploy Application Infrastructure
# On remote server cd /opt/compose/ai-tax-agent docker compose -f infrastructure.yaml up -d -
Initialize Services
- Vault: Unseal and configure
- Postgres: Run migrations
- Neo4j: Install plugins
- MinIO: Create buckets
-
Configure Authentik
- Create application groups
- Configure OAuth providers
- Set up ForwardAuth outpost
Phase 3: Application Deployment (Week 3)
-
Deploy Microservices
docker compose -f services.yaml up -d -
Deploy Monitoring
docker compose -f monitoring.yaml up -d -
Verify Health
- Check all service health endpoints
- Verify Traefik routing
- Test authentication flow
Phase 4: Testing & Validation (Week 4)
- Smoke Tests
- Integration Tests
- Performance Tests
- Security Audit
Deployment Files Structure
Create three new compose files for production:
infrastructure.yaml- Vault, MinIO, Neo4j, Qdrant, Postgres, Redis, NATSservices.yaml- All 13 microservices + UImonitoring.yaml- Prometheus, Grafana, Loki
Rollback Strategy
- Service-Level Rollback: Use Docker image tags
- Full Rollback: Restore from backup
- Gradual Rollout: Deploy services incrementally
Monitoring & Maintenance
- Logs: Centralized in Loki
- Metrics: Prometheus + Grafana
- Alerts: Configure Grafana alerts
- Backups: Daily automated backups of volumes
Next Steps
- Review and approve this plan
- Create production environment file
- Create production compose files
- Set up CI/CD pipeline for automated deployment
- Execute Phase 1 (Preparation)