Files
ai-tax-agent/infra/DEPLOYMENT_GUIDE.md
harkon b324ff09ef
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
Initial commit
2025-10-11 08:41:36 +01:00

10 KiB

AI Tax Agent Infrastructure Deployment Guide

Complete guide for deploying AI Tax Agent infrastructure across all environments.

Table of Contents

  1. Prerequisites
  2. Quick Start
  3. Local Development
  4. Development Server
  5. Production Server
  6. Troubleshooting

Prerequisites

Required Software

  • Docker 24.0+ with Compose V2
  • Git
  • SSH access (for remote deployments)
  • Domain with DNS access (for dev/prod)

Required Accounts

  • GoDaddy account (for DNS-01 challenge)
  • Gitea account (for container registry)
  • OpenAI/Anthropic API keys (optional)

Network Requirements

  • Ports 80, 443 open (for Traefik)
  • Docker networks: frontend, backend

Quick Start

1. Clone Repository

git clone <repository-url>
cd ai-tax-agent

2. Choose Environment

# Local development
export ENV=local

# Development server
export ENV=development

# Production server
export ENV=production

3. Setup Environment File

# Copy template
cp infra/environments/$ENV/.env.example infra/environments/$ENV/.env

# Edit configuration
vim infra/environments/$ENV/.env

4. Generate Secrets (Dev/Prod only)

./scripts/generate-production-secrets.sh

5. Deploy

# Setup networks
./infra/scripts/setup-networks.sh

# Deploy all services
./infra/scripts/deploy.sh $ENV all

Local Development

Setup

  1. Create environment file:
cp infra/environments/local/.env.example infra/environments/local/.env
  1. Edit configuration:
vim infra/environments/local/.env

Key settings for local:

DOMAIN=localhost
POSTGRES_PASSWORD=postgres
MINIO_ROOT_PASSWORD=minioadmin
GRAFANA_PASSWORD=admin
  1. Generate self-signed certificates (optional):
./scripts/generate-dev-certs.sh

Deploy

# Setup networks
./infra/scripts/setup-networks.sh

# Deploy infrastructure
./infra/scripts/deploy.sh local infrastructure

# Deploy monitoring
./infra/scripts/deploy.sh local monitoring

# Deploy services
./infra/scripts/deploy.sh local services

Access Services

Development Workflow

  1. Make code changes
  2. Build images: ./scripts/build-and-push-images.sh localhost:5000 latest local
  3. Restart services: ./infra/scripts/deploy.sh local services
  4. Test changes
  5. Check logs: docker compose -f infra/base/services.yaml --env-file infra/environments/local/.env logs -f

Development Server

Prerequisites

  • Server with Docker installed
  • Domain: dev.harkon.co.uk
  • GoDaddy API credentials
  • SSH access to server

Setup

  1. SSH to development server:
ssh deploy@dev-server.harkon.co.uk
  1. Clone repository:
cd /opt
git clone <repository-url> ai-tax-agent
cd ai-tax-agent
  1. Create environment file:
cp infra/environments/development/.env.example infra/environments/development/.env
  1. Generate secrets:
./scripts/generate-production-secrets.sh
  1. Edit environment file:
vim infra/environments/development/.env

Update:

  • DOMAIN=dev.harkon.co.uk
  • EMAIL=dev@harkon.co.uk
  • API keys
  • Registry credentials
  1. Setup GoDaddy DNS:
# Create Traefik provider file
vim infra/configs/traefik/.provider.env

Add:

GODADDY_API_KEY=your-api-key
GODADDY_API_SECRET=your-api-secret

Deploy

# Setup networks
./infra/scripts/setup-networks.sh

# Deploy infrastructure
./infra/scripts/deploy.sh development infrastructure

# Wait for services to be healthy
sleep 30

# Deploy monitoring
./infra/scripts/deploy.sh development monitoring

# Deploy services
./infra/scripts/deploy.sh development services

Verify Deployment

# Check services
docker ps

# Check logs
docker compose -f infra/base/infrastructure.yaml --env-file infra/environments/development/.env logs -f

# Test endpoints
curl https://vault.dev.harkon.co.uk
curl https://grafana.dev.harkon.co.uk

Access Services


Production Server

Prerequisites

  • Production server (141.136.35.199)
  • Domain: harkon.co.uk
  • Existing Traefik, Authentik, Gitea
  • SSH access as deploy user

Pre-Deployment Checklist

  • Backup existing data
  • Test in development first
  • Generate production secrets
  • Update DNS records
  • Configure Authentik OAuth providers
  • Setup Gitea container registry
  • Build and push Docker images

Setup

  1. SSH to production server:
ssh deploy@141.136.35.199
  1. Navigate to project:
cd /opt/ai-tax-agent
git pull origin main
  1. Verify environment file:
cat infra/environments/production/.env | grep DOMAIN

Should show:

DOMAIN=harkon.co.uk
  1. Verify secrets are set:
# Check all secrets are not CHANGE_ME
grep -i "CHANGE_ME" infra/environments/production/.env

Should return nothing.

Deploy Infrastructure

# Setup networks (if not already created)
./infra/scripts/setup-networks.sh

# Deploy infrastructure services
./infra/scripts/deploy.sh production infrastructure

This deploys:

  • Vault (secrets management)
  • MinIO (object storage)
  • PostgreSQL (relational database)
  • Neo4j (graph database)
  • Qdrant (vector database)
  • Redis (cache)
  • NATS (message queue)

Deploy Monitoring

./infra/scripts/deploy.sh production monitoring

This deploys:

  • Prometheus (metrics)
  • Grafana (dashboards)
  • Loki (logs)
  • Promtail (log collector)

Deploy Services

./infra/scripts/deploy.sh production services

This deploys all 14 microservices.

Post-Deployment

  1. Verify all services are running:
docker ps | grep ai-tax-agent
  1. Check health:
curl https://vault.harkon.co.uk/v1/sys/health
curl https://minio-api.harkon.co.uk/minio/health/live
  1. Configure Authentik OAuth:
  • Create OAuth providers for each service
  • Update environment variables with client secrets
  • Restart services
  1. Initialize Vault:
# Access Vault
docker exec -it vault sh

# Initialize (if first time)
vault operator init

# Unseal (if needed)
vault operator unseal
  1. Setup MinIO buckets:
# Access MinIO console
# https://minio.harkon.co.uk

# Create buckets:
# - documents
# - embeddings
# - models
# - backups

Access Services

All services available at https://<service>.harkon.co.uk:


Troubleshooting

Services Not Starting

# Check logs
docker compose -f infra/base/infrastructure.yaml --env-file infra/environments/production/.env logs -f

# Check specific service
docker logs vault

# Check Docker daemon
sudo systemctl status docker

Network Issues

# Check networks exist
docker network ls | grep -E "frontend|backend"

# Inspect network
docker network inspect frontend

# Recreate networks
docker network rm frontend backend
./infra/scripts/setup-networks.sh

Traefik Routing Issues

# Check Traefik logs
docker logs traefik | grep -i error

# Check container labels
docker inspect vault | grep -A 20 Labels

# Check Traefik dashboard
https://traefik.harkon.co.uk/dashboard/

Database Connection Issues

# Check PostgreSQL
docker exec -it postgres psql -U postgres -c "\l"

# Check Neo4j
docker exec -it neo4j cypher-shell -u neo4j -p $NEO4J_PASSWORD

# Check Redis
docker exec -it redis redis-cli ping

Volume/Data Issues

# List volumes
docker volume ls

# Inspect volume
docker volume inspect postgres_data

# Backup volume
docker run --rm -v postgres_data:/data -v $(pwd):/backup alpine tar czf /backup/postgres_backup.tar.gz /data

SSL Certificate Issues

# Check Traefik logs for ACME errors
docker logs traefik | grep -i acme

# Check GoDaddy credentials
cat infra/configs/traefik/.provider.env

# Force certificate renewal
docker exec traefik rm -rf /var/traefik/certs/acme.json
docker restart traefik

Maintenance

Update Services

# Pull latest code
git pull origin main

# Rebuild images
./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.2 harkon

# Deploy updates
./infra/scripts/deploy.sh production services --pull

Backup Data

# Backup all volumes
./scripts/backup-volumes.sh production

# Backup specific service
docker run --rm -v postgres_data:/data -v $(pwd):/backup alpine tar czf /backup/postgres_backup.tar.gz /data

Scale Services

# Scale a service
docker compose -f infra/base/services.yaml --env-file infra/environments/production/.env up -d --scale svc-ingestion=3

View Logs

# All services
docker compose -f infra/base/services.yaml --env-file infra/environments/production/.env logs -f

# Specific service
docker logs -f svc-ingestion

# With Loki (via Grafana)
https://grafana.harkon.co.uk/explore

Security Best Practices

  1. Rotate secrets regularly - Use generate-production-secrets.sh
  2. Use Authentik SSO - Enable for all services
  3. Keep images updated - Regular security patches
  4. Monitor logs - Check for suspicious activity
  5. Backup regularly - Automated daily backups
  6. Use strong passwords - Minimum 32 characters
  7. Limit network exposure - Only expose necessary ports
  8. Enable audit logging - Track all access

Support

For issues:

  1. Check logs
  2. Review documentation
  3. Check Traefik dashboard
  4. Verify environment variables
  5. Test in development first