deployment, linting and infra configuration
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
This commit is contained in:
@@ -7,6 +7,7 @@ This plan outlines the strategy to host both the **AI Tax Agent application** an
|
||||
## Current State Analysis
|
||||
|
||||
### Remote Server (`141.136.35.199`)
|
||||
|
||||
- **Location**: `/opt/compose/`
|
||||
- **Existing Services**:
|
||||
- Traefik v3.5.1 (reverse proxy with GoDaddy DNS challenge)
|
||||
@@ -25,6 +26,7 @@ This plan outlines the strategy to host both the **AI Tax Agent application** an
|
||||
- `portainer.harkon.co.uk`
|
||||
|
||||
### Local Repository (`infra/compose/`)
|
||||
|
||||
- **Compose Files**:
|
||||
- `docker-compose.local.yml` - Full stack for local development
|
||||
- `docker-compose.backend.yml` - Backend services (appears to be production-ready)
|
||||
@@ -39,25 +41,30 @@ This plan outlines the strategy to host both the **AI Tax Agent application** an
|
||||
## Challenges & Conflicts
|
||||
|
||||
### 1. **Duplicate Services**
|
||||
|
||||
- Both environments have Traefik and Authentik
|
||||
- Need to decide: shared vs. isolated
|
||||
|
||||
### 2. **Network Naming**
|
||||
|
||||
- Remote: `frontend`, `backend`
|
||||
- Local: `ai-tax-agent-frontend`, `ai-tax-agent-backend`
|
||||
- Production needs: Consistent naming
|
||||
|
||||
### 3. **Domain Management**
|
||||
|
||||
- Remote: `*.harkon.co.uk` (public)
|
||||
- Local: `*.local.lan` (development)
|
||||
- Production: Need subdomains like `app.harkon.co.uk`, `api.harkon.co.uk`
|
||||
|
||||
### 4. **SSL Certificates**
|
||||
|
||||
- Remote: GoDaddy DNS challenge (production)
|
||||
- Local: Self-signed certificates
|
||||
- Production: Must use GoDaddy DNS challenge
|
||||
|
||||
### 5. **Resource Isolation**
|
||||
|
||||
- Company services need to remain stable
|
||||
- Application services need independent deployment/rollback
|
||||
|
||||
@@ -66,6 +73,7 @@ This plan outlines the strategy to host both the **AI Tax Agent application** an
|
||||
We will deploy the company services and the AI Tax Agent as two fully isolated stacks, each with its own Traefik and Authentik. This maximizes blast-radius isolation and avoids naming and DNS conflicts across environments.
|
||||
|
||||
Key implications:
|
||||
|
||||
- Separate external networks and DNS namespaces per stack
|
||||
- Duplicate edge (Traefik) and IdP (Authentik), independent upgrades and rollbacks
|
||||
- Slightly higher resource usage in exchange for strong isolation
|
||||
@@ -139,6 +147,7 @@ Key implications:
|
||||
### Domain Mapping
|
||||
|
||||
**Company Services** (existing):
|
||||
|
||||
- `traefik.harkon.co.uk` - Traefik dashboard
|
||||
- `auth.harkon.co.uk` - Authentik SSO
|
||||
- `gitea.harkon.co.uk` - Git hosting
|
||||
@@ -146,6 +155,7 @@ Key implications:
|
||||
- `portainer.harkon.co.uk` - Docker management
|
||||
|
||||
**Application Services** (app stack):
|
||||
|
||||
- `review.<domain>` - Review UI
|
||||
- `api.<domain>` - API Gateway (microservices via Traefik)
|
||||
- `vault.<domain>` - Vault UI (admin only)
|
||||
@@ -159,12 +169,14 @@ Key implications:
|
||||
### Authentication Strategy
|
||||
|
||||
**Authentik Configuration**:
|
||||
|
||||
1. **Company Group** - Access to Gitea, Nextcloud, Portainer
|
||||
2. **App Admin Group** - Full access to all app services
|
||||
3. **App User Group** - Access to Review UI and API
|
||||
4. **App Reviewer Group** - Access to Review UI only
|
||||
|
||||
**Middleware Configuration**:
|
||||
|
||||
- `authentik-forwardauth` - Standard auth for all services
|
||||
- `admin-auth` - Requires admin group (Vault, MinIO, Neo4j, etc.)
|
||||
- `reviewer-auth` - Requires reviewer or higher
|
||||
@@ -182,6 +194,7 @@ Key implications:
|
||||
### Development Environment
|
||||
|
||||
**Keep Existing Setup**:
|
||||
|
||||
- Use `docker-compose.local.yml` as-is
|
||||
- Domain: `*.local.lan`
|
||||
- Self-signed certificates
|
||||
@@ -189,6 +202,7 @@ Key implications:
|
||||
- Full stack runs locally
|
||||
|
||||
**Benefits**:
|
||||
|
||||
- No dependency on remote server
|
||||
- Fast iteration
|
||||
- Complete isolation
|
||||
@@ -217,19 +231,22 @@ make deploy-production # Deploy to remote server
|
||||
### Phase 1: Preparation (Week 1)
|
||||
|
||||
1. **Backup Current State**
|
||||
|
||||
```bash
|
||||
ssh deploy@141.136.35.199
|
||||
cd /opt/compose
|
||||
cd /opt
|
||||
tar -czf ~/backup-$(date +%Y%m%d).tar.gz .
|
||||
```
|
||||
|
||||
2. **Create Production Environment File**
|
||||
- Copy `infra/compose/env.example` to `infra/compose/.env.production`
|
||||
|
||||
- Copy `infra/environments/production/.env.example` to `infra/environments/production/.env`
|
||||
- Update all secrets and passwords
|
||||
- Set `DOMAIN=harkon.co.uk`
|
||||
- Configure GoDaddy API credentials
|
||||
|
||||
3. **Update Traefik Configuration**
|
||||
|
||||
- Merge local Traefik config with remote
|
||||
- Add application routes
|
||||
- Configure Authentik ForwardAuth
|
||||
@@ -242,13 +259,15 @@ make deploy-production # Deploy to remote server
|
||||
### Phase 2: Infrastructure Deployment (Week 2)
|
||||
|
||||
1. **Deploy Application Infrastructure**
|
||||
|
||||
```bash
|
||||
# On remote server
|
||||
cd /opt/compose/ai-tax-agent
|
||||
cd /opt/ai-tax-agent
|
||||
docker compose -f infrastructure.yaml up -d
|
||||
```
|
||||
|
||||
2. **Initialize Services**
|
||||
|
||||
- Vault: Unseal and configure
|
||||
- Postgres: Run migrations
|
||||
- Neo4j: Install plugins
|
||||
@@ -262,11 +281,13 @@ make deploy-production # Deploy to remote server
|
||||
### Phase 3: Application Deployment (Week 3)
|
||||
|
||||
1. **Deploy Microservices**
|
||||
|
||||
```bash
|
||||
docker compose -f services.yaml up -d
|
||||
```
|
||||
|
||||
2. **Deploy Monitoring**
|
||||
|
||||
```bash
|
||||
docker compose -f monitoring.yaml up -d
|
||||
```
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
|
||||
### 1. Production Compose Files Created
|
||||
|
||||
Created three production-ready Docker Compose files in `infra/compose/production/`:
|
||||
Created three production-ready Docker Compose files in `infra/base/`:
|
||||
|
||||
#### **infrastructure.yaml**
|
||||
- Vault (secrets management)
|
||||
@@ -104,7 +104,7 @@ chmod +x scripts/deploy-to-production.sh
|
||||
|
||||
### 3. Documentation Created
|
||||
|
||||
#### **infra/compose/production/README.md**
|
||||
#### **infra/base manifests**
|
||||
Comprehensive production deployment guide including:
|
||||
- Prerequisites checklist
|
||||
- Three deployment options (automated, step-by-step, manual)
|
||||
@@ -221,7 +221,7 @@ Or step-by-step:
|
||||
1. **Initialize Vault**
|
||||
```bash
|
||||
ssh deploy@141.136.35.199
|
||||
cd /opt/compose/ai-tax-agent
|
||||
cd /opt/ai-tax-agent
|
||||
docker exec -it vault vault operator init
|
||||
# Save unseal keys!
|
||||
docker exec -it vault vault operator unseal
|
||||
@@ -382,7 +382,6 @@ Deployment is successful when:
|
||||
If you encounter issues:
|
||||
1. Check logs: `./scripts/deploy-to-production.sh logs <service>`
|
||||
2. Verify status: `./scripts/deploy-to-production.sh verify`
|
||||
3. Review documentation: `infra/compose/production/README.md`
|
||||
3. Review manifests: `infra/base/*.yaml`
|
||||
4. Check deployment plan: `docs/DEPLOYMENT_PLAN.md`
|
||||
5. Follow checklist: `docs/DEPLOYMENT_CHECKLIST.md`
|
||||
|
||||
|
||||
@@ -21,15 +21,14 @@
|
||||
- ✅ Created quick start guide (`docs/QUICK_START.md`)
|
||||
|
||||
### 3. Production Configuration Files
|
||||
- ✅ Created `infra/compose/production/infrastructure.yaml` (7 infrastructure services)
|
||||
- ✅ Created `infra/compose/production/services.yaml` (14 application services + UI)
|
||||
- ✅ Created `infra/compose/production/monitoring.yaml` (Prometheus, Grafana, Loki, Promtail)
|
||||
- ✅ Created `infra/compose/production/README.md` (deployment guide)
|
||||
- ✅ Created `infra/base/infrastructure.yaml` (infrastructure, incl. Traefik + Authentik)
|
||||
- ✅ Created `infra/base/services.yaml` (application services + UI)
|
||||
- ✅ Created `infra/base/monitoring.yaml` (Prometheus, Grafana, Loki, Promtail)
|
||||
|
||||
### 4. Monitoring Configuration
|
||||
- ✅ Created Prometheus configuration (`infra/compose/prometheus/prometheus.yml`)
|
||||
- ✅ Created Loki configuration (`infra/compose/loki/loki-config.yml`)
|
||||
- ✅ Created Promtail configuration (`infra/compose/promtail/promtail-config.yml`)
|
||||
- ✅ Created Prometheus configuration (`infra/base/prometheus/prometheus.yml`)
|
||||
- ✅ Created Loki configuration (`infra/base/loki/loki-config.yml`)
|
||||
- ✅ Created Promtail configuration (`infra/base/promtail/promtail-config.yml`)
|
||||
- ✅ Configured service discovery for all 14 services
|
||||
- ✅ Set up 30-day metrics retention
|
||||
|
||||
@@ -266,10 +265,9 @@ df -h
|
||||
- `docs/ENVIRONMENT_COMPARISON.md` - Local vs Production comparison
|
||||
|
||||
2. **Configuration:**
|
||||
- `infra/compose/production/README.md` - Production compose guide
|
||||
- `infra/compose/production/infrastructure.yaml` - Infrastructure services
|
||||
- `infra/compose/production/services.yaml` - Application services
|
||||
- `infra/compose/production/monitoring.yaml` - Monitoring stack
|
||||
- `infra/base/infrastructure.yaml` - Infrastructure services
|
||||
- `infra/base/services.yaml` - Application services
|
||||
- `infra/base/monitoring.yaml` - Monitoring stack
|
||||
|
||||
3. **Deployment:**
|
||||
- `docs/POST_BUILD_DEPLOYMENT.md` - Post-build deployment steps
|
||||
@@ -319,4 +317,3 @@ For questions or issues:
|
||||
- 🟡 In Progress
|
||||
- ⏳ Pending
|
||||
- ❌ Blocked
|
||||
|
||||
|
||||
@@ -12,7 +12,7 @@ This document compares the local development environment with the production env
|
||||
| **SSL** | Self-signed certificates | Let's Encrypt (GoDaddy DNS) |
|
||||
| **Networks** | `ai-tax-agent-frontend`<br/>`ai-tax-agent-backend` | `frontend`<br/>`backend` |
|
||||
| **Compose File** | `docker-compose.local.yml` | `infrastructure.yaml`<br/>`services.yaml`<br/>`monitoring.yaml` |
|
||||
| **Location** | Local machine | `deploy@141.136.35.199:/opt/compose/ai-tax-agent/` |
|
||||
| **Location** | Local machine | `deploy@141.136.35.199:/opt/ai-tax-agent/` |
|
||||
| **Traefik** | Isolated instance | Shared with company services |
|
||||
| **Authentik** | Isolated instance | Shared with company services |
|
||||
| **Data Persistence** | Local Docker volumes | Remote Docker volumes + backups |
|
||||
@@ -271,7 +271,7 @@ make clean
|
||||
#### Production
|
||||
```bash
|
||||
# Deploy infrastructure
|
||||
cd /opt/compose/ai-tax-agent
|
||||
cd /opt/ai-tax-agent
|
||||
docker compose -f infrastructure.yaml up -d
|
||||
|
||||
# Deploy services
|
||||
@@ -370,7 +370,7 @@ docker compose -f services.yaml up -d --no-deps svc-ingestion
|
||||
4. **Deploy to production**:
|
||||
```bash
|
||||
ssh deploy@141.136.35.199
|
||||
cd /opt/compose/ai-tax-agent
|
||||
cd /opt/ai-tax-agent
|
||||
docker compose -f services.yaml pull
|
||||
docker compose -f services.yaml up -d
|
||||
```
|
||||
@@ -436,4 +436,3 @@ The key differences between local and production environments are:
|
||||
6. **Backups**: Local has none; production has automated backups
|
||||
|
||||
Both environments use the same application code and Docker images, ensuring consistency and reducing deployment risks.
|
||||
|
||||
|
||||
@@ -1,332 +0,0 @@
|
||||
# Gitea Container Registry Debugging Guide
|
||||
|
||||
## Common Issues When Pushing Large Docker Images
|
||||
|
||||
### Issue 1: Not Logged In
|
||||
|
||||
**Symptom**: `unauthorized: authentication required`
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# On remote server
|
||||
docker login gitea.harkon.co.uk
|
||||
# Username: blue (or your Gitea username)
|
||||
# Password: <your-gitea-access-token>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Issue 2: Upload Size Limit (413 Request Entity Too Large)
|
||||
|
||||
**Symptom**: Push fails with `413 Request Entity Too Large` or similar error
|
||||
|
||||
**Root Cause**: Traefik or Gitea has a limit on request body size
|
||||
|
||||
**Solution A: Configure Traefik Middleware**
|
||||
|
||||
1. Find your Traefik configuration directory:
|
||||
```bash
|
||||
docker inspect traefik | grep -A 10 Mounts
|
||||
```
|
||||
|
||||
2. Create middleware configuration:
|
||||
```bash
|
||||
# Example: /opt/traefik/config/middlewares.yml
|
||||
sudo tee /opt/traefik/config/middlewares.yml > /dev/null << 'EOF'
|
||||
http:
|
||||
middlewares:
|
||||
large-upload:
|
||||
buffering:
|
||||
maxRequestBodyBytes: 5368709120 # 5GB
|
||||
memRequestBodyBytes: 104857600 # 100MB
|
||||
maxResponseBodyBytes: 5368709120 # 5GB
|
||||
memResponseBodyBytes: 104857600 # 100MB
|
||||
EOF
|
||||
```
|
||||
|
||||
3. Update Gitea container labels:
|
||||
```yaml
|
||||
labels:
|
||||
- "traefik.http.routers.gitea.middlewares=large-upload@file"
|
||||
```
|
||||
|
||||
4. Restart Traefik:
|
||||
```bash
|
||||
docker restart traefik
|
||||
```
|
||||
|
||||
**Solution B: Configure Gitea Directly**
|
||||
|
||||
1. Edit Gitea configuration:
|
||||
```bash
|
||||
docker exec -it gitea-server vi /data/gitea/conf/app.ini
|
||||
```
|
||||
|
||||
2. Add/modify these settings:
|
||||
```ini
|
||||
[server]
|
||||
LFS_MAX_FILE_SIZE = 5368709120 ; 5GB
|
||||
|
||||
[repository.upload]
|
||||
FILE_MAX_SIZE = 5368709120 ; 5GB
|
||||
```
|
||||
|
||||
3. Restart Gitea:
|
||||
```bash
|
||||
docker restart gitea-server
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Issue 3: Network Timeout
|
||||
|
||||
**Symptom**: Push hangs or times out after uploading for a while
|
||||
|
||||
**Root Cause**: Network instability or slow connection
|
||||
|
||||
**Solution**: Use chunked uploads or increase timeout
|
||||
|
||||
1. Configure Docker daemon timeout:
|
||||
```bash
|
||||
# Edit /etc/docker/daemon.json
|
||||
sudo tee /etc/docker/daemon.json > /dev/null << 'EOF'
|
||||
{
|
||||
"max-concurrent-uploads": 1,
|
||||
"max-concurrent-downloads": 3,
|
||||
"registry-mirrors": []
|
||||
}
|
||||
EOF
|
||||
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
|
||||
2. Or use Traefik timeout middleware:
|
||||
```yaml
|
||||
http:
|
||||
middlewares:
|
||||
long-timeout:
|
||||
buffering:
|
||||
retryExpression: "IsNetworkError() && Attempts() < 3"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Issue 4: Disk Space
|
||||
|
||||
**Symptom**: Push fails with "no space left on device"
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Check disk space
|
||||
df -h
|
||||
|
||||
# Clean up Docker
|
||||
docker system prune -a --volumes -f
|
||||
|
||||
# Check again
|
||||
df -h
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Issue 5: Gitea Registry Not Enabled
|
||||
|
||||
**Symptom**: `404 Not Found` when accessing `/v2/`
|
||||
|
||||
**Solution**:
|
||||
```bash
|
||||
# Check if registry is enabled
|
||||
docker exec gitea-server cat /data/gitea/conf/app.ini | grep -A 5 "\[packages\]"
|
||||
|
||||
# Should show:
|
||||
# [packages]
|
||||
# ENABLED = true
|
||||
```
|
||||
|
||||
If not enabled, add to `app.ini`:
|
||||
```ini
|
||||
[packages]
|
||||
ENABLED = true
|
||||
```
|
||||
|
||||
Restart Gitea:
|
||||
```bash
|
||||
docker restart gitea-server
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Debugging Steps
|
||||
|
||||
### Step 1: Verify Gitea Registry is Accessible
|
||||
|
||||
```bash
|
||||
# Should return 401 Unauthorized (which is good - means registry is working)
|
||||
curl -I https://gitea.harkon.co.uk/v2/
|
||||
|
||||
# Should return 200 OK after login
|
||||
docker login gitea.harkon.co.uk
|
||||
curl -u "username:token" https://gitea.harkon.co.uk/v2/
|
||||
```
|
||||
|
||||
### Step 2: Test with Small Image
|
||||
|
||||
```bash
|
||||
# Pull a small image
|
||||
docker pull alpine:latest
|
||||
|
||||
# Tag it for your registry
|
||||
docker tag alpine:latest gitea.harkon.co.uk/harkon/test:latest
|
||||
|
||||
# Try to push
|
||||
docker push gitea.harkon.co.uk/harkon/test:latest
|
||||
```
|
||||
|
||||
If this works, the issue is with large images (size limit).
|
||||
|
||||
### Step 3: Check Gitea Logs
|
||||
|
||||
```bash
|
||||
# Check for errors
|
||||
docker logs gitea-server --tail 100 | grep -i error
|
||||
|
||||
# Watch logs in real-time while pushing
|
||||
docker logs -f gitea-server
|
||||
```
|
||||
|
||||
### Step 4: Check Traefik Logs
|
||||
|
||||
```bash
|
||||
# Check for 413 or 502 errors
|
||||
docker logs traefik --tail 100 | grep -E "413|502|error"
|
||||
|
||||
# Watch logs in real-time
|
||||
docker logs -f traefik
|
||||
```
|
||||
|
||||
### Step 5: Check Docker Daemon Logs
|
||||
|
||||
```bash
|
||||
# Check Docker daemon logs
|
||||
sudo journalctl -u docker --since "1 hour ago" | grep -i error
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Fix: Bypass Traefik for Registry
|
||||
|
||||
If Traefik is causing issues, you can expose Gitea's registry directly:
|
||||
|
||||
1. Update Gitea docker-compose to expose port 3000:
|
||||
```yaml
|
||||
services:
|
||||
gitea:
|
||||
ports:
|
||||
- "3000:3000" # HTTP
|
||||
```
|
||||
|
||||
2. Use direct connection:
|
||||
```bash
|
||||
docker login gitea.harkon.co.uk:3000
|
||||
docker push gitea.harkon.co.uk:3000/harkon/base-ml:v1.0.1
|
||||
```
|
||||
|
||||
**Note**: This bypasses SSL, so only use for debugging!
|
||||
|
||||
---
|
||||
|
||||
## Recommended Configuration for Large Images
|
||||
|
||||
### Traefik Configuration
|
||||
|
||||
Create `/opt/traefik/config/gitea-registry.yml`:
|
||||
|
||||
```yaml
|
||||
http:
|
||||
middlewares:
|
||||
gitea-registry:
|
||||
buffering:
|
||||
maxRequestBodyBytes: 5368709120 # 5GB
|
||||
memRequestBodyBytes: 104857600 # 100MB in memory
|
||||
maxResponseBodyBytes: 5368709120 # 5GB
|
||||
memResponseBodyBytes: 104857600 # 100MB in memory
|
||||
|
||||
routers:
|
||||
gitea-registry:
|
||||
rule: "Host(`gitea.harkon.co.uk`) && PathPrefix(`/v2/`)"
|
||||
entryPoints:
|
||||
- websecure
|
||||
middlewares:
|
||||
- gitea-registry
|
||||
service: gitea
|
||||
tls:
|
||||
certResolver: letsencrypt
|
||||
```
|
||||
|
||||
### Gitea Configuration
|
||||
|
||||
In `/data/gitea/conf/app.ini`:
|
||||
|
||||
```ini
|
||||
[server]
|
||||
PROTOCOL = http
|
||||
DOMAIN = gitea.harkon.co.uk
|
||||
ROOT_URL = https://gitea.harkon.co.uk/
|
||||
HTTP_PORT = 3000
|
||||
LFS_MAX_FILE_SIZE = 5368709120
|
||||
|
||||
[repository.upload]
|
||||
FILE_MAX_SIZE = 5368709120
|
||||
ENABLED = true
|
||||
|
||||
[packages]
|
||||
ENABLED = true
|
||||
CHUNKED_UPLOAD_PATH = /data/gitea/tmp/package-upload
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing the Fix
|
||||
|
||||
After applying configuration changes:
|
||||
|
||||
1. Restart services:
|
||||
```bash
|
||||
docker restart traefik
|
||||
docker restart gitea-server
|
||||
```
|
||||
|
||||
2. Test with a large layer:
|
||||
```bash
|
||||
# Build base-ml (has large layers)
|
||||
cd /home/deploy/ai-tax-agent
|
||||
docker build -f infra/docker/base-ml.Dockerfile -t gitea.harkon.co.uk/harkon/base-ml:test .
|
||||
|
||||
# Try to push
|
||||
docker push gitea.harkon.co.uk/harkon/base-ml:test
|
||||
```
|
||||
|
||||
3. Monitor logs:
|
||||
```bash
|
||||
# Terminal 1: Watch Traefik
|
||||
docker logs -f traefik
|
||||
|
||||
# Terminal 2: Watch Gitea
|
||||
docker logs -f gitea-server
|
||||
|
||||
# Terminal 3: Push image
|
||||
docker push gitea.harkon.co.uk/harkon/base-ml:test
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Alternative: Use Docker Hub or GitHub Container Registry
|
||||
|
||||
If Gitea continues to have issues with large images, consider:
|
||||
|
||||
1. **Docker Hub**: Free for public images
|
||||
2. **GitHub Container Registry (ghcr.io)**: Free for public/private
|
||||
3. **GitLab Container Registry**: Free tier available
|
||||
|
||||
These are battle-tested for large ML images and have better defaults for large uploads.
|
||||
|
||||
@@ -1,194 +0,0 @@
|
||||
# Gitea Container Registry - Image Naming Fix
|
||||
|
||||
## Issue
|
||||
|
||||
The initial build script was using incorrect image naming convention for Gitea's container registry.
|
||||
|
||||
### Incorrect Format
|
||||
|
||||
```
|
||||
gitea.harkon.co.uk/ai-tax-agent/svc-ingestion:v1.0.0
|
||||
```
|
||||
|
||||
### Correct Format (Per Gitea Documentation)
|
||||
|
||||
```
|
||||
gitea.harkon.co.uk/{owner}/{image}:{tag}
|
||||
```
|
||||
|
||||
Where `{owner}` must be your **Gitea username** or **organization name**.
|
||||
|
||||
**Using organization:** `harkon` (Gitea team/organization)
|
||||
|
||||
## Solution
|
||||
|
||||
Updated the build script and production compose files to use the correct naming convention.
|
||||
|
||||
### Changes Made
|
||||
|
||||
#### 1. Build Script (`scripts/build-and-push-images.sh`)
|
||||
|
||||
**Before:**
|
||||
|
||||
```bash
|
||||
REGISTRY="${1:-gitea.harkon.co.uk}"
|
||||
VERSION="${2:-latest}"
|
||||
PROJECT="ai-tax-agent"
|
||||
|
||||
IMAGE_NAME="$REGISTRY/$PROJECT/$service:$VERSION"
|
||||
```
|
||||
|
||||
**After:**
|
||||
|
||||
```bash
|
||||
REGISTRY="${1:-gitea.harkon.co.uk}"
|
||||
VERSION="${2:-latest}"
|
||||
OWNER="${3:-harkon}" # Gitea organization/team name
|
||||
|
||||
IMAGE_NAME="$REGISTRY/$OWNER/$service:$VERSION"
|
||||
```
|
||||
|
||||
#### 2. Production Services (`infra/compose/production/services.yaml`)
|
||||
|
||||
**Before:**
|
||||
|
||||
```yaml
|
||||
svc-ingestion:
|
||||
image: gitea.harkon.co.uk/ai-tax-agent/svc-ingestion:latest
|
||||
```
|
||||
|
||||
**After:**
|
||||
|
||||
```yaml
|
||||
svc-ingestion:
|
||||
image: gitea.harkon.co.uk/harkon/svc-ingestion:latest
|
||||
```
|
||||
|
||||
All 14 services updated:
|
||||
|
||||
- svc-ingestion
|
||||
- svc-extract
|
||||
- svc-kg
|
||||
- svc-rag-retriever
|
||||
- svc-rag-indexer
|
||||
- svc-forms
|
||||
- svc-hmrc
|
||||
- svc-ocr
|
||||
- svc-rpa
|
||||
- svc-normalize-map
|
||||
- svc-reason
|
||||
- svc-firm-connectors
|
||||
- svc-coverage
|
||||
- ui-review
|
||||
|
||||
## Usage
|
||||
|
||||
### Build and Push Images
|
||||
|
||||
```bash
|
||||
# With default owner (harkon organization)
|
||||
./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.1
|
||||
|
||||
# With custom owner
|
||||
./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.1 <your-gitea-org>
|
||||
```
|
||||
|
||||
### Pull Images
|
||||
|
||||
```bash
|
||||
docker pull gitea.harkon.co.uk/harkon/svc-ingestion:v1.0.1
|
||||
```
|
||||
|
||||
### Push Images Manually
|
||||
|
||||
```bash
|
||||
# Tag image
|
||||
docker tag my-image:latest gitea.harkon.co.uk/harkon/my-image:v1.0.1
|
||||
|
||||
# Push image
|
||||
docker push gitea.harkon.co.uk/harkon/my-image:v1.0.1
|
||||
```
|
||||
|
||||
## Gitea Registry Documentation Reference
|
||||
|
||||
From Gitea's official documentation:
|
||||
|
||||
### Image Naming Convention
|
||||
|
||||
Images must follow this naming convention:
|
||||
|
||||
```
|
||||
{registry}/{owner}/{image}
|
||||
```
|
||||
|
||||
When building your docker image, using the naming convention above, this looks like:
|
||||
|
||||
```bash
|
||||
# build an image with tag
|
||||
docker build -t {registry}/{owner}/{image}:{tag} .
|
||||
|
||||
# name an existing image with tag
|
||||
docker tag {some-existing-image}:{tag} {registry}/{owner}/{image}:{tag}
|
||||
```
|
||||
|
||||
### Valid Examples
|
||||
|
||||
For owner `testuser` on `gitea.example.com`:
|
||||
|
||||
- ✅ `gitea.example.com/testuser/myimage`
|
||||
- ✅ `gitea.example.com/testuser/my-image`
|
||||
- ✅ `gitea.example.com/testuser/my/image`
|
||||
|
||||
### Important Notes
|
||||
|
||||
1. **Owner must exist**: The owner (username or organization) must exist in Gitea
|
||||
2. **Case-insensitive tags**: `image:tag` and `image:Tag` are treated as the same
|
||||
3. **Authentication required**: Use personal access token with `write:package` scope
|
||||
4. **Registry URL**: Use the main Gitea domain, not a separate registry subdomain
|
||||
|
||||
## Verification
|
||||
|
||||
After the fix, verify images are pushed correctly:
|
||||
|
||||
```bash
|
||||
# Login to Gitea
|
||||
docker login gitea.harkon.co.uk
|
||||
|
||||
# Check pushed images in Gitea UI
|
||||
# Navigate to: https://gitea.harkon.co.uk/blue/-/packages
|
||||
```
|
||||
|
||||
## Current Build Status
|
||||
|
||||
✅ **Fixed and working!**
|
||||
|
||||
Build command:
|
||||
|
||||
```bash
|
||||
./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.1 harkon
|
||||
```
|
||||
|
||||
Expected output:
|
||||
|
||||
```
|
||||
ℹ️ Logging in to registry: gitea.harkon.co.uk
|
||||
Login Succeeded
|
||||
ℹ️ Building svc-ingestion...
|
||||
ℹ️ Building: gitea.harkon.co.uk/harkon/svc-ingestion:v1.0.1
|
||||
✅ Built: gitea.harkon.co.uk/harkon/svc-ingestion:v1.0.1
|
||||
ℹ️ Pushing: gitea.harkon.co.uk/harkon/svc-ingestion:v1.0.1
|
||||
✅ Pushed: gitea.harkon.co.uk/harkon/svc-ingestion:v1.0.1
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. ✅ Build script fixed
|
||||
2. ✅ Production compose files updated
|
||||
3. 🟡 Build in progress (14 services)
|
||||
4. ⏳ Deploy to production (after build completes)
|
||||
|
||||
## References
|
||||
|
||||
- [Gitea Container Registry Documentation](https://docs.gitea.com/usage/packages/container)
|
||||
- Build script: `scripts/build-and-push-images.sh`
|
||||
- Production services: `infra/compose/production/services.yaml`
|
||||
@@ -148,11 +148,11 @@ docker run --rm gitea.harkon.co.uk/harkon/svc-ocr:v1.0.1 pip list | grep torch
|
||||
|
||||
### 5. Update Production Deployment
|
||||
|
||||
Update `infra/compose/production/services.yaml` to use `v1.0.1`:
|
||||
Update `infra/base/services.yaml` to use `v1.0.1`:
|
||||
|
||||
```bash
|
||||
# Find and replace v1.0.0 with v1.0.1
|
||||
sed -i '' 's/:v1.0.0/:v1.0.1/g' infra/compose/production/services.yaml
|
||||
sed -i '' 's/:v1.0.0/:v1.0.1/g' infra/base/services.yaml
|
||||
|
||||
# Or use latest tag (already configured)
|
||||
# No changes needed if using :latest
|
||||
|
||||
@@ -50,7 +50,7 @@ docker login gitea.harkon.co.uk
|
||||
**SSH to server:**
|
||||
```bash
|
||||
ssh deploy@141.136.35.199
|
||||
cd /opt/compose/ai-tax-agent
|
||||
cd /opt/ai-tax-agent
|
||||
```
|
||||
|
||||
**Initialize Vault:**
|
||||
@@ -62,19 +62,19 @@ docker exec -it vault vault operator unseal
|
||||
|
||||
**Create MinIO Buckets:**
|
||||
```bash
|
||||
docker exec -it minio mc alias set local http://localhost:9092 admin <MINIO_PASSWORD>
|
||||
docker exec -it minio mc mb local/documents
|
||||
docker exec -it minio mc mb local/models
|
||||
docker exec -it apa-minio mc alias set local http://localhost:9000 admin <MINIO_PASSWORD>
|
||||
docker exec -it apa-minio mc mb local/documents
|
||||
docker exec -it apa-minio mc mb local/models
|
||||
```
|
||||
|
||||
**Create NATS Streams:**
|
||||
```bash
|
||||
docker exec -it nats nats stream add TAX_AGENT_EVENTS \
|
||||
docker exec -it apa-nats nats stream add TAX_AGENT_EVENTS \\
|
||||
--subjects="tax.>" --storage=file --retention=limits --max-age=7d
|
||||
```
|
||||
|
||||
**Configure Authentik:**
|
||||
1. Go to https://authentik.harkon.co.uk
|
||||
1. Go to https://auth.harkon.co.uk
|
||||
2. Create groups: `app-admin`, `app-user`, `app-reviewer`
|
||||
3. Create OAuth providers for:
|
||||
- Review UI: `app.harkon.co.uk`
|
||||
@@ -94,7 +94,7 @@ curl -I https://api.harkon.co.uk/healthz
|
||||
curl -I https://grafana.harkon.co.uk
|
||||
|
||||
# View logs
|
||||
./scripts/deploy-to-production.sh logs svc-ingestion
|
||||
./scripts/deploy-to-production.sh logs apa-svc-ingestion
|
||||
```
|
||||
|
||||
---
|
||||
@@ -127,8 +127,8 @@ curl -I https://grafana.harkon.co.uk
|
||||
### Restart Service
|
||||
```bash
|
||||
ssh deploy@141.136.35.199
|
||||
cd /opt/compose/ai-tax-agent
|
||||
docker compose -f services.yaml restart svc-ingestion
|
||||
cd /opt/ai-tax-agent
|
||||
docker compose -f services.yaml restart apa-svc-ingestion
|
||||
```
|
||||
|
||||
### Check Status
|
||||
@@ -163,25 +163,25 @@ docker compose -f services.yaml logs svc-ingestion
|
||||
docker compose -f infrastructure.yaml ps
|
||||
|
||||
# Restart
|
||||
docker compose -f services.yaml restart svc-ingestion
|
||||
docker compose -f services.yaml restart apa-svc-ingestion
|
||||
```
|
||||
|
||||
### SSL Issues
|
||||
```bash
|
||||
# Check Traefik logs
|
||||
docker logs traefik
|
||||
docker logs apa-traefik
|
||||
|
||||
# Check certificates
|
||||
sudo cat /opt/compose/traefik/certs/godaddy-acme.json | jq
|
||||
sudo cat /opt/ai-tax-agent/traefik/certs/godaddy-acme.json | jq
|
||||
```
|
||||
|
||||
### Database Connection
|
||||
```bash
|
||||
# Test Postgres
|
||||
docker exec -it postgres pg_isready -U postgres
|
||||
docker exec -it apa-postgres pg_isready -U postgres
|
||||
|
||||
# Check env vars
|
||||
docker exec -it svc-ingestion env | grep POSTGRES
|
||||
docker exec -it apa-svc-ingestion env | grep POSTGRES
|
||||
```
|
||||
|
||||
---
|
||||
@@ -190,7 +190,7 @@ docker exec -it svc-ingestion env | grep POSTGRES
|
||||
|
||||
```bash
|
||||
ssh deploy@141.136.35.199
|
||||
cd /opt/compose/ai-tax-agent
|
||||
cd /opt/ai-tax-agent
|
||||
|
||||
# Stop services
|
||||
docker compose -f services.yaml down
|
||||
@@ -198,12 +198,11 @@ docker compose -f infrastructure.yaml down
|
||||
docker compose -f monitoring.yaml down
|
||||
|
||||
# Restore backup
|
||||
cd /opt/compose
|
||||
cd /opt
|
||||
tar -xzf ~/backups/backup-YYYYMMDD-HHMMSS.tar.gz
|
||||
|
||||
# Restart company services
|
||||
cd /opt/compose/traefik && docker compose up -d
|
||||
cd /opt/compose/authentik && docker compose up -d
|
||||
# Restart application infra
|
||||
cd /opt/ai-tax-agent && docker compose -f infrastructure.yaml up -d
|
||||
```
|
||||
|
||||
---
|
||||
@@ -242,4 +241,3 @@ cd /opt/compose/authentik && docker compose up -d
|
||||
```bash
|
||||
./scripts/deploy-to-production.sh logs <service>
|
||||
```
|
||||
|
||||
|
||||
555
docs/SRE.md
Normal file
555
docs/SRE.md
Normal file
@@ -0,0 +1,555 @@
|
||||
# ROLE
|
||||
|
||||
You are a **Senior Platform Engineer + Backend Lead** generating **production code** and **ops assets** for a microservice suite that powers an accounting Knowledge Graph + Vector RAG platform. Authentication/authorization are centralized at the **edge via Traefik + Authentik** (ForwardAuth). **Services are trust-bound** to Traefik and consume user/role claims via forwarded headers/JWT.
|
||||
|
||||
# MISSION
|
||||
|
||||
Produce fully working code for **all application services** (FastAPI + Python 3.12) with:
|
||||
|
||||
- Solid domain models, Pydantic v2 schemas, type hints, strict mypy, ruff lint.
|
||||
- Opentelemetry tracing, Prometheus metrics, structured logging.
|
||||
- Vault-backed secrets, MinIO S3 client, Qdrant client, Neo4j driver, Postgres (SQLAlchemy), Redis.
|
||||
- Eventing (Kafka or SQS/SNS behind an interface).
|
||||
- Deterministic data contracts, end-to-end tests, Dockerfiles, Compose, CI for Gitea.
|
||||
- Traefik labels + Authentik Outpost integration for every exposed route.
|
||||
- Zero PII in vectors (Qdrant), evidence-based lineage in KG, and bitemporal writes.
|
||||
|
||||
# GLOBAL CONSTRAINTS (APPLY TO ALL SERVICES)
|
||||
|
||||
- **Language & Runtime:** Python **3.12**.
|
||||
- **Frameworks:** FastAPI, Pydantic v2, SQLAlchemy 2, httpx, aiokafka or boto3 (pluggable), redis-py, opentelemetry-instrumentation-fastapi, prometheus-fastapi-instrumentator.
|
||||
- **Config:** `pydantic-settings` with `.env` overlay. Provide `Settings` class per service.
|
||||
- **Secrets:** HashiCorp **Vault** (AppRole/JWT). Use Vault Transit to **envelope-encrypt** sensitive fields before persistence (helpers provided in `lib/security.py`).
|
||||
- **Auth:** No OIDC in services. Add `TrustedProxyMiddleware`:
|
||||
|
||||
- Reject if request not from internal network (configurable CIDR).
|
||||
- Require headers set by Traefik+Authentik (`X-Authenticated-User`, `X-Authenticated-Email`, `X-Authenticated-Groups`, `Authorization: Bearer …`).
|
||||
- Parse groups → `roles` list on `request.state`.
|
||||
|
||||
- **Observability:**
|
||||
|
||||
- OpenTelemetry (traceparent propagation), span attrs (service, route, user, tenant).
|
||||
- Prometheus metrics endpoint `/metrics` protected by internal network check.
|
||||
- Structured JSON logs (timestamp, level, svc, trace_id, msg) via `structlog`.
|
||||
|
||||
- **Errors:** Global exception handler → RFC7807 Problem+JSON (`type`, `title`, `status`, `detail`, `instance`, `trace_id`).
|
||||
- **Testing:** `pytest`, `pytest-asyncio`, `hypothesis` (property tests for calculators), `coverage ≥ 90%` per service.
|
||||
- **Static:** `ruff`, `mypy --strict`, `bandit`, `safety`, `licensecheck`.
|
||||
- **Perf:** Each service exposes `/healthz`, `/readyz`, `/livez`; cold start < 500ms; p95 endpoint < 250ms (local).
|
||||
- **Containers:** Distroless or slim images; non-root user; read-only FS; `/tmp` mounted for OCR where needed.
|
||||
- **Docs:** OpenAPI JSON + ReDoc; MkDocs site with service READMEs.
|
||||
|
||||
# SHARED LIBS (GENERATE ONCE, REUSE)
|
||||
|
||||
Create `libs/` used by all services:
|
||||
|
||||
- `libs/config.py` – base `Settings`, env parsing, Vault client factory, MinIO client factory, Qdrant client factory, Neo4j driver factory, Redis factory, Kafka/SQS client factory.
|
||||
- `libs/security.py` – Vault Transit helpers (`encrypt_field`, `decrypt_field`), header parsing, internal-CIDR validator.
|
||||
- `libs/observability.py` – otel init, prometheus instrumentor, logging config.
|
||||
- `libs/events.py` – abstract `EventBus` with `publish(topic, payload: dict)`, `subscribe(topic, handler)`. Two impls: Kafka (`aiokafka`) and SQS/SNS (`boto3`).
|
||||
- `libs/schemas.py` – **canonical Pydantic models** shared across services (Document, Evidence, IncomeItem, etc.) mirroring the ontology schemas. Include JSONSchema exports.
|
||||
- `libs/storage.py` – S3/MinIO helpers (bucket ensure, put/get, presigned).
|
||||
- `libs/neo.py` – Neo4j session helpers, Cypher runner with retry, SHACL validator invoker (pySHACL on exported RDF).
|
||||
- `libs/rag.py` – Qdrant collections CRUD, hybrid search (dense+sparse), rerank wrapper, de-identification utilities (regex + NER; hash placeholders).
|
||||
- `libs/forms.py` – PDF AcroForm fill via `pdfrw` with overlay fallback via `reportlab`.
|
||||
- `libs/calibration.py` – `calibrated_confidence(raw_score, method="temperature_scaling", params=...)`.
|
||||
|
||||
# EVENT TOPICS (STANDARDIZE)
|
||||
|
||||
- `doc.ingested`, `doc.ocr_ready`, `doc.extracted`, `kg.upserted`, `rag.indexed`, `calc.schedule_ready`, `form.filled`, `hmrc.submitted`, `review.requested`, `review.completed`, `firm.sync.completed`
|
||||
|
||||
Each payload MUST include: `event_id (ulid)`, `occurred_at (iso)`, `actor`, `tenant_id`, `trace_id`, `schema_version`, and a `data` object (service-specific).
|
||||
|
||||
# TRUST HEADERS FROM TRAEFIK + AUTHENTIK (USE EXACT KEYS)
|
||||
|
||||
- `X-Authenticated-User` (string)
|
||||
- `X-Authenticated-Email` (string)
|
||||
- `X-Authenticated-Groups` (comma-separated)
|
||||
- `Authorization` (`Bearer <jwt>` from Authentik)
|
||||
Reject any request missing these (except `/healthz|/readyz|/livez|/metrics` from internal CIDR).
|
||||
|
||||
---
|
||||
|
||||
## SERVICES TO IMPLEMENT (CODE FOR EACH)
|
||||
|
||||
### 1) `svc-ingestion`
|
||||
|
||||
**Purpose:** Accept uploads or URLs, checksum, store to MinIO, emit `doc.ingested`.
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/ingest/upload` (multipart file, metadata: `tenant_id`, `kind`, `source`) → `{doc_id, s3_url, checksum}`
|
||||
- `POST /v1/ingest/url` (json: `{url, kind, tenant_id}`) → downloads to MinIO
|
||||
- `GET /v1/docs/{doc_id}` → metadata
|
||||
|
||||
**Logic:**
|
||||
|
||||
- Compute SHA256, dedupe by checksum; MinIO path `tenants/{tenant_id}/raw/{doc_id}.pdf`.
|
||||
- Store metadata in Postgres table `ingest_documents` (alembic migrations).
|
||||
- Publish `doc.ingested` with `{doc_id, bucket, key, pages?, mime}`.
|
||||
|
||||
**Env:** `S3_BUCKET_RAW`, `MINIO_*`, `DB_URL`.
|
||||
|
||||
**Traefik labels:** route `/ingest/*`.
|
||||
|
||||
---
|
||||
|
||||
### 2) `svc-rpa`
|
||||
|
||||
**Purpose:** Scheduled RPA pulls from firm/client portals via Playwright.
|
||||
|
||||
**Tasks:**
|
||||
|
||||
- Playwright login flows (credentials from Vault), 2FA via Authentik OAuth device or OTP secret in Vault.
|
||||
- Download statements/invoices; hand off to `svc-ingestion` via internal POST.
|
||||
- Prefect flows: `pull_portal_X()`, `pull_portal_Y()` with schedules.
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/rpa/run/{connector}` (manual trigger)
|
||||
- `GET /v1/rpa/status/{run_id}`
|
||||
|
||||
**Env:** `VAULT_ADDR`, `VAULT_ROLE_ID`, `VAULT_SECRET_ID`.
|
||||
|
||||
---
|
||||
|
||||
### 3) `svc-ocr`
|
||||
|
||||
**Purpose:** OCR & layout extraction.
|
||||
|
||||
**Pipeline:**
|
||||
|
||||
- Pull object from MinIO, detect rotation/de-skew (`opencv-python`), split pages (`pymupdf`), OCR (`pytesseract`) or bypass if text layer present (`pdfplumber`).
|
||||
- Output per-page text + **bbox** for lines/words.
|
||||
- Write JSON to MinIO `tenants/{tenant_id}/ocr/{doc_id}.json` and emit `doc.ocr_ready`.
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/ocr/{doc_id}` (idempotent trigger)
|
||||
- `GET /v1/ocr/{doc_id}` (fetch OCR JSON)
|
||||
|
||||
**Env:** `TESSERACT_LANGS`, `S3_BUCKET_EVIDENCE`.
|
||||
|
||||
---
|
||||
|
||||
### 4) `svc-extract`
|
||||
|
||||
**Purpose:** Classify docs and extract KV + tables into **schema-constrained JSON** (with bbox/page).
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/extract/{doc_id}` body: `{strategy: "llm|rules|hybrid"}`
|
||||
- `GET /v1/extract/{doc_id}` → structured JSON
|
||||
|
||||
**Implementation:**
|
||||
|
||||
- Use prompt files in `prompts/`: `doc_classify.txt`, `kv_extract.txt`, `table_extract.txt`.
|
||||
- **Validator loop**: run LLM → validate JSONSchema → retry with error messages up to N times.
|
||||
- Return Pydantic models from `libs/schemas.py`.
|
||||
- Emit `doc.extracted`.
|
||||
|
||||
**Env:** `LLM_ENGINE`, `TEMPERATURE`, `MAX_TOKENS`.
|
||||
|
||||
---
|
||||
|
||||
### 5) `svc-normalize-map`
|
||||
|
||||
**Purpose:** Normalize & map extracted data to KG.
|
||||
|
||||
**Logic:**
|
||||
|
||||
- Currency normalization (ECB or static fx table), dates, UK tax year/basis period inference.
|
||||
- Entity resolution (blocking + fuzzy).
|
||||
- Generate nodes/edges (+ `Evidence` with doc_id/page/bbox/text_hash).
|
||||
- Use `libs/neo.py` to write with **bitemporal** fields; run **SHACL** validator; on violation, queue `review.requested`.
|
||||
- Emit `kg.upserted`.
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/map/{doc_id}`
|
||||
- `GET /v1/map/{doc_id}/preview` (diff view, to be used by UI)
|
||||
|
||||
**Env:** `NEO4J_*`.
|
||||
|
||||
---
|
||||
|
||||
### 6) `svc-kg`
|
||||
|
||||
**Purpose:** Graph façade + RDF/SHACL utility.
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `GET /v1/kg/nodes/{label}/{id}`
|
||||
- `POST /v1/kg/cypher` (admin-gated inline query; must check `admin` role)
|
||||
- `POST /v1/kg/export/rdf` (returns RDF for SHACL)
|
||||
- `POST /v1/kg/validate` (run pySHACL against `schemas/shapes.ttl`)
|
||||
- `GET /v1/kg/lineage/{node_id}` (traverse `DERIVED_FROM` → Evidence)
|
||||
|
||||
**Env:** `NEO4J_*`.
|
||||
|
||||
---
|
||||
|
||||
### 7) `svc-rag-indexer`
|
||||
|
||||
**Purpose:** Build Qdrant indices (firm knowledge, legislation, best practices, glossary).
|
||||
|
||||
**Workflow:**
|
||||
|
||||
- Load sources (filesystem, URLs, Firm DMS via `svc-firm-connectors`).
|
||||
- **De-identify PII** (regex + NER), replace with placeholders; store mapping only in Postgres.
|
||||
- Chunk (layout-aware) per `retrieval/chunking.yaml`.
|
||||
- Compute **dense** embeddings (e.g., `bge-small-en-v1.5`) and **sparse** (Qdrant sparse).
|
||||
- Upsert to Qdrant with payload `{jurisdiction, tax_years[], topic_tags[], version, pii_free: true, doc_id/section_id/url}`.
|
||||
- Emit `rag.indexed`.
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/index/run`
|
||||
- `GET /v1/index/status/{run_id}`
|
||||
|
||||
**Env:** `QDRANT_URL`, `RAG_EMBEDDING_MODEL`, `RAG_RERANKER_MODEL`.
|
||||
|
||||
---
|
||||
|
||||
### 8) `svc-rag-retriever`
|
||||
|
||||
**Purpose:** Hybrid search + KG fusion with rerank and calibrated confidence.
|
||||
|
||||
**Endpoint:**
|
||||
|
||||
- `POST /v1/rag/search` `{query, tax_year?, jurisdiction?, k?}` →
|
||||
|
||||
```
|
||||
{
|
||||
"chunks": [...],
|
||||
"citations": [{doc_id|url, section_id?, page?, bbox?}],
|
||||
"kg_hints": [{rule_id, formula_id, node_ids[]}],
|
||||
"calibrated_confidence": 0.0-1.0
|
||||
}
|
||||
```
|
||||
|
||||
**Implementation:**
|
||||
|
||||
- Hybrid score: `alpha * dense + beta * sparse`; rerank top-K via cross-encoder; **KG fusion** (boost chunks citing Rules/Calculations relevant to schedule).
|
||||
- Use `libs/calibration.py` to expose calibrated confidence.
|
||||
|
||||
---
|
||||
|
||||
### 9) `svc-reason`
|
||||
|
||||
**Purpose:** Deterministic calculators + materializers (UK SA).
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/reason/compute_schedule` `{tax_year, taxpayer_id, schedule_id}`
|
||||
- `GET /v1/reason/explain/{schedule_id}` → rationale & lineage paths
|
||||
|
||||
**Implementation:**
|
||||
|
||||
- Pure functions for: employment, self-employment, property (FHL, 20% interest credit), dividends/interest, allowances, NIC (Class 2/4), HICBC, student loans (Plans 1/2/4/5, PGL).
|
||||
- **Deterministic order** as defined; rounding per `FormBox.rounding_rule`.
|
||||
- Use Cypher from `kg/reasoning/schedule_queries.cypher` to materialize box values; attach `DERIVED_FROM` evidence.
|
||||
|
||||
---
|
||||
|
||||
### 10) `svc-forms`
|
||||
|
||||
**Purpose:** Fill PDFs and assemble evidence bundles.
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/forms/fill` `{tax_year, taxpayer_id, form_id}` → returns PDF (binary)
|
||||
- `POST /v1/forms/evidence_pack` `{scope}` → ZIP + manifest + signed hashes (sha256)
|
||||
|
||||
**Implementation:**
|
||||
|
||||
- `pdfrw` for AcroForm; overlay with ReportLab if needed.
|
||||
- Manifest includes `doc_id/page/bbox/text_hash` for every numeric field.
|
||||
|
||||
---
|
||||
|
||||
### 11) `svc-hmrc`
|
||||
|
||||
**Purpose:** HMRC submitter (stub|sandbox|live).
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/hmrc/submit` `{tax_year, taxpayer_id, dry_run}` → `{status, submission_id?, errors[]}`
|
||||
- `GET /v1/hmrc/submissions/{id}`
|
||||
|
||||
**Implementation:**
|
||||
|
||||
- Rate limits, retries/backoff, signed audit log; environment toggle.
|
||||
|
||||
---
|
||||
|
||||
### 12) `svc-firm-connectors`
|
||||
|
||||
**Purpose:** Read-only connectors to Firm Databases (Practice Mgmt, DMS).
|
||||
|
||||
**Endpoints:**
|
||||
|
||||
- `POST /v1/firm/sync` `{since?}` → `{objects_synced, errors[]}`
|
||||
- `GET /v1/firm/objects` (paged)
|
||||
|
||||
**Implementation:**
|
||||
|
||||
- Data contracts in `config/firm_contracts/`; mappers → Secure Client Data Store (Postgres) with lineage columns (`source`, `source_id`, `synced_at`).
|
||||
|
||||
---
|
||||
|
||||
### 13) `ui-review` (outline only)
|
||||
|
||||
- Next.js (SSO handled by Traefik+Authentik), shows extracted fields + evidence snippets; POST overrides to `svc-extract`/`svc-normalize-map`.
|
||||
|
||||
---
|
||||
|
||||
## DATA CONTRACTS (ESSENTIAL EXAMPLES)
|
||||
|
||||
**Event: `doc.ingested`**
|
||||
|
||||
```json
|
||||
{
|
||||
"event_id": "01J...ULID",
|
||||
"occurred_at": "2025-09-13T08:00:00Z",
|
||||
"actor": "svc-ingestion",
|
||||
"tenant_id": "t_123",
|
||||
"trace_id": "abc-123",
|
||||
"schema_version": "1.0",
|
||||
"data": {
|
||||
"doc_id": "d_abc",
|
||||
"bucket": "raw",
|
||||
"key": "tenants/t_123/raw/d_abc.pdf",
|
||||
"checksum": "sha256:...",
|
||||
"kind": "bank_statement",
|
||||
"mime": "application/pdf",
|
||||
"pages": 12
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**RAG search response shape**
|
||||
|
||||
```json
|
||||
{
|
||||
"chunks": [
|
||||
{
|
||||
"id": "c1",
|
||||
"text": "...",
|
||||
"score": 0.78,
|
||||
"payload": {
|
||||
"jurisdiction": "UK",
|
||||
"tax_years": ["2024-25"],
|
||||
"topic_tags": ["FHL"],
|
||||
"pii_free": true
|
||||
}
|
||||
}
|
||||
],
|
||||
"citations": [
|
||||
{ "doc_id": "leg-ITA2007", "section_id": "s272A", "url": "https://..." }
|
||||
],
|
||||
"kg_hints": [
|
||||
{
|
||||
"rule_id": "UK.FHL.Qual",
|
||||
"formula_id": "FHL_Test_v1",
|
||||
"node_ids": ["n123", "n456"]
|
||||
}
|
||||
],
|
||||
"calibrated_confidence": 0.81
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## PERSISTENCE SCHEMAS (POSTGRES; ALEMBIC)
|
||||
|
||||
- `ingest_documents(id pk, tenant_id, doc_id, kind, checksum, bucket, key, mime, pages, created_at)`
|
||||
- `firm_objects(id pk, tenant_id, source, source_id, type, payload jsonb, synced_at)`
|
||||
- Qdrant PII mapping table (if absolutely needed): `pii_links(id pk, placeholder_hash, client_id, created_at)` — **encrypt with Vault Transit**; do NOT store raw values.
|
||||
|
||||
---
|
||||
|
||||
## TRAEFIK + AUTHENTIK (COMPOSE LABELS PER SERVICE)
|
||||
|
||||
For every service container in `infra/compose/docker-compose.local.yml`, add labels:
|
||||
|
||||
```
|
||||
- "traefik.enable=true"
|
||||
- "traefik.http.routers.svc-extract.rule=Host(`api.local`) && PathPrefix(`/extract`)"
|
||||
- "traefik.http.routers.svc-extract.entrypoints=websecure"
|
||||
- "traefik.http.routers.svc-extract.tls=true"
|
||||
- "traefik.http.routers.svc-extract.middlewares=authentik-forwardauth,rate-limit"
|
||||
- "traefik.http.services.svc-extract.loadbalancer.server.port=8000"
|
||||
```
|
||||
|
||||
Use the shared dynamic file `traefik-dynamic.yml` with `authentik-forwardauth` and `rate-limit` middlewares.
|
||||
|
||||
---
|
||||
|
||||
## OUTPUT FORMAT (STRICT)
|
||||
|
||||
Implement a **multi-file codebase** as fenced blocks, EXACTLY in this order:
|
||||
|
||||
```txt
|
||||
# FILE: libs/config.py
|
||||
# factories for Vault/MinIO/Qdrant/Neo4j/Redis/EventBus, Settings base
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: libs/security.py
|
||||
# Vault Transit helpers, header parsing, internal CIDR checks, middleware
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: libs/observability.py
|
||||
# otel init, prometheus, structlog
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: libs/events.py
|
||||
# EventBus abstraction with Kafka and SQS/SNS impls
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: libs/schemas.py
|
||||
# Shared Pydantic models mirroring ontology entities
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-ingestion/main.py
|
||||
# FastAPI app, endpoints, MinIO write, Postgres, publish doc.ingested
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-rpa/main.py
|
||||
# Playwright flows, Prefect tasks, triggers
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-ocr/main.py
|
||||
# OCR pipeline, endpoints
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-extract/main.py
|
||||
# Classifier + extractors with validator loop
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-normalize-map/main.py
|
||||
# normalization, entity resolution, KG mapping, SHACL validation call
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-kg/main.py
|
||||
# KG façade, RDF export, SHACL validate, lineage traversal
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-rag-indexer/main.py
|
||||
# chunk/de-id/embed/upsert to Qdrant
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-rag-retriever/main.py
|
||||
# hybrid retrieval + rerank + KG fusion
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-reason/main.py
|
||||
# deterministic calculators, schedule compute/explain
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-forms/main.py
|
||||
# PDF fill + evidence pack
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-hmrc/main.py
|
||||
# submit stub|sandbox|live with audit + retries
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: apps/svc-firm-connectors/main.py
|
||||
# connectors to practice mgmt & DMS, sync to Postgres
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: infra/compose/docker-compose.local.yml
|
||||
# Traefik, Authentik, Vault, MinIO, Qdrant, Neo4j, Postgres, Redis, Prom+Grafana, Loki, Unleash, all services
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: infra/compose/traefik.yml
|
||||
# static Traefik config
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: infra/compose/traefik-dynamic.yml
|
||||
# forwardAuth middleware + routers/services
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: .gitea/workflows/ci.yml
|
||||
# lint->test->build->scan->push->deploy
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: Makefile
|
||||
# bootstrap, run, test, lint, build, deploy, format, seed
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: tests/e2e/test_happy_path.py
|
||||
# end-to-end: ingest -> ocr -> extract -> map -> compute -> fill -> (stub) submit
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: tests/unit/test_calculators.py
|
||||
# boundary tests for UK SA logic (NIC, HICBC, PA taper, FHL)
|
||||
...
|
||||
```
|
||||
|
||||
```txt
|
||||
# FILE: README.md
|
||||
# how to run locally with docker-compose, Authentik setup, Traefik certs
|
||||
...
|
||||
```
|
||||
|
||||
## DEFINITION OF DONE
|
||||
|
||||
- `docker compose up` brings the full stack up; SSO via Authentik; routes secured via Traefik ForwardAuth.
|
||||
- Running `pytest` yields ≥ 90% coverage; `make e2e` passes the ingest→…→submit stub flow.
|
||||
- All services expose `/healthz|/readyz|/livez|/metrics`; OpenAPI at `/docs`.
|
||||
- No PII stored in Qdrant; vectors carry `pii_free=true`.
|
||||
- KG writes are SHACL-validated; violations produce `review.requested` events.
|
||||
- Evidence lineage is present for every numeric box value.
|
||||
- Gitea pipeline passes: lint, test, build, scan, push, deploy.
|
||||
|
||||
# START
|
||||
|
||||
Generate the full codebase and configs in the **exact file blocks and order** specified above.
|
||||
Reference in New Issue
Block a user