Initial commit
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled

This commit is contained in:
harkon
2025-10-11 08:41:36 +01:00
commit b324ff09ef
276 changed files with 55220 additions and 0 deletions

View File

@@ -0,0 +1,439 @@
# Environment Comparison: Local vs Production
## Overview
This document compares the local development environment with the production environment to help developers understand the differences and ensure smooth transitions between environments.
## Quick Reference
| Aspect | Local Development | Production |
|--------|------------------|------------|
| **Domain** | `*.local.lan` | `*.harkon.co.uk` |
| **SSL** | Self-signed certificates | Let's Encrypt (GoDaddy DNS) |
| **Networks** | `ai-tax-agent-frontend`<br/>`ai-tax-agent-backend` | `frontend`<br/>`backend` |
| **Compose File** | `docker-compose.local.yml` | `infrastructure.yaml`<br/>`services.yaml`<br/>`monitoring.yaml` |
| **Location** | Local machine | `deploy@141.136.35.199:/opt/compose/ai-tax-agent/` |
| **Traefik** | Isolated instance | Shared with company services |
| **Authentik** | Isolated instance | Shared with company services |
| **Data Persistence** | Local Docker volumes | Remote Docker volumes + backups |
## Detailed Comparison
### 1. Domain & URLs
#### Local Development
```
Frontend:
- Review UI: https://review.local.lan
- Authentik: https://auth.local.lan
- Grafana: https://grafana.local.lan
API:
- API Gateway: https://api.local.lan
Admin Interfaces:
- Traefik: http://localhost:8080
- Vault: https://vault.local.lan
- MinIO: https://minio.local.lan
- Neo4j: https://neo4j.local.lan
- Qdrant: https://qdrant.local.lan
- Prometheus: https://prometheus.local.lan
- Loki: https://loki.local.lan
```
#### Production
```
Frontend:
- Review UI: https://app.harkon.co.uk
- Authentik: https://authentik.harkon.co.uk (shared)
- Grafana: https://grafana.harkon.co.uk
API:
- API Gateway: https://api.harkon.co.uk
Admin Interfaces:
- Traefik: https://traefik.harkon.co.uk (shared)
- Vault: https://vault.harkon.co.uk
- MinIO: https://minio.harkon.co.uk
- Neo4j: https://neo4j.harkon.co.uk
- Qdrant: https://qdrant.harkon.co.uk
- Prometheus: https://prometheus.harkon.co.uk
- Loki: https://loki.harkon.co.uk
Company Services (shared):
- Gitea: https://gitea.harkon.co.uk
- Nextcloud: https://cloud.harkon.co.uk
- Portainer: https://portainer.harkon.co.uk
```
### 2. SSL/TLS Configuration
#### Local Development
- **Certificate Type**: Self-signed
- **Generation**: `scripts/generate-dev-certs.sh`
- **Location**: `infra/compose/certs/local.crt`, `infra/compose/certs/local.key`
- **Browser Warning**: Yes (must accept)
- **Renewal**: Manual (when expired)
#### Production
- **Certificate Type**: Let's Encrypt
- **Challenge**: DNS-01 (GoDaddy)
- **Location**: `/opt/compose/traefik/certs/godaddy-acme.json`
- **Browser Warning**: No
- **Renewal**: Automatic (Traefik handles)
### 3. Network Configuration
#### Local Development
```yaml
networks:
frontend:
external: true
name: ai-tax-agent-frontend
backend:
external: true
name: ai-tax-agent-backend
```
**Creation**:
```bash
docker network create ai-tax-agent-frontend
docker network create ai-tax-agent-backend
```
#### Production
```yaml
networks:
frontend:
external: true
name: frontend
backend:
external: true
name: backend
```
**Note**: Networks are shared with company services (Gitea, Nextcloud, Portainer)
### 4. Service Isolation
#### Local Development
- **Traefik**: Dedicated instance for AI Tax Agent
- **Authentik**: Dedicated instance for AI Tax Agent
- **Isolation**: Complete - no shared services
- **Impact**: Changes don't affect other services
#### Production
- **Traefik**: Shared with company services
- **Authentik**: Shared with company services
- **Isolation**: Partial - infrastructure shared, application isolated
- **Impact**: Traefik/Authentik changes affect all services
### 5. Authentication & Authorization
#### Local Development
- **Bootstrap Admin**: `admin@local.lan` / `admin123`
- **Groups**: Auto-created via bootstrap
- **OAuth Clients**: Auto-configured
- **Users**: Test users only
#### Production
- **Bootstrap Admin**: Real admin credentials
- **Groups**:
- `company` - Company services access
- `app-admin` - Full app access
- `app-user` - App user access
- `app-reviewer` - Reviewer access
- **OAuth Clients**: Manually configured
- **Users**: Real users with proper onboarding
### 6. Data Persistence
#### Local Development
```bash
# Volume location
/var/lib/docker/volumes/
# Volumes
- postgres_data
- neo4j_data
- qdrant_data
- minio_data
- vault_data
- redis_data
- nats_data
- authentik_data
```
**Backup**: Manual (not automated)
**Retention**: Until `make clean`
#### Production
```bash
# Volume location
/var/lib/docker/volumes/
# Volumes (prefixed with project name)
- ai-tax-agent_postgres_data
- ai-tax-agent_neo4j_data
- ai-tax-agent_qdrant_data
- ai-tax-agent_minio_data
- ai-tax-agent_vault_data
- ai-tax-agent_redis_data
- ai-tax-agent_nats_data
```
**Backup**: Automated daily backups
**Retention**: 30 days
### 7. Environment Variables
#### Local Development (`.env`)
```bash
DOMAIN=local.lan
EMAIL=admin@local.lan
POSTGRES_PASSWORD=postgres
NEO4J_PASSWORD=neo4jpass
AUTHENTIK_SECRET_KEY=changeme
VAULT_DEV_ROOT_TOKEN_ID=root
DEBUG=true
DEVELOPMENT_MODE=true
```
#### Production (`.env.production`)
```bash
DOMAIN=harkon.co.uk
EMAIL=admin@harkon.co.uk
POSTGRES_PASSWORD=<strong-password>
NEO4J_PASSWORD=<strong-password>
AUTHENTIK_SECRET_KEY=<generated-secret>
VAULT_DEV_ROOT_TOKEN_ID=<production-token>
DEBUG=false
DEVELOPMENT_MODE=false
```
### 8. Resource Limits
#### Local Development
- **No limits**: Uses available resources
- **Suitable for**: Development and testing
- **Scaling**: Not configured
#### Production
```yaml
# Example resource limits
services:
svc-ingestion:
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
reservations:
cpus: '0.5'
memory: 512M
```
### 9. Logging & Monitoring
#### Local Development
- **Logs**: Docker logs (`docker compose logs`)
- **Retention**: Until container restart
- **Monitoring**: Optional (Grafana available but not required)
- **Alerts**: Disabled
#### Production
- **Logs**: Centralized in Loki
- **Retention**: 30 days
- **Monitoring**: Required (Prometheus + Grafana)
- **Alerts**: Enabled (email/Slack notifications)
### 10. Deployment Process
#### Local Development
```bash
# Start everything
make bootstrap
make up
# Or step by step
./scripts/create-networks.sh
./scripts/generate-dev-certs.sh
cd infra/compose
docker compose -f docker-compose.local.yml up -d
# Stop everything
make down
# Clean everything
make clean
```
#### Production
```bash
# Deploy infrastructure
cd /opt/compose/ai-tax-agent
docker compose -f infrastructure.yaml up -d
# Deploy services
docker compose -f services.yaml up -d
# Deploy monitoring
docker compose -f monitoring.yaml up -d
# Update single service
docker compose -f services.yaml up -d --no-deps svc-ingestion
```
### 11. Database Migrations
#### Local Development
- **Automatic**: Migrations run on startup
- **Rollback**: `make clean` and restart
- **Data Loss**: Acceptable
#### Production
- **Manual**: Migrations run explicitly
- **Rollback**: Requires backup restoration
- **Data Loss**: NOT acceptable
### 12. Secrets Management
#### Local Development
- **Storage**: `.env` file (committed to git as example)
- **Vault**: Dev mode (unsealed automatically)
- **Security**: Low (development only)
#### Production
- **Storage**: `.env.production` (NOT committed to git)
- **Vault**: Production mode (manual unseal required)
- **Security**: High (encrypted, access controlled)
### 13. CI/CD Integration
#### Local Development
- **CI/CD**: Not applicable
- **Testing**: Manual
- **Deployment**: Manual
#### Production
- **CI/CD**: Gitea Actions (planned)
- **Testing**: Automated (unit, integration, e2e)
- **Deployment**: Automated with approval gates
### 14. Backup & Recovery
#### Local Development
- **Backup**: Not configured
- **Recovery**: Rebuild from scratch
- **RTO**: N/A
- **RPO**: N/A
#### Production
- **Backup**: Daily automated backups
- **Recovery**: Restore from backup
- **RTO**: 1 hour
- **RPO**: 24 hours
### 15. Cost Considerations
#### Local Development
- **Infrastructure**: Free (local machine)
- **Compute**: Uses local resources
- **Storage**: Uses local disk
#### Production
- **Infrastructure**: Server rental (~$50/month)
- **Compute**: Shared with company services
- **Storage**: Included in server
- **Domain**: ~$15/year
- **SSL**: Free (Let's Encrypt)
## Migration Path
### From Local to Production
1. **Build images locally**:
```bash
docker compose -f docker-compose.local.yml build
```
2. **Tag for production**:
```bash
docker tag svc-ingestion:latest gitea.harkon.co.uk/ai-tax-agent/svc-ingestion:v1.0.0
```
3. **Push to registry**:
```bash
docker push gitea.harkon.co.uk/ai-tax-agent/svc-ingestion:v1.0.0
```
4. **Deploy to production**:
```bash
ssh deploy@141.136.35.199
cd /opt/compose/ai-tax-agent
docker compose -f services.yaml pull
docker compose -f services.yaml up -d
```
### From Production to Local (for debugging)
1. **Pull production image**:
```bash
docker pull gitea.harkon.co.uk/ai-tax-agent/svc-ingestion:v1.0.0
```
2. **Tag for local use**:
```bash
docker tag gitea.harkon.co.uk/ai-tax-agent/svc-ingestion:v1.0.0 svc-ingestion:latest
```
3. **Run locally**:
```bash
docker compose -f docker-compose.local.yml up -d svc-ingestion
```
## Best Practices
### Local Development
1. ✅ Use `make` commands for consistency
2. ✅ Keep `.env` file updated from `env.example`
3. ✅ Run tests before committing
4. ✅ Use `docker compose logs -f` for debugging
5. ✅ Clean up regularly with `make clean`
### Production
1. ✅ Never commit `.env.production` to git
2. ✅ Always backup before making changes
3. ✅ Test in local environment first
4. ✅ Use versioned image tags (not `latest`)
5. ✅ Monitor logs and metrics after deployment
6. ✅ Have rollback plan ready
7. ✅ Document all changes
## Troubleshooting
### Local Development Issues
- **Port conflicts**: Check if ports 80, 443, 8080 are in use
- **Network errors**: Recreate networks with `make networks`
- **Certificate errors**: Regenerate with `./scripts/generate-dev-certs.sh`
- **Service won't start**: Check logs with `docker compose logs <service>`
### Production Issues
- **Service unreachable**: Check Traefik routing and DNS
- **Authentication fails**: Verify Authentik configuration
- **SSL errors**: Check certificate renewal in Traefik
- **Performance issues**: Check resource usage with `docker stats`
## Summary
The key differences between local and production environments are:
1. **Isolation**: Local is fully isolated; production shares Traefik/Authentik
2. **Security**: Local uses weak credentials; production uses strong secrets
3. **Domains**: Local uses `.local.lan`; production uses `.harkon.co.uk`
4. **SSL**: Local uses self-signed; production uses Let's Encrypt
5. **Monitoring**: Local is optional; production is required
6. **Backups**: Local has none; production has automated backups
Both environments use the same application code and Docker images, ensuring consistency and reducing deployment risks.