Initial commit
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
This commit is contained in:
313
docs/REMOTE_BUILD_TROUBLESHOOTING.md
Normal file
313
docs/REMOTE_BUILD_TROUBLESHOOTING.md
Normal file
@@ -0,0 +1,313 @@
|
||||
# Remote Build Troubleshooting Guide
|
||||
|
||||
## Problem: Docker Push Failing on Remote Server
|
||||
|
||||
When building `base-ml` image on the remote server and pushing to Gitea, the push fails with large image layers (>1GB).
|
||||
|
||||
---
|
||||
|
||||
## Root Cause
|
||||
|
||||
The issue is likely one of these:
|
||||
|
||||
1. **Upload size limit in Traefik** (default ~100MB)
|
||||
2. **Upload size limit in Gitea** (default varies)
|
||||
3. **Network timeout** during large uploads
|
||||
4. **Not logged in** to Gitea registry
|
||||
5. **Disk space** issues
|
||||
|
||||
---
|
||||
|
||||
## Quick Diagnosis
|
||||
|
||||
### On Remote Server (ssh deploy@141.136.35.199)
|
||||
|
||||
Run these commands to diagnose:
|
||||
|
||||
```bash
|
||||
# 1. Check if logged in
|
||||
cat ~/.docker/config.json
|
||||
|
||||
# 2. Test registry endpoint
|
||||
curl -I https://gitea.harkon.co.uk/v2/
|
||||
|
||||
# 3. Check Gitea logs for errors
|
||||
docker logs --tail 50 gitea-server | grep -i error
|
||||
|
||||
# 4. Check Traefik logs for 413 errors
|
||||
docker logs --tail 50 traefik | grep -E "413|error"
|
||||
|
||||
# 5. Check disk space
|
||||
df -h
|
||||
|
||||
# 6. Test with small image
|
||||
docker pull alpine:latest
|
||||
docker tag alpine:latest gitea.harkon.co.uk/harkon/test:latest
|
||||
docker push gitea.harkon.co.uk/harkon/test:latest
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Solution 1: Automated Fix (Recommended)
|
||||
|
||||
Copy the fix script to the remote server and run it:
|
||||
|
||||
```bash
|
||||
# On your local machine
|
||||
scp scripts/fix-gitea-upload-limit.sh deploy@141.136.35.199:~/
|
||||
|
||||
# SSH to remote
|
||||
ssh deploy@141.136.35.199
|
||||
|
||||
# Run the fix script
|
||||
chmod +x fix-gitea-upload-limit.sh
|
||||
./fix-gitea-upload-limit.sh
|
||||
```
|
||||
|
||||
This script will:
|
||||
- ✅ Create Traefik middleware for large uploads (5GB limit)
|
||||
- ✅ Update Gitea configuration for large files
|
||||
- ✅ Restart both services
|
||||
- ✅ Test the registry endpoint
|
||||
|
||||
---
|
||||
|
||||
## Solution 2: Manual Fix
|
||||
|
||||
### Step 1: Configure Traefik
|
||||
|
||||
```bash
|
||||
# SSH to remote
|
||||
ssh deploy@141.136.35.199
|
||||
|
||||
# Create Traefik middleware config
|
||||
sudo mkdir -p /opt/traefik/config
|
||||
sudo tee /opt/traefik/config/gitea-large-upload.yml > /dev/null << 'EOF'
|
||||
http:
|
||||
middlewares:
|
||||
gitea-large-upload:
|
||||
buffering:
|
||||
maxRequestBodyBytes: 5368709120 # 5GB
|
||||
memRequestBodyBytes: 104857600 # 100MB
|
||||
maxResponseBodyBytes: 5368709120 # 5GB
|
||||
memResponseBodyBytes: 104857600 # 100MB
|
||||
EOF
|
||||
|
||||
# Restart Traefik
|
||||
docker restart traefik
|
||||
```
|
||||
|
||||
### Step 2: Update Gitea Container Labels
|
||||
|
||||
Find your Gitea docker-compose file and add this label:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
gitea:
|
||||
labels:
|
||||
- "traefik.http.routers.gitea.middlewares=gitea-large-upload@file"
|
||||
```
|
||||
|
||||
Then restart:
|
||||
```bash
|
||||
docker-compose up -d gitea
|
||||
```
|
||||
|
||||
### Step 3: Configure Gitea Settings
|
||||
|
||||
```bash
|
||||
# Backup config
|
||||
docker exec gitea-server cp /data/gitea/conf/app.ini /data/gitea/conf/app.ini.backup
|
||||
|
||||
# Edit config
|
||||
docker exec -it gitea-server vi /data/gitea/conf/app.ini
|
||||
```
|
||||
|
||||
Add these settings:
|
||||
|
||||
```ini
|
||||
[server]
|
||||
LFS_MAX_FILE_SIZE = 5368709120 ; 5GB
|
||||
|
||||
[packages]
|
||||
ENABLED = true
|
||||
CHUNKED_UPLOAD_PATH = /data/gitea/tmp/package-upload
|
||||
```
|
||||
|
||||
Restart Gitea:
|
||||
```bash
|
||||
docker restart gitea-server
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Solution 3: Alternative - Use GitHub Container Registry
|
||||
|
||||
If Gitea continues to have issues, use GitHub Container Registry instead:
|
||||
|
||||
### On Remote Server:
|
||||
|
||||
```bash
|
||||
# Login to GitHub Container Registry
|
||||
echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
|
||||
|
||||
# Build and push to GitHub
|
||||
cd /home/deploy/ai-tax-agent
|
||||
docker build -f infra/docker/base-ml.Dockerfile -t ghcr.io/harkon/base-ml:v1.0.1 .
|
||||
docker push ghcr.io/harkon/base-ml:v1.0.1
|
||||
```
|
||||
|
||||
### Update Dockerfiles:
|
||||
|
||||
Change `FROM` statements from:
|
||||
```dockerfile
|
||||
FROM gitea.harkon.co.uk/harkon/base-ml:v1.0.1
|
||||
```
|
||||
|
||||
To:
|
||||
```dockerfile
|
||||
FROM ghcr.io/harkon/base-ml:v1.0.1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing the Fix
|
||||
|
||||
After applying the fix:
|
||||
|
||||
### 1. Test with Small Image
|
||||
|
||||
```bash
|
||||
docker pull alpine:latest
|
||||
docker tag alpine:latest gitea.harkon.co.uk/harkon/test:latest
|
||||
docker push gitea.harkon.co.uk/harkon/test:latest
|
||||
```
|
||||
|
||||
Expected: ✅ Push succeeds
|
||||
|
||||
### 2. Test with Large Image
|
||||
|
||||
```bash
|
||||
cd /home/deploy/ai-tax-agent
|
||||
docker build -f infra/docker/base-ml.Dockerfile -t gitea.harkon.co.uk/harkon/base-ml:test .
|
||||
docker push gitea.harkon.co.uk/harkon/base-ml:test
|
||||
```
|
||||
|
||||
Expected: ✅ Push succeeds (may take 5-10 minutes)
|
||||
|
||||
### 3. Monitor Logs
|
||||
|
||||
In separate terminals:
|
||||
|
||||
```bash
|
||||
# Terminal 1: Traefik logs
|
||||
docker logs -f traefik
|
||||
|
||||
# Terminal 2: Gitea logs
|
||||
docker logs -f gitea-server
|
||||
|
||||
# Terminal 3: Push image
|
||||
docker push gitea.harkon.co.uk/harkon/base-ml:test
|
||||
```
|
||||
|
||||
Look for:
|
||||
- ❌ `413 Request Entity Too Large` - Upload limit still too low
|
||||
- ❌ `502 Bad Gateway` - Timeout issue
|
||||
- ❌ `unauthorized` - Not logged in
|
||||
- ✅ `Pushed` - Success!
|
||||
|
||||
---
|
||||
|
||||
## Common Errors and Fixes
|
||||
|
||||
### Error: `413 Request Entity Too Large`
|
||||
|
||||
**Fix**: Increase Traefik buffering limit (see Solution 1 or 2 above)
|
||||
|
||||
### Error: `unauthorized: authentication required`
|
||||
|
||||
**Fix**: Log in to Gitea registry
|
||||
```bash
|
||||
docker login gitea.harkon.co.uk
|
||||
```
|
||||
|
||||
### Error: `no space left on device`
|
||||
|
||||
**Fix**: Clean up Docker
|
||||
```bash
|
||||
docker system prune -a --volumes -f
|
||||
df -h
|
||||
```
|
||||
|
||||
### Error: `net/http: request canceled while waiting for connection`
|
||||
|
||||
**Fix**: Network timeout - increase timeout or use chunked uploads
|
||||
```bash
|
||||
# Add to Traefik middleware
|
||||
retryExpression: "IsNetworkError() && Attempts() < 3"
|
||||
```
|
||||
|
||||
### Error: `received unexpected HTTP status: 500 Internal Server Error`
|
||||
|
||||
**Fix**: Check Gitea logs for the actual error
|
||||
```bash
|
||||
docker logs gitea-server --tail 100
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
After fixing, verify:
|
||||
|
||||
- [ ] Traefik middleware created and loaded
|
||||
- [ ] Gitea container has middleware label
|
||||
- [ ] Gitea app.ini has LFS_MAX_FILE_SIZE set
|
||||
- [ ] Gitea packages enabled
|
||||
- [ ] Both services restarted
|
||||
- [ ] Registry endpoint returns 401 (not 404)
|
||||
- [ ] Logged in to registry
|
||||
- [ ] Small image push works
|
||||
- [ ] Large image push works
|
||||
|
||||
---
|
||||
|
||||
## Next Steps After Fix
|
||||
|
||||
Once the fix is applied and tested:
|
||||
|
||||
1. **Build base-ml on remote**:
|
||||
```bash
|
||||
cd /home/deploy/ai-tax-agent
|
||||
docker build -f infra/docker/base-ml.Dockerfile -t gitea.harkon.co.uk/harkon/base-ml:v1.0.1 .
|
||||
docker push gitea.harkon.co.uk/harkon/base-ml:v1.0.1
|
||||
```
|
||||
|
||||
2. **Build services locally** (they'll pull base-ml from Gitea):
|
||||
```bash
|
||||
# On local machine
|
||||
./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.1 harkon
|
||||
```
|
||||
|
||||
3. **Deploy to production**:
|
||||
```bash
|
||||
./scripts/deploy-to-production.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Support Resources
|
||||
|
||||
- **Gitea Registry Docs**: https://docs.gitea.io/en-us/packages/container/
|
||||
- **Traefik Buffering**: https://doc.traefik.io/traefik/middlewares/http/buffering/
|
||||
- **Docker Registry API**: https://docs.docker.com/registry/spec/api/
|
||||
|
||||
---
|
||||
|
||||
## Files Created
|
||||
|
||||
- `scripts/fix-gitea-upload-limit.sh` - Automated fix script
|
||||
- `scripts/remote-debug-commands.txt` - Manual debug commands
|
||||
- `docs/GITEA_REGISTRY_DEBUG.md` - Detailed debugging guide
|
||||
- `docs/REMOTE_BUILD_TROUBLESHOOTING.md` - This file
|
||||
|
||||
Reference in New Issue
Block a user