Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
314 lines
6.5 KiB
Markdown
314 lines
6.5 KiB
Markdown
# Remote Build Troubleshooting Guide
|
|
|
|
## Problem: Docker Push Failing on Remote Server
|
|
|
|
When building `base-ml` image on the remote server and pushing to Gitea, the push fails with large image layers (>1GB).
|
|
|
|
---
|
|
|
|
## Root Cause
|
|
|
|
The issue is likely one of these:
|
|
|
|
1. **Upload size limit in Traefik** (default ~100MB)
|
|
2. **Upload size limit in Gitea** (default varies)
|
|
3. **Network timeout** during large uploads
|
|
4. **Not logged in** to Gitea registry
|
|
5. **Disk space** issues
|
|
|
|
---
|
|
|
|
## Quick Diagnosis
|
|
|
|
### On Remote Server (ssh deploy@141.136.35.199)
|
|
|
|
Run these commands to diagnose:
|
|
|
|
```bash
|
|
# 1. Check if logged in
|
|
cat ~/.docker/config.json
|
|
|
|
# 2. Test registry endpoint
|
|
curl -I https://gitea.harkon.co.uk/v2/
|
|
|
|
# 3. Check Gitea logs for errors
|
|
docker logs --tail 50 gitea-server | grep -i error
|
|
|
|
# 4. Check Traefik logs for 413 errors
|
|
docker logs --tail 50 traefik | grep -E "413|error"
|
|
|
|
# 5. Check disk space
|
|
df -h
|
|
|
|
# 6. Test with small image
|
|
docker pull alpine:latest
|
|
docker tag alpine:latest gitea.harkon.co.uk/harkon/test:latest
|
|
docker push gitea.harkon.co.uk/harkon/test:latest
|
|
```
|
|
|
|
---
|
|
|
|
## Solution 1: Automated Fix (Recommended)
|
|
|
|
Copy the fix script to the remote server and run it:
|
|
|
|
```bash
|
|
# On your local machine
|
|
scp scripts/fix-gitea-upload-limit.sh deploy@141.136.35.199:~/
|
|
|
|
# SSH to remote
|
|
ssh deploy@141.136.35.199
|
|
|
|
# Run the fix script
|
|
chmod +x fix-gitea-upload-limit.sh
|
|
./fix-gitea-upload-limit.sh
|
|
```
|
|
|
|
This script will:
|
|
- ✅ Create Traefik middleware for large uploads (5GB limit)
|
|
- ✅ Update Gitea configuration for large files
|
|
- ✅ Restart both services
|
|
- ✅ Test the registry endpoint
|
|
|
|
---
|
|
|
|
## Solution 2: Manual Fix
|
|
|
|
### Step 1: Configure Traefik
|
|
|
|
```bash
|
|
# SSH to remote
|
|
ssh deploy@141.136.35.199
|
|
|
|
# Create Traefik middleware config
|
|
sudo mkdir -p /opt/traefik/config
|
|
sudo tee /opt/traefik/config/gitea-large-upload.yml > /dev/null << 'EOF'
|
|
http:
|
|
middlewares:
|
|
gitea-large-upload:
|
|
buffering:
|
|
maxRequestBodyBytes: 5368709120 # 5GB
|
|
memRequestBodyBytes: 104857600 # 100MB
|
|
maxResponseBodyBytes: 5368709120 # 5GB
|
|
memResponseBodyBytes: 104857600 # 100MB
|
|
EOF
|
|
|
|
# Restart Traefik
|
|
docker restart traefik
|
|
```
|
|
|
|
### Step 2: Update Gitea Container Labels
|
|
|
|
Find your Gitea docker-compose file and add this label:
|
|
|
|
```yaml
|
|
services:
|
|
gitea:
|
|
labels:
|
|
- "traefik.http.routers.gitea.middlewares=gitea-large-upload@file"
|
|
```
|
|
|
|
Then restart:
|
|
```bash
|
|
docker-compose up -d gitea
|
|
```
|
|
|
|
### Step 3: Configure Gitea Settings
|
|
|
|
```bash
|
|
# Backup config
|
|
docker exec gitea-server cp /data/gitea/conf/app.ini /data/gitea/conf/app.ini.backup
|
|
|
|
# Edit config
|
|
docker exec -it gitea-server vi /data/gitea/conf/app.ini
|
|
```
|
|
|
|
Add these settings:
|
|
|
|
```ini
|
|
[server]
|
|
LFS_MAX_FILE_SIZE = 5368709120 ; 5GB
|
|
|
|
[packages]
|
|
ENABLED = true
|
|
CHUNKED_UPLOAD_PATH = /data/gitea/tmp/package-upload
|
|
```
|
|
|
|
Restart Gitea:
|
|
```bash
|
|
docker restart gitea-server
|
|
```
|
|
|
|
---
|
|
|
|
## Solution 3: Alternative - Use GitHub Container Registry
|
|
|
|
If Gitea continues to have issues, use GitHub Container Registry instead:
|
|
|
|
### On Remote Server:
|
|
|
|
```bash
|
|
# Login to GitHub Container Registry
|
|
echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
|
|
|
|
# Build and push to GitHub
|
|
cd /home/deploy/ai-tax-agent
|
|
docker build -f infra/docker/base-ml.Dockerfile -t ghcr.io/harkon/base-ml:v1.0.1 .
|
|
docker push ghcr.io/harkon/base-ml:v1.0.1
|
|
```
|
|
|
|
### Update Dockerfiles:
|
|
|
|
Change `FROM` statements from:
|
|
```dockerfile
|
|
FROM gitea.harkon.co.uk/harkon/base-ml:v1.0.1
|
|
```
|
|
|
|
To:
|
|
```dockerfile
|
|
FROM ghcr.io/harkon/base-ml:v1.0.1
|
|
```
|
|
|
|
---
|
|
|
|
## Testing the Fix
|
|
|
|
After applying the fix:
|
|
|
|
### 1. Test with Small Image
|
|
|
|
```bash
|
|
docker pull alpine:latest
|
|
docker tag alpine:latest gitea.harkon.co.uk/harkon/test:latest
|
|
docker push gitea.harkon.co.uk/harkon/test:latest
|
|
```
|
|
|
|
Expected: ✅ Push succeeds
|
|
|
|
### 2. Test with Large Image
|
|
|
|
```bash
|
|
cd /home/deploy/ai-tax-agent
|
|
docker build -f infra/docker/base-ml.Dockerfile -t gitea.harkon.co.uk/harkon/base-ml:test .
|
|
docker push gitea.harkon.co.uk/harkon/base-ml:test
|
|
```
|
|
|
|
Expected: ✅ Push succeeds (may take 5-10 minutes)
|
|
|
|
### 3. Monitor Logs
|
|
|
|
In separate terminals:
|
|
|
|
```bash
|
|
# Terminal 1: Traefik logs
|
|
docker logs -f traefik
|
|
|
|
# Terminal 2: Gitea logs
|
|
docker logs -f gitea-server
|
|
|
|
# Terminal 3: Push image
|
|
docker push gitea.harkon.co.uk/harkon/base-ml:test
|
|
```
|
|
|
|
Look for:
|
|
- ❌ `413 Request Entity Too Large` - Upload limit still too low
|
|
- ❌ `502 Bad Gateway` - Timeout issue
|
|
- ❌ `unauthorized` - Not logged in
|
|
- ✅ `Pushed` - Success!
|
|
|
|
---
|
|
|
|
## Common Errors and Fixes
|
|
|
|
### Error: `413 Request Entity Too Large`
|
|
|
|
**Fix**: Increase Traefik buffering limit (see Solution 1 or 2 above)
|
|
|
|
### Error: `unauthorized: authentication required`
|
|
|
|
**Fix**: Log in to Gitea registry
|
|
```bash
|
|
docker login gitea.harkon.co.uk
|
|
```
|
|
|
|
### Error: `no space left on device`
|
|
|
|
**Fix**: Clean up Docker
|
|
```bash
|
|
docker system prune -a --volumes -f
|
|
df -h
|
|
```
|
|
|
|
### Error: `net/http: request canceled while waiting for connection`
|
|
|
|
**Fix**: Network timeout - increase timeout or use chunked uploads
|
|
```bash
|
|
# Add to Traefik middleware
|
|
retryExpression: "IsNetworkError() && Attempts() < 3"
|
|
```
|
|
|
|
### Error: `received unexpected HTTP status: 500 Internal Server Error`
|
|
|
|
**Fix**: Check Gitea logs for the actual error
|
|
```bash
|
|
docker logs gitea-server --tail 100
|
|
```
|
|
|
|
---
|
|
|
|
## Verification Checklist
|
|
|
|
After fixing, verify:
|
|
|
|
- [ ] Traefik middleware created and loaded
|
|
- [ ] Gitea container has middleware label
|
|
- [ ] Gitea app.ini has LFS_MAX_FILE_SIZE set
|
|
- [ ] Gitea packages enabled
|
|
- [ ] Both services restarted
|
|
- [ ] Registry endpoint returns 401 (not 404)
|
|
- [ ] Logged in to registry
|
|
- [ ] Small image push works
|
|
- [ ] Large image push works
|
|
|
|
---
|
|
|
|
## Next Steps After Fix
|
|
|
|
Once the fix is applied and tested:
|
|
|
|
1. **Build base-ml on remote**:
|
|
```bash
|
|
cd /home/deploy/ai-tax-agent
|
|
docker build -f infra/docker/base-ml.Dockerfile -t gitea.harkon.co.uk/harkon/base-ml:v1.0.1 .
|
|
docker push gitea.harkon.co.uk/harkon/base-ml:v1.0.1
|
|
```
|
|
|
|
2. **Build services locally** (they'll pull base-ml from Gitea):
|
|
```bash
|
|
# On local machine
|
|
./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.1 harkon
|
|
```
|
|
|
|
3. **Deploy to production**:
|
|
```bash
|
|
./scripts/deploy-to-production.sh
|
|
```
|
|
|
|
---
|
|
|
|
## Support Resources
|
|
|
|
- **Gitea Registry Docs**: https://docs.gitea.io/en-us/packages/container/
|
|
- **Traefik Buffering**: https://doc.traefik.io/traefik/middlewares/http/buffering/
|
|
- **Docker Registry API**: https://docs.docker.com/registry/spec/api/
|
|
|
|
---
|
|
|
|
## Files Created
|
|
|
|
- `scripts/fix-gitea-upload-limit.sh` - Automated fix script
|
|
- `scripts/remote-debug-commands.txt` - Manual debug commands
|
|
- `docs/GITEA_REGISTRY_DEBUG.md` - Detailed debugging guide
|
|
- `docs/REMOTE_BUILD_TROUBLESHOOTING.md` - This file
|
|
|