# Base Image Architecture ## Overview To optimize Docker image sizes and build times, we use a **layered base image architecture**: ``` python:3.12-slim (150MB) ├─> base-runtime (300MB) - Core deps for ALL services └─> base-ml (1.2GB) - ML deps (sentence-transformers, PyTorch, etc.) ├─> svc-ocr (1.25GB = base-ml + 50MB app) ├─> svc-rag-indexer (1.25GB = base-ml + 50MB app) └─> svc-rag-retriever (1.25GB = base-ml + 50MB app) ``` ## Benefits ### 1. **Build ML Dependencies Once** - Heavy ML libraries (PyTorch, transformers, sentence-transformers) are built once in `base-ml` - All ML services reuse the same base image - No need to rebuild 1GB+ of dependencies for each service ### 2. **Faster Builds** - **Before**: Each ML service took 10-15 minutes to build - **After**: ML services build in 1-2 minutes (only app code + small deps) ### 3. **Faster Pushes** - **Before**: Pushing 1.3GB per service = 3.9GB total for 3 ML services - **After**: Push base-ml once (1.2GB) + 3 small app layers (50MB each) = 1.35GB total - **Savings**: 65% reduction in push time ### 4. **Layer Caching** - Docker reuses base-ml layers across all ML services - Only the small application layer (~50MB) needs to be pushed/pulled - Faster deployments and rollbacks ### 5. **Easy Updates** - Update ML library versions in one place (`base-ml`) - Rebuild base-ml once, then rebuild all ML services quickly - Consistent ML library versions across all services ## Image Sizes | Image Type | Size | Contents | | ------------------ | ------- | --------------------------------------------------------------------------------------------- | | **base-runtime** | ~300MB | FastAPI, uvicorn, database drivers, Redis, NATS, MinIO, Qdrant, etc. | | **base-ml** | ~1.2GB | base-runtime + sentence-transformers, PyTorch, transformers, numpy, scikit-learn, spacy, nltk | | **ML Service** | ~1.25GB | base-ml + service-specific deps (faiss, tiktoken, etc.) + app code (~50MB) | | **Non-ML Service** | ~350MB | python:3.12-slim + base deps + service deps + app code | ## Architecture ### Base Images #### 1. base-runtime - **Location**: `infra/docker/base-runtime.Dockerfile` - **Registry**: `gitea.harkon.co.uk/harkon/base-runtime:v1.0.1` - **Contents**: Core dependencies for ALL services - FastAPI, uvicorn, pydantic - Database drivers (asyncpg, psycopg2, neo4j, redis) - Object storage (minio) - Vector DB (qdrant-client) - Event bus (nats-py) - Secrets (hvac) - Monitoring (prometheus-client) - HTTP client (httpx) - Utilities (ulid-py, python-dateutil, orjson) #### 2. base-ml - **Location**: `infra/docker/base-ml.Dockerfile` - **Registry**: `gitea.harkon.co.uk/harkon/base-ml:v1.0.1` - **Contents**: base-runtime + ML dependencies - sentence-transformers (includes PyTorch) - transformers - scikit-learn - numpy - spacy - nltk - fuzzywuzzy - python-Levenshtein ### Service Images #### ML Services (use base-ml) 1. **svc-ocr** - OCR and document AI - Additional deps: pytesseract, PyMuPDF, pdf2image, Pillow, opencv-python-headless, torchvision - System deps: tesseract-ocr, poppler-utils 2. **svc-rag-indexer** - Document indexing and embedding - Additional deps: tiktoken, beautifulsoup4, faiss-cpu, python-docx, python-pptx, openpyxl, sparse-dot-topn 3. **svc-rag-retriever** - Semantic search and retrieval - Additional deps: rank-bm25, faiss-cpu, sparse-dot-topn #### Non-ML Services (use python:3.12-slim directly) - All other services (svc-ingestion, svc-extract, svc-kg, svc-forms, etc.) - Build from scratch with base requirements + service-specific deps ## Build Process ### Step 1: Build Base Images (One Time) **IMPORTANT**: Build `base-ml` on the remote server to avoid pushing 1.2GB+ over the network! #### Option A: Build base-ml on Remote Server (Recommended) ```bash # Build base-ml on remote server (fast push to Gitea on same network) ./scripts/remote-build-base-ml.sh deploy@141.136.35.199 /home/deploy/ai-tax-agent gitea.harkon.co.uk v1.0.1 harkon # Or use defaults (deploy user, /home/deploy/ai-tax-agent) ./scripts/remote-build-base-ml.sh ``` This will: 1. Sync code to remote server 2. Build `base-ml` on remote (~1.2GB, 10-15 min) 3. Push to Gitea from remote (fast, same network) **Why build base-ml remotely?** - ✅ Faster push to Gitea (same datacenter/network) - ✅ Saves local network bandwidth - ✅ Image is cached on remote server for faster service builds - ✅ Only need to do this once **Time**: 10-15 minutes (one time only) #### Option B: Build Locally (Not Recommended for base-ml) ```bash # Build both base images locally ./scripts/build-base-images.sh gitea.harkon.co.uk v1.0.1 harkon ``` This builds: - `gitea.harkon.co.uk/harkon/base-runtime:v1.0.1` (~300MB) - `gitea.harkon.co.uk/harkon/base-ml:v1.0.1` (~1.2GB) **Note**: Pushing 1.2GB base-ml from local machine is slow and may fail due to network issues. ### Step 2: Build Service Images ```bash # Build and push all services ./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.1 harkon ``` ML services will: 1. Pull `base-ml:v1.0.1` from registry (if not cached) 2. Install service-specific deps (~10-20 packages) 3. Copy application code 4. Build final image (~1.25GB) **Time per ML service**: 1-2 minutes (vs 10-15 minutes before) ### Step 3: Update Base Images (When Needed) When you need to update ML library versions: ```bash # 1. Update libs/requirements-ml.txt vim libs/requirements-ml.txt # 2. Rebuild base-ml with new version ./scripts/build-base-images.sh gitea.harkon.co.uk v1.0.2 harkon # 3. Update service Dockerfiles to use new base version # Change: ARG BASE_VERSION=v1.0.2 # 4. Rebuild ML services ./scripts/build-and-push-images.sh gitea.harkon.co.uk v1.0.2 harkon ``` ## Requirements Files ### libs/requirements-base.txt Core dependencies for ALL services (included in base-runtime and base-ml) ### libs/requirements-ml.txt ML dependencies (included in base-ml only) ### apps/svc\_\*/requirements.txt Service-specific dependencies: - **ML services**: Only additional deps NOT in base-ml (e.g., faiss-cpu, tiktoken) - **Non-ML services**: Service-specific deps (e.g., aiofiles, openai, anthropic) ## Dockerfile Templates ### ML Service Dockerfile Pattern ```dockerfile # Use pre-built ML base image ARG REGISTRY=gitea.harkon.co.uk ARG OWNER=harkon ARG BASE_VERSION=v1.0.1 FROM ${REGISTRY}/${OWNER}/base-ml:${BASE_VERSION} USER root WORKDIR /app # Install service-specific deps (minimal) COPY apps/SERVICE_NAME/requirements.txt /tmp/service-requirements.txt RUN pip install --no-cache-dir -r /tmp/service-requirements.txt # Copy app code COPY libs/ ./libs/ COPY apps/SERVICE_NAME/ ./apps/SERVICE_NAME/ RUN chown -R appuser:appuser /app USER appuser # Health check, expose, CMD... ``` ### Non-ML Service Dockerfile Pattern ```dockerfile # Multi-stage build from scratch FROM python:3.12-slim AS builder # Install build deps RUN apt-get update && apt-get install -y build-essential curl && rm -rf /var/lib/apt/lists/* # Create venv and install deps RUN python -m venv /opt/venv ENV PATH="/opt/venv/bin:$PATH" COPY libs/requirements-base.txt /tmp/libs-requirements.txt COPY apps/SERVICE_NAME/requirements.txt /tmp/requirements.txt RUN pip install --no-cache-dir -r /tmp/libs-requirements.txt -r /tmp/requirements.txt # Production stage FROM python:3.12-slim # ... copy venv, app code, etc. ``` ## Comparison: Before vs After ### Before (Monolithic Approach) ``` Each ML service: - Build time: 10-15 minutes - Image size: 1.6GB - Push time: 5-10 minutes - Total for 3 services: 30-45 min build + 15-30 min push = 45-75 minutes ``` ### After (Base Image Approach) ``` Base-ml (one time): - Build time: 10-15 minutes - Image size: 1.2GB - Push time: 5-10 minutes Each ML service: - Build time: 1-2 minutes - Image size: 1.25GB (but only 50MB new layers) - Push time: 30-60 seconds (only new layers) - Total for 3 services: 3-6 min build + 2-3 min push = 5-9 minutes Total time savings: 40-66 minutes (89% faster!) ``` ## Best Practices 1. **Version base images**: Always tag with version (e.g., v1.0.1, v1.0.2) 2. **Update base images infrequently**: Only when ML library versions need updating 3. **Keep service requirements minimal**: Only add deps NOT in base-ml 4. **Use build args**: Make registry/owner/version configurable 5. **Test base images**: Ensure health checks pass before building services 6. **Document changes**: Update this file when modifying base images ## Troubleshooting ### Issue: Service can't find ML library **Cause**: Library removed from service requirements but not in base-ml **Solution**: Add library to `libs/requirements-ml.txt` and rebuild base-ml ### Issue: Base image not found **Cause**: Base image not pushed to registry or wrong version **Solution**: Run `./scripts/build-base-images.sh` first ### Issue: Service image too large **Cause**: Duplicate dependencies in service requirements **Solution**: Remove deps already in base-ml from service requirements.txt ## Future Improvements 1. **base-runtime for non-ML services**: Use base-runtime instead of building from scratch 2. **Multi-arch builds**: Support ARM64 for Apple Silicon 3. **Automated base image updates**: CI/CD pipeline to rebuild base images on dependency updates 4. **Layer analysis**: Tools to analyze and optimize layer sizes