Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled
508 lines
14 KiB
Markdown
508 lines
14 KiB
Markdown
# Encryption Strategy
|
|
## AI Tax Agent System
|
|
|
|
**Document Version:** 1.0
|
|
**Date:** 2024-01-31
|
|
**Owner:** Security Architecture Team
|
|
|
|
## 1. Executive Summary
|
|
|
|
This document defines the comprehensive encryption strategy for the AI Tax Agent System, covering data at rest, in transit, and in use. The strategy implements defense-in-depth with multiple encryption layers and key management best practices.
|
|
|
|
## 2. Encryption Requirements
|
|
|
|
### 2.1 Regulatory Requirements
|
|
- **GDPR Article 32**: Appropriate technical measures including encryption
|
|
- **UK Data Protection Act 2018**: Security of processing requirements
|
|
- **HMRC Security Standards**: Government security classifications
|
|
- **ISO 27001**: Information security management requirements
|
|
- **SOC 2 Type II**: Security and availability controls
|
|
|
|
### 2.2 Business Requirements
|
|
- **Client Data Protection**: Financial and personal information
|
|
- **Intellectual Property**: Proprietary algorithms and models
|
|
- **Regulatory Compliance**: Audit trail and evidence integrity
|
|
- **Business Continuity**: Key recovery and disaster recovery
|
|
|
|
## 3. Encryption Architecture
|
|
|
|
### 3.1 Encryption Layers
|
|
|
|
```mermaid
|
|
graph TB
|
|
A[Client Browser] -->|TLS 1.3| B[Traefik Gateway]
|
|
B -->|mTLS| C[Application Services]
|
|
C -->|Application-Level| D[Database Layer]
|
|
D -->|Transparent Data Encryption| E[Storage Layer]
|
|
E -->|Volume Encryption| F[Disk Storage]
|
|
|
|
G[Key Management] --> H[Vault HSM]
|
|
H --> I[Encryption Keys]
|
|
I --> C
|
|
I --> D
|
|
I --> E
|
|
```
|
|
|
|
### 3.2 Encryption Domains
|
|
|
|
| Domain | Technology | Key Size | Algorithm | Rotation |
|
|
|--------|------------|----------|-----------|----------|
|
|
| **Transport** | TLS 1.3 | 256-bit | AES-GCM, ChaCha20-Poly1305 | Annual |
|
|
| **Application** | AES-GCM | 256-bit | AES-256-GCM | Quarterly |
|
|
| **Database** | TDE | 256-bit | AES-256-CBC | Quarterly |
|
|
| **Storage** | LUKS/dm-crypt | 256-bit | AES-256-XTS | Annual |
|
|
| **Backup** | GPG | 4096-bit | RSA-4096 + AES-256 | Annual |
|
|
|
|
## 4. Data Classification and Encryption
|
|
|
|
### 4.1 Data Classification Matrix
|
|
|
|
| Classification | Examples | Encryption Level | Key Access |
|
|
|----------------|----------|------------------|------------|
|
|
| **PUBLIC** | Marketing materials, documentation | TLS only | Public |
|
|
| **INTERNAL** | System logs, metrics | TLS + Storage | Service accounts |
|
|
| **CONFIDENTIAL** | Client names, addresses | TLS + App + Storage | Authorized users |
|
|
| **RESTRICTED** | Financial data, UTR, NI numbers | TLS + App + Field + Storage | Need-to-know |
|
|
| **SECRET** | Encryption keys, certificates | HSM + Multiple layers | Key custodians |
|
|
|
|
### 4.2 Field-Level Encryption
|
|
|
|
**Sensitive Fields Requiring Field-Level Encryption:**
|
|
```python
|
|
ENCRYPTED_FIELDS = {
|
|
'taxpayer_profile': ['utr', 'ni_number', 'full_name', 'address'],
|
|
'financial_data': ['account_number', 'sort_code', 'iban', 'amount'],
|
|
'document_content': ['ocr_text', 'extracted_fields'],
|
|
'authentication': ['password_hash', 'api_keys', 'tokens']
|
|
}
|
|
```
|
|
|
|
**Implementation Example:**
|
|
```python
|
|
from cryptography.fernet import Fernet
|
|
import vault_client
|
|
|
|
class FieldEncryption:
|
|
def __init__(self, vault_client):
|
|
self.vault = vault_client
|
|
|
|
def encrypt_field(self, field_name: str, value: str) -> str:
|
|
"""Encrypt sensitive field using Vault transit engine"""
|
|
key_name = f"field-{field_name}"
|
|
response = self.vault.encrypt(
|
|
mount_point='transit',
|
|
name=key_name,
|
|
plaintext=base64.b64encode(value.encode()).decode()
|
|
)
|
|
return response['data']['ciphertext']
|
|
|
|
def decrypt_field(self, field_name: str, ciphertext: str) -> str:
|
|
"""Decrypt sensitive field using Vault transit engine"""
|
|
key_name = f"field-{field_name}"
|
|
response = self.vault.decrypt(
|
|
mount_point='transit',
|
|
name=key_name,
|
|
ciphertext=ciphertext
|
|
)
|
|
return base64.b64decode(response['data']['plaintext']).decode()
|
|
```
|
|
|
|
## 5. Key Management Strategy
|
|
|
|
### 5.1 Key Hierarchy
|
|
|
|
```
|
|
Root Key (HSM)
|
|
├── Master Encryption Key (MEK)
|
|
│ ├── Data Encryption Keys (DEK)
|
|
│ │ ├── Database DEK
|
|
│ │ ├── Application DEK
|
|
│ │ └── Storage DEK
|
|
│ └── Key Encryption Keys (KEK)
|
|
│ ├── Field Encryption KEK
|
|
│ ├── Backup KEK
|
|
│ └── Archive KEK
|
|
└── Signing Keys
|
|
├── JWT Signing Key
|
|
├── Document Signing Key
|
|
└── API Signing Key
|
|
```
|
|
|
|
### 5.2 HashiCorp Vault Configuration
|
|
|
|
**Vault Policies:**
|
|
```hcl
|
|
# Database encryption policy
|
|
path "transit/encrypt/database-*" {
|
|
capabilities = ["create", "update"]
|
|
}
|
|
|
|
path "transit/decrypt/database-*" {
|
|
capabilities = ["create", "update"]
|
|
}
|
|
|
|
# Application encryption policy
|
|
path "transit/encrypt/app-*" {
|
|
capabilities = ["create", "update"]
|
|
}
|
|
|
|
path "transit/decrypt/app-*" {
|
|
capabilities = ["create", "update"]
|
|
}
|
|
|
|
# Field encryption policy (restricted)
|
|
path "transit/encrypt/field-*" {
|
|
capabilities = ["create", "update"]
|
|
allowed_parameters = {
|
|
"plaintext" = []
|
|
}
|
|
denied_parameters = {
|
|
"batch_input" = []
|
|
}
|
|
}
|
|
```
|
|
|
|
**Key Rotation Policy:**
|
|
```hcl
|
|
# Automatic key rotation
|
|
path "transit/keys/database-primary" {
|
|
min_decryption_version = 1
|
|
min_encryption_version = 2
|
|
deletion_allowed = false
|
|
auto_rotate_period = "2160h" # 90 days
|
|
}
|
|
```
|
|
|
|
### 5.3 Hardware Security Module (HSM)
|
|
|
|
**HSM Configuration:**
|
|
- **Type**: AWS CloudHSM / Azure Dedicated HSM
|
|
- **FIPS Level**: FIPS 140-2 Level 3
|
|
- **High Availability**: Multi-AZ deployment
|
|
- **Backup**: Encrypted key backup to secure offline storage
|
|
|
|
## 6. Transport Layer Security
|
|
|
|
### 6.1 TLS Configuration
|
|
|
|
**Traefik TLS Configuration:**
|
|
```yaml
|
|
tls:
|
|
options:
|
|
default:
|
|
minVersion: "VersionTLS13"
|
|
maxVersion: "VersionTLS13"
|
|
cipherSuites:
|
|
- "TLS_AES_256_GCM_SHA384"
|
|
- "TLS_CHACHA20_POLY1305_SHA256"
|
|
- "TLS_AES_128_GCM_SHA256"
|
|
curvePreferences:
|
|
- "X25519"
|
|
- "secp384r1"
|
|
sniStrict: true
|
|
|
|
certificates:
|
|
- certFile: /certs/wildcard.crt
|
|
keyFile: /certs/wildcard.key
|
|
```
|
|
|
|
### 6.2 Certificate Management
|
|
|
|
**Certificate Lifecycle:**
|
|
- **Issuance**: Let's Encrypt with DNS challenge
|
|
- **Rotation**: Automated 30-day renewal
|
|
- **Monitoring**: Certificate expiry alerts
|
|
- **Backup**: Encrypted certificate backup
|
|
|
|
**Internal PKI:**
|
|
```bash
|
|
# Vault PKI setup
|
|
vault secrets enable -path=pki-root pki
|
|
vault secrets tune -max-lease-ttl=87600h pki-root
|
|
|
|
vault write pki-root/root/generate/internal \
|
|
common_name="AI Tax Agent Root CA" \
|
|
ttl=87600h \
|
|
key_bits=4096
|
|
|
|
vault secrets enable -path=pki-int pki
|
|
vault secrets tune -max-lease-ttl=43800h pki-int
|
|
|
|
vault write pki-int/intermediate/generate/internal \
|
|
common_name="AI Tax Agent Intermediate CA" \
|
|
ttl=43800h \
|
|
key_bits=4096
|
|
```
|
|
|
|
## 7. Database Encryption
|
|
|
|
### 7.1 PostgreSQL Encryption
|
|
|
|
**Transparent Data Encryption (TDE):**
|
|
```sql
|
|
-- Enable pgcrypto extension
|
|
CREATE EXTENSION IF NOT EXISTS pgcrypto;
|
|
|
|
-- Create encrypted table
|
|
CREATE TABLE taxpayer_profiles (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
utr_encrypted BYTEA NOT NULL,
|
|
ni_number_encrypted BYTEA NOT NULL,
|
|
name_encrypted BYTEA NOT NULL,
|
|
created_at TIMESTAMP DEFAULT NOW()
|
|
);
|
|
|
|
-- Encryption functions
|
|
CREATE OR REPLACE FUNCTION encrypt_pii(data TEXT, key_id TEXT)
|
|
RETURNS BYTEA AS $$
|
|
BEGIN
|
|
-- Use Vault transit engine for encryption
|
|
RETURN vault_encrypt(data, key_id);
|
|
END;
|
|
$$ LANGUAGE plpgsql;
|
|
```
|
|
|
|
**Column-Level Encryption:**
|
|
```python
|
|
class EncryptedTaxpayerProfile(Base):
|
|
__tablename__ = 'taxpayer_profiles'
|
|
|
|
id = Column(UUID, primary_key=True, default=uuid.uuid4)
|
|
utr_encrypted = Column(LargeBinary, nullable=False)
|
|
ni_number_encrypted = Column(LargeBinary, nullable=False)
|
|
|
|
@hybrid_property
|
|
def utr(self):
|
|
return vault_client.decrypt('field-utr', self.utr_encrypted)
|
|
|
|
@utr.setter
|
|
def utr(self, value):
|
|
self.utr_encrypted = vault_client.encrypt('field-utr', value)
|
|
```
|
|
|
|
### 7.2 Neo4j Encryption
|
|
|
|
**Enterprise Edition Features:**
|
|
```cypher
|
|
// Enable encryption at rest
|
|
CALL dbms.security.setConfigValue('dbms.security.encryption.enabled', 'true');
|
|
|
|
// Create encrypted property
|
|
CREATE CONSTRAINT encrypted_utr IF NOT EXISTS
|
|
FOR (tp:TaxpayerProfile)
|
|
REQUIRE tp.utr_encrypted IS NOT NULL;
|
|
|
|
// Encryption UDF
|
|
CALL apoc.custom.asFunction(
|
|
'encrypt',
|
|
'RETURN apoc.util.md5([text, $key])',
|
|
'STRING',
|
|
[['text', 'STRING'], ['key', 'STRING']]
|
|
);
|
|
```
|
|
|
|
## 8. Application-Level Encryption
|
|
|
|
### 8.1 Microservice Encryption
|
|
|
|
**Service-to-Service Communication:**
|
|
```python
|
|
import httpx
|
|
from cryptography.hazmat.primitives import hashes
|
|
from cryptography.hazmat.primitives.asymmetric import rsa, padding
|
|
|
|
class SecureServiceClient:
|
|
def __init__(self, service_url: str, private_key: rsa.RSAPrivateKey):
|
|
self.service_url = service_url
|
|
self.private_key = private_key
|
|
|
|
async def make_request(self, endpoint: str, data: dict):
|
|
# Encrypt request payload
|
|
encrypted_data = self.encrypt_payload(data)
|
|
|
|
# Sign request
|
|
signature = self.sign_request(encrypted_data)
|
|
|
|
async with httpx.AsyncClient() as client:
|
|
response = await client.post(
|
|
f"{self.service_url}/{endpoint}",
|
|
json={"data": encrypted_data, "signature": signature},
|
|
headers={"Content-Type": "application/json"}
|
|
)
|
|
|
|
# Decrypt response
|
|
return self.decrypt_response(response.json())
|
|
```
|
|
|
|
### 8.2 Document Encryption
|
|
|
|
**Document Storage Encryption:**
|
|
```python
|
|
class DocumentEncryption:
|
|
def __init__(self, vault_client):
|
|
self.vault = vault_client
|
|
|
|
def encrypt_document(self, document_content: bytes, doc_id: str) -> dict:
|
|
"""Encrypt document with unique DEK"""
|
|
# Generate document-specific DEK
|
|
dek = self.vault.generate_data_key('document-master-key')
|
|
|
|
# Encrypt document with DEK
|
|
cipher = Fernet(dek['plaintext_key'])
|
|
encrypted_content = cipher.encrypt(document_content)
|
|
|
|
# Store encrypted DEK
|
|
encrypted_dek = dek['ciphertext_key']
|
|
|
|
return {
|
|
'encrypted_content': encrypted_content,
|
|
'encrypted_dek': encrypted_dek,
|
|
'key_version': dek['key_version']
|
|
}
|
|
```
|
|
|
|
## 9. Backup and Archive Encryption
|
|
|
|
### 9.1 Backup Encryption Strategy
|
|
|
|
**Multi-Layer Backup Encryption:**
|
|
```bash
|
|
#!/bin/bash
|
|
# Backup encryption script
|
|
|
|
# 1. Database dump with encryption
|
|
pg_dump tax_system | gpg --cipher-algo AES256 --compress-algo 2 \
|
|
--symmetric --output backup_$(date +%Y%m%d).sql.gpg
|
|
|
|
# 2. Neo4j backup with encryption
|
|
neo4j-admin backup --backup-dir=/backups/neo4j \
|
|
--name=graph_$(date +%Y%m%d) --encrypt
|
|
|
|
# 3. Document backup with encryption
|
|
tar -czf - /data/documents | gpg --cipher-algo AES256 \
|
|
--symmetric --output documents_$(date +%Y%m%d).tar.gz.gpg
|
|
|
|
# 4. Upload to encrypted cloud storage
|
|
aws s3 cp backup_$(date +%Y%m%d).sql.gpg \
|
|
s3://tax-agent-backups/ --sse aws:kms --sse-kms-key-id alias/backup-key
|
|
```
|
|
|
|
### 9.2 Archive Encryption
|
|
|
|
**Long-Term Archive Strategy:**
|
|
- **Encryption**: AES-256 with 10-year key retention
|
|
- **Integrity**: SHA-256 checksums with digital signatures
|
|
- **Storage**: Geographically distributed encrypted storage
|
|
- **Access**: Multi-person authorization for archive access
|
|
|
|
## 10. Key Rotation and Recovery
|
|
|
|
### 10.1 Automated Key Rotation
|
|
|
|
**Rotation Schedule:**
|
|
```python
|
|
ROTATION_SCHEDULE = {
|
|
'transport_keys': timedelta(days=365), # Annual
|
|
'application_keys': timedelta(days=90), # Quarterly
|
|
'database_keys': timedelta(days=90), # Quarterly
|
|
'field_encryption_keys': timedelta(days=30), # Monthly
|
|
'signing_keys': timedelta(days=180), # Bi-annual
|
|
}
|
|
|
|
class KeyRotationManager:
|
|
def __init__(self, vault_client):
|
|
self.vault = vault_client
|
|
|
|
async def rotate_keys(self):
|
|
"""Automated key rotation process"""
|
|
for key_type, rotation_period in ROTATION_SCHEDULE.items():
|
|
keys = await self.get_keys_due_for_rotation(key_type, rotation_period)
|
|
|
|
for key in keys:
|
|
await self.rotate_key(key)
|
|
await self.update_applications(key)
|
|
await self.verify_rotation(key)
|
|
```
|
|
|
|
### 10.2 Key Recovery Procedures
|
|
|
|
**Emergency Key Recovery:**
|
|
1. **Multi-Person Authorization**: Require 3 of 5 key custodians
|
|
2. **Secure Communication**: Use encrypted channels for coordination
|
|
3. **Audit Trail**: Log all recovery activities
|
|
4. **Verification**: Verify key integrity before use
|
|
5. **Re-encryption**: Re-encrypt data with new keys if compromise suspected
|
|
|
|
## 11. Monitoring and Compliance
|
|
|
|
### 11.1 Encryption Monitoring
|
|
|
|
**Key Metrics:**
|
|
- Key rotation compliance rate
|
|
- Encryption coverage percentage
|
|
- Failed encryption/decryption attempts
|
|
- Key access patterns and anomalies
|
|
- Certificate expiry warnings
|
|
|
|
**Alerting Rules:**
|
|
```yaml
|
|
groups:
|
|
- name: encryption_alerts
|
|
rules:
|
|
- alert: KeyRotationOverdue
|
|
expr: vault_key_age_days > 90
|
|
for: 1h
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: "Encryption key rotation overdue"
|
|
|
|
- alert: EncryptionFailure
|
|
expr: rate(encryption_errors_total[5m]) > 0.1
|
|
for: 2m
|
|
labels:
|
|
severity: critical
|
|
annotations:
|
|
summary: "High encryption failure rate detected"
|
|
```
|
|
|
|
### 11.2 Compliance Reporting
|
|
|
|
**Quarterly Encryption Report:**
|
|
- Encryption coverage by data classification
|
|
- Key rotation compliance status
|
|
- Security incidents related to encryption
|
|
- Vulnerability assessment results
|
|
- Compliance gap analysis
|
|
|
|
## 12. Incident Response
|
|
|
|
### 12.1 Key Compromise Response
|
|
|
|
**Response Procedures:**
|
|
1. **Immediate**: Revoke compromised keys
|
|
2. **Assessment**: Determine scope of compromise
|
|
3. **Containment**: Isolate affected systems
|
|
4. **Recovery**: Generate new keys and re-encrypt data
|
|
5. **Lessons Learned**: Update procedures and controls
|
|
|
|
### 12.2 Encryption Failure Response
|
|
|
|
**Failure Scenarios:**
|
|
- HSM hardware failure
|
|
- Key corruption or loss
|
|
- Encryption service outage
|
|
- Certificate expiry
|
|
|
|
**Recovery Procedures:**
|
|
- Activate backup HSM
|
|
- Restore keys from secure backup
|
|
- Implement manual encryption processes
|
|
- Emergency certificate issuance
|
|
|
|
---
|
|
|
|
**Document Classification**: CONFIDENTIAL
|
|
**Next Review Date**: 2024-07-31
|
|
**Approval**: Security Architecture Team
|