# Encryption Strategy ## AI Tax Agent System **Document Version:** 1.0 **Date:** 2024-01-31 **Owner:** Security Architecture Team ## 1. Executive Summary This document defines the comprehensive encryption strategy for the AI Tax Agent System, covering data at rest, in transit, and in use. The strategy implements defense-in-depth with multiple encryption layers and key management best practices. ## 2. Encryption Requirements ### 2.1 Regulatory Requirements - **GDPR Article 32**: Appropriate technical measures including encryption - **UK Data Protection Act 2018**: Security of processing requirements - **HMRC Security Standards**: Government security classifications - **ISO 27001**: Information security management requirements - **SOC 2 Type II**: Security and availability controls ### 2.2 Business Requirements - **Client Data Protection**: Financial and personal information - **Intellectual Property**: Proprietary algorithms and models - **Regulatory Compliance**: Audit trail and evidence integrity - **Business Continuity**: Key recovery and disaster recovery ## 3. Encryption Architecture ### 3.1 Encryption Layers ```mermaid graph TB A[Client Browser] -->|TLS 1.3| B[Traefik Gateway] B -->|mTLS| C[Application Services] C -->|Application-Level| D[Database Layer] D -->|Transparent Data Encryption| E[Storage Layer] E -->|Volume Encryption| F[Disk Storage] G[Key Management] --> H[Vault HSM] H --> I[Encryption Keys] I --> C I --> D I --> E ``` ### 3.2 Encryption Domains | Domain | Technology | Key Size | Algorithm | Rotation | |--------|------------|----------|-----------|----------| | **Transport** | TLS 1.3 | 256-bit | AES-GCM, ChaCha20-Poly1305 | Annual | | **Application** | AES-GCM | 256-bit | AES-256-GCM | Quarterly | | **Database** | TDE | 256-bit | AES-256-CBC | Quarterly | | **Storage** | LUKS/dm-crypt | 256-bit | AES-256-XTS | Annual | | **Backup** | GPG | 4096-bit | RSA-4096 + AES-256 | Annual | ## 4. Data Classification and Encryption ### 4.1 Data Classification Matrix | Classification | Examples | Encryption Level | Key Access | |----------------|----------|------------------|------------| | **PUBLIC** | Marketing materials, documentation | TLS only | Public | | **INTERNAL** | System logs, metrics | TLS + Storage | Service accounts | | **CONFIDENTIAL** | Client names, addresses | TLS + App + Storage | Authorized users | | **RESTRICTED** | Financial data, UTR, NI numbers | TLS + App + Field + Storage | Need-to-know | | **SECRET** | Encryption keys, certificates | HSM + Multiple layers | Key custodians | ### 4.2 Field-Level Encryption **Sensitive Fields Requiring Field-Level Encryption:** ```python ENCRYPTED_FIELDS = { 'taxpayer_profile': ['utr', 'ni_number', 'full_name', 'address'], 'financial_data': ['account_number', 'sort_code', 'iban', 'amount'], 'document_content': ['ocr_text', 'extracted_fields'], 'authentication': ['password_hash', 'api_keys', 'tokens'] } ``` **Implementation Example:** ```python from cryptography.fernet import Fernet import vault_client class FieldEncryption: def __init__(self, vault_client): self.vault = vault_client def encrypt_field(self, field_name: str, value: str) -> str: """Encrypt sensitive field using Vault transit engine""" key_name = f"field-{field_name}" response = self.vault.encrypt( mount_point='transit', name=key_name, plaintext=base64.b64encode(value.encode()).decode() ) return response['data']['ciphertext'] def decrypt_field(self, field_name: str, ciphertext: str) -> str: """Decrypt sensitive field using Vault transit engine""" key_name = f"field-{field_name}" response = self.vault.decrypt( mount_point='transit', name=key_name, ciphertext=ciphertext ) return base64.b64decode(response['data']['plaintext']).decode() ``` ## 5. Key Management Strategy ### 5.1 Key Hierarchy ``` Root Key (HSM) ├── Master Encryption Key (MEK) │ ├── Data Encryption Keys (DEK) │ │ ├── Database DEK │ │ ├── Application DEK │ │ └── Storage DEK │ └── Key Encryption Keys (KEK) │ ├── Field Encryption KEK │ ├── Backup KEK │ └── Archive KEK └── Signing Keys ├── JWT Signing Key ├── Document Signing Key └── API Signing Key ``` ### 5.2 HashiCorp Vault Configuration **Vault Policies:** ```hcl # Database encryption policy path "transit/encrypt/database-*" { capabilities = ["create", "update"] } path "transit/decrypt/database-*" { capabilities = ["create", "update"] } # Application encryption policy path "transit/encrypt/app-*" { capabilities = ["create", "update"] } path "transit/decrypt/app-*" { capabilities = ["create", "update"] } # Field encryption policy (restricted) path "transit/encrypt/field-*" { capabilities = ["create", "update"] allowed_parameters = { "plaintext" = [] } denied_parameters = { "batch_input" = [] } } ``` **Key Rotation Policy:** ```hcl # Automatic key rotation path "transit/keys/database-primary" { min_decryption_version = 1 min_encryption_version = 2 deletion_allowed = false auto_rotate_period = "2160h" # 90 days } ``` ### 5.3 Hardware Security Module (HSM) **HSM Configuration:** - **Type**: AWS CloudHSM / Azure Dedicated HSM - **FIPS Level**: FIPS 140-2 Level 3 - **High Availability**: Multi-AZ deployment - **Backup**: Encrypted key backup to secure offline storage ## 6. Transport Layer Security ### 6.1 TLS Configuration **Traefik TLS Configuration:** ```yaml tls: options: default: minVersion: "VersionTLS13" maxVersion: "VersionTLS13" cipherSuites: - "TLS_AES_256_GCM_SHA384" - "TLS_CHACHA20_POLY1305_SHA256" - "TLS_AES_128_GCM_SHA256" curvePreferences: - "X25519" - "secp384r1" sniStrict: true certificates: - certFile: /certs/wildcard.crt keyFile: /certs/wildcard.key ``` ### 6.2 Certificate Management **Certificate Lifecycle:** - **Issuance**: Let's Encrypt with DNS challenge - **Rotation**: Automated 30-day renewal - **Monitoring**: Certificate expiry alerts - **Backup**: Encrypted certificate backup **Internal PKI:** ```bash # Vault PKI setup vault secrets enable -path=pki-root pki vault secrets tune -max-lease-ttl=87600h pki-root vault write pki-root/root/generate/internal \ common_name="AI Tax Agent Root CA" \ ttl=87600h \ key_bits=4096 vault secrets enable -path=pki-int pki vault secrets tune -max-lease-ttl=43800h pki-int vault write pki-int/intermediate/generate/internal \ common_name="AI Tax Agent Intermediate CA" \ ttl=43800h \ key_bits=4096 ``` ## 7. Database Encryption ### 7.1 PostgreSQL Encryption **Transparent Data Encryption (TDE):** ```sql -- Enable pgcrypto extension CREATE EXTENSION IF NOT EXISTS pgcrypto; -- Create encrypted table CREATE TABLE taxpayer_profiles ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), utr_encrypted BYTEA NOT NULL, ni_number_encrypted BYTEA NOT NULL, name_encrypted BYTEA NOT NULL, created_at TIMESTAMP DEFAULT NOW() ); -- Encryption functions CREATE OR REPLACE FUNCTION encrypt_pii(data TEXT, key_id TEXT) RETURNS BYTEA AS $$ BEGIN -- Use Vault transit engine for encryption RETURN vault_encrypt(data, key_id); END; $$ LANGUAGE plpgsql; ``` **Column-Level Encryption:** ```python class EncryptedTaxpayerProfile(Base): __tablename__ = 'taxpayer_profiles' id = Column(UUID, primary_key=True, default=uuid.uuid4) utr_encrypted = Column(LargeBinary, nullable=False) ni_number_encrypted = Column(LargeBinary, nullable=False) @hybrid_property def utr(self): return vault_client.decrypt('field-utr', self.utr_encrypted) @utr.setter def utr(self, value): self.utr_encrypted = vault_client.encrypt('field-utr', value) ``` ### 7.2 Neo4j Encryption **Enterprise Edition Features:** ```cypher // Enable encryption at rest CALL dbms.security.setConfigValue('dbms.security.encryption.enabled', 'true'); // Create encrypted property CREATE CONSTRAINT encrypted_utr IF NOT EXISTS FOR (tp:TaxpayerProfile) REQUIRE tp.utr_encrypted IS NOT NULL; // Encryption UDF CALL apoc.custom.asFunction( 'encrypt', 'RETURN apoc.util.md5([text, $key])', 'STRING', [['text', 'STRING'], ['key', 'STRING']] ); ``` ## 8. Application-Level Encryption ### 8.1 Microservice Encryption **Service-to-Service Communication:** ```python import httpx from cryptography.hazmat.primitives import hashes from cryptography.hazmat.primitives.asymmetric import rsa, padding class SecureServiceClient: def __init__(self, service_url: str, private_key: rsa.RSAPrivateKey): self.service_url = service_url self.private_key = private_key async def make_request(self, endpoint: str, data: dict): # Encrypt request payload encrypted_data = self.encrypt_payload(data) # Sign request signature = self.sign_request(encrypted_data) async with httpx.AsyncClient() as client: response = await client.post( f"{self.service_url}/{endpoint}", json={"data": encrypted_data, "signature": signature}, headers={"Content-Type": "application/json"} ) # Decrypt response return self.decrypt_response(response.json()) ``` ### 8.2 Document Encryption **Document Storage Encryption:** ```python class DocumentEncryption: def __init__(self, vault_client): self.vault = vault_client def encrypt_document(self, document_content: bytes, doc_id: str) -> dict: """Encrypt document with unique DEK""" # Generate document-specific DEK dek = self.vault.generate_data_key('document-master-key') # Encrypt document with DEK cipher = Fernet(dek['plaintext_key']) encrypted_content = cipher.encrypt(document_content) # Store encrypted DEK encrypted_dek = dek['ciphertext_key'] return { 'encrypted_content': encrypted_content, 'encrypted_dek': encrypted_dek, 'key_version': dek['key_version'] } ``` ## 9. Backup and Archive Encryption ### 9.1 Backup Encryption Strategy **Multi-Layer Backup Encryption:** ```bash #!/bin/bash # Backup encryption script # 1. Database dump with encryption pg_dump tax_system | gpg --cipher-algo AES256 --compress-algo 2 \ --symmetric --output backup_$(date +%Y%m%d).sql.gpg # 2. Neo4j backup with encryption neo4j-admin backup --backup-dir=/backups/neo4j \ --name=graph_$(date +%Y%m%d) --encrypt # 3. Document backup with encryption tar -czf - /data/documents | gpg --cipher-algo AES256 \ --symmetric --output documents_$(date +%Y%m%d).tar.gz.gpg # 4. Upload to encrypted cloud storage aws s3 cp backup_$(date +%Y%m%d).sql.gpg \ s3://tax-agent-backups/ --sse aws:kms --sse-kms-key-id alias/backup-key ``` ### 9.2 Archive Encryption **Long-Term Archive Strategy:** - **Encryption**: AES-256 with 10-year key retention - **Integrity**: SHA-256 checksums with digital signatures - **Storage**: Geographically distributed encrypted storage - **Access**: Multi-person authorization for archive access ## 10. Key Rotation and Recovery ### 10.1 Automated Key Rotation **Rotation Schedule:** ```python ROTATION_SCHEDULE = { 'transport_keys': timedelta(days=365), # Annual 'application_keys': timedelta(days=90), # Quarterly 'database_keys': timedelta(days=90), # Quarterly 'field_encryption_keys': timedelta(days=30), # Monthly 'signing_keys': timedelta(days=180), # Bi-annual } class KeyRotationManager: def __init__(self, vault_client): self.vault = vault_client async def rotate_keys(self): """Automated key rotation process""" for key_type, rotation_period in ROTATION_SCHEDULE.items(): keys = await self.get_keys_due_for_rotation(key_type, rotation_period) for key in keys: await self.rotate_key(key) await self.update_applications(key) await self.verify_rotation(key) ``` ### 10.2 Key Recovery Procedures **Emergency Key Recovery:** 1. **Multi-Person Authorization**: Require 3 of 5 key custodians 2. **Secure Communication**: Use encrypted channels for coordination 3. **Audit Trail**: Log all recovery activities 4. **Verification**: Verify key integrity before use 5. **Re-encryption**: Re-encrypt data with new keys if compromise suspected ## 11. Monitoring and Compliance ### 11.1 Encryption Monitoring **Key Metrics:** - Key rotation compliance rate - Encryption coverage percentage - Failed encryption/decryption attempts - Key access patterns and anomalies - Certificate expiry warnings **Alerting Rules:** ```yaml groups: - name: encryption_alerts rules: - alert: KeyRotationOverdue expr: vault_key_age_days > 90 for: 1h labels: severity: warning annotations: summary: "Encryption key rotation overdue" - alert: EncryptionFailure expr: rate(encryption_errors_total[5m]) > 0.1 for: 2m labels: severity: critical annotations: summary: "High encryption failure rate detected" ``` ### 11.2 Compliance Reporting **Quarterly Encryption Report:** - Encryption coverage by data classification - Key rotation compliance status - Security incidents related to encryption - Vulnerability assessment results - Compliance gap analysis ## 12. Incident Response ### 12.1 Key Compromise Response **Response Procedures:** 1. **Immediate**: Revoke compromised keys 2. **Assessment**: Determine scope of compromise 3. **Containment**: Isolate affected systems 4. **Recovery**: Generate new keys and re-encrypt data 5. **Lessons Learned**: Update procedures and controls ### 12.2 Encryption Failure Response **Failure Scenarios:** - HSM hardware failure - Key corruption or loss - Encryption service outage - Certificate expiry **Recovery Procedures:** - Activate backup HSM - Restore keys from secure backup - Implement manual encryption processes - Emergency certificate issuance --- **Document Classification**: CONFIDENTIAL **Next Review Date**: 2024-07-31 **Approval**: Security Architecture Team