Initial commit
Some checks failed
CI/CD Pipeline / Code Quality & Linting (push) Has been cancelled
CI/CD Pipeline / Policy Validation (push) Has been cancelled
CI/CD Pipeline / Test Suite (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-firm-connectors) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-forms) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-hmrc) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ingestion) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-normalize-map) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-ocr) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-indexer) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-reason) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (svc-rpa) (push) Has been cancelled
CI/CD Pipeline / Build Docker Images (ui-review) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-coverage) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-extract) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-kg) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (svc-rag-retriever) (push) Has been cancelled
CI/CD Pipeline / Security Scanning (ui-review) (push) Has been cancelled
CI/CD Pipeline / Generate SBOM (push) Has been cancelled
CI/CD Pipeline / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline / Deploy to Production (push) Has been cancelled
CI/CD Pipeline / Notifications (push) Has been cancelled

This commit is contained in:
harkon
2025-10-11 08:41:36 +01:00
commit b324ff09ef
276 changed files with 55220 additions and 0 deletions

121
docs/ONTOLOGY.md Normal file
View File

@@ -0,0 +1,121 @@
# Concept Model
## Core Entities and Relationships
```mermaid
graph TB
TP[TaxpayerProfile] --> TY[TaxYear]
TY --> J[Jurisdiction]
TF[TaxForm] --> TY
TF --> S[Schedule]
S --> FB[FormBox]
D[Document] --> E[Evidence]
E --> II[IncomeItem]
E --> EI[ExpenseItem]
E --> P[Payment]
TP --> II
TP --> EI
TP --> PA[PropertyAsset]
TP --> BA[BusinessActivity]
TP --> PC[PensionContribution]
TP --> SLP[StudentLoanPlan]
Party --> II
Party --> EI
Party --> Account
II --> S
EI --> S
PA --> S
C[Calculation] --> FB
R[Rule] --> C
ER[ExchangeRate] --> II
ER --> EI
NE[NormalizationEvent] --> II
NE --> EI
ETL[ETLRun] --> D
ETL --> E
CB[Consent] --> TP
```
## Entity Descriptions
### Core Tax Entities
- **TaxpayerProfile**: Individual, partnership, or company with tax obligations
- **TaxYear**: Fiscal period (UK: 6 April - 5 April) with jurisdiction-specific rules
- **Jurisdiction**: Tax authority region (UK, with potential for other jurisdictions)
- **TaxForm**: Official forms (SA100, SA102, SA103, SA105, SA110, SA108)
- **Schedule**: Sections within forms (Employment, Self-Employment, Property, etc.)
- **FormBox**: Individual fields/boxes on forms with specific calculation rules
### Document & Evidence
- **Document**: Source materials (bank statements, invoices, receipts, P&L, etc.)
- **Evidence**: Specific snippets from documents with provenance (page, bbox, text hash)
### Financial Entities
- **IncomeItem**: Employment, self-employment, property, dividend, interest income
- **ExpenseItem**: Business expenses, property costs, allowable deductions
- **Payment**: Transactions to/from HMRC, employers, clients
- **PropertyAsset**: Real estate holdings with usage classification
- **BusinessActivity**: Trading activities with SIC codes and basis periods
### Parties & Accounts
- **Party**: Employers, payers, banks, landlords, tenants with identification numbers
- **Account**: Bank accounts with IBAN, sort codes, account numbers
### Calculation & Rules
- **Calculation**: Formula applications with versioned inputs/outputs
- **Rule**: Tax regulations with effective periods and references
- **Allowance/Relief**: Tax allowances with caps, rates, eligibility
- **ExchangeRate**: Currency conversions with date and source
### Compliance & Operations
- **Consent/LegalBasis**: GDPR compliance with purpose and scope
- **ETLRun**: Data processing jobs with success/error tracking
- **NormalizationEvent**: Data cleaning and standardization records
## Cardinalities
| Relationship | From | To | Cardinality |
| --------------- | ---------------------- | ---------------------- | ----------: |
| BELONGS_TO | Schedule | TaxForm | N:1 |
| OF_TAX_YEAR | TaxForm | TaxYear | N:1 |
| IN_JURISDICTION | TaxYear | Jurisdiction | N:1 |
| HAS_BOX | Schedule | FormBox | 1:N |
| DERIVED_FROM | IncomeItem/ExpenseItem | Evidence | N:N |
| SUPPORTED_BY | Evidence | Document | N:1 |
| PAID_BY | Payment | Party | N:1 |
| OWNS | TaxpayerProfile | PropertyAsset | N:N |
| EMPLOYED_BY | TaxpayerProfile | Party | N:N |
| APPLIES_TO | ExchangeRate | IncomeItem/ExpenseItem | 1:N |
| COMPUTES | Calculation | FormBox | N:1 |
| HAS_VALID_BASIS | TaxpayerProfile | Consent | 1:N |
| CITES | Calculation/Rule | RAGChunk | N:N |
| DESCRIBES | RAGChunk | IncomeItem/ExpenseItem | N:N |
## Temporal Model
All financial facts implement **bitemporal** modeling:
- **valid_time**: When the fact was true in reality (valid_from, valid_to)
- **system_time**: When the fact was recorded in the system (asserted_at, retracted_at)
This enables:
- Time-travel queries to any point in time
- Audit trails of all changes
- Correction of historical data without losing provenance
- Multi-year tax calculations with proper period alignment