Implement partial failure handling in orchestrator
epic-bufdir-reporting-export-core-logic-task-009 — Add structured error propagation to BufdirExportOrchestratorService: catch sub-service exceptions individually, classify them as fatal (abort export, mark audit as failed) vs. non-fatal (log to PartialFailureReport, continue), and surface a human-readable failure summary in ExportFailure. Ensure audit record always transitions to a terminal status even when exceptions escape.
Acceptance Criteria
Technical Requirements
Execution Context
Tier 4 - 323 tasks
Can start after Tier 3 completes
Implementation Notes
Introduce a BufdirExportErrorCode enum with values: network_timeout, storage_write_failed, file_generation_failed, activity_query_failed, row_mapping_failed, attachment_missing, unknown. The BufdirExportErrorClassifier.classify(Object exception) method returns a record ({bool isFatal, BufdirExportErrorCode code, String message}). Implement the top-level finally block inside runExport as: `try { ... } catch (e, st) { await _markFailed(auditId, classifier.classify(e)); rethrow; } finally { await _ensureTerminalStatus(auditId); }` where _ensureTerminalStatus is a no-op if already terminal.
Use a try/catch per row inside the mapping loop (not around the whole mapper call) to enable row-level non-fatal capture. Store PartialFailureReport as a JSONB column partial_failures on the audit record — do not store in a separate table to keep queries simple.
Testing Requirements
Unit tests (flutter_test): test BufdirExportErrorClassifier with every known exception type and verify correct fatal/non-fatal classification. Test the 20% row-failure threshold triggers reclassification to fatal. Test that the finally block transitions audit to 'failed' even when an unexpected RuntimeException escapes all catch blocks (simulate by throwing from the mock audit service itself). Test ExportSuccess with partial failures returns 'completed_with_warnings' status.
Test PartialFailureReport serialises and deserialises without data loss. Use a recorded fake for BufdirExportAuditService to assert exactly one terminal status transition occurs regardless of failure mode. Target ≥90% branch coverage on error handling paths.
Bufdir's column schema may have per-field business rules (conditional required fields, cross-field validation, organisation-specific category taxonomies) that cannot be expressed in a simple key-value mapping configuration. If the configuration model is too simple, supporting NHF's specific requirements will require hardcoded organisation logic, undermining the configuration-driven design.
Mitigation & Contingency
Mitigation: Design the column configuration schema as a full JSON document supporting field-level transformation rules, conditional expressions, and org-specific value enumerations. Validate the design against a real NHF Bufdir Excel template before implementation begins.
Contingency: If the configuration model cannot express all required rules, implement a thin transformation plugin interface where org-specific logic can be added as a named Dart class registered against the organisation ID, with the JSON config covering only the common cases.
For large organisations like NHF with potentially tens of thousands of activity records, the full export pipeline (query + map + generate + bundle + upload) may exceed Supabase Edge Function execution time limits (typically 150s), causing silent timeouts that leave audit records in a pending state indefinitely.
Mitigation & Contingency
Mitigation: Implement the orchestrator as a background Dart isolate with progress streaming rather than a synchronous Edge Function call. Use chunked processing for the query and mapping phases to reduce peak memory usage. Profile against realistic NHF data volumes in a staging environment.
Contingency: If processing time cannot be reduced below the timeout threshold, implement an asynchronous job model where the export is queued, processed in the background, and the user is notified via push notification when the download is ready — treating it as an eventual rather than synchronous operation.