high priority medium complexity backend pending backend specialist Tier 4

Acceptance Criteria

A `certification_expiry_audit_log` table exists in Supabase with columns: id (UUID), run_id (UUID), mentor_id (UUID), certification_id (UUID), event_type (enum: status_transitioned | reminder_dispatched | transition_failed | reminder_failed), old_status (nullable text), new_status (nullable text), threshold_days (nullable int), timestamp (timestamptz), error_message (nullable text)
CertificationExpiryAuditLogger.logTransition() inserts a row for every status transition attempted (success or failure)
CertificationExpiryAuditLogger.logReminderDispatch() inserts a row for every notification dispatch attempted (success or failure)
All log writes are batched: audit rows are collected in memory during the run and inserted via a single bulk INSERT at the end, not one INSERT per event
A method AuditLogRepository.getRunSummary(runId) returns all entries for a given run ID, ordered by timestamp ascending
A method AuditLogRepository.getMentorAuditHistory(mentorId, {DateTime? since}) returns all audit entries for a specific mentor for coordinator review
Both query methods are paginated (limit/offset) and documented per task-010 requirements
Audit log rows are never deleted by application code — append-only contract is enforced via RLS (no DELETE policy for application role)
If the bulk INSERT of audit rows fails, the failure is logged to stderr/console and does NOT cause the scheduler run to be marked FAILED — audit write failure is non-fatal
Run summary includes aggregate counts: total_transitioned, total_reminders_sent, total_failures

Technical Requirements

frameworks
Dart
Riverpod (AuditLogRepository as a provider)
Supabase Dart client
apis
Supabase PostgREST bulk INSERT (upsert with ignore on conflict for idempotency)
certification_expiry_audit_log Supabase table
data models
certification_expiry_audit_log (id, run_id, mentor_id, certification_id, event_type, old_status, new_status, threshold_days, timestamp, error_message)
AuditLogEntry (Dart model)
RunAuditSummary (run_id, started_at, finished_at, total_transitioned, total_reminders_sent, total_failures, entries: List<AuditLogEntry>)
performance requirements
Audit log rows must be bulk-inserted in a single Supabase call at end of run — not streamed per event
getMentorAuditHistory must use limit/offset pagination with a default page size of 50
Index on (mentor_id, timestamp) and (run_id) for efficient coordinator queries
security requirements
Audit log table must have RLS policy: service-role can INSERT and SELECT; coordinator role can SELECT only rows where mentor is in their organisation; no DELETE or UPDATE for any application role
error_message column must not contain stack traces with internal implementation details in production — truncate to 500 chars and strip file paths
Mentor IDs in audit log are UUIDs only — no names or PII beyond what is already stored in the referenced rows

Execution Context

Execution Tier
Tier 4

Tier 4 - 323 tasks

Can start after Tier 3 completes

Implementation Notes

Use an in-memory list (`List _pendingEntries = []`) as a buffer within the CertificationExpiryChecker run. After all transitions and dispatch attempts complete, pass the buffer to AuditLogRepository.bulkInsert(). This pattern decouples the hot loop from DB I/O. The AuditLogEntry Dart model should use `fromTransitionResult` and `fromReminderResult` factory constructors that accept the result objects from tasks 003 and 004, making the mapping explicit and testable.

For the Supabase bulk insert, use `.from('certification_expiry_audit_log').insert(entries.map((e) => e.toJson()).toList())` — Supabase accepts a list of maps for bulk insert in a single HTTP call. Define `event_type` as a Dart enum with `.name` serialisation to match the Postgres CHECK constraint. The append-only RLS policy is: `CREATE POLICY no_delete ON certification_expiry_audit_log FOR DELETE USING (false);`.

Testing Requirements

Unit tests (flutter_test) with mocked Supabase client: (1) after a run with 5 transitions and 3 reminders, verify 8 AuditLogEntry objects are passed to the bulk insert method, (2) bulk insert failure is caught and does not propagate as an exception from the checker, (3) getRunSummary(runId) returns entries sorted by timestamp, (4) getMentorAuditHistory with since parameter filters by timestamp correctly. Integration test against local Supabase: run a full expiry scan with known fixture data, query audit log table, assert correct event_type values and row count. RLS test: create a coordinator-role JWT for org A, assert it cannot SELECT audit rows for mentors in org B.

Component
Certification Expiry Checker Service
service high
Epic Risks (4)
high impact medium prob technical

The nightly expiry checker may run multiple times due to scheduler retries or infrastructure issues, causing duplicate auto-transitions and duplicate coordinator notifications that erode trust in the notification system.

Mitigation & Contingency

Mitigation: Implement idempotency via a unique constraint on (mentor_id, threshold_day, certification_expiry_date) in the cert_expiry_reminders table. Auto-transitions should be wrapped in a Postgres RPC that checks current status before applying, making repeated invocations safe.

Contingency: Add a compensation query in the reconciliation log that detects duplicate log entries for the same certification period and alerts the operations team for manual review within 24 hours.

high impact medium prob integration

The HLF Dynamics portal API may have eventual-consistency behaviour or rate limits that cause website listing updates to lag behind status changes, leaving expired mentors visible on the public website for an unacceptable window.

Mitigation & Contingency

Mitigation: Design the sync service to be triggered immediately on status transitions (event-driven via database webhook) in addition to the nightly batch run. Implement a reconciliation job that verifies sync state against app state and re-triggers any divergent records.

Contingency: If real-time sync cannot be guaranteed, implement a manual 'force sync' action in the coordinator dashboard so coordinators can trigger an immediate re-sync for urgent cases. Document the expected sync lag in coordinator onboarding materials.

medium impact medium prob scope

Stakeholder requests to extend the expiry checker to handle additional certification types, grace periods, or organisation-specific threshold configurations may significantly increase scope beyond what is designed here, delaying delivery.

Mitigation & Contingency

Mitigation: Parameterise threshold day values (30, 14, 7) via configuration repository rather than hard-coding them, enabling per-organisation customisation without code changes. Document that grace period logic and additional cert types are out of scope for this epic and require a dedicated follow-up.

Contingency: Deliver the feature with hard-coded HLF-standard thresholds first and introduce the configuration repository as a follow-up task in the next sprint, using a feature flag to enable per-org threshold overrides.

high impact low prob security

Dynamics portal API credentials stored as environment secrets in Supabase Edge Function configuration may be rotated or invalidated by HLF IT without notice, causing silent sync failures that go undetected for multiple days.

Mitigation & Contingency

Mitigation: Implement credential health-check calls on each scheduler run and emit an immediate alert on auth failure rather than only alerting after N consecutive failures. Document the credential rotation procedure with HLF IT and establish a rotation notification protocol.

Contingency: Maintain a break-glass manual sync script accessible to HLF administrators that can re-execute the Dynamics sync with newly provided credentials while the automated system is restored.