Add audit logging for all automated transitions

epic-peer-mentor-pause-management-automated-expiry-task-005 — Instrument the CertificationExpiryChecker to write a structured audit log entry for every automated status transition and every reminder notification dispatched. Log entries must include mentor ID, certification ID, old status, new status, threshold triggered, timestamp, and run ID. Persist logs to the designated audit table and expose a query method for coordinator review.

high priority medium complexity backend pending backend specialist Tier 4

Acceptance Criteria

A `certification_expiry_audit_log` table exists in Supabase with columns: id (UUID), run_id (UUID), mentor_id (UUID), certification_id (UUID), event_type (enum: status_transitioned | reminder_dispatched | transition_failed | reminder_failed), old_status (nullable text), new_status (nullable text), threshold_days (nullable int), timestamp (timestamptz), error_message (nullable text)

CertificationExpiryAuditLogger.logTransition() inserts a row for every status transition attempted (success or failure)

CertificationExpiryAuditLogger.logReminderDispatch() inserts a row for every notification dispatch attempted (success or failure)

All log writes are batched: audit rows are collected in memory during the run and inserted via a single bulk INSERT at the end, not one INSERT per event

A method AuditLogRepository.getRunSummary(runId) returns all entries for a given run ID, ordered by timestamp ascending

A method AuditLogRepository.getMentorAuditHistory(mentorId, {DateTime? since}) returns all audit entries for a specific mentor for coordinator review

Both query methods are paginated (limit/offset) and documented per task-010 requirements

Audit log rows are never deleted by application code — append-only contract is enforced via RLS (no DELETE policy for application role)

If the bulk INSERT of audit rows fails, the failure is logged to stderr/console and does NOT cause the scheduler run to be marked FAILED — audit write failure is non-fatal

Run summary includes aggregate counts: total_transitioned, total_reminders_sent, total_failures

Technical Requirements

frameworks

Dart

Riverpod (AuditLogRepository as a provider)

Supabase Dart client

apis

Supabase PostgREST bulk INSERT (upsert with ignore on conflict for idempotency)

certification_expiry_audit_log Supabase table

data models

certification_expiry_audit_log (id, run_id, mentor_id, certification_id, event_type, old_status, new_status, threshold_days, timestamp, error_message)

AuditLogEntry (Dart model)

RunAuditSummary (run_id, started_at, finished_at, total_transitioned, total_reminders_sent, total_failures, entries: List<AuditLogEntry>)

performance requirements

Audit log rows must be bulk-inserted in a single Supabase call at end of run — not streamed per event

getMentorAuditHistory must use limit/offset pagination with a default page size of 50

Index on (mentor_id, timestamp) and (run_id) for efficient coordinator queries

security requirements

Audit log table must have RLS policy: service-role can INSERT and SELECT; coordinator role can SELECT only rows where mentor is in their organisation; no DELETE or UPDATE for any application role

error_message column must not contain stack traces with internal implementation details in production — truncate to 500 chars and strip file paths

Mentor IDs in audit log are UUIDs only — no names or PII beyond what is already stored in the referenced rows

Execution Context

Execution Tier

Tier 4

Tier 4 - 323 tasks

Can start after Tier 3 completes

View Full Execution Plan

Implementation Notes

Use an in-memory list (`List _pendingEntries = []`) as a buffer within the CertificationExpiryChecker run. After all transitions and dispatch attempts complete, pass the buffer to AuditLogRepository.bulkInsert(). This pattern decouples the hot loop from DB I/O. The AuditLogEntry Dart model should use `fromTransitionResult` and `fromReminderResult` factory constructors that accept the result objects from tasks 003 and 004, making the mapping explicit and testable.

For the Supabase bulk insert, use `.from('certification_expiry_audit_log').insert(entries.map((e) => e.toJson()).toList())` — Supabase accepts a list of maps for bulk insert in a single HTTP call. Define `event_type` as a Dart enum with `.name` serialisation to match the Postgres CHECK constraint. The append-only RLS policy is: `CREATE POLICY no_delete ON certification_expiry_audit_log FOR DELETE USING (false);`.

Testing Requirements

Unit tests (flutter_test) with mocked Supabase client: (1) after a run with 5 transitions and 3 reminders, verify 8 AuditLogEntry objects are passed to the bulk insert method, (2) bulk insert failure is caught and does not propagate as an exception from the checker, (3) getRunSummary(runId) returns entries sorted by timestamp, (4) getMentorAuditHistory with since parameter filters by timestamp correctly. Integration test against local Supabase: run a full expiry scan with known fixture data, query audit log table, assert correct event_type values and row count. RLS test: create a coordinator-role JWT for org A, assert it cannot SELECT audit rows for mentors in org B.

Component

Certification Expiry Checker Service

service high

Dependencies (1)

Add reminder notification dispatch logic to the CertificationExpiryChecker for the 30-day, 14-day, and 7-day warning thresholds. For each mentor whose certification falls within a threshold window, dispatch a coordinator reminder via the PauseNotificationService. Ensure idempotency so repeated nightly runs do not send duplicate reminders for the same threshold and mentor. epic-peer-mentor-pause-management-automated-expiry-task-004

Epic Risks (4)

high impact medium prob technical

The nightly expiry checker may run multiple times due to scheduler retries or infrastructure issues, causing duplicate auto-transitions and duplicate coordinator notifications that erode trust in the notification system.

Mitigation & Contingency

Mitigation: Implement idempotency via a unique constraint on (mentor_id, threshold_day, certification_expiry_date) in the cert_expiry_reminders table. Auto-transitions should be wrapped in a Postgres RPC that checks current status before applying, making repeated invocations safe.

Contingency: Add a compensation query in the reconciliation log that detects duplicate log entries for the same certification period and alerts the operations team for manual review within 24 hours.

high impact medium prob integration

The HLF Dynamics portal API may have eventual-consistency behaviour or rate limits that cause website listing updates to lag behind status changes, leaving expired mentors visible on the public website for an unacceptable window.

Mitigation & Contingency

Mitigation: Design the sync service to be triggered immediately on status transitions (event-driven via database webhook) in addition to the nightly batch run. Implement a reconciliation job that verifies sync state against app state and re-triggers any divergent records.

Contingency: If real-time sync cannot be guaranteed, implement a manual 'force sync' action in the coordinator dashboard so coordinators can trigger an immediate re-sync for urgent cases. Document the expected sync lag in coordinator onboarding materials.

medium impact medium prob scope

Stakeholder requests to extend the expiry checker to handle additional certification types, grace periods, or organisation-specific threshold configurations may significantly increase scope beyond what is designed here, delaying delivery.

Mitigation & Contingency

Mitigation: Parameterise threshold day values (30, 14, 7) via configuration repository rather than hard-coding them, enabling per-organisation customisation without code changes. Document that grace period logic and additional cert types are out of scope for this epic and require a dedicated follow-up.

Contingency: Deliver the feature with hard-coded HLF-standard thresholds first and introduce the configuration repository as a follow-up task in the next sprint, using a feature flag to enable per-org threshold overrides.

high impact low prob security

Dynamics portal API credentials stored as environment secrets in Supabase Edge Function configuration may be rotated or invalidated by HLF IT without notice, causing silent sync failures that go undetected for multiple days.

Mitigation & Contingency

Mitigation: Implement credential health-check calls on each scheduler run and emit an immediate alert on auth failure rather than only alerting after N consecutive failures. Document the credential rotation procedure with HLF IT and establish a rotation notification protocol.

Contingency: Maintain a break-glass manual sync script accessible to HLF administrators that can re-execute the Dynamics sync with newly provided credentials while the automated system is restored.

Quick Links

All Tasks Execution Plan