epic-certification-management-automation-task-005 - Implementation Task | Likepersonsapp

high priority low complexity backend pending backend specialist Tier 4

Acceptance Criteria

All error paths in the cron function emit structured JSON log entries with fields: timestamp (ISO 8601), severity (ERROR|WARN|INFO), cron_run_id (UUID), error_code, message, context object (affected certification_id, peer_mentor_id if applicable), and stack_trace for unexpected exceptions

Service call failures (Supabase RPC errors, network timeouts) are caught and logged with the error response body and HTTP status code included in the context field

Database errors include the failed SQL operation name and relevant record identifiers in the log context

A monitoring query runs at the end of each cron execution that selects all certifications where expiry_date < now() AND status != 'paused' AND idempotency record is absent, returning count and IDs

If the monitoring query returns one or more missed transitions, a WARN-level log entry is written with the list of affected certification IDs and a remediation instruction string

The monitoring report is written to a dedicated Supabase table (cron_audit_log) with columns: run_id, run_at, missed_pause_count, missed_ids (jsonb), resolved (boolean default false)

Unexpected exceptions (non-Supabase errors) are caught by a top-level try/catch, logged at ERROR severity with full stack trace, and do not crash the cron process silently

Log output is valid JSON on every line — no mixed plain-text lines that would break log aggregation parsers

Unit test verifies that a simulated service failure produces a log entry matching the expected JSON schema

Technical Requirements

frameworks

Supabase Edge Functions (Deno/TypeScript)

Supabase CLI

apis

Supabase PostgREST RPC

Supabase cron_audit_log table insert

data models

certifications

certification_idempotency_log

cron_audit_log

performance requirements

Monitoring query must complete within 2 seconds on datasets up to 10,000 certifications — add index on (expiry_date, status) if not present

Logging must be non-blocking — errors in the logging layer itself must never prevent the cron from completing its primary work

Total overhead of error logging and monitoring query must not exceed 500ms per cron run

security requirements

Log entries must never include full PII (names, national identity numbers) — only system IDs (certification_id, peer_mentor_id UUIDs)

cron_audit_log table must have Row Level Security enabled, readable only by service_role and ops-admin role

Stack traces written to logs must be sanitised to remove any embedded secret values or connection strings

Execution Context

Execution Tier

Tier 4

Tier 4 - 323 tasks

Can start after Tier 3 completes

View Full Execution Plan

Implementation Notes

Introduce a structured logger helper (e.g. `logEvent(severity, code, message, context)`) at the top of the cron module that serialises to JSON and writes to stdout — Supabase Edge Functions stream stdout to the platform log aggregator. Use a UUID v4 generated once per cron invocation as `cron_run_id` so all log lines from a single run can be correlated. For the monitoring query, use a single Supabase RPC call (e.g.

`rpc('get_missed_auto_pauses')`) rather than a raw query from the Edge Function — this keeps the SQL in a versioned migration file and avoids string interpolation. Insert the audit row at the very end of the cron, after all processing, using an upsert keyed on `run_id` so a partial re-run does not create duplicates. Keep the `resolved` column for the ops team to manually acknowledge remediations without deleting rows. Do not throw from the monitoring section — wrap in try/catch and log a WARN if the monitoring query itself fails, so a broken audit query does not mask the primary cron outcome.

Testing Requirements

Write unit tests using Deno's built-in test runner (or Jest if the Edge Function project uses it). Test 1: mock a Supabase client that throws a PostgREST error on the service call; assert the emitted log object matches the expected JSON schema (all required fields present, severity=ERROR, error_code populated). Test 2: seed an in-memory dataset with two expired-but-not-paused certifications; run the monitoring query function; assert the returned object contains missed_pause_count=2 and the correct IDs. Test 3: simulate a clean run with no missed transitions; assert no WARN log is emitted and cron_audit_log row has missed_pause_count=0.

Test 4: inject an unexpected runtime exception; assert top-level catch produces an ERROR log with stack_trace field present. Achieve 100% branch coverage on the logging and monitoring modules.

Component

Certification Expiry Nightly Cron Job

infrastructure medium

Dependencies (1)

Implement idempotency protection so the cron function does not send duplicate reminders or trigger duplicate auto-pauses if it is invoked more than once within a 24-hour window. Use a processed-runs log table or a date-keyed lock record in Supabase to track each action already taken for a given certification and run date. epic-certification-management-automation-task-004

Epic Risks (2)

medium impact low prob technical

Supabase Edge Functions can have cold-start latency that causes the nightly cron to time out when processing large cohorts of expiring certifications, resulting in partial reminder dispatches.

Mitigation & Contingency

Mitigation: Batch the cron processing in chunks of 50 mentors per iteration. Use pagination with a cursor to resume processing if the function is re-invoked. Keep total invocation time well under the Edge Function timeout limit.

Contingency: If timeouts occur in production, split the cron into two separate functions: one for reminders and one for auto-pauses, each with its own schedule offset to reduce peak load.

low impact medium prob technical

Certification BLoC covers three distinct workflows (view, renew, enrol) which may lead to an overly complex state machine that is hard to test and maintain, particularly when error states from multiple concurrent operations need to be differentiated in the UI.

Mitigation & Contingency

Mitigation: Use separate sealed state classes per workflow (CertificationViewState, RenewalState, EnrolmentState) composed into a single BLoC state wrapper. Follow the existing BLoC patterns established in the codebase for consistency.

Contingency: If the BLoC grows too complex, split into two BLoCs: CertificationBLoC (view/load) and CertificationActionBLoC (mutations), connected via a shared stream.

Quick Links

All Tasks Execution Plan

Structured Error Logging and Missed Auto-Pause Monitoring

Acceptance Criteria

Technical Requirements

Execution Context

Implementation Notes

Testing Requirements