high priority low complexity backend pending backend specialist Tier 9

Acceptance Criteria

Each scheduler run emits a 'scheduler_run_started' log entry at invocation with: invocation_id (UUID), triggered_by ('pg_cron'), timestamp (ISO 8601 UTC)
Each scheduler run emits a 'scheduler_run_completed' log entry at completion with: invocation_id, total_active_mentors, total_evaluated, total_dispatched, total_suppressed, total_errors, duration_ms, completed (boolean), pages_processed (integer)
Each per-mentor error emits a 'mentor_evaluation_error' log entry with: invocation_id, peer_mentor_id (UUID), error_message (sanitized, no PII), error_type (string classification), and does NOT abort the batch
If the scheduler exits due to timeout guard, a 'scheduler_run_partial' log entry is emitted before return with remaining_unprocessed_count
All log entries use the same StructuredLogger from task-010 (imported and reused) — no separate logging implementation
Log entries include `component: 'scenario-scheduler'` field to distinguish scheduler logs from trigger engine logs in Supabase log streaming
Unexpected top-level errors (e.g., database unavailable at startup) produce a 'scheduler_run_failed' log entry with error details before the function returns 500
Error classification distinguishes at minimum: 'network_error', 'timeout_error', 'evaluation_logic_error', 'database_error' for per-mentor failures

Technical Requirements

frameworks
Supabase Edge Functions (Deno)
apis
Supabase log streaming (NDJSON stdout)
performance requirements
Per-mentor error logging must not add more than 1ms overhead per mentor in the batch loop
Aggregate counters (total_evaluated, total_dispatched, etc.) maintained in memory as integers — no database writes per counter increment
security requirements
Per-mentor error messages sanitized before logging — strip any string that matches UUID patterns from error.message to avoid logging peer_mentor_id twice (it's already a separate field)
No raw exception stack traces from trigger engine responses included in scheduler logs — log only classification and sanitized message
invocation_id generated fresh each run using crypto.randomUUID() — not derived from any user data

Execution Context

Execution Tier
Tier 9

Tier 9 - 22 tasks

Can start after Tier 8 completes

Implementation Notes

Import and instantiate `StructuredLogger` from the shared logger module built in task-010, passing `{ component: 'scenario-scheduler', invocation_id: crypto.randomUUID() }` as context at function startup — this context is merged into every log entry automatically. Maintain a `BatchStats` object `{ totalActive: 0, evaluated: 0, dispatched: 0, suppressed: 0, errors: 0, pagesProcessed: 0 }` updated synchronously after each mentor evaluation result. Wrap the per-mentor trigger engine call in `try/catch` and classify errors by checking `error instanceof` patterns or inspecting HTTP response status codes from the trigger engine. The `error_type` classification should be an exhaustive string union type in TypeScript to ensure all cases are handled.

Log the 'scheduler_run_completed' entry inside a `finally` block so it always fires even if the timeout guard triggers an early return — use a flag to distinguish normal completion from partial completion.

Testing Requirements

Unit tests for scheduler logging: (1) scheduler_run_started emitted at invocation with correct fields, (2) scheduler_run_completed emitted at end with accurate counts after processing mock mentor list, (3) per-mentor error caught, logged as 'mentor_evaluation_error', and batch continues to next mentor, (4) timeout guard emits 'scheduler_run_partial' with correct remaining count, (5) top-level failure emits 'scheduler_run_failed'. Verify all log entries are valid NDJSON by parsing captured stdout in tests. Verify 'component' field present in all entries. Verify no PII appears in any log output.

Reuse the StructuredLogger unit tests from task-010 to confirm logger is being reused, not reimplemented.

Component
Scenario Edge Function Scheduler
infrastructure medium
Epic Risks (3)
high impact medium prob technical

The scenario-edge-function-scheduler must evaluate all active peer mentors within the 30-second Supabase Edge Function timeout. For large organisations, a sequential evaluation loop may exceed this limit, causing partial runs and missed notifications.

Mitigation & Contingency

Mitigation: Design the trigger engine to batch mentor evaluations using database-side SQL queries (bulk inactivity check via a single query rather than per-mentor calls), and add a performance test against 500 mentors during development. Document the evaluated mentor count per scenario type in scenario-evaluation-config to allow selective scenario execution per run.

Contingency: If single-run execution is insufficient, split evaluation into per-scenario-type scheduled functions (inactivity check, milestone check, expiry check) on separate cron schedules, dividing the computational load across multiple invocations.

high impact low prob technical

A race condition between concurrent scheduler invocations or retried cron triggers could cause the same scenario notification to be dispatched multiple times to a mentor, severely degrading trust in the feature.

Mitigation & Contingency

Mitigation: Implement cooldown enforcement using a database-level upsert with a unique constraint on (user_id, scenario_type, cooldown_window_start) so that a second invocation within the same window is rejected at the persistence layer rather than the application layer.

Contingency: Add an idempotency key derived from (user_id, scenario_type, evaluation_date) to the notification record insert; if a duplicate key violation is caught, log it as a warning and skip dispatch without error.

medium impact medium prob integration

The trigger engine queries peer mentor activity history across potentially multiple organisations and chapters. RLS policies configured for app-user roles may block the Edge Function's service-role queries, or query performance may degrade on large activity tables.

Mitigation & Contingency

Mitigation: Confirm the Edge Function runs with the Supabase service role key (bypassing RLS) and add composite indexes on (user_id, activity_date) to the activity tables before implementing the inactivity detection query.

Contingency: If service-role access is restricted by organisational policy, implement a dedicated database function (SECURITY DEFINER) that performs the inactivity aggregation and is callable by the Edge Function with limited scope.