Add structured logging and observability to trigger engine
epic-scenario-push-engagement-core-engine-task-010 — Instrument the Scenario Trigger Engine Edge Function with structured JSON logging: log each evaluation run with peer_mentor_id, scenario_type, evaluation_result (triggered/suppressed/opted_out/cooldown), and relevant metric values. Include error logging with stack traces for unexpected failures. Ensure logs are compatible with Supabase log streaming for operational monitoring.
Acceptance Criteria
Technical Requirements
Execution Context
Tier 6 - 158 tasks
Can start after Tier 5 completes
Implementation Notes
Build a minimal `StructuredLogger` class with a single `log(entry: LogEntry)` method. Define `LogEntry` as a TypeScript interface with all required fields typed — this prevents accidentally adding untyped fields that might contain PII. The class constructor accepts a `context` object (function_name, invocation_id) merged into every entry automatically so callers don't need to repeat it. Use `console.log(JSON.stringify(entry))` — Supabase Edge Functions capture stdout as log output.
Do not use any third-party logging library; the Deno runtime has no npm ecosystem and adding dependencies increases cold-start time. The `log_version` field enables a future migration to a richer schema without breaking existing Supabase log queries. Treat the logger as infrastructure — keep it in a separate `logger.ts` file imported by the trigger engine, not inlined.
Testing Requirements
Unit tests for the logger module: (1) evaluation_result enum values serialize correctly to expected string literals, (2) PII scrubber removes email patterns from arbitrary log payloads, (3) stack traces are truncated at 2000 characters, (4) logger failure (console.log throws) does not propagate exception to caller. Integration test: run a full evaluation cycle against mock dependencies and capture stdout; parse NDJSON output and assert each entry is valid JSON with required fields present. Verify no PII appears in any log output by running a regex scan over captured logs in the test.
The scenario-edge-function-scheduler must evaluate all active peer mentors within the 30-second Supabase Edge Function timeout. For large organisations, a sequential evaluation loop may exceed this limit, causing partial runs and missed notifications.
Mitigation & Contingency
Mitigation: Design the trigger engine to batch mentor evaluations using database-side SQL queries (bulk inactivity check via a single query rather than per-mentor calls), and add a performance test against 500 mentors during development. Document the evaluated mentor count per scenario type in scenario-evaluation-config to allow selective scenario execution per run.
Contingency: If single-run execution is insufficient, split evaluation into per-scenario-type scheduled functions (inactivity check, milestone check, expiry check) on separate cron schedules, dividing the computational load across multiple invocations.
A race condition between concurrent scheduler invocations or retried cron triggers could cause the same scenario notification to be dispatched multiple times to a mentor, severely degrading trust in the feature.
Mitigation & Contingency
Mitigation: Implement cooldown enforcement using a database-level upsert with a unique constraint on (user_id, scenario_type, cooldown_window_start) so that a second invocation within the same window is rejected at the persistence layer rather than the application layer.
Contingency: Add an idempotency key derived from (user_id, scenario_type, evaluation_date) to the notification record insert; if a duplicate key violation is caught, log it as a warning and skip dispatch without error.
The trigger engine queries peer mentor activity history across potentially multiple organisations and chapters. RLS policies configured for app-user roles may block the Edge Function's service-role queries, or query performance may degrade on large activity tables.
Mitigation & Contingency
Mitigation: Confirm the Edge Function runs with the Supabase service role key (bypassing RLS) and add composite indexes on (user_id, activity_date) to the activity tables before implementing the inactivity detection query.
Contingency: If service-role access is restricted by organisational policy, implement a dedicated database function (SECURITY DEFINER) that performs the inactivity aggregation and is callable by the Edge Function with limited scope.