Integrate kill-switch feature flag check at function startup
epic-scenario-based-follow-up-prompts-infrastructure-task-003 — Before invoking the Scenario Prompt Scheduler Service, query the feature flag store for the scenario-follow-up-prompts kill switch. If the flag is disabled, log a structured JSON entry with reason 'kill_switch_active' and exit with HTTP 200 without performing any work. This prevents overwhelming peer mentors with cognitive disabilities if the feature needs to be disabled post-launch.
Acceptance Criteria
Technical Requirements
Execution Context
Tier 2 - 518 tasks
Can start after Tier 1 completes
Implementation Notes
Implement checkKillSwitch as an async function that performs a single .select('enabled').eq('flag_name', config.killSwitchFlagName).single() query. Wrap the entire call in a try/catch — any exception returns false (fail-safe). Do not use .maybeSingle() here; treat a missing row as disabled. The fail-safe behaviour (default to disabled on error) is a deliberate accessibility and wellbeing decision documented in the original requirements: the feature could overwhelm peer mentors with cognitive disabilities if it fires uncontrolled.
Document this rationale in a comment above the error handler. Avoid caching the flag value in module scope — since Edge Function instances can be reused across invocations, a cached enabled=true could prevent a kill-switch from taking effect immediately after being toggled off.
Testing Requirements
Deno unit tests for checkKillSwitch(). Use a mock Supabase client. Test: (1) flag row exists with enabled=true → returns true, (2) flag row exists with enabled=false → returns false, (3) flag row absent (empty result) → returns false, (4) Supabase client throws network error → returns false and logs kill_switch_check_failed. Integration test: deploy function to local emulator with flag disabled, send POST with valid cron secret, verify response body has reason='kill_switch_active' and no database mutations occurred (query the DB to confirm no scheduler side effects).
Supabase Edge Functions on Deno can have cold-start latency of 500ms–2s. If the evaluation window contains many activities (e.g., post-holiday catch-up), the function may approach the 60-second invocation timeout before completing all evaluations.
Mitigation & Contingency
Mitigation: Implement pagination in the activity fetch query with a configurable page size; process pages sequentially and commit history records per page so partial runs are recoverable on the next invocation.
Contingency: If timeout remains an issue at scale, split the evaluation into per-chapter invocations triggered by a fan-out pattern using Supabase Realtime or a lightweight queue.
Supabase cron triggers (pg_cron or Edge Function schedules) may miss invocations during platform maintenance windows, causing evaluation gaps that delay time-sensitive prompts beyond their intended delivery window.
Mitigation & Contingency
Mitigation: Configure the look-back window to be 2× the cron interval (e.g., 2-hour look-back for hourly cron) so a single missed invocation does not result in missed prompts; log each run's look-back range for auditability.
Contingency: If missed invocations are detected via monitoring alerts, implement a manual re-trigger endpoint accessible to admins that runs the evaluation for a specified time range.