critical priority low complexity integration pending integration specialist Tier 2

Acceptance Criteria

Before any scheduler logic is invoked, the function queries the feature flag store for the flag named by KILL_SWITCH_FLAG_NAME (from EdgeFunctionConfig)
If the flag is disabled (or absent), the function returns HTTP 200 immediately with a JSON body: { "status": "skipped", "reason": "kill_switch_active", "invocation_id": "..." }
If the flag is disabled, a structured log entry is emitted with fields: level='info', reason='kill_switch_active', flag_name, invocation_id — no error-level log (this is expected behaviour, not an error)
If the feature flag store is unreachable (network error), the function defaults to DISABLED behaviour (fail-safe) — logs reason='kill_switch_check_failed' and returns HTTP 200 without processing
If the flag is enabled, execution continues normally to the scheduler invocation with no additional latency beyond the flag check query
The feature flag check is implemented as a separate checkKillSwitch(config, supabase): Promise<boolean> function — not inlined in index.ts
The flag check adds no more than 200ms to the function's total execution time under normal database latency
Unit tests cover: flag enabled (returns true, scheduler runs), flag disabled (returns false, scheduler skipped), flag store unreachable (returns false, scheduler skipped)

Technical Requirements

frameworks
Supabase Edge Functions (Deno runtime)
TypeScript (strict mode)
@supabase/supabase-js
apis
Supabase PostgreSQL (feature_flags table or equivalent) via service role client
performance requirements
Kill-switch query must complete in under 200ms (single indexed row lookup)
Fail-safe default (disabled) must resolve synchronously on network error — no retry loop
security requirements
Feature flag query uses the service role client — flag cannot be bypassed by a low-privilege JWT
Kill-switch state is not cached between invocations — each execution performs a fresh query to ensure immediate effect when toggled
Fail-safe default (disabled on error) prioritises peer mentor wellbeing over feature availability — critical for users with cognitive disabilities

Execution Context

Execution Tier
Tier 2

Tier 2 - 518 tasks

Can start after Tier 1 completes

Implementation Notes

Implement checkKillSwitch as an async function that performs a single .select('enabled').eq('flag_name', config.killSwitchFlagName).single() query. Wrap the entire call in a try/catch — any exception returns false (fail-safe). Do not use .maybeSingle() here; treat a missing row as disabled. The fail-safe behaviour (default to disabled on error) is a deliberate accessibility and wellbeing decision documented in the original requirements: the feature could overwhelm peer mentors with cognitive disabilities if it fires uncontrolled.

Document this rationale in a comment above the error handler. Avoid caching the flag value in module scope — since Edge Function instances can be reused across invocations, a cached enabled=true could prevent a kill-switch from taking effect immediately after being toggled off.

Testing Requirements

Deno unit tests for checkKillSwitch(). Use a mock Supabase client. Test: (1) flag row exists with enabled=true → returns true, (2) flag row exists with enabled=false → returns false, (3) flag row absent (empty result) → returns false, (4) Supabase client throws network error → returns false and logs kill_switch_check_failed. Integration test: deploy function to local emulator with flag disabled, send POST with valid cron secret, verify response body has reason='kill_switch_active' and no database mutations occurred (query the DB to confirm no scheduler side effects).

Component
Scenario Evaluation Edge Function
infrastructure medium
Epic Risks (2)
medium impact low prob technical

Supabase Edge Functions on Deno can have cold-start latency of 500ms–2s. If the evaluation window contains many activities (e.g., post-holiday catch-up), the function may approach the 60-second invocation timeout before completing all evaluations.

Mitigation & Contingency

Mitigation: Implement pagination in the activity fetch query with a configurable page size; process pages sequentially and commit history records per page so partial runs are recoverable on the next invocation.

Contingency: If timeout remains an issue at scale, split the evaluation into per-chapter invocations triggered by a fan-out pattern using Supabase Realtime or a lightweight queue.

medium impact low prob dependency

Supabase cron triggers (pg_cron or Edge Function schedules) may miss invocations during platform maintenance windows, causing evaluation gaps that delay time-sensitive prompts beyond their intended delivery window.

Mitigation & Contingency

Mitigation: Configure the look-back window to be 2× the cron interval (e.g., 2-hour look-back for hourly cron) so a single missed invocation does not result in missed prompts; log each run's look-back range for auditability.

Contingency: If missed invocations are detected via monitoring alerts, implement a manual re-trigger endpoint accessible to admins that runs the evaluation for a specified time range.