Write Integration Tests for Summary Pipeline
epic-periodic-summaries-core-logic-task-013 — Implement end-to-end integration tests covering the full summary generation pipeline: period boundary detection, activity aggregation, outlier classification, summary persistence, cache population, and notification dispatch. Tests must cover idempotency, organisation isolation, and year-over-year delta correctness.
Acceptance Criteria
Technical Requirements
Execution Context
Tier 5 - 253 tasks
Can start after Tier 4 completes
Implementation Notes
Structure the test suite as a `test/integration/summary_pipeline/` directory with one file per concern (e.g. `idempotency_test.dart`, `organisation_isolation_test.dart`, `outlier_classification_test.dart`). Create a shared `SummaryPipelineTestFixture` class with `setUp()` / `tearDown()` methods that seed and clean the local Supabase instance. For date injection, add a `clock` parameter to all date-sensitive functions (see the `clock` package pattern in Dart).
The year-over-year delta test requires seeding two periods of data: seed H1 2024 completed summary, then trigger H1 2025 generation and assert the delta field is populated. For organisation isolation, run two orgs simultaneously using `Future.wait` and verify via direct DB queries that each org's records are correctly separated. Document the `supabase start` and seed commands in a `Makefile` target `make test-integration` for CI reproducibility.
Testing Requirements
Integration tests written in Deno test (for Edge Function pipeline) and flutter_test (for Dart-side service layer). Use a `TestDataSeeder` helper class that inserts deterministic activity records, org configs, and peer mentor profiles before each test and cleans up after. Group tests into describe blocks: `PipelineHappyPath`, `IdempotencyGuard`, `OrganisationIsolation`, `OutlierClassification`, `YearOverYearDelta`, `NotificationDispatch`, `FailureIsolation`. Use dependency injection to pass a mock date into period boundary detection — do not rely on `DateTime.now()`.
Use Supabase local dev (`supabase start`) as the test database — document setup steps in the test README. Assert DB state directly via SQL after each test rather than only asserting return values.
Supabase pg_cron or Edge Function retries could trigger multiple concurrent generation runs for the same period and organisation, producing duplicate summaries and sending multiple push notifications to users — a serious UX regression.
Mitigation & Contingency
Mitigation: Implement a database-level run-lock using an INSERT … ON CONFLICT DO NOTHING pattern keyed on (organisation_id, period_type, period_start). Only the first successful insert proceeds; subsequent attempts read the existing lock and exit early. Test with concurrent invocations in a Deno test suite.
Contingency: If duplicate summaries are detected post-deployment, add a deduplication cleanup job that removes all but the most recent summary per (user_id, period_type, period_start) and sends a corrective push notification.
FCM and APNs have different payload structures and size limits. An oversized or malformed payload could cause silent notification drops on iOS or delivery failures on Android, meaning mentors never learn their summary is ready.
Mitigation & Contingency
Mitigation: Build the PushNotificationDispatcher with separate FCM and APNs payload constructors, enforce a 256-byte body limit on the preview text, and run integration tests against the Firebase Emulator and a test APNs sandbox.
Contingency: Fall back to a generic 'Your periodic summary is ready' message if personalised preview text construction fails, ensuring delivery even when the personalisation pipeline encounters an error.
Outlier thresholds that are too tight will flag most mentors as outliers (alert fatigue for coordinators), while thresholds that are too loose will miss genuinely underactive mentors — directly undermining HLF's follow-up goal.
Mitigation & Contingency
Mitigation: Implement thresholds as configurable per-organisation database settings rather than hardcoded constants. Provide sensible defaults (underactive < 2 sessions/period, overloaded > 20 sessions/period) and document the tuning process for coordinators in the admin portal.
Contingency: If coordinators report threshold miscalibration after launch, expose a threshold configuration UI in the coordinator admin screen and allow real-time threshold adjustment without requiring a code deployment.
The app may not have 12 months of historical activity data for all organisations at launch, making year-over-year comparison impossible for most users and rendering the comparison widget empty, which could disappoint users expecting Wrapped-style insights.
Mitigation & Contingency
Mitigation: Design the generation service to gracefully handle missing prior-year data by setting the yoy_delta field to null rather than zero. The UI must treat null as 'no comparison available' with appropriate placeholder copy rather than showing a misleading 0% delta.
Contingency: If historical data import from legacy Excel/Word sources becomes feasible, add a one-time backfill Edge Function that populates prior-year activity records from imported spreadsheets. Until then, explicitly communicate the data-availability limitation in the first summary each user receives.