high priority high complexity integration pending integration specialist Tier 7

Acceptance Criteria

Integration tests connect to the Supabase staging environment (URL and anon key read from environment variables — never hardcoded)
Seed data for at least two organizations (e.g., org_A and org_B) is inserted in setUp() and deleted in tearDown() to keep staging clean
fetchRawActivities for org_A returns only org_A records — org_B records are absent from the result set
fetchRawActivities for org_B returns only org_B records — org_A records are absent
fetchParticipantCount returns the exact seeded count for the test period
fetchGeographicDistribution returns municipality-level codes (not GPS coordinates) for seeded records
saveMetricSnapshot writes to Supabase; a second repository instance (fresh constructor) calling getMetricSnapshot returns the same data
Multi-org isolation: simultaneous queries for org_A and org_B (using separate JWT contexts) return disjoint result sets
All RPC function signatures discovered/confirmed during testing are documented in a INTEGRATION_TEST_NOTES.md file in the test directory
Integration tests are tagged with @Tags(['integration']) so they can be excluded from CI unit-test runs
Tests pass consistently across three consecutive runs against staging (no flakiness from timing or state)

Technical Requirements

frameworks
Flutter
Dart
flutter_test
apis
Supabase PostgreSQL 15 RPC
Supabase Auth (staging)
Supabase REST API (staging)
PostGIS Spatial Extension
data models
activity
annual_summary
contact
assignment
bufdir_export_audit_log
performance requirements
Each RPC call must complete within 8 seconds on staging (allows for cold-start latency)
Test suite total execution under 3 minutes
security requirements
Staging Supabase URL and anon key loaded from CI environment variables (SUPABASE_STAGING_URL, SUPABASE_STAGING_ANON_KEY) — never in source code
Seed data must use fictional UUIDs and non-PII test values (e.g., 'Test Contact A', not real names)
tearDown() must always clean up seed data even when tests fail — use try/finally pattern
Test user JWTs generated from staging auth service must have short expiry (1 hour max)
Cross-org isolation test must use two separate SupabaseClient instances with different JWT contexts to properly simulate multi-tenant requests

Execution Context

Execution Tier
Tier 7

Tier 7 - 84 tasks

Can start after Tier 6 completes

Implementation Notes

Create a StagingTestHelper class with static methods insertSeedOrg(), insertSeedActivities(orgId, count), insertSeedContacts(), and cleanupSeedData(orgIds) to keep test setup readable. For multi-org JWT testing, use Supabase's signInWithPassword for two different test user accounts (one per org) and run queries with each client in parallel using Future.wait(). If the staging RPC functions are not yet deployed when this task starts, mock the RPC layer in a subset of tests and document which assertions are blocked on RPC deployment. Use expect(..., completion(...)) for async assertions.

Never use sleep() — use await and proper async patterns.

Testing Requirements

Integration tests live in integration_test/data/bufdir_metrics_integration_test.dart. Use flutter_test's TestWidgetsFlutterBinding.ensureInitialized() for async setup. Tag tests with group('Integration - BufdirMetricsRepository', ...) and individual tests with tags: ['integration', 'supabase', 'rls']. CI pipeline must have a separate job that runs these tests only on staging environment with the SUPABASE_STAGING_URL env var set — unit test job must not require this variable.

Document pass/fail for each RPC function in INTEGRATION_TEST_NOTES.md alongside the function signature (name, parameters, return shape).

Component
Aggregation Query Builder
data high
Epic Risks (2)
high impact medium prob technical

Supabase RPC functions return JSON with PostgreSQL numeric types (bigint, numeric) that do not map cleanly to Dart int/double. Silent truncation or JSON parsing errors could corrupt participant counts in the final Bufdir submission without any runtime exception.

Mitigation & Contingency

Mitigation: Define explicit Dart fromJson factories for all RPC result models with type-safe parsing and assertion checks. Add a contract test that compares raw RPC JSON output against expected Dart model values using a known seed dataset.

Contingency: If type mismatches are found in production metrics, expose a validation endpoint in BufdirMetricsRepository that re-fetches and compares raw RPC output against the persisted snapshot, flagging any discrepancies before export proceeds.

medium impact high prob scope

Persisted metric snapshots can become stale if additional activities are registered after the snapshot is saved but before the export is finalized. Coordinators might unknowingly export data that does not reflect the latest activity registrations.

Mitigation & Contingency

Mitigation: Store a snapshot_generated_at timestamp and a record_count_at_generation field in the snapshot. When the coordinator views cached results, compare the current activity count for the period against the snapshot value and display a 'Data updated since last aggregation — re-run?' warning if counts differ.

Contingency: Add a mandatory staleness check before the export confirmation dialog can proceed: if the snapshot is more than 24 hours old or the record count has changed, require the coordinator to re-run aggregation before the export button is enabled.