critical priority medium complexity testing pending testing specialist Tier 3

Acceptance Criteria

Integration test suite runs against a dedicated test Supabase project with seeded data for at least 3 distinct organizations
After invoking SupabaseRLSTenantConfigurator with org_A, all subsequent queries return ONLY records belonging to org_A
Queries executed before invoking the configurator return an empty result set or throw an unauthorized error — never records from another org
Switching configurator from org_A to org_B causes all subsequent queries to return only org_B records; org_A records are inaccessible
Direct table scans on activities, contacts, user_roles, and user_stories tables all respect tenant scope
Attempting to read a record by explicit ID from another tenant returns null or 403 — no data leakage
Test assertions explicitly verify that cross-tenant IDs seeded in the test database are NOT present in any query result
All tests pass in CI with no flakiness across 3 consecutive runs
Test names clearly document the GDPR scenario being validated (e.g., 'org_B user cannot read org_A activity records')

Technical Requirements

frameworks
flutter_test
Riverpod
Supabase Flutter SDK
apis
Supabase REST API
Supabase RLS policies
data models
organizations
activities
user_roles
contacts
user_stories
performance requirements
Each integration test must complete within 5 seconds against the test Supabase project
Test suite total runtime must not exceed 3 minutes
security requirements
Test Supabase project must use production-identical RLS policies — no policy relaxation for tests
Test credentials must be stored in environment variables, never hardcoded
Seeded PII in the test database must use obviously fake data (e.g., 'Test User 1') to avoid accidental GDPR exposure
Cross-tenant leakage of any field constitutes a critical test failure — assertions must check full record content, not just IDs

Execution Context

Execution Tier
Tier 3

Tier 3 - 413 tasks

Can start after Tier 2 completes

Implementation Notes

The test Supabase project should mirror production RLS policies exactly — use a migration script or snapshot to keep them in sync. Seed data via the Supabase service-role key (bypasses RLS) to set up cross-tenant records; then switch to user-scoped JWT tokens to execute test queries. Use Dart's `setUpAll`/`tearDownAll` for seed/cleanup and individual `setUp` to reset session state. To test scope switching, call `SupabaseRLSTenantConfigurator.configure(orgId)` then assert, then call again with a different orgId and re-assert — do not rely on a single configurator call per test file.

Consider extracting a `TestSupabaseFixture` helper class that encapsulates seeding, JWT generation, and teardown to keep individual tests concise. GDPR non-leakage assertions should use `expect(result.any((r) => r.orgId != expectedOrgId), isFalse)` pattern across all returned collections.

Testing Requirements

Integration tests only — no unit mocks for the Supabase client in this suite. Use a real test Supabase project (separate from staging/production) with deterministic seed data covering at least 3 organizations. Seed script must be idempotent (re-runnable without duplicates). Cover: (1) single-org scope after configurator invocation, (2) scope switch between orgs, (3) explicit cross-tenant ID lookup returning null/error, (4) all primary tables verified for RLS enforcement (activities, contacts, user_roles, user_stories, org_feature_flags).

Each test must independently set up its JWT/session state — no shared state between tests. Run the suite in CI on every PR touching authentication, RLS configuration, or database access layers.

Component
Supabase RLS Tenant Scope Configurator
infrastructure medium
Epic Risks (3)
high impact medium prob technical

iOS Keychain and Android Keystore have meaningfully different failure modes and permission models. The secure storage plugin may throw platform-specific exceptions (e.g., biometric enrollment required, Keystore wipe after device re-enrolment) that crash higher-level flows if not caught at the adapter boundary.

Mitigation & Contingency

Mitigation: Wrap all storage plugin calls in try/catch at the adapter layer and expose a typed StorageResult<T> instead of throwing. Write integration tests on real device simulators for both platforms in CI using Fastlane. Document the exception matrix during spike.

Contingency: If a platform-specific failure cannot be handled gracefully, fall back to in-memory-only storage for the current session and surface a non-blocking warning to the user; log the event for investigation.

high impact medium prob integration

Setting a session-level Postgres variable (app.current_org_id) via a Supabase RPC requires that RLS policies on every table reference this variable. If the Supabase project schema has not yet defined these policies, the configurator will set the variable but queries will return unfiltered data, giving a false sense of security.

Mitigation & Contingency

Mitigation: Include a smoke-test RPC in the SupabaseRLSTenantConfigurator that verifies the variable is readable from a policy-scoped query before marking setup as complete. Coordinate with the database migration task to ensure RLS policies reference app.current_org_id before the configurator is shipped.

Contingency: If RLS policies are not in place at integration time, gate all data-fetching components behind a runtime check in SupabaseRLSTenantConfigurator.isRlsScopeVerified(); block data access and surface a developer warning until policies are confirmed.

medium impact medium prob technical

Fetching feature flags from Supabase on every cold start adds network latency before the first branded screen renders. On slow connections this may cause a perceptible blank-screen gap or cause the app to render with default (unflagged) state before flags arrive.

Mitigation & Contingency

Mitigation: Persist the last-known flag set to disk in the FeatureFlagProvider and serve stale-while-revalidate on startup. Gate flag refresh behind a configurable TTL (default 15 minutes) so network calls are not made on every launch.

Contingency: If stale flags cause a feature to appear that should be hidden, add a post-load re-evaluation pass that reconciles the live flag set with the rendered widget tree and triggers a targeted rebuild where needed.