high priority medium complexity testing pending testing specialist Tier 4

Acceptance Criteria

Version comparison tests cover all three branches: appVersion == minAppVersion (enabled), appVersion < minAppVersion (disabled), appVersion > minAppVersion (enabled)
Activation date boundary tests verify: flag disabled when now < activationDate, enabled when now == activationDate (inclusive), enabled when now > activationDate
Percentage rollout tests confirm determinism: same orgId always produces same enabled/disabled result across 1000 repeated calls with no randomness drift
Percentage rollout distribution test validates that 1000 distinct orgIds produce enabled results within ±5% of the configured rollout percentage
FeatureFlagRepository cache-hit test verifies no Supabase call is made when a valid cached entry exists within TTL window
FeatureFlagRepository cache-miss test verifies Supabase is called and result is stored in cache when no cached entry exists or TTL is expired
RLS isolation test uses a mock Supabase client that returns rows belonging to a different orgId and asserts the repository returns an empty/default result
Offline fallback test simulates network exception from Supabase mock and asserts the repository returns last-cached state without throwing
Offline fallback test with empty cache asserts safe defaults (all flags disabled) are returned without throwing
All test files follow flutter_test conventions and run in under 2 seconds total

Technical Requirements

frameworks
flutter_test
mockito or mocktail for Dart mocking
apis
Supabase client mock interface
data models
activity_type
performance requirements
Full test suite completes in under 2 seconds
No real network calls — all Supabase interactions mocked
security requirements
Tests must verify that cross-org data cannot leak through the repository layer
Mock Supabase client must simulate RLS rejection scenarios, not just empty result sets

Execution Context

Execution Tier
Tier 4

Tier 4 - 323 tasks

Can start after Tier 3 completes

Implementation Notes

For version comparison use semantic versioning split on '.' and compare integer tuples — do not use string lexicographic comparison as '10' < '9' lexicographically. For percentage rollout determinism use a stable hash of orgId (e.g., FNV-1a or dart:convert crc32) modulo 100 — document the chosen algorithm in a comment since changing it would invalidate existing rollout assignments. For cache TTL boundary testing, inject a Clock abstraction into the repository rather than calling DateTime.now() directly so tests can advance time without real delays. When mocking Supabase for RLS tests, have the mock return rows with a different organization_id and assert the repository filters them out — this tests the application-layer guard that complements database-level RLS.

Testing Requirements

Unit tests only — no integration or widget tests in this task. Use flutter_test with mocktail (preferred over mockito for null-safety). Organize tests in three describe-style groups: RolloutEvaluatorTest, FeatureFlagRepositoryCacheTest, FeatureFlagRepositoryIsolationTest. Each group must have a setUp that resets shared state.

Aim for 100% branch coverage on RolloutEvaluator and 90%+ line coverage on FeatureFlagRepository. Run with `flutter test --coverage` and verify coverage report before marking complete.

Component
Rollout Condition Evaluator
service low
Epic Risks (3)
high impact medium prob security

Supabase RLS policies for organization_configs may have gaps that allow cross-organization reads if the JWT claim for organization_id is absent or malformed, leading to data leakage between tenants.

Mitigation & Contingency

Mitigation: Implement RLS policies using auth.uid() joined against a memberships table to derive organization_id rather than trusting a client-supplied claim. Write integration tests that simulate a cross-org read attempt and assert it returns zero rows.

Contingency: If a gap is discovered post-launch, immediately disable the affected RLS policy, roll back the migration, and re-implement with a parameterized policy tested against all organization fixture data.

medium impact medium prob technical

Dart does not have a built-in semantic version comparison library; a naive string comparison (e.g., '2.10.0' < '2.9.0' lexicographically) would cause rollout evaluator to produce incorrect eligibility results for organizations on different app versions.

Mitigation & Contingency

Mitigation: Use the pub.dev `pub_semver` package or implement a proper three-segment integer comparison. Add parameterized unit tests covering 20+ version pairs including double-digit minor/patch segments.

Contingency: If incorrect comparison is discovered in production, push a hotfix with corrected comparison logic and temporarily disable phase-gated flags until all affected organizations have updated to the corrected version.

medium impact low prob technical

Persistent local cache written to shared_preferences or Hive could become corrupted or deserialized incorrectly after an app update changes the FeatureFlag schema, causing startup crashes or all flags defaulting to disabled.

Mitigation & Contingency

Mitigation: Wrap all cache reads in try/catch with explicit fallback to the all-disabled default map. Version the cache key (e.g., `feature_flags_v2_{orgId}`) so schema changes automatically invalidate old entries.

Contingency: If cache corruption is detected in a release, publish an app update that clears the versioned cache key on first launch and re-fetches from Supabase.