high priority high complexity testing pending testing specialist Tier 6

Acceptance Criteria

Test dataset: 1,400 chapter nodes, 9 regions, 1 national node, seeded into the Supabase test instance using a generated fixture script
Test: full national rollup (all 1,400 chapters) completes in under 2,000ms p95 measured over 10 consecutive cold-start runs
Test: cached national rollup responds in under 500ms p95 measured over 10 consecutive warm runs
Test: cache hit rate exceeds 80% under a simulated read-heavy workload (100 sequential reads, 10 writes randomly interspersed)
Test: Bufdir breakdown output for a known 50-chapter subset matches hand-computed reference values to exact integer counts (zero tolerance for counting errors)
Test: participant deduplication is verified — a contact participating in 3 chapters under the same region is counted once in the regional total and once in the national total
Test: concurrent stress — 20 simultaneous activity write requests trigger cache invalidation without race conditions; final aggregation values are consistent with all 20 activities reflected
Test: after stress run, a final rollup query returns correct totals matching the sum of all inserted activities
Test: rolling window filter (last 30 days, last 90 days, full year) returns correct subsets with matching counts against direct SQL count queries
Test: each performance test logs its measured latency to a structured JSON output file for CI trend tracking
All tests pass on the Supabase test instance with the same schema version as production

Technical Requirements

frameworks
flutter_test
Supabase PostgreSQL 15
Supabase Edge Functions (Deno)
apis
Internal: HierarchyAggregationService rollup API (task-012 endpoint)
Supabase PostgREST: bulk INSERT for activity fixture seeding
Supabase PostgREST: direct COUNT queries for reference value computation
Internal: AggregationCacheInvalidationService (task-013)
data models
activity
annual_summary
bufdir_column_schema
bufdir_export_audit_log
contact
performance requirements
Full national rollup: under 2,000ms p95 (cold cache)
Cached rollup: under 500ms p95 (warm cache)
Cache hit rate: >80% under simulated read-heavy workload
Stress test: 20 concurrent writes must all be reflected in final aggregation with zero data loss
security requirements
Performance tests run against a dedicated test Supabase instance — never production
Test fixture data uses synthetic non-personal UUIDs — no real participant data
Service role key used only for fixture seeding; all aggregation API calls use coordinator JWT to test realistic access patterns
Performance test results logged locally only — not transmitted externally

Execution Context

Execution Tier
Tier 6

Tier 6 - 158 tasks

Can start after Tier 5 completes

Implementation Notes

The fixture generator should create a realistic NHF-shaped hierarchy: 1 national node → 9 regional nodes → ~155 chapters per region (1,400 total). Distribute activities using a power-law distribution (most chapters have 1-5 activities, a few have 50+) to simulate real usage. For the deduplication correctness test, manually construct a small 5-chapter scenario where 3 contacts participate in overlapping chapters and verify the expected deduplicated count by hand before encoding it as the expected value. For the concurrent write stress test, ensure each Future in Future.wait() uses an independent Supabase client connection to avoid shared state.

The cache invalidation correctness check must run after all 20 writes are confirmed complete (await Future.wait completion) before calling the rollup API — this prevents a race between the last write and the verification read. Consider adding a 100ms settle delay after Future.wait to allow async Realtime invalidation propagation before reading cache state.

Testing Requirements

Test file: integration_test/hierarchy_aggregation_performance_test.dart. Use a FixtureGenerator class to programmatically generate a 1,400-chapter hierarchy tree with random activity distributions and seed it into the test instance in setUpAll. Measure latency using Stopwatch() around each aggregation call. For p95 calculation, run each scenario 10 times and discard the top 0.5 (take the 9th-highest value as p95 proxy for 10 samples).

For concurrency stress test: use Future.wait(List.generate(20, (i) => insertActivity())) then verify final rollup. For correctness test: pre-compute a reference value by running a direct SQL COUNT(DISTINCT contact_id) query against the test instance for the known subset, then compare to the aggregation API output and assert equality. Export latency measurements to test_results/perf_results.json with timestamp, scenario name, p50, p95 fields for CI integration. Mark any test exceeding its latency bound as a test failure (not just a warning).

Component
Hierarchy Aggregation Service
service high
Epic Risks (3)
high impact medium prob technical

Recursive aggregation queries across four hierarchy levels (national → region → local) with 1,400 leaf nodes may be too slow for real-time dashboard requests, exceeding the 200ms target and causing spinner timeouts.

Mitigation & Contingency

Mitigation: Implement aggregation as a Supabase RPC using a single recursive CTE rather than multiple round-trip queries. Pre-compute aggregations nightly via a scheduled Edge Function and cache results. For real-time needs, aggregate only the immediate subtree on demand.

Contingency: Surface a 'Refreshing...' indicator and serve stale cached aggregations immediately. Queue an async recalculation and push updated data via Supabase Realtime when ready, avoiding blocking the admin dashboard.

medium impact medium prob scope

The 5-chapter limit and primary-assignment constraint are NHF-specific. Applying these rules globally may break HLF and Blindeforbundet configurations where different limits apply, requiring per-organization configuration that was not initially scoped.

Mitigation & Contingency

Mitigation: Make the maximum assignment count a configurable value stored in the organization's feature-flag or settings table rather than a hardcoded constant. Design the assignment service to read this limit at runtime per organization.

Contingency: Default the limit to a high value (e.g., 100) for organizations other than NHF, effectively making it non-restrictive, while keeping the enforcement logic intact for when per-org configuration is fully implemented.

medium impact low prob technical

The searchable parent dropdown in HierarchyNodeEditor must search across up to 1,400 units efficiently. Client-side filtering of the full hierarchy may be slow; server-side search adds complexity and latency.

Mitigation & Contingency

Mitigation: Use the in-memory hierarchy cache as the search corpus — since the cache already holds the flat unit list, client-side filtering with a debounced input is sufficient and avoids extra Supabase calls. Pre-build a search index on cache load.

Contingency: Cap the dropdown to showing the 50 most recently accessed units by default, with a 'search all' option that triggers a server-side full-text query. This keeps the common case fast while supporting edge cases.