critical priority high complexity testing pending testing specialist Tier 8

Acceptance Criteria

End-to-end test suite covers the full BufdirAggregationService pipeline from raw data ingestion to final Bufdir-compliant payload generation
Multi-org isolation is verified: seeded data from Org A must never appear in aggregation results for Org B across all pipeline stages
Proxy deduplication test: a participant appearing under multiple proxies within the same org is counted exactly once in the final participant count
Multi-chapter deduplication test: a participant registered in multiple chapters of the same org is deduplicated correctly at the aggregation layer
Geographic deduplication test: participants spanning multiple geographic zones are deduplicated and their geographic distribution is accurately computed
Category mapping correctness: all activity categories are mapped to their corresponding Bufdir category codes without loss or misclassification
Historical re-aggregation idempotency: running the full pipeline twice over the same historical dataset produces byte-identical output payloads
Output payload schema validation: every generated payload passes JSON schema validation against the official Bufdir submission format specification
Tests use realistic seeded data representative of production volumes (at minimum 3 orgs, 100+ activities, 50+ participants, 5+ geographic zones)
All tests pass in CI without requiring a live Supabase instance by using the local Supabase emulator or a dedicated test project
Test execution time for the full suite is under 5 minutes
Test failures produce human-readable diagnostic output identifying which pipeline stage produced incorrect results

Technical Requirements

frameworks
flutter_test
Supabase local emulator or test project for integration layer
BLoC test utilities for state verification
apis
BufdirAggregationService internal API
AggregationQueryBuilder RPC wrappers
BuffdirMetricsRepository interface
Supabase RPC functions (activity_counts, contact_counts, event_counts, geographic_distribution, participant_deduplication)
data models
ActivityRecord
EventRecord
ContactRecord
MetricSnapshot
GeographicDistributionResult
ParticipantCount
BufdirSubmissionPayload
performance requirements
Full integration test suite completes in under 5 minutes
Each individual test case completes in under 30 seconds
Seeded dataset must cover at minimum 100 activities and 50 participants to be representative
security requirements
Test credentials and Supabase keys must not be committed to source control — use environment variables or a local emulator with no real data
Test database must be isolated from production and staging environments
Seeded test data must not contain real personal information — use synthetic names, postcodes, and identifiers

Execution Context

Execution Tier
Tier 8

Tier 8 - 48 tasks

Can start after Tier 7 completes

Implementation Notes

Structure tests in a dedicated `integration_test/bufdir/` directory, separate from unit tests, to enable selective CI execution. Use a shared test fixture class (e.g., `BufdirTestFixtures`) that seeds the Supabase emulator with known datasets before each test group and tears down after. Design seed data as Dart constants so test assertions can reference exact expected values rather than computed ranges. For idempotency testing, snapshot the MetricSnapshot output after the first run and compare it field-by-field after the second run using a custom Matcher.

For schema conformance, encode the official Bufdir JSON schema as a Dart map literal or load it from a test asset file, then validate each payload field. Avoid `sleep()` or time-based waits — use `await` with explicit completion signals from the service. Tag all tests with `@Tags(['integration', 'bufdir'])` so they can be excluded from fast unit test runs in CI. This task is the final quality gate for the entire aggregation epic; if any test fails, the pipeline is not production-ready.

Testing Requirements

This task IS the testing deliverable. The test suite must include: (1) Integration tests that exercise the full BufdirAggregationService pipeline end-to-end using a real (emulated) Supabase backend — not mocks. (2) Parameterized test cases for each deduplication strategy (proxy, multi-chapter, geographic) with clearly labeled seed data sets. (3) Schema conformance tests using JSON schema validation or an equivalent typed assertion library against the official Bufdir submission format.

(4) Idempotency tests that run the aggregation pipeline twice and assert output equality using deep equality checks. (5) Negative tests verifying that cross-org data leakage does not occur under any seed data configuration. Test coverage target: 100% of BufdirAggregationService public methods exercised via integration path. No mocking of the data layer is permitted in these tests — mocks are reserved for unit tests in earlier tasks.

Component
Bufdir Aggregation Service
service high
Epic Risks (4)
high impact high prob integration

NHF members can belong to up to 5 local chapters. When a participant has activities registered under different chapter IDs within the same reporting period, deduplication requires a reliable cross-chapter identity key. If national IDs are absent for some members (a known data quality issue in NHF's systems), the deduplication service may fail to identify duplicates, resulting in inflated counts submitted to Bufdir.

Mitigation & Contingency

Mitigation: Implement a multi-attribute identity matching strategy: primary match on national_id, fallback to (full_name + birth_year + municipality) composite key. Expose a low-confidence match list in DeduplicationAnomalyReport that coordinators can review and manually resolve before submission.

Contingency: If identity data quality is too poor for reliable automated deduplication for specific organisations, add an organisation-level config flag that disables cross-chapter deduplication for that org and requires coordinators to manually review the anomaly report before submitting.

high impact medium prob integration

The geographic distribution algorithm must resolve NHF's 1,400 local chapter hierarchy to regional aggregates. If the organizational unit hierarchy in the database is incomplete (missing parent-child relationships for some chapters), the geographic service will silently drop activities from unmapped chapters, producing an understated geographic breakdown.

Mitigation & Contingency

Mitigation: Add a hierarchy completeness validation step in GeographicDistributionService that counts activities without a resolvable region assignment and surfaces them as an 'unmapped_activities' field in the distribution result. Block export if unmapped_activities > 0.

Contingency: Provide a 'national' fallback bucket for activities from chapters with no region assignment, clearly labelled in the preview screen so coordinators are alerted to fix the org hierarchy data before re-running aggregation.

high impact low prob technical

BufdirAggregationService orchestrates four dependent services. If one service (e.g., GeographicDistributionService) throws mid-pipeline, the partially assembled metrics payload may be silently cached or returned as if complete, resulting in a Bufdir submission missing the geographic breakdown section.

Mitigation & Contingency

Mitigation: Implement the orchestrator as a transactional pipeline using Dart's Result type pattern: each stage returns Either<AggregationError, PartialResult>, and the orchestrator only proceeds if all stages succeed. The final payload is only assembled and persisted when all stages return success.

Contingency: If a partial failure state reaches the UI, the AggregationProgressIndicator must display a specific stage failure message with a retry option that re-runs only the failed stage rather than the full pipeline.

medium impact medium prob scope

Internal activity types that have no corresponding Bufdir category in the mapping configuration will cause the aggregation to silently exclude those activities from the final counts. Coordinators may not notice the omission until Bufdir queries why submission totals are lower than expected.

Mitigation & Contingency

Mitigation: BufdirAggregationService must produce an unmapped_activity_types list as part of its output. If any internal activity types are unmapped, display a blocking warning in the AggregationSummaryWidget listing the unmapped types before allowing the coordinator to proceed to export.

Contingency: Allow coordinators to temporarily assign unmapped activity types to a Bufdir 'other' catch-all category as an emergency workaround, with an audit flag indicating manual override was applied for that submission.