Widget and golden tests for ClaimStatusAuditTimeline

epic-expense-approval-workflow-core-logic-task-016 — Write flutter_test widget tests for ClaimStatusAuditTimeline covering: rendering with a full multi-event history, rendering the empty state when no events exist, loading state display, error state display, correct Norwegian timezone conversion for a known UTC timestamp, screen reader semantics verification for actor/role/timestamp fields, and golden image tests for the standard multi-event layout.

high priority medium complexity testing pending testing specialist Tier 8

Acceptance Criteria

Test 1 — multi-event render: pump `ClaimStatusAuditTimeline` with 3 seeded `ClaimEvent` records; assert `find.text(actorDisplayName)` finds all 3 actors, all 3 role badges render, all 3 formatted timestamps are present

Test 2 — empty state: pump with an empty `List<ClaimEvent>`; assert the 'No activity recorded yet' empty state message is visible and no event rows are present

Test 3 — loading state: pump the composing widget (that includes the Riverpod provider) with a mock repository that delays; assert a `CircularProgressIndicator` is visible during loading

Test 4 — error state: pump with a mock repository that throws; assert an error message widget is visible and no event rows are rendered

Test 5 — timezone conversion: provide a single event with `created_at` = 2025-02-14T08:30:00Z; assert the rendered timestamp contains '09:30' (CET, UTC+1) and NOT '08:30'

Test 6 — CEST timezone: provide `created_at` = 2025-07-01T12:00:00Z; assert rendered time is '14:00' (CEST, UTC+2)

Test 7 — semantics: use `SemanticsController` to find the semantics node for a timestamp; assert `label` contains the full absolute date-time string in Norwegian format

Test 8 — actor role semantics: assert each role badge has a `Semantics.label` that describes the role (e.g., 'Peer mentor')

Test 9 — coordinator comment present: assert comment text is visible when `coordinator_comment` is non-null

Test 10 — coordinator comment absent: assert comment widget is not in the tree when `coordinator_comment` is null

Golden test 1: multi-event layout (3 events, mixed roles) — golden file committed to `test/goldens/claim_status_audit_timeline_multi_event.png`

Golden test 2: empty state layout — golden file committed to `test/goldens/claim_status_audit_timeline_empty.png`

All tests pass with `flutter test --update-goldens` producing stable images across CI runs (use a fixed screen size: 390×844 logical pixels)

Technical Requirements

frameworks

flutter_test (built-in Flutter testing framework)

mocktail (for mocking ClaimEventsRepository)

flutter_riverpod (ProviderScope override for mock injection)

apis

ClaimEventsRepository (mocked)

ClaimTimestampFormatter (real implementation — do not mock)

data models

ClaimEvent (test fixtures with deterministic created_at values)

ClaimEventStateTransition (all transition types should appear in at least one test)

performance requirements

Full test suite completes in under 30 seconds

Golden tests use a fixed device pixel ratio of 1.0 for reproducibility across machines

security requirements

Test fixtures must not contain real user data — use clearly fake names like 'Test User', 'Test Coordinator'

ui components

ClaimStatusAuditTimeline (the widget under test)

MockClaimEventsRepository (test double)

Execution Context

Execution Tier

Tier 8

Tier 8 - 48 tasks

Can start after Tier 7 completes

View Full Execution Plan

Implementation Notes

Organize the test file in three sections with comments: `// --- Pure Widget Tests ---`, `// --- Riverpod Integration Tests ---`, `// --- Golden Tests ---`. Create a `_buildFixtures()` helper that returns a deterministic list of `ClaimEvent` objects with fixed, known timestamps to make timezone assertions reproducible. For semantics tests, use `tester.getSemantics(find.byType(AccessibleTimestamp))` and assert on the `SemanticsData.label` property. For golden tests, set an explicit `Size` using `tester.binding.setSurfaceSize(const Size(390, 844))` and reset in `tearDown` to avoid cross-test contamination.

If the golden test is flaky due to font rendering differences between platforms, use `fontFamily: 'Ahem'` in the test theme to neutralize font rendering variance. Ensure the golden PNG files are tracked in git (not in .gitignore).

Testing Requirements

All tests are in a single file `test/widgets/claim_status_audit_timeline_test.dart`. Each test uses `testWidgets()`. For Riverpod-dependent tests (loading, error), wrap with `ProviderScope(overrides: [...])`. For pure widget tests (multi-event, empty), call `ClaimStatusAuditTimeline(events: fixtures)` directly without Riverpod.

Use a `TestWidgetsFlutterBinding.ensureInitialized()` call to initialize the timezone and nb_NO locale in `setUpAll()`. Golden tests use `matchesGoldenFile()` matcher. Run golden update with `flutter test --update-goldens` and commit the resulting PNG files. The golden images must be regenerated whenever the design tokens or layout change.

Component

Claim Status Audit Timeline

ui low

Dependencies (1)

Ensure all timestamps in ClaimStatusAuditTimeline are converted from UTC to Europe/Oslo timezone before display. Use the intl package with nb_NO locale for date and time formatting. Display relative time (e.g. '2 hours ago') for events within the last 24 hours and absolute date-time for older events. Add WCAG 2.2 AA accessible semantics labels for each timestamp. epic-expense-approval-workflow-core-logic-task-015

Epic Risks (3)

high impact high prob technical

The ThresholdEvaluationService is described as shared Dart logic used both client-side and in the Edge Function. Supabase Edge Functions run Deno/TypeScript, not Dart, meaning the threshold logic must be maintained in two languages and can diverge, causing the server to reject legitimate client submissions.

Mitigation & Contingency

Mitigation: Implement the threshold logic as a single TypeScript module in the Edge Function and call it via a thin Dart HTTP client wrapper for client-side preview feedback only. The server is always authoritative; the client version is purely for UX (showing the user whether their claim will auto-approve before they submit).

Contingency: If dual-language maintenance is unavoidable, create a shared golden test file (JSON fixtures with inputs and expected outputs) that is run against both implementations in CI to detect divergence immediately.

medium impact medium prob technical

A peer mentor could double-tap the submit button or a network retry could trigger a duplicate submission, causing the ApprovalWorkflowService to attempt two concurrent state transitions from draft→submitted for the same claim, potentially resulting in two audit events or conflicting statuses.

Mitigation & Contingency

Mitigation: Implement idempotency in the ApprovalWorkflowService using a database-level unique constraint on (claim_id, from_status, to_status) per transition, combined with a UI-level submission lock (disable button after first tap until response returns).

Contingency: Add a deduplication check at the start of every state transition method that returns the existing state if an identical transition is already in progress or completed within the last 10 seconds.

high impact medium prob scope

Claims with multiple expense lines (e.g., mileage + parking) must have their combined total evaluated against the threshold. If individual lines are added asynchronously or the evaluation runs before all lines are persisted, the auto-approval decision may be computed on an incomplete set of expense lines.

Mitigation & Contingency

Mitigation: The Edge Function always fetches all expense lines from the database (not from the client payload) before computing the threshold decision. Define a clear claim submission contract that requires all expense lines to be persisted before the submit action is called.

Contingency: Add a validation step in ApprovalWorkflowService that counts expected vs. persisted expense lines before allowing the transition, returning a validation error if lines are missing.

Quick Links

All Tasks Execution Plan