critical priority high complexity testing pending testing specialist Tier 8

Acceptance Criteria

Scenario 1 — Successful Norwegian transcription: service emits 2–4 partials then one final with transcript 'Jeg trenger hjelp' and confidence ≥ 0.7
Scenario 2 — Mid-session network drop: service emits partial events, then bridge fires network error, service emits SpeechRecognitionErrorEvent(networkTimeout), stream closes, state returns to idle
Scenario 3 — Engine unavailable on cold start: startListening() results in SpeechRecognitionErrorEvent(engineUnavailable) emitted, no partial events emitted, state remains idle after
Scenario 4 — Back-to-back session requests: first session completes normally; second startListening() call after stream completes succeeds without error; second session emits its own event sequence independently
All scenarios validate complete event ordering using emitsInOrder() or StreamQueue step assertions
Stream done event (close) is verified for all scenarios — no scenarios leave an open stream
FakeSpeechApiBridge is a hand-written fake (not a mockito mock) that drives realistic async callback timing using Future.microtask() or Timer
Tests run entirely within flutter_test — no device or simulator required
Each scenario is an independent test case with fresh service and bridge instances

Technical Requirements

frameworks
flutter_test
async (StreamQueue)
fake_async (for timing simulation)
apis
FakeNativeSpeechApiBridge (hand-written test double)
SpeechRecognitionService (system under test)
StreamQueue for ordered assertion
data models
FakeSessionScenario (enum driving fake bridge behavior)
All SpeechRecognitionEvent subtypes
performance requirements
All integration test scenarios complete in under 10 seconds total
No real network calls — all bridge I/O simulated
security requirements
Norwegian test phrases must be generic and non-personally-identifying
No Supabase credentials or real auth tokens in test fixtures

Execution Context

Execution Tier
Tier 8

Tier 8 - 48 tasks

Can start after Tier 7 completes

Implementation Notes

The key difference from unit tests (task-008) is that FakeNativeSpeechApiBridge here is a fully implemented test double (not a mockito mock) that drives callback sequences asynchronously. Implement FakeNativeSpeechApiBridge with a configureScenario(FakeSessionScenario) method. FakeSessionScenario is a simple data class: {List partialTranscripts, String finalTranscript, double confidence, SpeechErrorCode? errorAfterNthPartial}.

Bridge fires callbacks using a chain of Future.microtask(() => _onPartialResult(partials[0])).then((_) => Future.microtask(() => _onPartialResult(partials[1]))).etc. For back-to-back session test: after first session stream closes, call service.startListening() again and verify the second session's stream is a fresh Stream with its own events — confirm no events from session 1 appear on session 2's stream. The back-to-back test is the most important for real-world use: Blindeforbundet users may start multiple sessions per home visit report; HLF users may dictate multiple paragraphs separately.

Testing Requirements

Integration tests in speech_recognition_service_integration_test.dart. Use a hand-written FakeNativeSpeechApiBridge that accepts a scenario configuration and fires callbacks in realistic async sequences using Future.microtask() chains. Use StreamQueue from the async package for step-by-step stream assertion. Scenarios must use realistic Norwegian text ('jeg trenger hjelp', 'kan du hjelpe meg') to validate encoding/locale handling.

Run as part of flutter test — no widget test environment needed (pure Dart layer). Verify stream isDone after each scenario using expectLater(stream, emitsDone).

Component
Speech Recognition Service
service high
Epic Risks (2)
medium impact medium prob technical

The speech_to_text Flutter package delegates accuracy entirely to the OS-native engine. Norwegian accuracy for domain-specific vocabulary (medical terms, organisation names, accessibility terminology) may fall below the 85% acceptance threshold on older devices or in noisy environments, causing user frustration and manual correction overhead that negates the time saving.

Mitigation & Contingency

Mitigation: Configure the SpeechRecognitionService with Norwegian as the explicit locale and test against a representative corpus of peer mentoring vocabulary on target devices. Expose locale switching so users can fallback to Bokmål vs Nynorsk. Clearly set user expectations in the UI that transcription is a starting point for editing, not a finished product.

Contingency: If accuracy is consistently below threshold on specific device/OS combinations, add a device-capability check that hides the dictation button with an explanatory message rather than offering a degraded experience. Document affected device models for QA and org contacts.

medium impact low prob dependency

The speech_to_text Flutter package is a third-party dependency that may introduce breaking API changes or deprecations on major version upgrades, requiring rework of SpeechRecognitionService when Flutter or platform OS versions are updated.

Mitigation & Contingency

Mitigation: Wrap all speech_to_text API calls behind the SpeechRecognitionService interface so that package changes are isolated to one file. Pin the package version in pubspec.yaml and review changelogs before any upgrade. Write integration tests that exercise the package contract so regressions are caught immediately.

Contingency: If the package is abandoned or has unresolvable issues, NativeSpeechApiBridge already provides the platform-channel abstraction needed to implement a direct plugin replacement with minimal changes to SpeechRecognitionService.