Implement iOS SFSpeechRecognizer platform bridge

epic-speech-to-text-input-foundation-task-007 — Implement the iOS-specific NativeSpeechApiBridge using Flutter MethodChannel to invoke SFSpeechRecognizer via native Swift plugin code. Handle: NSMicrophoneUsageDescription and NSSpeechRecognitionUsageDescription permission requests, AVAudioSession configuration, real-time partial result streaming via EventChannel, recognition task lifecycle (start/stop/cancel), and locale selection for Norwegian (nb-NO).

critical priority medium complexity infrastructure pending infrastructure specialist Tier 1

Acceptance Criteria

IosSpeechApiBridge extends NativeSpeechApiBridge and overrides all required methods

NSMicrophoneUsageDescription and NSSpeechRecognitionUsageDescription keys are present in Info.plist with Norwegian-language descriptions

requestPermission() calls SFSpeechRecognizer.requestAuthorization and AVAudioSession permission and maps both results to SpeechPermissionResult

startRecognition() configures AVAudioSession with .record category and .default mode before creating SFSpeechAudioBufferRecognitionRequest

Partial results are delivered via EventChannel to onPartial callback with each intermediate hypothesis

Final result is delivered via EventChannel to onFinal callback when recognition task completes

stopRecognition() ends the audio engine, calls recognitionTask.finish(), and deactivates AVAudioSession

Locale nb-NO is passed to SFSpeechRecognizer(locale: Locale(identifier: 'nb-NO')) and falls back to device locale if nb-NO is unavailable

isAvailable() returns false when SFSpeechRecognizer.isAvailable is false or when running on simulator without audio input

All SpeechError cases are mapped: network error → .networkUnavailable, no speech → .noSpeechDetected, auth denied → .permissionDenied

No audio session resources remain active after stopRecognition() completes

Plugin compiles and runs on iOS 15+ physical device and TestFlight build without crashes

Technical Requirements

frameworks

Flutter

Dart

Swift

SFSpeechRecognizer

AVFoundation

apis

Flutter MethodChannel

Flutter EventChannel

SFSpeechRecognizer API

AVAudioSession API

SFSpeechAudioBufferRecognitionRequest

data models

NativeSpeechApiBridge

SpeechPermissionResult

SpeechRecognitionEvent

SpeechError

performance requirements

Partial results must be delivered within 300ms of being produced by SFSpeechRecognizer

AVAudioSession must be activated and ready before the first audio buffer is submitted — no dropped initial words

stopRecognition() must complete and release resources within 500ms

security requirements

AVAudioSession must be deactivated immediately after stopRecognition() to prevent background audio capture

Raw audio buffers from AVAudioEngine must never be forwarded over MethodChannel — only transcribed text

Permission denial must be handled gracefully without crashing — surface SpeechError.permissionDenied

Execution Context

Execution Tier

Tier 1

Tier 1 - 540 tasks

Can start after Tier 0 completes

View Full Execution Plan

Implementation Notes

Structure the Swift code as a FlutterPlugin in ios/Classes/SpeechPlugin.swift. Use a single MethodChannel ('com.eircodex.speech/methods') for requestPermission, isAvailable, startRecognition, stopRecognition commands, and a separate EventChannel ('com.eircodex.speech/events') for streaming partial and final results as JSON-encoded maps {type: 'partial'|'final'|'error', text: String?, errorCode: String?}. Ensure AVAudioEngine is stopped before calling recognitionTask.cancel() — reversing this order causes an AVAudioSession error on some iOS versions. Use DispatchQueue.main.async for all Flutter channel calls to avoid threading violations.

For nb-NO locale fallback: check SFSpeechRecognizer.supportedLocales() and log a warning if nb-NO is absent. Handle the case where recognition task produces a final result with isFinal=true inside the result handler rather than only in the completion handler — both paths must trigger the final callback.

Testing Requirements

Test on a physical iOS device with TestFlight distribution — SFSpeechRecognizer does not function on iOS Simulator. Write flutter_test integration tests using mock MethodChannel handlers that simulate: (1) permission granted response, (2) permission denied response, (3) a sequence of partial events followed by a final event, (4) a network error mid-recognition, (5) stopRecognition called before any result arrives. Verify that AVAudioSession is deactivated after stop by asserting no further EventChannel events arrive. Manual QA must include Norwegian speech input with nb-NO locale on a real device to confirm accurate recognition.

Component

Native Speech API Bridge

infrastructure medium

Dependencies (1)

Define the abstract NativeSpeechApiBridge interface in Dart with method signatures for: requestPermission(), startRecognition({locale, onPartial, onFinal, onError}), stopRecognition(), and isAvailable(). Define shared result types: SpeechPermissionResult, SpeechRecognitionEvent (partial/final/error variants), and SpeechError enum. This contract must be platform-agnostic and stable before platform implementations begin. epic-speech-to-text-input-foundation-task-006

Epic Risks (3)

high impact medium prob technical

iOS 15 on-device speech recognition has a 1-minute session limit and requires network fallback for longer sessions. Peer mentor way-forward dictation may routinely exceed this limit, causing silent truncation of transcribed content without user feedback.

Mitigation & Contingency

Mitigation: Implement session-chunking logic in NativeSpeechApiBridge that automatically restarts recognition before the limit is reached, preserving continuity via partial concatenation. Document the iOS 15 vs iOS 16 on-device recognition behaviour difference in code comments.

Contingency: If chunking causes user-visible interruptions, surface a non-blocking informational banner on iOS 15 devices informing users that very long dictation sessions may need to be broken into segments, and use PartialTranscriptionRepository to persist each chunk immediately.

high impact medium prob scope

On iOS, speech recognition permission can only be requested once. If the user denies the permission, the app cannot re-request it. A poor first-impression permission flow will permanently disable dictation for those users, impacting the Blindeforbundet blind-user base who rely on dictation most.

Mitigation & Contingency

Mitigation: Design the NativeSpeechApiBridge permission flow to show a clear pre-permission rationale screen before the OS dialog. Implement a graceful degradation path that hides the microphone button and shows a settings deep-link when permission is permanently denied.

Contingency: If users have already denied permission before the rationale screen is added, provide a settings deep-link in DictationScopeGuard's denial message directing users to iOS Settings > Privacy > Speech Recognition to re-enable manually.

medium impact low prob integration

The approved field IDs and screen routes configuration in DictationScopeGuard may fall out of sync with the actual report form schema as new fields are added by org administrators, silently blocking dictation on legitimately approved fields.

Mitigation & Contingency

Mitigation: Source the approved field configuration from the same org-field-config-loader used by the report form, rather than a hardcoded list. Add a developer-time assertion that logs a warning when a dictation-eligible field type is rendered but not in the approved routes map.

Contingency: Provide a runtime override mechanism in the scope guard that coordinators or admins can use to temporarily whitelist a field ID while the config is updated, with an automatic expiry.

Quick Links

All Tasks Execution Plan