Implement Android SpeechRecognizer platform bridge

epic-speech-to-text-input-foundation-task-008 — Implement the Android-specific NativeSpeechApiBridge using Flutter MethodChannel to invoke android.speech.SpeechRecognizer via native Kotlin plugin code. Handle: RECORD_AUDIO permission request via ActivityResultContracts, RecognitionListener lifecycle events, partial results via RESULTS_RECOGNITION with UNSTABLE_TEXT, recognition intent configuration for Norwegian locale (nb-NO), and RecognizerIntent.ACTION_RECOGNIZE_SPEECH setup.

critical priority medium complexity infrastructure pending infrastructure specialist Tier 1

Acceptance Criteria

AndroidSpeechApiBridge extends NativeSpeechApiBridge and overrides all required methods

RECORD_AUDIO permission is declared in AndroidManifest.xml and requested at runtime via ActivityResultContracts.RequestPermission

requestPermission() returns SpeechPermissionResult.granted when RECORD_AUDIO is granted and .denied or .permanentlyDenied appropriately

startRecognition() creates a RecognizerIntent with ACTION_RECOGNIZE_SPEECH, EXTRA_LANGUAGE set to nb-NO, and PARTIAL_RESULTS set to true

Partial results are extracted from onPartialResults bundle key RESULTS_RECOGNITION and delivered to onPartial callback

UNSTABLE_TEXT key from onPartialResults bundle is used for in-progress hypothesis text when available

Final results are extracted from onResults RESULTS_RECOGNITION[0] and delivered to onFinal callback

onError RecognitionListener callback maps SpeechRecognizer error codes to SpeechError enum values: ERROR_NETWORK → networkUnavailable, ERROR_NO_MATCH → noSpeechDetected, ERROR_INSUFFICIENT_PERMISSIONS → permissionDenied, ERROR_AUDIO → audioSessionError, all others → unknown

stopRecognition() calls SpeechRecognizer.stopListening() and destroys the recognizer instance

isAvailable() calls SpeechRecognizer.isRecognitionAvailable(context) and returns the result

No SpeechRecognizer instance remains after stopRecognition() — destroy() is called to prevent resource leak

Plugin compiles and runs on Android API 26+ without crashes

Technical Requirements

frameworks

Flutter

Dart

Kotlin

Android SDK

apis

Flutter MethodChannel

Flutter EventChannel

android.speech.SpeechRecognizer

android.speech.RecognizerIntent

android.speech.RecognitionListener

ActivityResultContracts.RequestPermission

data models

NativeSpeechApiBridge

SpeechPermissionResult

SpeechRecognitionEvent

SpeechError

performance requirements

Partial results must be forwarded to Flutter within 200ms of arrival in onPartialResults

SpeechRecognizer.destroy() must be called within 1 second of stopRecognition() to release microphone resource

Recognition intent must start within 500ms of startRecognition() being called

security requirements

RECORD_AUDIO permission must be checked before every startRecognition() call — do not assume it persists

Audio data must never be written to disk or forwarded over the channel — only transcribed text strings

SpeechRecognizer.destroy() must always be called on stop to prevent background microphone access

Execution Context

Execution Tier

Tier 1

Tier 1 - 540 tasks

Can start after Tier 0 completes

View Full Execution Plan

Implementation Notes

Implement in android/src/main/kotlin/SpeechPlugin.kt as a FlutterPlugin. Use the same MethodChannel and EventChannel names as the iOS implementation for symmetry. SpeechRecognizer must be created on the main thread — use mainLooper when initializing. For permission requests: since ActivityResultContracts requires an Activity, store a reference to the FlutterActivity and use ActivityCompat.requestPermissions with a request code, handling the result in onRequestPermissionsResult forwarded from the Activity.

Alternatively, use the newer ActivityPluginBinding approach with ActivityResultLauncher. The UNSTABLE_TEXT key is only available in API 23+ — add a null check. Be aware that Android's built-in SpeechRecognizer has a ~60-second timeout and will auto-stop — handle this in onEndOfSpeech and treat it as a final result. For nb-NO: EXTRA_LANGUAGE='nb-NO', EXTRA_LANGUAGE_MODEL=LANGUAGE_MODEL_FREE_FORM, EXTRA_MAX_RESULTS=1.

Testing Requirements

Write flutter_test integration tests using mock MethodChannel handlers simulating all RecognitionListener callback paths: (1) onReadyForSpeech → onPartialResults with nb-NO text → onResults (happy path), (2) onError with ERROR_NETWORK, (3) onError with ERROR_NO_MATCH, (4) onError with ERROR_INSUFFICIENT_PERMISSIONS, (5) stopRecognition called during active recognition. Verify destroy() is called after stop via mock call log assertions. Run on an Android emulator API 30+ with Google Play Services for recognition availability. Manual QA on a physical device with Norwegian speech input must confirm nb-NO locale accuracy.

Component

Native Speech API Bridge

infrastructure medium

Dependencies (1)

Define the abstract NativeSpeechApiBridge interface in Dart with method signatures for: requestPermission(), startRecognition({locale, onPartial, onFinal, onError}), stopRecognition(), and isAvailable(). Define shared result types: SpeechPermissionResult, SpeechRecognitionEvent (partial/final/error variants), and SpeechError enum. This contract must be platform-agnostic and stable before platform implementations begin. epic-speech-to-text-input-foundation-task-006

Epic Risks (3)

high impact medium prob technical

iOS 15 on-device speech recognition has a 1-minute session limit and requires network fallback for longer sessions. Peer mentor way-forward dictation may routinely exceed this limit, causing silent truncation of transcribed content without user feedback.

Mitigation & Contingency

Mitigation: Implement session-chunking logic in NativeSpeechApiBridge that automatically restarts recognition before the limit is reached, preserving continuity via partial concatenation. Document the iOS 15 vs iOS 16 on-device recognition behaviour difference in code comments.

Contingency: If chunking causes user-visible interruptions, surface a non-blocking informational banner on iOS 15 devices informing users that very long dictation sessions may need to be broken into segments, and use PartialTranscriptionRepository to persist each chunk immediately.

high impact medium prob scope

On iOS, speech recognition permission can only be requested once. If the user denies the permission, the app cannot re-request it. A poor first-impression permission flow will permanently disable dictation for those users, impacting the Blindeforbundet blind-user base who rely on dictation most.

Mitigation & Contingency

Mitigation: Design the NativeSpeechApiBridge permission flow to show a clear pre-permission rationale screen before the OS dialog. Implement a graceful degradation path that hides the microphone button and shows a settings deep-link when permission is permanently denied.

Contingency: If users have already denied permission before the rationale screen is added, provide a settings deep-link in DictationScopeGuard's denial message directing users to iOS Settings > Privacy > Speech Recognition to re-enable manually.

medium impact low prob integration

The approved field IDs and screen routes configuration in DictationScopeGuard may fall out of sync with the actual report form schema as new fields are added by org administrators, silently blocking dictation on legitimately approved fields.

Mitigation & Contingency

Mitigation: Source the approved field configuration from the same org-field-config-loader used by the report form, rather than a hardcoded list. Add a developer-time assertion that logs a warning when a dictation-eligible field type is rendered but not in the approved routes map.

Contingency: Provide a runtime override mechanism in the scope guard that coordinators or admins can use to temporarily whitelist a field ID while the config is updated, with an automatic expiry.

Quick Links

All Tasks Execution Plan