speech_to_text Flutter Package (iOS SFSpeechRecognizer / Android SpeechRecognizer)
Third Party Library Integration by Flutter Community
Description
The speech_to_text Flutter package wraps iOS SFSpeechRecognizer and Android SpeechRecognizer to enable hands-free report dictation for peer mentors. Dictation is used post-session for writing activity notes and report fields — not during the session itself, as recording during sensitive conversations was explicitly rejected by Norges Blindeforbund. Reduces barrier to reporting for users with motor impairments.
Detailed Analysis
Speech-to-text dictation directly addresses a key accessibility requirement identified by Norges Blindeforbund: reducing the reporting burden for peer mentors with motor impairments or visual disabilities. By enabling hands-free dictation for post-session activity notes and report fields, the app lowers the barrier to accurate and timely reporting — improving data quality for coordinators and reducing the risk of late or incomplete session records. Critically, dictation is scoped exclusively to post-session reporting; Norges Blindeforbund explicitly rejected microphone use during sensitive peer conversations, and the DictationScopeGuard component enforces this boundary technically. The integration uses iOS SFSpeechRecognizer and Android SpeechRecognizer via the open-source speech_to_text Flutter package, meaning there is no per-use licensing cost.
On-device recognition is prioritised, so audio never leaves the device for the majority of users, addressing privacy concerns without additional infrastructure. This is a zero-cost, high-accessibility-impact capability with a strong ethical foundation aligned to HLF's mission.
The speech-to-text integration spans nine components including the SpeechToTextAdapter, DictationMicrophoneButton, TranscriptionPreviewField, TranscriptionStateManager, DictationScopeGuard, and NativeSpeechApiBridge. Integration requires platform-specific permission declarations: NSSpeechRecognitionUsageDescription and NSMicrophoneUsageDescription in iOS Info.plist, and RECORD_AUDIO in the Android Manifest. These must be reviewed and approved as part of App Store and Play Store submission. Key test scenarios include: permission grant and denial flows on both platforms, on-device recognition availability on iOS 13+ and Android 5.0+, partial transcription behaviour when a user navigates away mid-dictation, and DictationScopeGuard enforcement ensuring dictation cannot be activated during active mentor conversations.
No external credentials or accounts are required — setup complexity is low compared to other integrations. Locale configuration for Norwegian (nb_NO) should be validated against both platform speech engines. Ongoing maintenance is minimal as both platform SDKs are OS-managed and the Flutter package is community-maintained with stable versioning.
The integration wraps iOS SFSpeechRecognizer and Android SpeechRecognizer via the speech_to_text Flutter package (^6.0.0). No API keys or server-side credentials are required — authentication is entirely via OS-level permission grants (NSSpeechRecognitionUsageDescription, NSMicrophoneUsageDescription on iOS; RECORD_AUDIO on Android). On-device recognition is preferred via iOS on-device mode (iOS 13+) and Android SpeechRecognizer, avoiding audio upload to third-party servers. The NativeSpeechApiBridge abstracts platform differences; TranscriptionStateManager handles recognition lifecycle including partial result streaming (< 500ms first partial result target), silence timeout, and graceful interruption.
The DictationScopeGuard is a critical safety component that prevents the microphone from activating during active peer conversations — enforced at the service layer. When speech recognition is unavailable on a device, the SpeechToTextFieldOverlay hides the dictation button entirely rather than showing an error. Partial transcriptions are preserved if the user navigates away mid-session. Locale is configurable to nb_NO for Norwegian.
The one-minute iOS server-based recognition limit is circumvented by preferring on-device recognition; for longer dictations, session restart logic should be implemented. Screen reader announcements on recording start/stop are required for accessibility compliance.
Using Components (9)
Dependencies (4)
Authentication
| Type | None |
| Requirements | NSSpeechRecognitionUsageDescription in iOS Info.plist, NSMicrophoneUsageDescription in iOS Info.plist, RECORD_AUDIO permission in Android Manifest |
| Scopes | microphonespeech_recognition |
Configuration
Error Handling
Monitoring
Performance
| Latency | < 500ms for first partial result display |
| Availability | On-device recognition preferred for offline availability |
Cost Implications
| Pricing Model | Free open-source package; iOS on-device recognition is free |