Implement cursor-position merging of final transcription

epic-speech-to-text-input-user-interface-task-007 — Add the cursor-position-aware merge logic to TranscriptionPreviewField. When a final transcription result arrives, insert the confirmed text at the current cursor position in the existing field content, preserving all pre-existing text before and after the insertion point. Dismiss the preview area after merge. Handle edge cases: cursor at start, cursor at end, text selected (replace selection with transcription).

high priority medium complexity frontend pending frontend specialist Tier 1

Acceptance Criteria

When a final transcription result arrives and cursor is mid-text, the confirmed text is inserted at the cursor position — text before cursor is preserved, text after cursor is preserved

When cursor is at the start of the field (offset 0), confirmed text is prepended with existing text appended after

When cursor is at the end of the field, confirmed text is appended

When there is an active text selection (selection.start != selection.end), the confirmed text replaces the selected range — no selected text remains after merge

After merge, cursor is placed immediately after the last inserted character

Preview area is dismissed (hidden, zero height) synchronously with merge

A single trailing space is appended after inserted text if the character immediately following the insertion point is not whitespace or punctuation — to prevent word collision

Merge operation is undoable via the platform's standard undo gesture (Ctrl+Z / device shake) — the TextEditingController value before merge is pushed to the undo history

No partial transcription text is ever committed to the field — only final results trigger a merge

Merge completes within one frame — no async gaps between final result arrival and field update

Widget test covers all five cursor/selection scenarios: start, end, mid-text, selection with text, empty field

Technical Requirements

frameworks

Flutter

Riverpod

apis

TextEditingController (value, selection, text)

TranscriptionStateManager provider (final result stream)

Flutter UndoHistoryController (Flutter 3.7+)

performance requirements

Merge must complete within a single frame — no await calls in the merge code path

TextEditingController.value set exactly once per merge to avoid double-rebuild

security requirements

Final transcription text must be sanitized to remove any null bytes or control characters before insertion into the field

Merged text must not be auto-submitted to any backend — it remains local field content until user explicitly saves

ui components

TranscriptionPreviewField (component 658)

TextEditingController

UndoHistoryController (optional, for undo stack integration)

Execution Context

Execution Tier

Tier 1

Tier 1 - 540 tasks

Can start after Tier 0 completes

View Full Execution Plan

Implementation Notes

Implement merge logic as a pure static method for testability: takes `TextEditingValue current`, `String finalText` and returns a new `TextEditingValue`. Inside: `final before = current.text.substring(0, current.selection.start); final after = current.text.substring(current.selection.end); final merged = before + finalText + trailingSpace + after; final newOffset = before.length + finalText.length + trailingSpace.length;` Then set `controller.value = TextEditingValue(text: merged, selection: TextSelection.collapsed(offset: newOffset))`. For undo support, if using Flutter 3.7+ UndoHistoryController, call `undoController.value = UndoHistoryValue(canUndo: true)` before overwriting — or simply rely on the fact that programmatic controller.value changes are tracked automatically in Flutter's undo stack since 3.3. Listen for final results using `ref.listen(transcriptionStateManagerProvider, (prev, next) { if (next is TranscriptionComplete) _mergeFinalText(next.text); })` in initState/didChangeDependencies with a ProviderSubscription.

Testing Requirements

Unit tests: Extract the merge logic into a pure function `mergeAtCursor(String existingText, TextSelection selection, String insertText) -> TextEditingValue` and test all five cursor scenarios: empty field, cursor at start, cursor at end, cursor mid-text, active selection. Assert output text and cursor position for each. Assert trailing space logic (space added when next char is a letter, not added when next char is punctuation or end of string). Widget tests: Wire TranscriptionPreviewField with a mock provider that emits a final result; assert TextEditingController.text and selection.baseOffset after merge.

Assert preview area is gone after merge. Assert undo restores pre-merge state. Manual tests: Dictate into a field with existing text at various cursor positions; verify no text loss. Test undo on both iOS (shake to undo) and Android.

Component

Transcription Preview Field

ui medium

Dependencies (1)

Create the TranscriptionPreviewField Flutter widget that wraps a standard TextField and displays live partial transcription results in a visually distinct preview area below the cursor. The widget must observe TranscriptionStateManager for partial results and update the preview text in real time. Support all standard OS text editing gestures (tap, long-press, drag-to-select) on the underlying field. epic-speech-to-text-input-user-interface-task-006

Epic Risks (3)

medium impact medium prob technical

Merging dictated text at the current cursor position in a TextField that already contains user-typed content is non-trivial in Flutter — TextEditingController cursor offsets can behave unexpectedly with IME composition, emoji, or RTL characters, potentially corrupting the user's existing notes.

Mitigation & Contingency

Mitigation: Implement the merge logic using TextEditingController.value replacement with explicit selection range calculation rather than direct text manipulation. Write targeted widget tests covering edge cases: cursor at start, cursor at end, cursor mid-word, existing content with emoji, and content that was modified during an active partial-results stream.

Contingency: If cursor-position merging proves too fragile for the initial release, scope the merge behaviour to always append dictated text at the end of the existing field content and add the cursor-position insertion as a follow-on task after the feature is in TestFlight with real user feedback.

high impact medium prob technical

VoiceOver on iOS and TalkBack on Android handle rapid sequential live region announcements differently. If recording start, partial-result, and recording-stop announcements arrive within a short window, they may queue, overlap, or be dropped, leaving screen reader users without critical state information.

Mitigation & Contingency

Mitigation: Implement announcement queuing in AccessibilityLiveRegionAnnouncer with a minimum inter-announcement delay and priority ordering (assertive recording start/stop always takes precedence over polite partial-result updates). Test announcement behaviour on physical iOS and Android devices with VoiceOver/TalkBack enabled as part of the acceptance test plan.

Contingency: If platform differences make reliable queuing impossible, reduce partial-result announcements to a single 'transcription updating' message with debouncing, preserving the critical start/stop announcements. Coordinate with the screen-reader-support feature team to leverage the existing SemanticsServiceFacade patterns already established in the codebase.

medium impact low prob integration

The DictationMicrophoneButton must integrate with the dynamic-field-renderer which generates form fields from org-specific schemas at runtime. If the renderer does not expose a stable field metadata API for dictation eligibility checks, the scope guard and button visibility logic will require invasive changes to the report form architecture.

Mitigation & Contingency

Mitigation: Coordinate with the post-session report feature team early in the epic to confirm that dynamic-field-renderer exposes a field metadata interface including field type and sensitivity flags. Add a dictation_eligible flag to the field schema that the renderer passes to DictationMicrophoneButton as a constructor parameter.

Contingency: If the renderer cannot be modified without breaking changes, implement dictation eligibility as a separate lookup against org-field-config-loader using the field key as the lookup identifier, bypassing the renderer integration and keeping the dictation components fully decoupled from the report form architecture.

Quick Links

All Tasks Execution Plan