epic-organizational-hierarchy-management-duplicate-detection-task-006 - Implementation Task | Likepersonsapp

high priority high complexity infrastructure pending infrastructure specialist Tier 4

Acceptance Criteria

Trigger `after_activity_insert_check_duplicates` fires AFTER INSERT on activities FOR EACH ROW

Trigger function calls `get_cross_hierarchy_activity_candidates(NEW.id, config.time_window_hours)` to get candidates

For each candidate with similarity_score >= config.similarity_threshold, inserts a row into suspected_duplicates with detection_method = 'trigger'

Canonical pair ordering enforced: activity_id_a = LEAST(NEW.id, candidate_id), activity_id_b = GREATEST(NEW.id, candidate_id)

Uses ON CONFLICT DO NOTHING on suspected_duplicates insert to safely handle race conditions (e.g., two activities inserted nearly simultaneously)

Reads organization config from duplicate_detection_config; if no config exists for the org, uses hardcoded defaults (24h window, 0.8 threshold)

Trigger execution time must not significantly delay the original INSERT — if the trigger function raises an exception, the exception is caught and logged to a `trigger_errors` log table rather than rolling back the activity insert

Trigger is disabled by default for organizations that have opted out (is_active = false in duplicate_detection_config)

Migration includes `CREATE OR REPLACE FUNCTION` and `CREATE TRIGGER IF NOT EXISTS` (or DROP/CREATE pattern for idempotency)

Technical Requirements

frameworks

PostgreSQL 15+ PL/pgSQL triggers

apis

get_cross_hierarchy_activity_candidates()

calculate_activity_similarity()

duplicate_detection_config table

data models

activities

suspected_duplicates

duplicate_detection_config

performance requirements

Trigger must complete in under 200ms for p95 to avoid noticeable latency on activity submission from Flutter

If no config exists for the org (first insert), the config lookup must short-circuit to defaults without a table scan

security requirements

Trigger function runs with SECURITY DEFINER to access suspected_duplicates and config tables regardless of the inserting user's permissions

SECURITY DEFINER function must be owned by a dedicated service role, not a superuser

Exceptions caught inside trigger must log minimal info (activity id, error message) — no PII in trigger_errors table

Execution Context

Execution Tier

Tier 4

Tier 4 - 323 tasks

Can start after Tier 3 completes

View Full Execution Plan

Implementation Notes

Structure the trigger function as: (1) DECLARE config record; (2) SELECT config into variable, using COALESCE for defaults; (3) early RETURN NEW if is_active = false; (4) wrap the candidate loop in a BEGIN...EXCEPTION WHEN OTHERS THEN block that inserts into trigger_errors; (5) FOR candidate IN SELECT * FROM get_cross_hierarchy_activity_candidates(NEW.id, config.time_window_hours) LOOP — check candidate.similarity_score >= config.similarity_threshold before inserting. Keep the trigger function focused and short — all heavy logic lives in `get_cross_hierarchy_activity_candidates()` and `calculate_activity_similarity()`. Consider adding a `pg_notify('duplicate_detected', json_build_object('org_id', NEW.organization_id, 'pair_id', inserted_id)::text)` call after each insert so Flutter clients subscribed to Supabase Realtime receive live admin queue updates without polling.

Testing Requirements

Integration tests (run against a real Supabase test project or local Supabase CLI): (1) insert two activities for the same participant in the same org from different units within the time window → verify a 'pending' row appears in suspected_duplicates within 1 second; (2) insert two activities for different participants → no suspected_duplicates row; (3) insert activity for an org with is_active = false → no suspected_duplicates row; (4) simulate trigger function throwing an exception (e.g. by temporarily dropping a dependency) → verify the activity INSERT still succeeds and an error row is written to trigger_errors; (5) concurrent inserts of two near-duplicate activities → only one suspected_duplicates row (ON CONFLICT DO NOTHING test). Flutter widget test: after submitting an activity via the UI, check the suspected_duplicates stream in the admin BLoC fires.

Component

Duplicate Activity Detector

infrastructure high

Dependencies (1)

Implement a SQL query or Postgres function that retrieves activities submitted by coordinators at different unit levels within the same organization hierarchy for the same participant. Must join organization_units and unit_assignments to identify activities registered under different chapter headings for the same physical event. epic-organizational-hierarchy-management-duplicate-detection-task-005

Epic Risks (3)

medium impact high prob technical

Fingerprint-based similarity matching may produce high false-positive rates for common activity types (e.g., weekly group sessions with the same participants), causing alert fatigue among coordinators and undermining trust in the detection system.

Mitigation & Contingency

Mitigation: Start with conservative, high-confidence thresholds (exact peer mentor match + same date + same activity type) before adding looser fuzzy matching. Allow NHF administrators to tune thresholds based on observed false-positive rates. Log all detection decisions for retrospective threshold calibration.

Contingency: Introduce a snooze mechanism allowing coordinators to dismiss false positives for a configurable period. Track dismissal rates per activity type and automatically raise the similarity threshold for activity types with high dismissal rates.

medium impact medium prob technical

A database trigger on the activities insert path adds synchronous overhead to every activity registration. For HLF peer mentors with 380 annual registrations or coordinators doing bulk proxy registration, this could create perceptible latency or lock contention.

Mitigation & Contingency

Mitigation: Implement the trigger as a DEFERRED constraint trigger (fires after the transaction commits) or replace it with a LISTEN/NOTIFY pattern that queues detection work asynchronously via an Edge Function, completely decoupling detection from the registration write path.

Contingency: Disable the synchronous trigger entirely and rely solely on the scheduled Edge Function for batch detection. Accept a detection delay of up to the scheduling interval (e.g., 15 minutes) in exchange for zero impact on registration latency.

medium impact medium prob dependency

The duplicate detection logic must be validated and approved by NHF before go-live, including agreement on threshold values and the review workflow. NHF stakeholder availability for sign-off may delay this epic's release independently of technical readiness.

Mitigation & Contingency

Mitigation: Gate the feature behind the NHF-specific feature flag so technical deployment can proceed independently of business approval. Involve an NHF administrator in threshold calibration sessions during QA, reducing the formal sign-off surface to policy and workflow rather than technical details.

Contingency: Release the detection system in 'silent mode' — flagging duplicates internally without surfacing notifications to coordinators — until NHF approves the workflow. Use the silent period to collect real data on false-positive rates and refine thresholds before activating notifications.

Quick Links

All Tasks Execution Plan

Implement Supabase trigger for new activities

Acceptance Criteria

Technical Requirements

Execution Context

Implementation Notes

Testing Requirements