Proxy Duplicate Detector
Component Detail
Description
Checks each target peer mentor in a bulk or proxy session against existing activity records to detect potential duplicates before insertion. A duplicate is defined as the same peer_mentor_id, activity_type, and date within the same coordinator's current session, aligning with NHF's duplicate-detection requirements.
proxy-duplicate-detector
Summaries
The Proxy Duplicate Detector safeguards NHF's activity data from accidental duplication, a compliance and reporting risk that would undermine the accuracy of program metrics used for funding and evaluation purposes. By automatically checking each mentor in a bulk submission against existing records before any data is written, the system surfaces conflicts to the coordinator in real time rather than after the fact. This prevents inflated activity counts, reduces the administrative effort of cleaning up erroneous records, and supports the organization's commitment to accurate reporting. Coordinators retain override capability for legitimate edge cases, balancing data quality enforcement with operational flexibility and ensuring no valid activity is blocked unnecessarily.
The Proxy Duplicate Detector is a medium-complexity service component with a single dependency on the proxy-activity-repository, making it a relatively self-contained deliverable. However, its correctness is critical to the bulk registration flow — incorrect duplicate flagging (false positives or missed duplicates) directly affects coordinator trust and data quality.
Testing must cover the three-field duplicate key (peer_mentor_id, activity_type, date) across varying batch sizes, including edge cases like same mentor appearing twice in one submission. The override flow also requires UI coordination to confirm the interaction contract. This component should be completed and validated before the Bulk Registration Service integration begins, as it sits on the critical dependency path for that higher-complexity feature.
Proxy Duplicate Detector queries the proxy-activity-repository for each candidate mentor ID in the incoming batch, applying a composite key check of peer_mentor_id + activity_type + activity_date to identify conflicts with existing records. checkForDuplicates() is the primary entry point for bulk flows, internally calling isDuplicate() per mentor and aggregating results into a structured warning list returned by getDuplicateWarnings(). The overrideDuplicate() method mutates local state to allow the coordinator to proceed past a flagged mentor, and clearDuplicateState() resets all flags between sessions. Since the detector runs in both mobile and backend contexts, query logic must be environment-agnostic and rely on the repository abstraction rather than direct Supabase calls.
For performance with large mentor batches, consider batching the underlying existence queries rather than issuing one query per mentor ID.
Responsibilities
- Query existing activities for each mentor ID in the candidate batch
- Flag records matching peer_mentor_id + activity_type + date as potential duplicates
- Return structured list of duplicate warnings per mentor for UI display
- Allow coordinator to override and proceed after review
Interfaces
checkForDuplicates(mentorIds, activityType, activityDate)
isDuplicate(mentorId, activityType, date)
getDuplicateWarnings(mentorIds, activityData)
overrideDuplicate(mentorId)
clearDuplicateState()
Relationships
Dependents (2)
Components that depend on this component