critical priority medium complexity testing pending testing specialist Tier 3

Acceptance Criteria

Test suite runs end-to-end against `supabase start` local instance without any mocks or stubs for Supabase internals
Scenario 1 β€” valid match: POST with a claim amount that server computes as 'auto_approved' and client submits 'auto_approved' β†’ assert HTTP 200, response body { approved_path: 'auto_approved', authoritative: true }
Scenario 2 β€” tampered submission: POST with amount server computes as 'manual_review' but client submits 'auto_approved' β†’ assert HTTP 422, response body contains error code 'CLIENT_DATA_TAMPERED_OR_STALE'
Scenario 3 β€” exact threshold boundary (amount == threshold): assert server-computed path matches the boundary rule and HTTP 200 is returned when client submits the correct boundary result
Scenario 4 β€” amount one unit below threshold: verify auto-approval path is returned
Scenario 5 β€” amount one unit above threshold: verify manual_review path is returned
Scenario 6 β€” missing claim_id field: assert HTTP 400 with field-level validation error
Scenario 7 β€” missing amount field: assert HTTP 400
Scenario 8 β€” missing client_approval_path field: assert HTTP 400
Scenario 9 β€” no Authorization header: assert HTTP 401
Scenario 10 β€” expired JWT: assert HTTP 401
Scenario 11 β€” valid tampered submission generates an audit log row: after a 422 response, query the audit log table and assert a row exists with the correct claim_id, client_path, and event_type='APPROVAL_PATH_MISMATCH'
All tests are idempotent β€” each test seeds its own data and cleans up after itself
CI pipeline step runs `supabase db reset` before the test suite to ensure clean state

Technical Requirements

frameworks
Deno test runner (built-in, no external framework needed)
Supabase CLI (supabase start / supabase functions serve) for local instance
supabase-js v2 for audit log assertions
apis
ThresholdValidationEdgeFunction HTTP endpoint (local: http://localhost:54321/functions/v1/threshold-validation)
Supabase Admin API (service role key) for test data seeding and audit log assertions
data models
ClaimExpense (claim_id, amount, currency)
ApprovalMismatchAuditEvent (for post-test assertion queries)
performance requirements
Full test suite completes in under 60 seconds on CI
Each individual test completes in under 5 seconds
security requirements
Service role key used only in test helpers for seeding/assertions β€” never in the request under test
Test JWTs generated with short expiry (60 seconds) to test expiry scenario safely
Local Supabase instance must not be accessible outside localhost during tests

Execution Context

Execution Tier
Tier 3

Tier 3 - 413 tasks

Can start after Tier 2 completes

Implementation Notes

Use `fetch()` directly against the local Edge Function URL β€” do not use the Supabase JS client to invoke functions, as it abstracts HTTP details needed for status code assertions. Structure each test as: (1) seed minimal required data, (2) build request payload, (3) call fetch(), (4) assert status + body, (5) optionally assert side effects (audit log). For JWT generation in tests, use `supabase.auth.signInWithPassword()` with a seeded test user rather than crafting raw JWTs β€” this tests the real auth flow. Keep threshold value in a shared constant imported by both the Edge Function and the test file to avoid magic numbers.

Document the `supabase start` prerequisite prominently at the top of the test file. Add the test command to package.json or Makefile so CI can run it with a single command.

Testing Requirements

These ARE the tests for this task. The test file is the deliverable. Structure tests using Deno's `Deno.test()` with a `beforeAll` hook that calls `supabase start` and a `afterAll` that tears down seeded data. Use a `createTestUser()` helper that creates a Supabase auth user and returns a valid JWT for authenticated scenarios.

Use a `createExpiredJwt()` helper for the 401 expiry scenario. After each tamper scenario, use the service role client to query the audit log and assert the row β€” this verifies both the 422 response AND the audit trail in a single test. All helpers must be in a `test_helpers.ts` file, not inlined.

Component
Threshold Validation Supabase Edge Function
infrastructure medium
Epic Risks (3)
high impact high prob technical

The ThresholdEvaluationService is described as shared Dart logic used both client-side and in the Edge Function. Supabase Edge Functions run Deno/TypeScript, not Dart, meaning the threshold logic must be maintained in two languages and can diverge, causing the server to reject legitimate client submissions.

Mitigation & Contingency

Mitigation: Implement the threshold logic as a single TypeScript module in the Edge Function and call it via a thin Dart HTTP client wrapper for client-side preview feedback only. The server is always authoritative; the client version is purely for UX (showing the user whether their claim will auto-approve before they submit).

Contingency: If dual-language maintenance is unavoidable, create a shared golden test file (JSON fixtures with inputs and expected outputs) that is run against both implementations in CI to detect divergence immediately.

medium impact medium prob technical

A peer mentor could double-tap the submit button or a network retry could trigger a duplicate submission, causing the ApprovalWorkflowService to attempt two concurrent state transitions from draft→submitted for the same claim, potentially resulting in two audit events or conflicting statuses.

Mitigation & Contingency

Mitigation: Implement idempotency in the ApprovalWorkflowService using a database-level unique constraint on (claim_id, from_status, to_status) per transition, combined with a UI-level submission lock (disable button after first tap until response returns).

Contingency: Add a deduplication check at the start of every state transition method that returns the existing state if an identical transition is already in progress or completed within the last 10 seconds.

high impact medium prob scope

Claims with multiple expense lines (e.g., mileage + parking) must have their combined total evaluated against the threshold. If individual lines are added asynchronously or the evaluation runs before all lines are persisted, the auto-approval decision may be computed on an incomplete set of expense lines.

Mitigation & Contingency

Mitigation: The Edge Function always fetches all expense lines from the database (not from the client payload) before computing the threshold decision. Define a clear claim submission contract that requires all expense lines to be persisted before the submit action is called.

Contingency: Add a validation step in ApprovalWorkflowService that counts expected vs. persisted expense lines before allowing the transition, returning a validation error if lines are missing.