Troubleshooting
A diagnosis-first guide to the most common failure modes. Each section lists the symptom, the most likely cause, and the fix.
Diarization-related
"4 speakers in the transcript but the call is between 2 people"
Likely cause. Background noise (HVAC, traffic) misclassified by pyannote as additional speakers.
Fix.
-
Confirm
SCRYON_AUDIO_DENOISE_ENABLED=true. -
Confirm
SCRYON_DIARIZATION_HINT_TWO_SPEAKERS=trueanddirectionis set on the call. -
Inspect:
curl -s /api/calls/$CALL/transcript | jq '.speakers | length, .speakerResolution' -
If still over-segmented, raise
SCRYON_AUDIO_DENOISE_NR_DBto 16 dB and reprocess.
"Diarization succeeded but every word is attributed to Speaker 1"
Likely cause. Diarization succeeded but the audio is mono with only one detectable speaker — typically a recording where one side wasn't captured.
Fix. Verify the audio yourself (ffprobe, listen to the file). If genuinely one-sided, the result is correct.
Pipeline reports PYANNOTE_FAILED_FALLBACK upload_io_http_400
Likely cause. Presigned URL upload to pyannote failed. Most often: wrong Content-Type, encoded query string, or expired URL.
Fix. This was the bug pattern fixed in PR #19/#20. If it returns:
- Confirm
Content-Type: application/octet-streamis sent on the PUT. - Confirm
DiarizationClientConfigis usingEncodingMode.NONEfor the upload client. - Check pyannote dashboard for ingress errors.
Transcript-related
"Lots of repeated / nonsense words in the transcript"
Likely cause. Whisper stutter loops, phrase loops, or non-speech tags.
Fix. These are stripped by TranscriptNormalizationService (NORMALIZATION_VERSION=3). If you see them in an old call, reprocess to apply the current normalisation. For a fresh call:
- Verify
pipeline.normalizationVersion >= 3on the transcript JSON. - If the issue persists, raise
LEMONFOX_LANGUAGE=en(autodetect can drift on noisy audio).
Names are not resolved — everyone is "Speaker 1 / Speaker 2"
Likely cause. Missing call metadata.
Fix. Check the call record:
curl -s /api/calls/$CALL | jq '{contactName, phoneNumber, direction}'
- If both
contactNameand the user'sdisplayNameare missing → no text resolution possible. Answering-pattern detection can still identify the CONTACT, but display names will remain generic until metadata is supplied. - If a name is set but resolution still produces UNKNOWN, inspect
speakerResolution.warningson the transcript.speaker_roles_unresolved— no transcript evidence (greeting, name mention, or answering phrase) resolved either speaker. Check that the call has audible speech in the first 20 seconds.speaker_identity_ambiguous— both speakers mentioned both names; the resolver refused to guess.answering_pattern_used— roles were assigned from an opening phrase ("hello?", "haan bolo", etc.) —MEDIUMconfidence, review if incorrect.
"Speaker 2's words attributed to Speaker 1"
Likely cause. Diarization collapsed two speakers into one, OR the answering-pattern heuristic fired on the wrong speaker (e.g. the USER opened with "hello?").
Fix.
- Inspect
speakers[]on the transcript — if there's only one real speaker, diarization collapsed. - Check
speakerResolution.warnings:answering_pattern_used→ the resolver identified the first speaker as CONTACT based on their opening phrase. If that's wrong, the call likely starts with the USER saying "hello?" (e.g. user picked up, not the contact). This is an edge case; supplycontactName+ ensure the contact's name appears in the transcript for stronger evidence.
- Check whether pyannote ran:
pipeline.diarizationProvider. If it'spyannote-fallbackorlemonfox, pyannote failed — setPYANNOTE_ENABLED=true.
Voice embedding
POST /api/users/me/voice-profile returns 404
Likely cause. Feature flag is off.
Fix. Set SCRYON_VOICE_EMBEDDING_ENABLED=true and SCRYON_VOICE_EMBEDDING_PROVIDER=pyannote. The pyannote credentials are reused.
Voice match consistently returns NO_MATCH despite a profile
Likely cause. Sample quality, language mismatch, or threshold too high.
Fix.
- Re-upload a 20–30 s sample of the user speaking in the same language as their calls.
- Lower
SCRYON_VOICE_EMBEDDING_MEDIUM_THRESHOLDfrom 0.75 to 0.65 temporarily; observe thescryon_voice_match_outcomemetric. - If still no match, the underlying voice may differ too much (different microphone, lots of background noise). Re-record.
Analysis
Analysis returns empty fields
Likely cause. Transcript was too short or LLM returned a malformed JSON.
Fix.
- For calls < 15 s, this is expected — there isn't enough content.
- Otherwise, check Sentry for
ScryonAnalysisParseException.
Action items lose their owner
Likely cause. The LLM emitted an owner label that didn't match any speaker.
Fix. Check the row directly:
SELECT title, owner_speaker_label, owner_speaker_id, owner_display_name
FROM action_items WHERE id = '<id>';
If owner_speaker_label is set but owner_speaker_id is null, the mapper couldn't reconcile. The LLM is being vague — usually a sign the speaker resolution itself was weak.
Deployment
App fails to start with Flyway checksum mismatch
Likely cause. Someone edited an applied migration file.
Fix. Never edit applied migrations. Restore the file to its original content. If the change is genuinely needed, add a new migration.
If desperate:
flyway repair -url=$DB_URL -user=$DB_USERNAME -password=$DB_PASSWORD
Only use repair if you understand the consequences. It rewrites the schema-history checksums to match local files.
App starts but /api/health returns 503
Likely cause. Postgres unreachable or under heavy load.
Fix. Check actuator/health/db (when exposed) or run a manual SELECT 1 against DB_URL.
Sentry not receiving events
Likely cause. SENTRY_DSN empty or pointed at the wrong project.
Fix. Test:
SENTRY_DSN=... curl https://sentry.io/api/0/projects/<org>/<project>/keys/ -H "Authorization: DSN $SENTRY_DSN"
Where to look first when something is wrong
event=PIPELINElogs for the failingcallId.- The transcript JSON's
speakerResolution.warnings. call_processing_eventstable for the call.- Sentry for any unhandled exception.
- Prometheus
scryon_calls_failed_total{reason=...}to see which stage exploded.
If none of the above show anything, capture the call ID and reach out in the #scryon-eng channel.