Skip to main content

Transcripts

GET /api/calls/{id}/transcript

Returns the normalised, speaker-attributed transcript for a completed call.

Response — 200 OK

{
"schemaVersion": 2,
"callId": "f0a1d2e3-...",
"language": "en",
"durationSeconds": 240,
"cleanText": "[00:00 - 00:04] Praveen: Hi Ravi, ...\n[00:04 - 00:08] Ravi: ...",
"speakers": [
{
"speakerId": "spk_1",
"sourceSpeakerId": "SPEAKER_00",
"label": "Praveen",
"displayName": "Praveen",
"role": "USER",
"confidence": "HIGH",
"labelSource": "GREETING_MATCH",
"voiceMatchScore": null,
"voiceProfileMatched": null
},
{
"speakerId": "spk_2",
"sourceSpeakerId": "SPEAKER_01",
"label": "Ravi",
"displayName": "Ravi",
"role": "CONTACT",
"confidence": "MEDIUM",
"labelSource": "BY_ELIMINATION"
}
],
"segments": [
{
"id": "seg_0001",
"speaker": "Praveen",
"speakerLabel": "Praveen",
"speakerDisplayName": "Praveen",
"speakerId": "spk_1",
"sourceSpeakerId": "SPEAKER_00",
"role": "USER",
"startSeconds": 0.0,
"endSeconds": 4.2,
"text": "Hi Ravi, I'll send the revised pricing today.",
"alignmentConfidence": "HIGH"
}
],
"speakerResolution": {
"strategyVersion": 2,
"usedUserDisplayName": true,
"usedContactName": true,
"usedDirection": true,
"usedPhoneFallback": false,
"voiceEmbeddingEnabled": false,
"voiceProfileUsed": false,
"voiceMatchStatus": "DISABLED",
"totalSpeakerCount": 2,
"resolvedSpeakerCount": 2,
"warnings": []
},
"pipeline": {
"transcriptionProvider": "lemonfox",
"transcriptionModel": "whisper-1",
"diarizationProvider": "pyannote",
"diarizationModel": "precision-2",
"alignmentVersion": "1.0",
"normalizationVersion": 3,
"status": "COMPLETED"
},
"createdAt": "2026-05-29T13:00:42Z"
}

Field reference

FieldMeaning
schemaVersionCurrently 2. Bumped on breaking shape changes.
cleanTextPretty-printed transcript suitable for direct display.
speakers[]One entry per resolved speaker. See enum tables below.
segments[]Time-coded segments; each refers to a speakerId.
speakerResolutionPrivacy-safe telemetry block from SpeakerNameResolutionService.
pipelineProvenance — which providers and algorithm versions produced the output.

Enums

role

ValueMeaning
USERThe authenticated user (phone owner).
CONTACTThe other party on the call.
UNKNOWNIdentity could not be determined.

confidence

ValueMeaning
HIGHStrong evidence: e.g. a greeting addressed the other party by name.
MEDIUMIndirect evidence (mention asymmetry, by-elimination).
LOWDefault / fallback. Always paired with POSITIONAL_FALLBACK or DIARIZATION.

labelSource

ValueMeaning
DIARIZATIONNo resolution applied; raw provider speaker.
USER_PROFILE / CONTACT_METADATAStrict text match on the supplied names.
GREETING_MATCHName appeared inside a greeting pattern.
NAME_MENTIONName appeared elsewhere in the speaker's text.
BY_ELIMINATIONOther speaker was resolved; this is the remaining role.
PHONE_FALLBACKContact name missing; "Contact ending NNNN" was used.
POSITIONAL_FALLBACKNo evidence; direction-aware appearance-order guess.
VOICE_EMBEDDINGVoice profile matched this speaker as the user.
AMBIGUOUSBoth speakers reference both names; nothing assigned.
MANUALFuture user-correction endpoint.

Errors

StatuscodeCause
404call_not_foundCall missing or owned by another user.
422call_not_completedStatus is not COMPLETED.