Skip to main content

Data model

Scryon's relational store is Postgres. The schema is owned by Flyway migrations in scryon-backend/src/main/resources/db/migration/. All tables use UUID primary keys, created_at / updated_at timestamps, and per-user scoping.

Entity-relationship diagram

Tables at a glance

TablePurposeNotes
usersAuthenticated user accounts.external_user_id (Firebase UID) is unique.
call_recordsOne row per uploaded call.Drives the state machine. Indexed on (user_id, created_at desc).
call_artifactsOne row per piece of content stored in object storage.(call_id, artifact_type) unique.
action_itemsExtracted action items.Owner fields capture speaker + role + display name.
user_voice_profilesOptional voiceprint per user.At most one row per user; consent_version tracks consent UX.
call_processing_eventsPipeline event log.High-cardinality; retention policy enforced by sweeper.

users

ColumnTypeNotes
iduuid PKServer-generated.
external_user_idtext UNIQUEFirebase UID, or local-dev.
emailtextNullable; sourced from Firebase claims.
display_nametextUsed by the speaker resolver.
created_at / updated_attimestamptz

call_records

ColumnNotes
id (PK)The callId surfaced to clients.
user_id (FK)Owner.
titleFree-form.
contact_name / contact_id / phone_number / organizationCounterparty metadata.
directionINCOMING / OUTGOING / UNKNOWN.
recorded_atClient-supplied; otherwise upload time.
duration_secondsBest-effort.
statusState machine.
error_reasonShort opaque code on FAILED.
created_at / updated_at

call_artifacts

ColumnNotes
id (PK)
call_id (FK)
artifact_typeEnum: TEMP_AUDIO, DIARIZATION_JSON, RAW_TRANSCRIPT_JSON, NORMALIZED_TRANSCRIPT_JSON, ANALYSIS_JSON.
storage_keyLogical path in object storage. See Storage layout.
content_typeMIME type of the bytes.
byte_sizeTotal bytes.
created_at

action_items

ColumnNotes
id (PK)
call_record_id (FK)
title
description
due_datedate, may be null.
prioritylow / medium / high.
statusOPEN / DONE / SNOOZED.
owner_speaker_id / owner_speaker_label / owner_display_name / owner_roleSet from ActionItemOwnerMapper.
source_segment_ids_jsonJSON array of source segment IDs.
source_textProvenance for explainability.
created_at / updated_at / completed_at

user_voice_profiles

ColumnNotes
id (PK)
user_id (FK, unique)
providere.g. pyannote.
model / model_versionProvenance.
embedding_jsonOpaque provider blob. Not a vector we own — we don't decode it.
consent_versionMatches SCRYON_VOICE_CONSENT_VERSION at create time.
sample_duration_secondsFor UX hints.
created_at / updated_at

call_processing_events

ColumnNotes
id (PK)
call_idFK-style, nullable for non-call events.
user_idScope.
stageEnum from ProcessingStage.
statusSTARTED, COMPLETED, FAILED, SKIPPED.
providerWhen applicable.
duration_msSet by ProcessingEventLogger at end of stage.
error_codeShort opaque code.
error_messageSanitized.
created_at

Conventions

  • All FKs are ON DELETE CASCADE when the child is owned (artifacts, events, action items).
  • No raw audio bytes are ever stored in Postgres — only object-storage keys.
  • Timestamps are UTC. Hibernate time_zone=UTC is set explicitly.
  • JSON columns use jsonb so we can index and query without serialisation overhead.

See Database migrations for how the schema evolves.