Digest Engine logoDigest Engine
DocumentationData Model

Data Model

This document outlines the core domain entities mapping to the database, ensuring developers understand the relational boundaries and project-scoping invariants.

Model Diagram

(Future: Add Mermaid ER diagram for visual reference)

Per-App Model List

AppModelDescription
usersAppUserThe project’s custom user model, with profile fields and avatar metadata layered onto Django’s historical auth tables.
usersMembershipInvitationInvite one email address into a project with a predefined role and one-time redemption token.
projectsProjectTop-level workspace for one newsletter topic, scoped through project memberships rather than a legacy Django group.
projectsProjectMembershipJoin table assigning one user a per-project role such as admin, member, or reader.
projectsProjectConfigPer-project tuning values for authority weighting, decay, and topic-centroid recomputation.
projectsSourceConfigPer-project configuration for each ingestion plugin (RSS, Reddit), including activation state and fetch configuration.
projectsBlueskyCredentialsStored account credentials and verification state for a single project’s Bluesky plugin.
contentContentThe canonical record for ingested articles or posts, including source metadata, extracted text, relevance scoring, embeddings, and entity association.
contentUserFeedbackExplicit upvote or downvote feedback on a content item, used to capture editorial preference signals.
ingestionIngestionRunAudit/log record for an ingestion execution, tracking plugin, timing, item counts, status, and failure messages.
entitiesEntityA person, vendor, or organization tracked within a project to associate content with a known source or subject.
entitiesEntityAuthoritySnapshotOne persisted authority-score recomputation for a tracked entity.
entitiesEntityMentionA detected mention of a tracked entity inside one content item, including role and sentiment metadata.
entitiesEntityCandidateAn extracted named entity awaiting acceptance, rejection, or merge into an existing tracked entity.
newslettersIntakeAllowlistApproved sender list for project newsletter intake; confirming who can submit inbound newsletter emails.
newslettersNewsletterIntakeRaw inbound newsletter email captured before and after extraction, holding subject, body, status, and errors.
pipelinePipelineRunAudit model for an execution round through LangGraph.
pipelineSkillResultOutput record for an enrichment skill run, storing status, payload, confidence, latency, and model metadata.
pipelineReviewQueueHuman review item created for content needing manual judgment (e.g., borderline relevance).
trendsThemeSuggestionClustered topic trends presented to human editors for newsletter inclusion.
trendsTopicCentroidSnapshotOne snapshot of the project’s feedback-weighted topic centroid and its drift metrics.
trendsOriginalContentIdeaAuto-generated suggestions for topics the editor should write original content about.

Project-Scoping Invariants

By architectural rule, almost every model (except AppUser) belongs securely to a specific Project (e.g., via project_id). This scoping is heavily enforced at the API layer (refer to developer-guide/backend-conventions.md).

  • Never execute unbounded or unscoped wide queries out of the API.
  • All relationships crossing between content, entities, newsletters, and pipeline models must enforce that the foreign keys belong to the same Project.

Key Indexes

  • project_id scoping indexes exist uniformly.
  • Qdrant Vector Index: Content embeddings are maintained synchronously alongside Postgres data, identified by string UUIDs linking Content.id into the Qdrant document payload. Project ID is routinely attached to vector payloads to allow tenant-safe cosine similarity searches.