Core

Knowledge base

Bring your external documentation — help centers, public wikis, Notion workspaces — into Momentum. Index it once, search it semantically, and use it to ground every AI-drafted reply.

Supported source types

  • Sitemap

    Point at any /sitemap.xml. Momentum walks the sitemap, fetches each URL, and extracts article body using a content-aware extractor.

  • Single URL

    One-off ingestion of a single page. Good for testing or small additions.

  • Zendesk Help Center

    Uses the public /api/v2/help_center/articles.json pagination endpoint. No auth required for public articles.

  • Intercom Help Center

    Auto-discovers /sitemap.xml from the help-center homepage. Intercom-specific body extractor preserves article structure.

  • Freshdesk Help Center

    Same auto-discovery approach. Freshdesk-specific body extractor.

  • Notion public

    Ingests a public Notion page or workspace via its published URL.

How ingestion works

  1. Discover URLs. For sitemap/help-center types, Momentum walks the source to collect article URLs.
  2. Fetch and extract. Each article is downloaded with content-aware extraction (strips navigation, footers, ads).
  3. Heuristic cleanup. Common noise patterns are removed (cookie banners, social share buttons).
  4. AI cleanup (optional). A Claude Haiku pass sharpens chunks for retrieval — strips boilerplate, normalizes headings, compacts verbose wording.
  5. Chunk and embed. Articles split into chunks and embedded with OpenAI text-embedding-3-small.
  6. Store in pgvector. Chunks land in kb_chunks scoped by workspace_id. Composite unique (workspace_id, url) keeps re-crawls idempotent.

Scheduling

Each source has a schedule_interval:

  • manual — only crawls when you hit the re-crawl button
  • daily — scheduler ticks once per 24h
  • weekly — once per 7d

The scheduler lives inside the Hocuspocus process and ticks every 60 seconds. Due sources are picked up and fired asynchronously.

Source health dashboard

The settings page shows, for each source:

  • Last crawl timestamp + status
  • Next scheduled crawl
  • Indexed article count
  • All-time error count
  • Manual re-crawl button

Searching

Queries hit searchKb() which filters strictly by the caller's workspace_id. Results return both the matching chunk and the source article URL.

CaseAIde automatically searches the KB before drafting a reply, surfacing the most relevant article chunks as context for the model.