LLM / AI Integration
This document explains how CoCalc integrates large language models — provider routing, cost tracking, streaming, the Conat messaging bridge, and frontend components.
Overview
CoCalc supports multiple LLM providers through a unified architecture:
Server (
packages/server/llm/): evaluation engine, provider routing via the Vercel AI SDK, abuse prevention, cost trackingConat bridge (
packages/conat/llm/): request/response messaging with streaming between frontend and serverFrontend (
packages/frontend/frame-editors/llm/,packages/frontend/client/llm.ts): model selector, inline assistant, cost estimationTypes & config (
packages/util/db-schema/llm-utils.ts,packages/util/types/llm.ts): model definitions, pricing, validation
Supported Providers
| Provider | Model prefix | Examples |
|---|---|---|
| OpenAI | gpt- | gpt-4o-8k, gpt-5.2-8k |
gemini- | gemini-2.5-flash-8k, gemini-3-flash-preview-16k | |
| Anthropic | claude- | claude-4-6-sonnet-8k, claude-3-5-sonnet |
| Mistral | mistral- | mistral-large, mistral-small |
| Xai | grok- | grok-4-1-fast-non-reasoning-16k, grok-code-fast-1-16k |
| Ollama | ollama- | User-configured local models |
| Custom OpenAI | custom_openai- | User-configured endpoints |
| User-defined | user- | user-{service}-{id} |
Default priority when auto-selecting: Google -> OpenAI -> Anthropic -> Mistral -> Xai -> Ollama -> Custom OpenAI.
Server-Side Evaluation
Entry Point
packages/server/llm/index.ts — the main evaluate() function:
AI SDK Unified Handler
packages/server/llm/evaluate.ts — routes all providers (OpenAI, Google, Anthropic, Mistral, Xai, Ollama, Custom OpenAI) through the Vercel AI SDK:
Each provider config includes:
createModel()— instantiate an AI SDK model instancecheckEnabled()— verify API key is configuredcanonicalModel()— normalize model namesupportsCaching— whether the provider supports prompt caching (e.g. Anthropic)
After evaluation, provider metadata is logged for diagnostics — this includes cached token counts (Anthropic) and reasoning token counts (OpenAI, xAI).
Streaming
Streaming uses Conat multiresponse requests:
Frontend sends request to
llm.account-{account_id}.apiServer sends chunks with incrementing sequence numbers
Frontend reassembles via
streamcallback:(output: string | null) => voidnullsignals completion
User-Defined LLMs
packages/server/llm/user-defined.ts — models configured by individual users:
Reasoning & Thinking Tokens
Different providers expose reasoning and caching metadata in different ways via the AI SDK's providerMetadata:
| Provider | Reasoning tokens | Cache tokens | Notes |
|---|---|---|---|
| OpenAI | providerMetadata.openai.reasoningTokens | -- | o3-mini: ~86% of output can be reasoning tokens |
| xAI | providerMetadata.openai.reasoningTokens (OpenAI-compatible) | -- | Reasoning count may exceed output count (different methodology) |
| Anthropic | -- | providerMetadata.anthropic.cacheCreationInputTokens, cacheReadInputTokens | Extended thinking exists but not exposed as token counts |
| -- | -- | Reasoning tokens included in totals but not yet exposed via AI SDK |
Billing note: reasoning tokens are already included in completion_tokens, so billing is correct even when reasoning counts are not separately displayed.
Core Types
ChatOptions
ChatOutput
Cost Tracking
Pricing
packages/util/db-schema/llm-utils.ts defines per-model pricing:
Purchase Flow
After evaluation:
Check
isFreeModel(model)— free models skip chargingCalculate cost:
prompt_cost * prompt_tokens + completion_cost * completion_tokensCreate purchase via
createPurchase()with type, token counts, tag
Free Models
Determined by isFreeModel(model, isCoCalcCom):
Ollama models (self-hosted)
Some user-defined LLMs
Platform-specific free tiers
Abuse Prevention
packages/server/llm/abuse.ts:
Prometheus metrics: llm_abuse_usage_global_pct (gauge), llm_abuse_usage_account_pct (histogram), llm_abuse_rejected_total (counter).
Database Schema
openai_chatgpt_log Table
Despite the legacy name, stores all LLM provider interactions:
| Field | Type | Description |
|---|---|---|
id | serial | Primary key |
time | timestamp | Request time |
account_id | UUID | Requesting user |
input | text | User message |
output | text | Model response |
history | jsonb | Conversation history |
model | text | Model identifier |
system | text | System prompt |
tag | text | Analytics tag ({vendor}:{category}) |
total_tokens | integer | Total tokens used |
prompt_tokens | integer | Input tokens |
total_time_s | float | Response time |
project_id | UUID | Context project |
path | text | Context file path |
Related Tables
openai_embedding_log— vector embedding usage trackingopenai_embedding_cache— embedding cache (keyed byinput_sha1)
Conat Messaging
Subject Pattern
Server Registration
Client
Frontend Components
LLMClient
packages/frontend/client/llm.ts:
Handles: default system prompt, locale settings, purchase permission checks, history/message truncation to fit context window, Conat call.
Model Selector
packages/frontend/frame-editors/llm/llm-selector.tsx — dropdown for choosing model. Groups models by provider, shows inline cost estimation, includes user-defined LLMs, validates availability.
AI Assistant Integration Points
Frame editors (packages/frontend/frame-editors/llm/):
| Component | Purpose |
|---|---|
llm-assistant-button.tsx | Main AI button in editor toolbar |
help-me-fix-button.tsx | Error explanation button |
help-me-fix-dialog.tsx | Full dialog for fix suggestions |
llm-query-dropdown.tsx | Quick action menu |
llm-history-selector.tsx | Previous query history |
Jupyter (packages/frontend/jupyter/llm/):
| Component | Purpose |
|---|---|
cell-tool.tsx | Per-cell AI assistant button |
cell-context-selector.tsx | Choose context scope |
split-cells.ts | LLM-powered cell splitting |
Chat (packages/frontend/chat/):
llm-cost-estimation.tsx— cost display in chat messagesMessage summarization via LLM
Token Estimation
packages/frontend/misc/llm.ts:
Cost Estimation Component
packages/frontend/misc/llm-cost-estimation.tsx — displays estimated cost before execution. Free models marked as "free to use".
User-Defined LLMs
Users can add their own LLM endpoints:
User-Defined LLM Hook (Frontend)
REST API
Server Settings
| Setting | Description |
|---|---|
default_llm | Default model (fallback: gemini-3-flash-preview-16k) |
pay_as_you_go_openai_markup_percentage | Cost markup (0-100%) |
user_defined_llm | Enable/disable user-defined LLM support |
Key Source Files
| File | Description |
|---|---|
packages/util/db-schema/llm-utils.ts | Model definitions, pricing, validation (~66KB) |
packages/util/types/llm.ts | ChatOptions, History, ChatOutput types |
packages/util/db-schema/llm.ts | Database schema for log tables |
packages/server/llm/index.ts | Main evaluate() entry point |
packages/server/llm/evaluate.ts | AI SDK unified handler |
packages/server/llm/client.ts | Ollama / Custom OpenAI model factories |
packages/server/llm/utils.ts | Token counting, provider metadata extraction |
packages/server/llm/user-defined.ts | User-defined LLM evaluation |
packages/server/llm/abuse.ts | Rate limiting and quotas |
packages/server/llm/save-response.ts | Database persistence |
packages/conat/llm/client.ts | Frontend -> server messaging |
packages/conat/llm/server.ts | Subject routing and handling |
packages/frontend/client/llm.ts | LLMClient class |
packages/frontend/frame-editors/llm/llm-selector.tsx | Model picker |
packages/frontend/frame-editors/llm/llm-assistant-button.tsx | AI button |
packages/frontend/jupyter/llm/cell-tool.tsx | Jupyter cell assistant |
packages/frontend/misc/llm-cost-estimation.tsx | Cost display |
packages/frontend/misc/llm.ts | Token estimation utilities |
packages/next/pages/api/v2/llm/evaluate.ts | REST API endpoint |
Tests
Unit Tests
Integration Tests (requires Postgres + API keys)
The suite is opt-in and skipped unless COCALC_TEST_LLM=true.
Required environment variables (see packages/server/llm/test/shared.ts):
COCALC_TEST_OPENAI_KEYCOCALC_TEST_GOOGLE_GENAI_KEYCOCALC_TEST_ANTHROPIC_KEYCOCALC_TEST_MISTRAL_AI_KEYCOCALC_TEST_XAI_KEY