Jupyter Notebooks: Architecture and Integration
This document explains how CoCalc implements Jupyter notebook support — kernel management, real-time collaboration, the .sage-jupyter2 SyncDB format, code execution, ipywidgets, and the frontend rendering pipeline.
Overview
CoCalc's Jupyter integration is a multi-layered system:
Backend (
packages/jupyter/): kernel lifecycle, ZMQ messaging, code execution, kernel pooling, ipynb import/exportProject daemon (
packages/project/jupyter/): nbconvert, project-level API, compute server coordinationFrontend (
packages/frontend/jupyter/): React UI, Redux state, output rendering, ipywidgets, collaborative editingConat bridge (
packages/conat/): remote kernel execution, hub API for stateless execution
SyncDB Format (.sage-jupyter2)
Jupyter notebooks are stored as a SyncDB document for real-time collaboration. The synced file path is derived from the ipynb path:
SyncDB Schema
The SyncDB uses these primary keys and string columns:
Record Types
Each record has a type field. The main record types:
| Type | id | Purpose |
|---|---|---|
"cell" | cell UUID | A notebook cell (code, markdown, raw) |
"settings" | "main" | Notebook-level metadata and kernel info |
Cell records include:
| Field | Type | Description |
|---|---|---|
type | "cell" | Record type |
id | string | Cell UUID |
input | string | Cell source code (string_col — uses string merge) |
output | object | Cell outputs (map of index → output message) |
cell_type | "code" | "markdown" | "raw" | Cell type |
pos | number | Position in cell ordering |
exec_count | number | Execution count shown in In[N] |
start | number | Execution start timestamp |
end | number | Execution end timestamp |
state | string | Execution state ("busy", "idle", "run") |
collapsed | boolean | Whether output is collapsed |
scrolled | boolean | Whether output is scrolled |
tags | object | Cell tags (for nbgrader, etc.) |
Settings record:
| Field | Type | Description |
|---|---|---|
type | "settings" | Record type |
id | "main" | Fixed ID |
kernel | string | Kernel name (e.g., "python3") |
metadata | object | Notebook-level metadata |
backend_state | string | Kernel lifecycle state |
kernel_error | string | Last kernel error message |
trust | boolean | Whether notebook is trusted |
Kernel Management
JupyterKernel Class
packages/jupyter/kernel/kernel.ts (~1150 lines) — the core kernel wrapper.
State machine:
Key methods:
spawn()— launch kernel process, set up ZMQ socketsexecute_code(opts)→CodeExecutionEmitter— queue code for executionkernel_info()— get kernel metadata (language, version, banner)complete({code, cursor_pos})— tab completionintrospect({code, cursor_pos, detail_level})— docstring lookupsignal(sig)— send signal (SIGINT for interrupt)close()— shutdown kernel and clean up
Events emitted:
"state"— lifecycle state changes"running"/"failed"— terminal states"shell","iopub","stdin"— ZMQ channel messages"closed"— kernel shutdown
ZMQ Sockets
packages/jupyter/zmq/ — raw ZMQ communication with the Jupyter kernel:
| Socket | Type | Purpose |
|---|---|---|
iopub | Subscriber | Broadcast: outputs, execution_state, display_data |
shell | Dealer | Request/reply: execute, complete, inspect, kernel_info |
stdin | Dealer | Input requests (Python input() function) |
control | Dealer | Interrupt, shutdown |
Message flow for code execution:
Send
execute_requestonshellKernel broadcasts
status: busyoniopubOutputs (
stream,display_data,execute_result,error) oniopubKernel broadcasts
status: idleoniopubexecute_replyonshellwith status
Kernel Pool
packages/jupyter/pool/pool.ts — pre-spawns kernels for faster notebook opens.
Kernels indexed by normalized options (excluding
cwdand filename)Julia kernels are excluded from pooling (resource-heavy)
Pool replenishes asynchronously after a kernel is claimed
Kernel Data
packages/jupyter/kernel/kernel-data.ts — discovers available kernelspecs:
Code Execution
CodeExecutionEmitter
packages/jupyter/execute/execute-code.ts — manages a single code execution:
Execution queue: Cells execute sequentially via _execute_code_queue. Each request is pushed to the queue, and _process_execute_code_queue() processes them one at a time.
OutputHandler
packages/jupyter/execute/output-handler.ts — processes and truncates outputs:
Enforces
max_output_lengthandmax_output_messagesWhen limits exceeded, stores overflow in
_more_output[cell_id]User can fetch overflow via "More output" button →
kernel.more_output(id)Handles blob storage for large binary outputs (images, PDFs)
Blob Storage
Large binary outputs (images, PDFs, HTML) are stored as SHA1-keyed blobs in a Conat DKV (distributed key-value store), not inline in the SyncDB:
Redux State Management
Store (packages/jupyter/redux/store.ts)
Key state fields:
Actions — Three Layers
Base actions (packages/jupyter/redux/actions.ts, ~2600 lines):
Abstract base class shared by frontend and backend. Core operations:
run_code_cell(id)— execute cell, update outputinsert_cell(delta, id?)— add cell above/belowdelete_cell(id)— remove cellmerge_cells(ids)— merge selected cellsset_cell_type(id, type)— change cell typemove_cell(old_pos, new_pos)— reorderset_kernel(name)— switch kernelprocess_output(content)— handle kernel messages
Project actions (packages/jupyter/redux/project-actions.ts):
Server-side actions managing the actual kernel:
Kernel lifecycle (spawn, restart, shutdown)
Blob store management via DKV
Conat service initialization for remote execution
Cell execution queue management
nbconvert integration
Compute server coordination
Browser actions (packages/frontend/jupyter/browser-actions.ts, ~1450 lines):
UI-specific actions:
Keyboard shortcut handling
Cursor tracking (collaborative cursors via
CursorManager)Widget manager initialization
UI state (toolbar, dialogs, scroll position)
nbgrader actions
Local storage persistence
Frontend Components
Main Component
packages/frontend/jupyter/main.tsx — JupyterEditor top-level component.
Cell Rendering Pipeline
Output MIME Type Routing
packages/frontend/jupyter/output-messages/mime-types/ dispatches outputs to specialized renderers:
| MIME Type | Renderer | Notes |
|---|---|---|
text/plain | Plain text with ANSI color support | |
text/html | Iframe-isolated HTML | Security sandbox |
text/markdown | Markdown renderer | |
text/latex | MathJax rendering | |
image/png, image/jpeg | Image component | |
image/svg+xml | SVG renderer | |
application/pdf | PDF viewer | |
application/javascript | JS sandbox | |
application/vnd.jupyter.widget-view+json | ipywidgets |
Commands
packages/frontend/jupyter/commands.ts (~1000 lines) — defines all keyboard shortcuts and menu items as a {[name]: CommandDescription} registry.
ipywidgets
Architecture
IpywidgetsState
packages/sync/editor/generic/ipywidgets-state.ts — syncs widget model state:
WidgetManager
packages/frontend/jupyter/widgets/manager.ts — manages @cocalc/widgets rendering:
Receives comm messages from kernel via IpywidgetsState
Creates widget model instances
Routes
display_datamessages withwidget-view+jsonto widget rendererHandles
send_comm_message_to_kernel()for bidirectional communicationBuffer handling via
setModelBuffers()
Conat Integration
Remote Kernel Execution
packages/jupyter/kernel/conat-service.ts — RPC wrapper for compute servers:
Hub API (Stateless Execution)
packages/conat/hub/api/jupyter.ts — hub-level Jupyter API:
Used by the Python API client and REST endpoints for one-off code execution without opening a full notebook session.
ipynb Import/Export
packages/jupyter/ipynb/:
import-from-ipynb.ts—IPynbImporterclass parses standard.ipynbJSON into the internal SyncDB cell formatexport-to-ipynb.ts—export_to_ipynb()converts the SyncDB state back to standard.ipynbformat for download/interop
The project daemon periodically saves the SyncDB state to the .ipynb file on disk (autosave), and loads from .ipynb on first open.
nbconvert
packages/project/jupyter/convert/ — wraps Jupyter's nbconvert tool:
nbgrader Integration
packages/frontend/jupyter/nbgrader/ — assignment creation and grading:
Cell metadata toolbar for marking solution/test regions
### BEGIN/END SOLUTIONmarkers### BEGIN/END AUTOGRADED TESTmarkersChecksum validation for tamper detection
Clear solutions/hidden tests for student distribution
Key Source Files
| File | Description |
|---|---|
packages/jupyter/kernel/kernel.ts | Core JupyterKernel class (~1150 lines) |
packages/jupyter/kernel/launch-kernel.ts | Direct kernel spawning |
packages/jupyter/pool/pool.ts | Kernel pool manager |
packages/jupyter/execute/execute-code.ts | CodeExecutionEmitter |
packages/jupyter/execute/output-handler.ts | Output processing and truncation |
packages/jupyter/redux/actions.ts | Base JupyterActions (~2600 lines) |
packages/jupyter/redux/store.ts | JupyterStoreState |
packages/jupyter/redux/project-actions.ts | Project-side kernel management |
packages/jupyter/ipynb/import-from-ipynb.ts | ipynb → SyncDB |
packages/jupyter/ipynb/export-to-ipynb.ts | SyncDB → ipynb |
packages/frontend/jupyter/main.tsx | JupyterEditor component |
packages/frontend/jupyter/browser-actions.ts | Browser-side actions (~1450 lines) |
packages/frontend/jupyter/cell-list.tsx | Cell list rendering |
packages/frontend/jupyter/commands.ts | Keyboard/menu commands (~1000 lines) |
packages/frontend/jupyter/output-messages/ | MIME type renderers |
packages/frontend/jupyter/widgets/manager.ts | ipywidgets WidgetManager |
packages/sync/editor/generic/ipywidgets-state.ts | Widget state sync |
packages/jupyter/kernel/conat-service.ts | Remote kernel RPC |
packages/conat/hub/api/jupyter.ts | Hub Jupyter API |
packages/util/jupyter/names.ts | Path utilities, syncdb extensions |