Path: blob/main/src/vs/platform/agentHost/test/auth-rework.md
13394 views
Auth Rework: Standards-Based Authentication for the Agent Host Protocol
Problem
The current authentication mechanism is imperative and VS Code-specific:
The renderer discovers agents via
listAgents()and checksIAgentDescriptor.requiresAuth.It obtains a GitHub OAuth token from VS Code's built-in authentication service.
It pushes the token via
setAuthToken(token)— a fire-and-forget JSON-RPC notification.The agent host fans the token out to all registered
IAgentproviders.
This couples the agent host to VS Code internals. An external client (CLI tool, web app, another editor) connecting over WebSocket has no way to know what authentication is required, where to get a token, or what scopes are needed. The client must have out-of-band knowledge that "this server needs a GitHub OAuth token."
Design Goals
Self-describing: The server declares its auth requirements so arbitrary clients can discover them without prior knowledge of the server's internals.
Standards-aligned: Use the semantics and vocabulary of RFC 6750 (Bearer Token Usage) and RFC 9728 (OAuth 2.0 Protected Resource Metadata) adapted for JSON-RPC.
Challenge-on-failure: When auth is missing or invalid, the server responds with a structured challenge (like
WWW-Authenticate) that tells the client exactly what to do.Transport-agnostic: Works over WebSocket JSON-RPC and MessagePort IPC alike.
Multi-provider: Supports multiple independent auth requirements (e.g. GitHub + a future enterprise IdP) each with their own scopes and authorization servers.
Non-breaking migration: Can coexist with
setAuthTokenduring a transition period.
Relevant Standards
RFC 6750 — Bearer Token Usage
Defines how bearer tokens are transmitted (Authorization: Bearer <token>) and how servers challenge clients when auth is missing or invalid:
Key error codes: invalid_request, invalid_token, insufficient_scope.
RFC 9728 — OAuth 2.0 Protected Resource Metadata
Defines a metadata document that a protected resource publishes to describe itself:
Clients discover this metadata either via a well-known URL or via the resource_metadata parameter in a WWW-Authenticate challenge. This tells the client where to get a token and what scopes to request.
Proposed Design
Overview
The authentication flow has three phases, mirroring the HTTP flow from RFC 9728 §5:
Phase 1: Discovery (in initialize response)
The initialize result is extended with a resourceMetadata field, modeled on RFC 9728 §2:
Why in initialize? RFC 9728 publishes metadata at a well-known URL. In our JSON-RPC world, the initialize handshake is the well-known endpoint — it's the first thing every client calls, and it's already where we exchange capabilities. This avoids an extra round-trip and keeps the discovery atomic.
Phase 2: Token Delivery (authenticate command)
Replace the fire-and-forget setAuthToken notification with a proper JSON-RPC request so the client gets confirmation:
This is a request (not a notification) so:
The client knows immediately if the token was accepted or rejected.
The server can validate the token before returning success.
Errors use structured challenges (see Phase 3).
The client can call authenticate multiple times (e.g. when a token is refreshed), and can authenticate for multiple scheme IDs independently.
Phase 3: Challenges on Failure
When a command fails because authentication is missing or invalid, the server returns a JSON-RPC error with structured challenge data in the data field, modeled on RFC 6750 §3:
This is returned as the data payload of a JSON-RPC error response:
A dedicated error code (-32007 AHP_AUTH_REQUIRED) signals this is an auth error so clients can handle it programmatically without parsing the message string.
Phase 4: Auth State Notifications
The server pushes auth state changes via notifications so clients know when auth expires or the required scopes change:
This replaces the implicit "push a token whenever you see an account change" model with an explicit server-driven signal.
Concrete Example: GitHub Copilot Auth
Server-side (CopilotAgent)
When the Copilot agent registers, it publishes an auth scheme:
The agent host aggregates auth schemes from all agents into IInitializeResult.resourceMetadata.
Client-side (VS Code renderer)
Client-side (generic external client)
A CLI tool connecting over WebSocket:
Protocol Changes Summary
New JSON-RPC request: authenticate
| Direction | Type | Params | Result |
|---|---|---|---|
| Client → Server | Request | IAuthenticateParams | IAuthenticateResult |
New JSON-RPC error code
| Code | Name | When |
|---|---|---|
-32007 | AHP_AUTH_REQUIRED | A command failed because auth is missing or invalid |
Extended: initialize result
| Field | Type | Description |
|---|---|---|
resourceMetadata | IResourceMetadata | Optional. Auth and resource information. |
New notification
| Type | Direction | When |
|---|---|---|
notify/authRequired | Server → Client | Auth state changed (expired, revoked, new requirements) |
Deprecated
| Item | Replacement | Migration |
|---|---|---|
setAuthToken notification | authenticate request | Keep accepting setAuthToken for one version, log deprecation |
IAgentDescriptor.requiresAuth | IResourceMetadata.authSchemes | Derive from authSchemes during transition |
Interface Changes in agentService.ts
IAgentService
IAgent
IAgentDescriptor
requiresAuth is removed — clients discover auth requirements from IResourceMetadata instead of per-agent descriptors.
Design Decisions
Why not WWW-Authenticate headers literally?
We're not using HTTP. Embedding RFC 6750's string-encoded header format in JSON-RPC would be awkward. Instead, we use JSON-native equivalents with the same semantics: IAuthChallenge mirrors the WWW-Authenticate parameters, and IResourceMetadata mirrors RFC 9728's metadata document.
Why in initialize and not a separate getResourceMetadata command?
Fewer round-trips. Every client calls initialize first — embedding auth requirements there means the client knows what auth is needed from the very first response. A separate command would add latency and complexity for zero benefit, since the metadata is small and always needed.
Why schemeId and not just the scheme name?
A server might need multiple bearer tokens from different authorization servers (e.g. GitHub + an enterprise IdP). The schemeId lets the client and server correlate tokens to specific requirements. It also makes authenticate calls idempotent and unambiguous.
Why a request instead of a notification for authenticate?
The current setAuthToken is fire-and-forget — the client has no idea if the token was accepted, expired, or for the wrong provider. Making authenticate a request with a response lets the client react immediately (retry with different scopes, prompt the user, etc.).
What about Device Code / OAuth flows that the server drives?
This proposal covers the "client already has a token" case (RFC 6750 bearer). For server-driven flows (device code, authorization code with redirect), the authorizationServers metadata tells the client which AS to talk to. The actual OAuth flow is client-side — the server just declares requirements.
A future extension could add an IAuthScheme with scheme: 'device_code' that includes a device authorization endpoint, letting the server guide the client through a device flow. This is out of scope for the initial implementation.
Migration Plan
Phase A: Add
resourceMetadatatoIInitializeResultand theauthenticatecommand. KeepsetAuthTokenworking as-is.Phase B: Update VS Code renderer to use
authenticateinstead ofsetAuthToken. External clients can start using the new flow.Phase C: Remove
setAuthToken,requiresAuth, and the old imperative push model. Bump protocol version.
Open Questions
Token validation: Should the server validate tokens eagerly on
authenticate(e.g. call a GitHub API endpoint), or defer validation to when a command actually needs it? Eager validation gives better error messages; deferred is simpler and avoids extra network calls.Per-agent vs. global auth: The current design has one
resourceMetadatafor the whole server. Should auth schemes be per-agent-provider instead? Per-agent gives finer control (e.g. "Copilot needs GitHub, MockAgent needs nothing") but complicates the protocol. The current proposal uses global metadata withschemeIdcorrelation, which the server can internally route to the right agent.Token refresh: Should the server expose token expiry information so clients can proactively refresh, or rely on
notify/authRequiredto signal when a refresh is needed? Proactive refresh avoids interruptions but requires the server to parse tokens (which it shouldn't have to for opaque tokens).Multiple tokens: Can a client authenticate multiple scheme IDs simultaneously? (Proposed: yes.) Can multiple clients each send their own token? (Proposed: yes, last-writer-wins per schemeId, which matches current behavior.)