AntonAgent
Embed AntonAgent in a Dynamic app. Two halves: a backend function that calls ctx.anton.connect, and a client that speaks the WebSocket event protocol.
Audience: LLM agents or developers wiring AntonAgent into a Dynamic app.
Architecture
┌─────────┐ ┌──────────────┐ ┌─────────────┐
│ Client │────▶│ Web PubSub │────▶│ Anton Agent │
│ Widget │◀────│ (WebSocket) │◀────│ (Server) │
└─────────┘ └──────────────┘ └─────────────┘
Bidirectional WebSocket using the json.webpubsub.azure.v1 subprotocol. No REST polling. REST is only used during setup (_me, _rt/ws, anton-connect).
Prerequisites
- Dynamic app with auth configured
gg/ratinstalled (https://ggcmd.io/gg.cmd)- Client app that can host the chat widget
Backend
Pull the typed surface and let your IDE drive:
gg rat api types
ctx.anton.* exposes connect, wake, prompt, list, close, setCredits, ask. Every option is documented inline — hover the method or Cmd/Ctrl+click into the types for defaults, enum values, the identityId vs actAs distinction, and the rest. This page does not duplicate the reference.
You need one function. Here's the minimum:
gg rat api function create http \
--file functions/anton-connect.ts \
--endpoint /anton-connect \
--access POST
import type { FunctionContext } from "./types";
export default async function (ctx: FunctionContext) {
const { connectionId, message } = ctx.item as { connectionId: string; message?: string };
if (!connectionId) { ctx.res.status = 400; ctx.item = { error: "Missing connectionId" }; return; }
if (!ctx.me?.id) { ctx.res.status = 401; ctx.item = { error: "Not authenticated" }; return; }
ctx.item = await ctx.anton.connect({
identityId: ctx.me.id,
channelName: `anton:${ctx.me.id}`,
connectionId,
...(message ? { message } : {}),
});
}
Deploy:
gg rat api deploy
# or just the function:
gg rat api function update --file functions/anton-connect.ts
For anonymous sessions, expose the function with --access GET and pick an identityId from something the client sends but can't easily forge (signed cookie, Turnstile token, Vipps phone, device fingerprint). The function is the gatekeeper for whatever identityId it forwards.
Functions the agent calls must declare their inputs
Your app's REST API becomes the agent's tools by generating them from your OpenAPI (/api/_openapi). A tool can only pass arguments that the spec declares — so the rule is:
If the agent must supply a value, that value has to be in the contract: a typed request body (POST/PUT/PATCH) or a declared parameter. Never read an agent-supplied value ad-hoc from the query string.
A GET function's generated tool only carries the auto-injected collection params (filter, select, sortBy, pageNo, pageSize, count, include, lastKey). There is no way to add a custom query param to a function's generated tool — if your function reads ctx.req.query.myArg, the agent's tool has no slot for myArg and the call fails (400, or the function runs with the value missing). It works when you call it by hand (?myArg=…) but the agent can't, because the tool generator never saw myArg.
Anything the agent needs to choose at call time belongs in a POST body instead — a body is free-form JSON, so the agent's tool can carry arbitrary fields. Read it from ctx.req.bodyJson, and make sure the verb you read is in the function's access allow-list (a GET-only allow-list rejects the agent's POST with 403).
This is deliberately not something the AntonAgent binding can paper over: it has no way to know your function needs myArg, what values are valid, or which verb to use — none of that is in the spec. Keep the contract honest and the tools generate correctly for free.
// ❌ agent can't drive this — `path` is read from the query but never declared
// GET /api/spond -> ctx.req.query.path
const path = ctx.req?.query?.path;
// ✅ agent can drive this — `path` rides in the POST body the tool exposes
// POST /api/spond (access: POST) -> ctx.req.bodyJson.path
const { path } = (ctx.req?.bodyJson ?? {}) as { path?: string };
(A read-only proxy over POST looks odd, but the POST body is just the transport that lets the agent pass an argument — your function can still do a plain GET to whatever it proxies.)
Direct completion — ctx.anton.ask
When you just need an LLM response inside a function (classification, summarization, form validation, anything one-shot), skip connect entirely and call ctx.anton.ask — no container spins up.
messages is OpenAI chat-completions shape ({role, content}[]), so anything you'd hand an OpenAI SDK call drops straight in. The response is simplified to {text, usage, ...} — no choices[] envelope, no streaming. See the JSDoc on AntonAskOptions for the rest of the knobs.
Stats callback (optional)
Pass callbackUrl on connect. Anton POSTs JSON to it at turn lifecycle boundaries. Intended for billing/analytics on the app dev side.
URL forms accepted:
- Absolute:
https://example.com/anton-stats?key=… - Leading-slash relative:
/anton-stats?key=…— resolves against your app's own Dynamic base URL. Use this to point at one of your own functions without hardcoding the host.
No auth envelope. Bake your shared secret into the URL query string. Delivery is fire-and-forget — ~2s timeout, no retries, failures logged server-side only.
event | When |
|---|---|
turn-started | Codex begins a turn — credits snapshot baseline |
turn-completed | Turn finished or failed — full stats |
Common fields on every payload: event, apiId, apiName, identityId, threadId.
turn-started:
{
"event":"turn-started",
"apiId":"…","apiName":"…","identityId":"…","threadId":"…",
"credits": 87720,
"sessionCredits": 4520
}
turn-completed:
{
"event":"turn-completed",
"apiId":"…","apiName":"…","identityId":"…","threadId":"…",
"status":"completed",
"error": null,
"credits": 87400,
"sessionCredits": 4200,
"sessionCreditsLimit": 10000,
"sessionCreditsUsed": 5800,
"turnCreditsConsumed": 320,
"turnDurationMs": 4820,
"toolCalls": [
{ "server":"parkly", "tool":"get-schema", "durationMs":120 },
{ "server":"parkly", "tool":"crt-item", "durationMs":380, "error":"…" }
]
}
Respond 2xx quickly. Bodies are dropped. >2s is aborted.
Client: connection flow
1. GET /api/_me → current user ID
2. GET /api/_rt/ws → WebSocket URL (one-time token)
3. new WebSocket(url, 'json.webpubsub.azure.v1')
4. Wait {type:'system', event:'connected'} → extract connectionId
5. POST /api/anton-connect → {connectionId}
→ {channel, credits}
6. Ready — send/receive via channel
All fetch calls need credentials:'include' so access/refresh cookies are sent.
Tip: connect eagerly when the chat widget mounts (not on first user message). anton-connect wakes the agent container server-side, so doing it early shaves first-message latency.
Client: payload shape (read this before either transport)
Server-published frames (every event in the tables below — ack, text, turn-started, …) arrive with data as a JSON-encoded string, even though the envelope's dataType is "json". Only client-to-group echoes round-trip as objects. So clients MUST normalize:
const payload = typeof data === 'string' ? JSON.parse(data) : data;
Skip this and payload.type is undefined for every server event.
Server-published payloads also carry an extra _from: "anton" marker. Ignore it — and ignore any other unknown fields; the protocol may add more.
Client: SDK alternative
@azure/web-pubsub-client handles reconnect, subprotocol framing, token refresh, and group filtering. The event protocol is the same; only the transport changes.
npm install @azure/web-pubsub-client
import { WebPubSubClient } from "@azure/web-pubsub-client";
const client = new WebPubSubClient({
// Re-invoked on every (re)connect — always returns a fresh token.
getClientAccessUrl: async () => {
const r = await fetch("/api/_rt/ws", { credentials: "include" });
const { url } = (await r.json()) as { url: string };
return url;
},
});
client.on("group-message", (e) => {
if (e.message.dataType !== "json") return;
// Server frames are JSON-encoded strings; echoes are objects. Normalize.
const raw = e.message.data;
const payload = typeof raw === "string" ? JSON.parse(raw) : raw;
if (!payload || typeof payload !== "object" || Array.isArray(payload)) return;
if (typeof payload.type !== "string") return;
handleMessage(payload as { type: string; [k: string]: unknown });
});
await client.start();
// anton-connect joins the connection to the group server-side.
const r = await fetch("/api/anton-connect", {
method: "POST",
credentials: "include",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ connectionId: client.connectionId }),
});
const { channel, credits } = await r.json();
// Re-call anton-connect on every reconnect — connectionId changes.
client.on("connected", async (e) => {
await fetch("/api/anton-connect", {
method: "POST", credentials: "include",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ connectionId: e.connectionId }),
});
});
Send:
await client.sendToGroup(
channel,
{ type: "message", text, id: crypto.randomUUID() },
"json",
{ noEcho: true }, // server drops your own copy — no fromUserId filter needed
);
The SDK does NOT remove app-level ACK recovery: if Anton's subscription dropped mid-turn (container restart), the message is gone even though the transport succeeded. Keep the ACK-timeout → anton-connect re-send flow from "ACK system" below.
Client: message envelope (raw WS path)
Skip this section if you're using the SDK above.
Send:
ws.send(JSON.stringify({
type: 'sendToGroup',
group: channel, // from anton-connect response
data: payload, // one of the payloads below
dataType: 'json',
}));
Receive — messages arrive on ws.onmessage:
{"type":"message","from":"group","fromUserId":"...","data":<payload>}
<payload> is either a JSON object (your own echoes) or a JSON-encoded string (every server event). See "payload shape" above — you MUST JSON.parse strings before dispatching.
ws.onmessage = (ev) => {
const m = JSON.parse(ev.data);
if (m.type !== 'message' || m.from !== 'group') return;
if (m.fromUserId === myUserId) return;
const payload = typeof m.data === 'string' ? JSON.parse(m.data) : m.data;
handleMessage(payload); // payload._from === "anton" on server frames; ignore it
};
Client: send events (client → server)
type | Fields | Purpose |
|---|---|---|
message | text, id? | Chat message; include id (UUID) to receive an ack |
status | — | Request current agent state + credits |
cancel | — | Cancel the current AI turn |
new-thread | — | Start a new conversation |
restart | — | Restart the agent container |
fast | — | Enable fast mode for next turn |
effort | level:'low'|'medium'|'high' | Set reasoning effort |
thread-list | — | Request past conversations |
thread-resume | threadId | Resume a conversation |
compact | — | Compact current thread context |
rollback | turns? | Undo turns (default 1) |
mcp-list | — | Request MCP server list |
tool-result | reqId, result?, error? | Only used if you configured uiTools (see end of doc). |
Client: receive events (server → client)
Anton envelope events
type | Fields | Notes |
|---|---|---|
ack | id | Server received your message — clear the pending ACK timer |
status | agent, credits, sessionCredits | Update credits + agent state |
credits | credits, sessionCredits | Credit-only update |
info | message | Info message |
idle-timeout | deadline, timeoutMs | Idle warning |
anton-event | event | Lifecycle event |
mcp-list | servers:[{name, status}] | MCP server list |
thread-list | threads:[{id, preview, createdAt}] | Past threads. createdAt is unix seconds (UTC), not ms or ISO. preview is the first user message, trimmed to 200 chars. Sorted newest first, capped at 50. |
Codex-translated events
These come from the codex AI engine, translated server-side into a stable vocabulary so you don't need to know codex internals.
type | Fields | Notes |
|---|---|---|
text | delta | Assistant text delta — concat into a buffer, render markdown |
user-message | text | Past user turn — only inside history (not live) |
turn-started | — | Show spinner |
turn-completed | status, error? | Hide spinner, refresh app data. Status: 'completed'|'failed' |
tool-start | name, id | Built-in tool call |
tool-end | id | |
command-start | command, id | Shell command |
command-output | delta | Stdout/stderr |
command-end | id, exitCode? | |
file-start | path, id | File change |
file-output | delta | Diff delta |
file-end | id | |
mcp-tool-start | id, server, tool, argsPreview? | MCP tool — emoji-map by tool name |
mcp-tool-end | id, durationMs?, error? | |
mcp-status | name, status | MCP server liveness |
tool-progress | message | MCP tool progress |
thinking-start | id | Reasoning began |
thinking | delta | Reasoning delta |
thinking-end | id | |
web-search | id, query, action? | Agent searched the web. Action: 'search'|'openPage'|'findInPage'|'other' |
plan-updated | steps:[{step, status}], explanation? | Full plan snapshot, NOT a delta — replaces any prior plan. Status: 'pending'|'inProgress'|'completed' |
compacted | — | Context auto-compacted — optional banner so users understand earlier turns appear forgotten |
history | items: ClientMessage[] | Replay batch on thread-resume — items use the same vocab as live events; render through the same dispatcher |
error | message, willRetry? | Show error to user |
tool-call | tool, params, reqId? | Only fires if you configured uiTools (see end of doc). |
Markdown rendering for text
Codex text deltas are GitHub-flavored markdown — concat into a buffer per assistant message and re-render the full buffer on each update. Never render deltas in isolation (a code fence or link can span multiple deltas). Treat the agent's output as untrusted (it can embed user-supplied content) — sanitize before injecting as HTML.
Expect: headings, bold/italic, inline code, fenced code blocks (with language tags), bullet/numbered lists, tables, links, blockquotes. Use a markdown library that tolerates incomplete fences during streaming.
History on thread-resume
When the user picks a past thread, the server fires one {type:'history', items:[…]} before any new turn. Each item is a ClientMessage from the table above (including turn-started/turn-completed boundaries). Run them through your live dispatcher — same render code.
Client: ACK system
Web PubSub can drop a subscription mid-flight (container restart, etc.) where the transport succeeded but Anton never received the message. The ACK pattern detects that and recovers by re-calling anton-connect.
function send() {
const id = crypto.randomUUID();
sendToGroup({ type: 'message', text, id });
setWaitingAck(true);
setGenerating(true);
pendingAckRef.current = {
id,
timer: setTimeout(() => {
// No ACK after 10s — recover by calling anton-connect
// again with the pending message in the body.
setWaitingAck(false);
fetch('/api/anton-connect', {
method: 'POST',
credentials: 'include',
body: JSON.stringify({ connectionId, message: text }),
});
}, 10000),
};
}
// In handleMessage:
if (type === 'ack' && pendingAck?.id === data.id) clearPendingAck();
if (type === 'turn-started') clearPendingAck();
Without ACK recovery the chat still works — the user just refreshes manually if Anton stops responding.
Client: data refresh
When turn-completed fires, refresh the host app's data — the agent may have modified collections, created files, etc.
if (type === 'turn-completed') {
// ...cleanup
onDataChanged?.();
}
Optional: UI tools
Skip this section unless you have a specific reason to push UI side-channel events from codex (e.g., highlight a row, open a modal, ask the user to confirm a destructive action). For normal chat apps, you do NOT need this — the assistant's text replies plus turn-completed → onDataChanged() covers almost every use case.
If you do need it: pass uiTools on connect with each tool's description + params (and returns for round-trip tools). Codex sees them as MCP tools and may call them. Tool calls arrive on the frontend as {type:'tool-call', tool, params, reqId?}; if reqId is set, reply with {type:'tool-result', reqId, result} within 30s.
uiTools shape
Each entry is { description, params?, required?, returns? }. params accepts either form — both are normalized server-side:
uiTools: {
// No-params tool. Fire-and-forget (no `returns`) → frontend doesn't reply.
'refresh-list': {
description: 'Tell the UI to reload the current list. Call after you mutate data.',
params: {},
},
// Tool with params + round-trip (`returns` is set, so the frontend MUST
// send `{type:'tool-result', reqId, result}` within 30s).
'highlight-row': {
description: 'Visually highlight a row by id and return whether it was found.',
params: {
id: { type: 'string', description: 'Row id to highlight.' },
color: { type: 'string', enum: ['red', 'yellow', 'green'] },
},
required: ['id'],
returns: {
type: 'object',
properties: { found: { type: 'boolean' } },
required: ['found'],
},
},
// Same tool, written as a full JSON schema — also accepted. Use whichever
// form your LLM-generated code reaches for.
'highlight-row-alt': {
description: 'Same as above.',
params: {
type: 'object',
properties: {
id: { type: 'string' },
color: { type: 'string', enum: ['red', 'yellow', 'green'] },
},
required: ['id'],
},
returns: { /* same as above */ },
},
}
Codex picks them up via -c MCP server flags when the container boots. Adding or changing uiTools on a later connect call does NOT propagate to a running container. To apply changes, call ctx.anton.close({ identityId }) (kills the container), then reconnect.
Optional: API tools — cost tuning
By default the agent gets your app's entire backend REST API as MCP tools — every endpoint (user/role CRUD, auth, your own collections) becomes a callable tool. Those tool definitions are re-sent to the model uncached on every turn, so a large API is a real per-turn input-token cost even when the agent never touches most of it. For a chat assistant that genuinely works with your data this is usually what you want. For a UI-only or game agent that acts purely through uiTools, it's pure overhead.
These tools are generated straight from your OpenAPI, so a function is only drivable by the agent if its inputs are declared in the spec — see Functions the agent calls must declare their inputs.
Control it with apiTools on connect:
ctx.anton.connect({
identityId,
channelName,
connectionId,
uiTools: { /* your gameplay/UI tools */ },
apiTools: 'none', // worm-game agent plays via uiTools only — skip the API
});
apiTools | Agent sees | Use when |
|---|---|---|
'all' (default) | every endpoint as a tool | the agent works directly with your API/data |
'dynamic' | 3 meta-tools (list-api-endpoints, get-api-endpoint-schema, invoke-api-endpoint) | you want the API available but cheap — the agent discovers endpoints on demand (adds round-trips; best with a capable model) |
'explicit' | only the operations in apiToolsAllow (e.g. ['GET-user', 'POST-order']) | the agent needs a known handful of endpoints |
'none' | nothing from the backend API | UI-only / game agents that work purely through uiTools |
apiTools is applied via -c MCP flags at container boot. Changing it on a later connect does NOT affect a running container — ctx.anton.close({ identityId }) then reconnect to apply.