Skip to content

Agent & Chat

The agent API powers Ghost’s AI assistant — an LLM-based system that can analyze traffic, find security vulnerabilities, generate test scenarios, run external scanners, and produce reports. Unlike typical chat APIs that return a single response, Ghost’s agent streams its work in real time using Server-Sent Events (SSE), showing you each thought, tool call, and result as it happens.

Think of the agent like a security researcher or QA engineer sitting next to you. You describe what you want (“find all authentication issues in this session”), and the agent plans its approach, uses tools to examine traffic, runs analysis, and reports back — all while you watch its progress live.

POST /api/v1/agent/chat

Sends a message to the AI agent and receives a real-time stream of events as the agent thinks, plans, uses tools, and responds. The response uses Server-Sent Events (SSE) — a standard web technology where the server keeps the connection open and pushes events one at a time, rather than sending everything at once.

SSE was chosen over WebSocket for chat because it supports POST bodies (you can send a message and context in the initial request) and has simpler lifecycle management (one-way stream, automatic reconnection in browsers).

Body (JSON, 64 KB limit):

{
"message": "Find all API errors in this session and suggest fixes",
"conversation_id": "",
"session_id": "01HWXYZ...",
"mode": "qa",
"security_mode": "web",
"scan_mode": "passive"
}
FieldRequiredDefaultDescription
messageYesYour message to the agent. Can be a question, instruction, or follow-up in an existing conversation
session_idYesWhich session’s traffic the agent should analyze. The agent can only see flows in this session
conversation_idNo(new)If empty, starts a new conversation. If provided, continues an existing conversation with full context of previous messages
modeNo"qa""qa" for QA testing mode (bug finding, test generation) or "security" for security analysis mode (vulnerability hunting, penetration testing)
security_modeNo"web""web" for web application security or "mobile" for mobile app security. Only relevant when mode is "security"
scan_modeNo"passive"How aggressively the security agent can test. "passive" (read-only analysis), "active-safe" (safe active testing), or "active-full" (full penetration testing with potentially destructive tools)

Validation:

  • Empty message → 400 “message is required”
  • Empty session_id → 400 “session_id is required”
  • Agent not configured (no LLM API key set) → 503 “LLM provider not configured — set API key in settings”

When conversation_id is empty, a new conversation is created automatically:

  • The ID is a ULID (unique, time-sortable)
  • The title is generated from the first message — truncated to 50 bytes with UTF-8 safety (won’t cut a multi-byte character in half), falling back to “New conversation” if the message is empty
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
X-Accel-Buffering: no

The X-Accel-Buffering: no header tells reverse proxies (like Nginx) not to buffer the response — without this, the proxy would wait for the entire response before forwarding any events.

Each event is sent as event: <type>\ndata: <json>\n\n following the SSE specification. Here are all the event types the stream can emit:

EventWhen It FiresPayload FieldsDescription
chunkLLM generates texttype, contentA piece of the agent’s response text. These arrive token-by-token as the LLM generates them, creating the “typing” effect. Concatenate all chunks to build the full response
tool_callAgent decides to use a tooltype, tool_call_id, tool_name, tool_inputThe agent is calling one of its ~60+ tools. tool_input is a JSON object with the tool’s parameters. You’ll see what the agent is doing before the result comes back
tool_resultTool execution completestype, tool_call_id, tool_name, contentThe result of a tool call. content is truncated to 500 characters for display purposes (the agent sees the full result internally)
plan_createdAgent creates a work plantype, planThe agent has laid out a structured plan with numbered steps. The plan object includes goal, scope, steps (each with description, category, status, and suggested tools)
step_completedAgent finishes a plan steptype, step_idA specific step in the plan has been completed. step_id is the 1-based step number
plan_revisedAgent modifies its plantype, planThe agent adjusted its plan based on what it discovered (added steps, reordered priorities, etc.). Plans can be revised up to 5 times
plan_completedAll plan steps donetype, planThe entire plan has been completed. The plan object shows all steps with their final status and results
optionsAgent presents interactive choicestype, question, optionsThe agent is asking the user to choose from a set of options. question is the prompt text, options is an array of objects with label (display text), description (optional context), and value (returned when selected). The agent loop pauses until the user responds by sending a new chat message with the selected value. The Run Summary is suppressed during this state.
steerSteering message consumedtype, contentA user steering message (sent via the steer endpoint) was injected into the agent’s context. The content shows what was injected
metricsAgent run finishingtype, metricsRun statistics including iterations, tool calls, duration, findings count, termination reason, and more (see Metrics section below)
errorAny error occurstype, errorSomething went wrong. The stream ends immediately after an error event
doneStream completetype, conversation_id, message_idThe agent has finished. conversation_id lets you continue this conversation later, message_id identifies the specific assistant message

Note: A step_started event type is defined in the codebase but is never actually emitted.

Messages are saved to the database during streaming so conversations survive application restarts:

  • User messages are saved immediately before the agent starts running
  • Assistant messages are saved per-iteration (the agent may go through multiple iterations of thinking → tool use → thinking)
  • Tool results are saved as separate messages with role "tool" and the matching tool_call_id
  • Steering nudges (prefixed with __nudge__:) are saved as "user" role messages to maintain the required user/assistant alternation pattern

When loading a conversation’s history to feed back to the LLM, Ghost runs a sanitization pass that fixes two problems:

  1. Orphaned tool messages — if a tool result message references a tool_call_id that doesn’t appear in any preceding assistant message, it’s dropped. This can happen with conversations saved before per-iteration persistence was implemented
  2. Consecutive assistant messages — if two assistant messages appear back-to-back without a user message in between (which violates the LLM’s expected alternation pattern), a synthetic [continued] user message is inserted between them

The metrics event includes these fields:

FieldTypeDescription
started_attimestampWhen the agent run began
completed_attimestampWhen the agent run finished
durationnanosecondsTotal wall-clock time
iterationsintegerHow many think→act cycles the agent went through
max_iterationsintegerThe iteration cap (25)
tool_callsintegerTotal number of tool calls made
unique_toolsintegerNumber of distinct tools used
failed_toolsintegerNumber of tool calls that returned errors
plan_stepsintegerNumber of steps in the plan (0 if no plan was created)
plan_revisionsintegerHow many times the plan was revised
steps_completedintegerHow many plan steps were completed
reflectionsintegerNumber of reflection iterations (thinking without tool use)
termination_reasonstringWhy the agent stopped (e.g., “plan_reflection_stop”, “user_stop”, “loop_detected”, “waiting for user choice”)
loops_detectedintegerNumber of times the loop detector fired
findings_totalintegerTotal security findings reported (security mode only)
findings_by_severityobjectFindings broken down by severity level (e.g., {"high": 3, "medium": 5})

The plan object in plan events:

FieldTypeDescription
goalstringWhat the agent is trying to accomplish
scopestringBoundaries of the analysis (which hosts, endpoints, etc.)
stepsarrayOrdered list of plan steps
current_stepintegerWhich step is currently active
created_attimestampWhen the plan was first created
revised_attimestampWhen the plan was last revised (empty if never revised)
revisionsintegerTotal number of revisions (0–5)
statusstring"active", "completed", or "aborted"

Each step has:

FieldTypeDescription
idintegerStep number (1-based)
descriptionstringWhat this step does
categorystring"recon", "analysis", "active_test", "exploit", or "report"
toolsarraySuggested tools for this step
statusstring"pending", "in_progress", "completed", "skipped", or "failed"
resultstringSummary of what was found (max 500 characters)
substepsarrayFiner-grained sub-steps (strings)
started_attimestampWhen work on this step began
completed_attimestampWhen this step finished
POST /api/v1/agent/stop

Signals the running agent to stop. No request body needed.

The stop is not immediate — the agent finishes its current iteration, then gets 1–2 more iterations to write a summary of what it found before the stream ends. This ensures you always get a final report, even if you stop the agent early.

Internally, this cancels the agent’s context, which causes the RunWithRegistry goroutine to exit at the next iteration boundary.

Response: {"status": "stopped"}

Errors: 503 if the agent is not configured (no LLM API key)

POST /api/v1/agent/steer

Injects a message into the running agent’s next iteration — like tapping a researcher on the shoulder and saying “look at this instead.” The agent sees your message prefixed with [USER STEERING] and adjusts its approach accordingly.

Request body (JSON, 64 KB limit):

{
"message": "Focus on authentication endpoints instead"
}

Steering channel: The steer channel buffers up to 5 messages. If all 5 slots are full (the agent hasn’t consumed them yet), new steering messages are silently dropped with a warning logged. This prevents memory buildup if you send many steering messages quickly.

Response: {"status": "delivered"}

Errors: 400 (empty message), 503 (agent not configured)

Conversations are persistent chat threads. Each conversation belongs to a session and contains an ordered list of messages (user, assistant, and tool messages).

GET /api/v1/agent/conversations?session_id=01HWXYZ...

Returns all conversations for a session, ordered by creation time.

Query parameters:

  • session_id (required) — which session’s conversations to list

Response:

[
{
"id": "01HWXYZ...",
"session_id": "01HWABC...",
"title": "Analyze authentication flows",
"created_at": "2024-01-15T10:30:00Z",
"updated_at": "2024-01-15T10:35:00Z"
}
]
GET /api/v1/agent/conversations/{id}

Returns the full conversation including all messages — user messages, assistant responses, and tool call results.

Response:

{
"conversation": {
"id": "01HWXYZ...",
"session_id": "01HWABC...",
"title": "Analyze authentication flows",
"created_at": "2024-01-15T10:30:00Z",
"updated_at": "2024-01-15T10:35:00Z"
},
"messages": [
{
"id": "01HWABC...",
"conversation_id": "01HWXYZ...",
"role": "user",
"content": "Find all authentication issues",
"created_at": "2024-01-15T10:30:00.123456789Z"
},
{
"id": "01HWDEF...",
"conversation_id": "01HWXYZ...",
"role": "assistant",
"content": "I'll analyze the authentication flows in this session...",
"tool_calls": "[{\"id\":\"tc_1\",\"name\":\"list_flows\",\"input\":{\"host\":\"auth.example.com\"}}]",
"created_at": "2024-01-15T10:30:01.000000000Z"
},
{
"id": "01HWGHI...",
"conversation_id": "01HWXYZ...",
"role": "tool",
"content": "{\"flows\":[...]}",
"tool_call_id": "tc_1",
"created_at": "2024-01-15T10:30:02.000000000Z"
}
]
}

Message fields:

FieldTypeDescription
idstringMessage ULID
conversation_idstringWhich conversation this message belongs to
rolestring"user" (your messages), "assistant" (agent responses), or "tool" (tool execution results)
contentstringThe message text (for tool messages, this is the tool’s output)
tool_callsstringFor assistant messages that used tools — a JSON string containing an array of tool call objects (id, name, input). Empty for user/tool messages
tool_call_idstringFor tool messages — which tool call this result belongs to. Empty for user/assistant messages
created_atstringTimestamp in RFC3339Nano format (nanosecond precision)

Note that tool_calls is a JSON string, not a parsed object — the frontend needs to JSON.parse() it to access the individual tool calls.

DELETE /api/v1/agent/conversations/{id}

Deletes a conversation and all its messages. The messages are automatically deleted via SQLite’s ON DELETE CASCADE foreign key constraint — when the conversation row is removed, all message rows referencing it are removed too.

Response: {"status": "deleted"}

Returns 404 if the conversation doesn’t exist.

POST /api/v1/agent/upload
Content-Type: multipart/form-data

Uploads a file for the agent to reference during analysis. The file content is returned in the response (not stored on disk) — the frontend includes it in the next chat message so the agent can read it.

Overall body limit: 512 KB

Accepted file types and per-type size limits:

ExtensionMax SizeUse Case
.txt100 KBPlain text files, logs, notes
.md100 KBMarkdown documents, requirements
.json200 KBAPI specs, configuration files, test data
.yaml200 KBConfiguration files, CI/CD specs
.yml200 KBSame as .yaml

Validation:

  • Unsupported file extension → 400 with list of supported types
  • File exceeds per-type size limit → 413
  • File contains non-UTF-8 bytes → 400 “File contains invalid characters. Only UTF-8 text files supported.”

Response:

{
"filename": "api-spec.json",
"size": 4567,
"type": ".json",
"content": "{\"openapi\": \"3.0.0\", ...}"
}

The content field contains the full file contents as a string.

The agent can write files to a per-session workspace directory at ~/.ghost/workspaces/{session_id}/. These files include generated reports, test scripts, PoC exploits, exported data, and other artifacts. The workspace endpoints let you browse and read these files.

GET /api/v1/agent/workspace/files?session_id=01HWXYZ...

Returns all files in the session’s workspace directory.

Query parameters:

  • session_id (required) — which session’s workspace to list

Path traversal protection: The session ID is validated against ^[A-Za-z0-9_-]+$ to prevent directory traversal attacks (e.g., ../../etc/passwd).

Response:

{
"session_id": "01HWXYZ...",
"workspace": "/Users/you/.ghost/workspaces/01HWXYZ...",
"files": [
{
"path": "report.md",
"size": 4567,
"dir": false
},
{
"path": "poc",
"size": 0,
"dir": true
},
{
"path": "poc/sqli-test.py",
"size": 1234,
"dir": false
}
]
}

If the workspace directory doesn’t exist (no files have been generated for this session yet), returns an empty files array — not an error.

GET /api/v1/agent/workspace/files/{path}?session_id=01HWXYZ...

Returns the content of a specific file from the workspace. The {path} is a wildcard that captures the full path after /files/ — so /files/poc/sqli-test.py reads the file at ~/.ghost/workspaces/{session_id}/poc/sqli-test.py.

Security:

  • Session ID regex validation (^[A-Za-z0-9_-]+$)
  • Path is canonicalized with filepath.Abs() and checked with strings.HasPrefix() to ensure it stays within the workspace directory → 403 “path outside workspace” if traversal is detected

File size limit: 10 MB. Files larger than this return 413.

Response: Raw file content as text/plain; charset=utf-8 (not JSON-wrapped). The HTTP status is 200.

Errors:

  • Missing session_id → 400
  • Missing file path → 400
  • Path is a directory → 400
  • File not found → 404
  • File too large → 413
  • Path outside workspace → 403
LimitValueContext
Chat/steer body size64 KBMaximum JSON payload for chat and steer requests
Upload body size512 KBMaximum multipart form data for file upload
Upload per-type: .txt/.md100 KBIndividual file size cap
Upload per-type: .json/.yaml/.yml200 KBIndividual file size cap
Workspace file read10 MBMaximum file size for workspace file reads
Agent iterations25Maximum think→act cycles per run
LLM output tokens8,192Maximum tokens per LLM response
Steering channel5 messagesBuffer capacity for steering messages
SSE write deadline60 secondsPer-iteration timeout for SSE writes
SSE done/error deadline10 secondsTimeout for the final done event
Conversation title50 bytesMaximum title length (UTF-8 safe truncation)
Tool result display500 charactersTruncation for tool_result SSE events (agent sees full result)
Plan revisions5Maximum times a plan can be revised