Agent & Chat
The agent API powers Ghost’s AI assistant — an LLM-based system that can analyze traffic, find security vulnerabilities, generate test scenarios, run external scanners, and produce reports. Unlike typical chat APIs that return a single response, Ghost’s agent streams its work in real time using Server-Sent Events (SSE), showing you each thought, tool call, and result as it happens.
Think of the agent like a security researcher or QA engineer sitting next to you. You describe what you want (“find all authentication issues in this session”), and the agent plans its approach, uses tools to examine traffic, runs analysis, and reports back — all while you watch its progress live.
Chat (SSE Streaming)
Section titled “Chat (SSE Streaming)”POST /api/v1/agent/chatSends a message to the AI agent and receives a real-time stream of events as the agent thinks, plans, uses tools, and responds. The response uses Server-Sent Events (SSE) — a standard web technology where the server keeps the connection open and pushes events one at a time, rather than sending everything at once.
SSE was chosen over WebSocket for chat because it supports POST bodies (you can send a message and context in the initial request) and has simpler lifecycle management (one-way stream, automatic reconnection in browsers).
Request
Section titled “Request”Body (JSON, 64 KB limit):
{ "message": "Find all API errors in this session and suggest fixes", "conversation_id": "", "session_id": "01HWXYZ...", "mode": "qa", "security_mode": "web", "scan_mode": "passive"}| Field | Required | Default | Description |
|---|---|---|---|
message | Yes | — | Your message to the agent. Can be a question, instruction, or follow-up in an existing conversation |
session_id | Yes | — | Which session’s traffic the agent should analyze. The agent can only see flows in this session |
conversation_id | No | (new) | If empty, starts a new conversation. If provided, continues an existing conversation with full context of previous messages |
mode | No | "qa" | "qa" for QA testing mode (bug finding, test generation) or "security" for security analysis mode (vulnerability hunting, penetration testing) |
security_mode | No | "web" | "web" for web application security or "mobile" for mobile app security. Only relevant when mode is "security" |
scan_mode | No | "passive" | How aggressively the security agent can test. "passive" (read-only analysis), "active-safe" (safe active testing), or "active-full" (full penetration testing with potentially destructive tools) |
Validation:
- Empty
message→ 400 “message is required” - Empty
session_id→ 400 “session_id is required” - Agent not configured (no LLM API key set) → 503 “LLM provider not configured — set API key in settings”
New Conversation Creation
Section titled “New Conversation Creation”When conversation_id is empty, a new conversation is created automatically:
- The ID is a ULID (unique, time-sortable)
- The title is generated from the first message — truncated to 50 bytes with UTF-8 safety (won’t cut a multi-byte character in half), falling back to “New conversation” if the message is empty
Response Headers
Section titled “Response Headers”Content-Type: text/event-streamCache-Control: no-cacheConnection: keep-aliveX-Accel-Buffering: noThe X-Accel-Buffering: no header tells reverse proxies (like Nginx) not to buffer the response — without this, the proxy would wait for the entire response before forwarding any events.
SSE Event Types
Section titled “SSE Event Types”Each event is sent as event: <type>\ndata: <json>\n\n following the SSE specification. Here are all the event types the stream can emit:
| Event | When It Fires | Payload Fields | Description |
|---|---|---|---|
chunk | LLM generates text | type, content | A piece of the agent’s response text. These arrive token-by-token as the LLM generates them, creating the “typing” effect. Concatenate all chunks to build the full response |
tool_call | Agent decides to use a tool | type, tool_call_id, tool_name, tool_input | The agent is calling one of its ~60+ tools. tool_input is a JSON object with the tool’s parameters. You’ll see what the agent is doing before the result comes back |
tool_result | Tool execution completes | type, tool_call_id, tool_name, content | The result of a tool call. content is truncated to 500 characters for display purposes (the agent sees the full result internally) |
plan_created | Agent creates a work plan | type, plan | The agent has laid out a structured plan with numbered steps. The plan object includes goal, scope, steps (each with description, category, status, and suggested tools) |
step_completed | Agent finishes a plan step | type, step_id | A specific step in the plan has been completed. step_id is the 1-based step number |
plan_revised | Agent modifies its plan | type, plan | The agent adjusted its plan based on what it discovered (added steps, reordered priorities, etc.). Plans can be revised up to 5 times |
plan_completed | All plan steps done | type, plan | The entire plan has been completed. The plan object shows all steps with their final status and results |
options | Agent presents interactive choices | type, question, options | The agent is asking the user to choose from a set of options. question is the prompt text, options is an array of objects with label (display text), description (optional context), and value (returned when selected). The agent loop pauses until the user responds by sending a new chat message with the selected value. The Run Summary is suppressed during this state. |
steer | Steering message consumed | type, content | A user steering message (sent via the steer endpoint) was injected into the agent’s context. The content shows what was injected |
metrics | Agent run finishing | type, metrics | Run statistics including iterations, tool calls, duration, findings count, termination reason, and more (see Metrics section below) |
error | Any error occurs | type, error | Something went wrong. The stream ends immediately after an error event |
done | Stream complete | type, conversation_id, message_id | The agent has finished. conversation_id lets you continue this conversation later, message_id identifies the specific assistant message |
Note: A step_started event type is defined in the codebase but is never actually emitted.
Message Persistence
Section titled “Message Persistence”Messages are saved to the database during streaming so conversations survive application restarts:
- User messages are saved immediately before the agent starts running
- Assistant messages are saved per-iteration (the agent may go through multiple iterations of thinking → tool use → thinking)
- Tool results are saved as separate messages with role
"tool"and the matchingtool_call_id - Steering nudges (prefixed with
__nudge__:) are saved as"user"role messages to maintain the required user/assistant alternation pattern
History Sanitization
Section titled “History Sanitization”When loading a conversation’s history to feed back to the LLM, Ghost runs a sanitization pass that fixes two problems:
- Orphaned tool messages — if a tool result message references a
tool_call_idthat doesn’t appear in any preceding assistant message, it’s dropped. This can happen with conversations saved before per-iteration persistence was implemented - Consecutive assistant messages — if two assistant messages appear back-to-back without a user message in between (which violates the LLM’s expected alternation pattern), a synthetic
[continued]user message is inserted between them
Run Metrics
Section titled “Run Metrics”The metrics event includes these fields:
| Field | Type | Description |
|---|---|---|
started_at | timestamp | When the agent run began |
completed_at | timestamp | When the agent run finished |
duration | nanoseconds | Total wall-clock time |
iterations | integer | How many think→act cycles the agent went through |
max_iterations | integer | The iteration cap (25) |
tool_calls | integer | Total number of tool calls made |
unique_tools | integer | Number of distinct tools used |
failed_tools | integer | Number of tool calls that returned errors |
plan_steps | integer | Number of steps in the plan (0 if no plan was created) |
plan_revisions | integer | How many times the plan was revised |
steps_completed | integer | How many plan steps were completed |
reflections | integer | Number of reflection iterations (thinking without tool use) |
termination_reason | string | Why the agent stopped (e.g., “plan_reflection_stop”, “user_stop”, “loop_detected”, “waiting for user choice”) |
loops_detected | integer | Number of times the loop detector fired |
findings_total | integer | Total security findings reported (security mode only) |
findings_by_severity | object | Findings broken down by severity level (e.g., {"high": 3, "medium": 5}) |
Task Plan Structure
Section titled “Task Plan Structure”The plan object in plan events:
| Field | Type | Description |
|---|---|---|
goal | string | What the agent is trying to accomplish |
scope | string | Boundaries of the analysis (which hosts, endpoints, etc.) |
steps | array | Ordered list of plan steps |
current_step | integer | Which step is currently active |
created_at | timestamp | When the plan was first created |
revised_at | timestamp | When the plan was last revised (empty if never revised) |
revisions | integer | Total number of revisions (0–5) |
status | string | "active", "completed", or "aborted" |
Each step has:
| Field | Type | Description |
|---|---|---|
id | integer | Step number (1-based) |
description | string | What this step does |
category | string | "recon", "analysis", "active_test", "exploit", or "report" |
tools | array | Suggested tools for this step |
status | string | "pending", "in_progress", "completed", "skipped", or "failed" |
result | string | Summary of what was found (max 500 characters) |
substeps | array | Finer-grained sub-steps (strings) |
started_at | timestamp | When work on this step began |
completed_at | timestamp | When this step finished |
Stop Agent
Section titled “Stop Agent”POST /api/v1/agent/stopSignals the running agent to stop. No request body needed.
The stop is not immediate — the agent finishes its current iteration, then gets 1–2 more iterations to write a summary of what it found before the stream ends. This ensures you always get a final report, even if you stop the agent early.
Internally, this cancels the agent’s context, which causes the RunWithRegistry goroutine to exit at the next iteration boundary.
Response: {"status": "stopped"}
Errors: 503 if the agent is not configured (no LLM API key)
Steer Agent
Section titled “Steer Agent”POST /api/v1/agent/steerInjects a message into the running agent’s next iteration — like tapping a researcher on the shoulder and saying “look at this instead.” The agent sees your message prefixed with [USER STEERING] and adjusts its approach accordingly.
Request body (JSON, 64 KB limit):
{ "message": "Focus on authentication endpoints instead"}Steering channel: The steer channel buffers up to 5 messages. If all 5 slots are full (the agent hasn’t consumed them yet), new steering messages are silently dropped with a warning logged. This prevents memory buildup if you send many steering messages quickly.
Response: {"status": "delivered"}
Errors: 400 (empty message), 503 (agent not configured)
Conversations
Section titled “Conversations”Conversations are persistent chat threads. Each conversation belongs to a session and contains an ordered list of messages (user, assistant, and tool messages).
List Conversations
Section titled “List Conversations”GET /api/v1/agent/conversations?session_id=01HWXYZ...Returns all conversations for a session, ordered by creation time.
Query parameters:
session_id(required) — which session’s conversations to list
Response:
[ { "id": "01HWXYZ...", "session_id": "01HWABC...", "title": "Analyze authentication flows", "created_at": "2024-01-15T10:30:00Z", "updated_at": "2024-01-15T10:35:00Z" }]Get Conversation
Section titled “Get Conversation”GET /api/v1/agent/conversations/{id}Returns the full conversation including all messages — user messages, assistant responses, and tool call results.
Response:
{ "conversation": { "id": "01HWXYZ...", "session_id": "01HWABC...", "title": "Analyze authentication flows", "created_at": "2024-01-15T10:30:00Z", "updated_at": "2024-01-15T10:35:00Z" }, "messages": [ { "id": "01HWABC...", "conversation_id": "01HWXYZ...", "role": "user", "content": "Find all authentication issues", "created_at": "2024-01-15T10:30:00.123456789Z" }, { "id": "01HWDEF...", "conversation_id": "01HWXYZ...", "role": "assistant", "content": "I'll analyze the authentication flows in this session...", "tool_calls": "[{\"id\":\"tc_1\",\"name\":\"list_flows\",\"input\":{\"host\":\"auth.example.com\"}}]", "created_at": "2024-01-15T10:30:01.000000000Z" }, { "id": "01HWGHI...", "conversation_id": "01HWXYZ...", "role": "tool", "content": "{\"flows\":[...]}", "tool_call_id": "tc_1", "created_at": "2024-01-15T10:30:02.000000000Z" } ]}Message fields:
| Field | Type | Description |
|---|---|---|
id | string | Message ULID |
conversation_id | string | Which conversation this message belongs to |
role | string | "user" (your messages), "assistant" (agent responses), or "tool" (tool execution results) |
content | string | The message text (for tool messages, this is the tool’s output) |
tool_calls | string | For assistant messages that used tools — a JSON string containing an array of tool call objects (id, name, input). Empty for user/tool messages |
tool_call_id | string | For tool messages — which tool call this result belongs to. Empty for user/assistant messages |
created_at | string | Timestamp in RFC3339Nano format (nanosecond precision) |
Note that tool_calls is a JSON string, not a parsed object — the frontend needs to JSON.parse() it to access the individual tool calls.
Delete Conversation
Section titled “Delete Conversation”DELETE /api/v1/agent/conversations/{id}Deletes a conversation and all its messages. The messages are automatically deleted via SQLite’s ON DELETE CASCADE foreign key constraint — when the conversation row is removed, all message rows referencing it are removed too.
Response: {"status": "deleted"}
Returns 404 if the conversation doesn’t exist.
File Upload
Section titled “File Upload”POST /api/v1/agent/uploadContent-Type: multipart/form-dataUploads a file for the agent to reference during analysis. The file content is returned in the response (not stored on disk) — the frontend includes it in the next chat message so the agent can read it.
Overall body limit: 512 KB
Accepted file types and per-type size limits:
| Extension | Max Size | Use Case |
|---|---|---|
.txt | 100 KB | Plain text files, logs, notes |
.md | 100 KB | Markdown documents, requirements |
.json | 200 KB | API specs, configuration files, test data |
.yaml | 200 KB | Configuration files, CI/CD specs |
.yml | 200 KB | Same as .yaml |
Validation:
- Unsupported file extension → 400 with list of supported types
- File exceeds per-type size limit → 413
- File contains non-UTF-8 bytes → 400 “File contains invalid characters. Only UTF-8 text files supported.”
Response:
{ "filename": "api-spec.json", "size": 4567, "type": ".json", "content": "{\"openapi\": \"3.0.0\", ...}"}The content field contains the full file contents as a string.
Workspace Files
Section titled “Workspace Files”The agent can write files to a per-session workspace directory at ~/.ghost/workspaces/{session_id}/. These files include generated reports, test scripts, PoC exploits, exported data, and other artifacts. The workspace endpoints let you browse and read these files.
List Workspace Files
Section titled “List Workspace Files”GET /api/v1/agent/workspace/files?session_id=01HWXYZ...Returns all files in the session’s workspace directory.
Query parameters:
session_id(required) — which session’s workspace to list
Path traversal protection: The session ID is validated against ^[A-Za-z0-9_-]+$ to prevent directory traversal attacks (e.g., ../../etc/passwd).
Response:
{ "session_id": "01HWXYZ...", "workspace": "/Users/you/.ghost/workspaces/01HWXYZ...", "files": [ { "path": "report.md", "size": 4567, "dir": false }, { "path": "poc", "size": 0, "dir": true }, { "path": "poc/sqli-test.py", "size": 1234, "dir": false } ]}If the workspace directory doesn’t exist (no files have been generated for this session yet), returns an empty files array — not an error.
Read Workspace File
Section titled “Read Workspace File”GET /api/v1/agent/workspace/files/{path}?session_id=01HWXYZ...Returns the content of a specific file from the workspace. The {path} is a wildcard that captures the full path after /files/ — so /files/poc/sqli-test.py reads the file at ~/.ghost/workspaces/{session_id}/poc/sqli-test.py.
Security:
- Session ID regex validation (
^[A-Za-z0-9_-]+$) - Path is canonicalized with
filepath.Abs()and checked withstrings.HasPrefix()to ensure it stays within the workspace directory → 403 “path outside workspace” if traversal is detected
File size limit: 10 MB. Files larger than this return 413.
Response: Raw file content as text/plain; charset=utf-8 (not JSON-wrapped). The HTTP status is 200.
Errors:
- Missing session_id → 400
- Missing file path → 400
- Path is a directory → 400
- File not found → 404
- File too large → 413
- Path outside workspace → 403
Summary of Limits
Section titled “Summary of Limits”| Limit | Value | Context |
|---|---|---|
| Chat/steer body size | 64 KB | Maximum JSON payload for chat and steer requests |
| Upload body size | 512 KB | Maximum multipart form data for file upload |
| Upload per-type: .txt/.md | 100 KB | Individual file size cap |
| Upload per-type: .json/.yaml/.yml | 200 KB | Individual file size cap |
| Workspace file read | 10 MB | Maximum file size for workspace file reads |
| Agent iterations | 25 | Maximum think→act cycles per run |
| LLM output tokens | 8,192 | Maximum tokens per LLM response |
| Steering channel | 5 messages | Buffer capacity for steering messages |
| SSE write deadline | 60 seconds | Per-iteration timeout for SSE writes |
| SSE done/error deadline | 10 seconds | Timeout for the final done event |
| Conversation title | 50 bytes | Maximum title length (UTF-8 safe truncation) |
| Tool result display | 500 characters | Truncation for tool_result SSE events (agent sees full result) |
| Plan revisions | 5 | Maximum times a plan can be revised |