Testing Guide

Ghost’s testing infrastructure focuses on the Go backend, which is where the critical logic lives — the MITM proxy, SQLite database, API handlers, WebSocket hub, AI agent, and security interceptor. The frontend and extension rely on TypeScript type checking and linting rather than runtime tests.

Test Overview

Layer	Test infrastructure	Quality gate
Go backend	31 test files, ~10,500 lines, 357+ test functions, 1 benchmark	`go test -race -cover`
React frontend	No test runner installed	`tsc --noEmit` (TypeScript check) + ESLint
Chrome extension	No test files	`tsc --noEmit` (TypeScript check)

The Go tests cover 9 of the 16 backend packages, with the heaviest coverage on the agent system (~4,300 lines of tests), proxy (~1,700 lines), and store (~1,170 lines).

Test Distribution by Package

Package	Test files	Test lines	What’s tested
`internal/agent`	11	~4,300	Tool execution, plan lifecycle, termination signals, context management, parallel execution, reflections, QA tools, browser tools, attacker engine
`internal/proxy`	8	~1,740	Proxy server integration, MITM certificate issuance, noise detection, CA generation, protobuf decoding, interceptor pipeline, flow model, path normalization
`internal/api`	2	~1,180	API server integration (full request/response testing), WebSocket hub (broadcast, client lifecycle)
`internal/store`	1	~1,170	SQLite store — flow CRUD, session management, FTS search, migrations, pagination, purging
`internal/inspector`	4	~830	Traffic-UI correlation, selector generation, WebView detection, noise domain filtering
`internal/extension`	2	~670	Extension WebSocket hub (connect/disconnect, message routing), interaction correlation
`internal/addon`	1	~325	Addon engine — VM lifecycle, handler execution, ghost.* API, action dispatching
`internal/device`	1	~190	Android ADB interaction parsing
`internal/config`	1	~140	AES-256-GCM encryption/decryption round-trip, nonce uniqueness, tamper detection

Packages Without Tests

These packages currently have no test files:

Package	Why	Risk
`internal/certinstall`	Platform-specific OS commands — hard to test portably	Low (small, well-defined behavior)
`internal/sysproxy`	Platform-specific OS commands	Low (same)
`internal/frida`	Requires Frida runtime + connected device	Medium (subprocess management)
`internal/sectools`	Requires external tool binaries	Low (thin wrapper)
`internal/testrail`	Requires TestRail server	Low (HTTP client)
`cmd/ghost`	Entry point wiring — tested indirectly by integration tests	Low

Running Tests

Makefile Targets

# Run all tests with race detector, coverage, and no caching
make test              # go test -race -cover -count=1 ./...

# Same with verbose output (shows each test function name)
make test-v            # go test -race -cover -count=1 -v ./...

# Same with -short flag (exists for future use — no tests currently
# check testing.Short(), so this runs identically to make test)
make test-short        # go test -race -short -count=1 ./...

# Run tests for a specific package
go test -race ./internal/proxy/...
go test -race ./internal/store/...
go test -race ./internal/agent/...

The -count=1 flag disables Go’s test result caching. This is important because Ghost’s tests create SQLite databases, bind network ports, and start HTTP servers — cached results from a previous run might not reflect the current state.

Coverage Report

make cover

This runs all tests with coverage profiling enabled, generates a coverage.out file, and opens an interactive HTML report (coverage.html) in your browser. The report shows which lines of code were executed during tests and which were not, colored green and red respectively.

Race Detector

All test commands include -race by default. Go’s race detector instruments the compiled binary to detect concurrent access to shared variables without proper synchronization. This is critical for Ghost because many subsystems are accessed concurrently:

The WebSocket hub broadcasts events to multiple clients simultaneously
The proxy handles many connections in parallel, each potentially modifying shared state
The SQLite store uses a single-writer model but concurrent readers
Background goroutines (WAL checkpoint, auto-purge, device discovery) access shared data structures

The race detector adds approximately 10x runtime overhead, which is why it’s only used during development and testing, not in production builds.

Test Patterns

Ghost’s tests follow consistent patterns within each package. There is no shared testutil package — each package defines its own local helpers in _test.go files. This keeps test dependencies explicit and avoids coupling between packages.

SQLite Store Tests

Store tests create isolated database instances so each test starts with a clean slate:

func newTestStore(t *testing.T) *SQLiteStore {
    dir := t.TempDir()
    store, err := OpenSQLite(context.Background(), filepath.Join(dir, "test.db"), slog.Default())
    require.NoError(t, err)
    t.Cleanup(func() { store.Close() })
    return store
}

Key points:

Uses t.TempDir() which creates a temporary directory that’s automatically cleaned up when the test finishes
Each test gets its own SQLite file, so tests can run in parallel without interference
t.Cleanup ensures the database is properly closed even if the test panics
Helper functions like newTestSession(t, store, name) and newTestFlow(id, sessionID) create test fixtures with minimal boilerplate

API Integration Tests

API tests create a complete server environment with a real HTTP server, database, and all subsystems wired together:

func setupTestEnv(t *testing.T) *testEnv {
    // Creates: httptest.Server, API Server, :memory: SQLite store,
    // generated CA, config.NewTestManager, Frida manager, hub
    // ...
}

The testEnv struct provides:

httptest.Server — a real HTTP server on a random port
SQLite store (:memory: — in-memory database, no disk I/O)
Generated CA certificate and key
config.NewTestManager(cfg) — config manager that skips disk persistence
Auth token for authenticated requests
Started WebSocket hub with automatic cleanup

SSRF override: Test environments replace the SSRF-safe replay transport with http.DefaultTransport. This is necessary because httptest.NewServer binds to 127.0.0.1, which the SSRF protection would normally block (it rejects requests to private/loopback addresses). This override is safe because it only applies to test code.

Proxy Integration Tests

Proxy tests start a real proxy server and upstream targets:

func startTestProxy(t *testing.T, pipeline []Interceptor) (*Server, int) {
    // Creates real proxy on random port with generated CA
    // Returns the server and its port number
}

Tests create httptest.NewServer instances as upstream targets, configure the proxy to intercept traffic to those targets, and verify that the proxy correctly handles, modifies, and forwards requests. A collectingInterceptor records all flows for assertions:

type collectingInterceptor struct {
    mu    sync.Mutex
    flows []*Flow
}

The mutex is necessary because the proxy processes requests concurrently, and the test interceptor must be thread-safe.

Agent/Tool Tests

Agent tests use a comprehensive mockStore that implements the full store.Store interface with in-memory maps. This avoids any database dependency and makes tests fast and deterministic:

type mockStore struct {
    flows        map[string]*proxy.Flow
    sessions     map[string]*store.Session
    interactions []store.Interaction
    // ... more fields for each store interface
}

Tool tests verify that each agent tool produces the correct output given specific inputs, without calling any real LLM. The mock store is pre-populated with test flows, sessions, and other data.

Config Tests

Config tests use config.NewTestManager(cfg) which creates a manager with noSave: true. This prevents tests from writing to the filesystem, avoiding interference between tests and ensuring no test artifacts are left behind.

WebSocket Hub Tests

Both the main hub and extension hub are tested using httptest.Server + gorilla/websocket:

func setupHubTestServer() (*httptest.Server, *Hub) {
    hub := NewHub(slog.Default())
    go hub.Run()
    server := httptest.NewServer(http.HandlerFunc(hub.HandleWebSocket))
    return server, hub
}

Tests connect WebSocket clients, send messages, and verify that broadcasts reach all connected clients. The hub is started in a goroutine with t.Cleanup ensuring it’s stopped.

Benchmarks

Ghost includes one benchmark for URL path normalization — the function that replaces dynamic segments (UUIDs, numeric IDs, hex hashes) with {id} for session comparison:

go test -bench=. ./internal/proxy/...

The BenchmarkNormalizePath benchmark tests the performance of this function since it runs on every flow during session comparison and needs to be fast.

Linting

make lint              # golangci-lint run ./...
make vet               # go vet ./...
make fmt               # gofmt -s -w + goimports -w

golangci-lint runs with its built-in default configuration (no .golangci.yml config file exists in the repository). The defaults include checks for:

Unused code and variables
Error handling issues (unchecked errors, empty error handling)
Style violations (formatting, naming conventions)
Security concerns (potential vulnerabilities)
Shadowed variables
Inefficient code patterns

go vet is Go’s built-in static analysis tool. It catches issues like incorrect Printf format strings, unreachable code, and suspicious constructs that compile but are almost certainly bugs.

gofmt -s formats code with the -s (simplify) flag, which removes unnecessary type conversions and composite literal types. goimports additionally organizes import statements into groups (standard library, external, internal) and removes unused imports.

Frontend Type Checking

cd frontend
npx tsc --noEmit       # Type check without producing output files
npm run build          # Runs tsc -b first, then Vite build
npm run lint           # ESLint

The frontend has no runtime test runner (no Vitest, Jest, or Playwright). The quality gate is zero TypeScript errors from tsc --noEmit. This catches type mismatches, missing properties, incorrect function signatures, and other static issues at compile time.

ESLint provides additional checks for React hooks rules, React refresh compatibility, and general code quality.

Build Verification

Before declaring any change complete, verify that the entire codebase compiles cleanly:

# Go: all packages compile without errors
go build ./...

# Go: static analysis passes
go vet ./...

# Frontend: TypeScript type check passes
cd frontend && npx tsc --noEmit

These three commands catch the majority of issues. If all three pass, the change is very likely to work correctly at runtime.

Quality Standards

Ghost maintains strict coding standards that are enforced during code review. These rules exist because these exact bugs appeared repeatedly in quality gates — the goal is to prevent them from being written in the first place.

Go — Mandatory Checks

Rule	Why it matters
Always check `json.Unmarshal` errors	`_ = json.Unmarshal()` silently produces zero-value structs, leading to mysterious nil pointer dereferences later
Always check `url.Parse` errors	`parsed, _ := url.Parse()` returns a nil URL, and the next line accessing `parsed.Host` panics
Always handle `io.ReadAll` errors	When the body matters (not just draining), an error means the data is incomplete
SSRF protection on user-provided URLs	Any handler that sends HTTP to a user-provided URL needs `isPrivateHost()` validation and scheme checking to prevent internal network scanning
`io.LimitReader` on every request body	Named constant for each limit. Prevents denial-of-service via oversized payloads
Binary content detection for JSON responses	Any response body going into a JSON field needs base64 encoding if it contains binary data — raw binary breaks JSON
Use `strconv.Atoi` for parsing	Don’t hand-roll integer parsers — they miss edge cases

TypeScript — Mandatory Checks

Rule	Why it matters
No empty `catch {}` blocks	Every catch must set error state, show a toast, or at least `console.error`. Silent failures are invisible failures.
Escape special characters in output	Markdown tables need `\|` escaping, shell commands need space quoting, code generators need language-specific escaping
`AbortController` for HTTP calls in send/submit actions	Without it, navigating away while a request is in-flight causes a state update on an unmounted component
Reset view state when data changes	Zoom level, filters, collapse state, scroll position — all need resetting when the underlying data source changes
`Cmd/Ctrl+Enter` for send actions	Consistent keyboard shortcut across all send buttons

Both Languages

Rule	Why it matters
Bidirectional comparison	If comparing A>B, always also verify B>A. Asymmetric bugs are common in diff/comparison features.
Non-standard input handling	Test with HTTP methods beyond GET/POST, binary content types, unicode in filenames, pipes in header values