Session Comparison
Session comparison answers the big question: “what changed between these two testing sessions?” Instead of comparing individual requests one by one, Ghost analyzes all traffic from two sessions at once, groups it by endpoint, and tells you exactly which APIs broke, slowed down, appeared, or disappeared.
This is your primary tool for regression testing. Capture a session before a deployment, capture another session after the deployment with the same test flow, and let Ghost show you every difference. It’s like a “git diff” but for API behavior instead of code.
When to Use Session Comparison
Section titled “When to Use Session Comparison”- Before/after deployment — Did the new code break any existing API endpoints? Are there new errors that weren’t there before?
- Environment comparison — Staging vs. production: are all APIs behaving the same way?
- Test run validation — Run the same test flow twice and verify consistent behavior
- Performance monitoring — Did any endpoints get slower after the latest release?
- API surface tracking — Are there new endpoints that weren’t there last week? Did any endpoints disappear?
How to Open Session Comparison
Section titled “How to Open Session Comparison”Three ways to start:
- Keyboard shortcut: Press Cmd+Shift+K (macOS) or Ctrl+Shift+K (Windows) to toggle the comparison view on or off
- Toolbar button: Click the diff icon (two stacked rectangles) in the command bar — it has a tooltip “Compare Sessions (Cmd+Shift+K)”
- Drag-and-drop: Drop a
.haror.jsonfile onto Ghost — it imports the file into a new session and automatically opens comparison between the imported session and your active session
When the comparison opens, it replaces the main traffic area with a dedicated comparison view. Your regular traffic list comes back when you close the comparison.
Choosing Sessions
Section titled “Choosing Sessions”At the top of the comparison view, two dropdown selectors let you pick Source (baseline — what you expect) and Target (what you’re comparing against). All your sessions appear in both dropdowns. A swap button lets you quickly flip source and target.
Once both sessions are selected, Ghost automatically fetches and analyzes the comparison. A loading indicator shows progress, and if anything goes wrong, an error message with a retry button appears.
Limits: Each session can contain up to 50,000 flows for comparison. If a session exceeds this limit, Ghost returns a “too large” error (HTTP 413). The comparison analysis has a 30-second timeout — for very large sessions, this ensures the UI doesn’t hang indefinitely.
Overview Banner
Section titled “Overview Banner”After both sessions load, a banner at the top shows key statistics for both sessions side by side:
| Metric | What It Shows |
|---|---|
| Flow count | Total number of HTTP flows captured in each session |
| Error count | Flows with HTTP status 400+ or connection errors |
| Average duration | Mean response time across all flows in the session |
| Total size | Sum of all response body sizes |
| Unique hosts | How many distinct servers/domains appeared (case-insensitive) |
| Unique endpoints | How many distinct API endpoints were called |
Between the two session cards, a delta strip shows the key differences:
| Delta | What It Means |
|---|---|
| New endpoints | API endpoints that exist only in the target session (not in the source) |
| Removed endpoints | Endpoints that existed in the source but are missing from the target |
| Changed endpoints | Endpoints that exist in both but show meaningful differences |
| Unchanged endpoints | Endpoints that behaved identically in both sessions |
| Regressions | Endpoints where the error rate increased significantly |
| Performance issues | Endpoints where response time got significantly worse |
How Endpoints Are Grouped
Section titled “How Endpoints Are Grouped”Ghost doesn’t compare individual flows directly — that would be overwhelming with thousands of requests. Instead, it groups flows into endpoints by combining the HTTP method, hostname, and normalized URL path.
Path normalization is key: Ghost automatically replaces dynamic segments in URLs with {id} placeholders so that requests to the same logical endpoint get grouped together even when the IDs differ. For example:
| Original Path | Normalized | Why |
|---|---|---|
/api/products/12345 | /api/products/{id} | Numeric IDs (pure digits) are replaced |
/api/users/550e8400-e29b-41d4-a716-446655440000 | /api/users/{id} | UUIDs (36 hex chars with dashes) are replaced |
/api/orders/01HWRFQP3K5M9TGWQX7Z | /api/orders/{id} | ULIDs (26-char Crockford base32) are replaced |
/api/files/a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4 | /api/files/{id} | Hex hashes (32+ hex characters) are replaced |
/api/products?page=2&sort=name | /api/products | Query strings are stripped |
This means GET /api/products/123 and GET /api/products/456 are treated as the same endpoint (GET /api/products/{id}), so their statistics are aggregated together for comparison.
Additional normalization: trailing slashes are stripped (except root /), double slashes are collapsed, and hostnames are lowercased.
Change Classification
Section titled “Change Classification”Each endpoint that exists in both sessions is analyzed and classified by comparing its statistics. Ghost uses specific thresholds to avoid false positives:
Regression
Section titled “Regression”An endpoint is flagged as a regression when ALL of these conditions are true:
- The source session’s error rate was below 50% (it was mostly working)
- The target session’s error rate increased by at least 10 percentage points (e.g., from 5% to 15%)
- The dominant status code class changed from success (2xx) to error (4xx or 5xx)
This catches the most critical problem: an API that was working fine is now returning errors.
Performance Degradation
Section titled “Performance Degradation”An endpoint is flagged as a performance issue when BOTH of these conditions are true:
- The target’s average response time is at least 2× the source (doubled or worse)
- The absolute increase is at least 100 milliseconds
The dual threshold prevents false positives: a 1ms → 3ms change is 3× but insignificant in practice. A 500ms → 600ms change is only 1.2× and within normal variation. But a 200ms → 500ms change is 2.5× AND 300ms — that’s a real problem.
Status Changed
Section titled “Status Changed”The dominant status code (most common status) changed between sessions. For example, an endpoint that mostly returned 200 in the source now mostly returns 301 in the target. Only flagged when it’s not already classified as a regression.
Size Changed
Section titled “Size Changed”The average response body size changed by at least 20% AND at least 1,024 bytes (1 KB). This catches APIs that started returning significantly more or less data — which could indicate a schema change, missing pagination, or broken serialization.
Severity Ordering
Section titled “Severity Ordering”Endpoints are sorted by severity (most critical first):
- Regression (highest priority)
- Removed (endpoint disappeared)
- Performance degradation
- Status code changed
- New (endpoint appeared)
- Size changed
- Unchanged (lowest priority)
Within the same severity level, endpoints are sorted alphabetically by their path pattern.
Endpoint Table
Section titled “Endpoint Table”The main area of the comparison view is a virtualized table listing every endpoint found across both sessions. Endpoints are grouped under collapsible host headers (e.g., “api.example.com”), with aggregate status for each host.
Columns
Section titled “Columns”| Column | What It Shows |
|---|---|
| Status icon | Color-coded indicator: green circle for new, red minus for removed, amber dot for changed, red down-arrow for regression, orange clock for performance, grey check for unchanged |
| Method | HTTP method badge (GET, POST, PUT, DELETE, etc.) |
| Endpoint | The normalized path pattern (e.g., /api/products/{id}) |
| Source | Status code distribution from the source session (e.g., “200 ×5, 404 ×1”) |
| Target | Status code distribution from the target session |
| Duration | Average duration for both sessions with delta and multiplier (e.g., “142ms → 340ms (+198ms, 2.4×)“) |
| Size | Average response size for both sessions with delta |
| Count | Number of flows in each session |
Rich Tooltips
Section titled “Rich Tooltips”Hovering over any endpoint row shows a detailed tooltip with the full statistics for both source and target side by side:
- Call count
- Status code distribution (every code and its count)
- Average and p95 duration
- Average response size
- Up to 5 sample paths (original, before normalization) — so you can see the actual URLs that were grouped together
Filters
Section titled “Filters”A segmented control bar above the table lets you filter which endpoints are shown:
| Filter | What It Shows | Count Badge |
|---|---|---|
| All | Every endpoint from both sessions | Total endpoint count |
| Changed | Endpoints with any detected change (status, size, regression, performance) | Number of changed endpoints |
| Regressions | Only endpoints classified as regressions (error rate increased) | Number of regressions |
| Performance | Only endpoints with performance degradation (2×+ slower and 100ms+ increase) | Number of performance issues |
| New | Endpoints that exist only in the target session | Number of new endpoints |
| Removed | Endpoints that exist only in the source session | Number of removed endpoints |
The default filter when you first open the comparison is Changed — showing you only what’s different, which is usually what you care about most.
A search input below the filters lets you type to find specific endpoints. It supports:
- Substring matching — type
checkoutto find endpoints with “checkout” in the host or path - Wildcard patterns — type
api.*.comor/products/*for glob-style matching - Sample path matching — also searches against the original (pre-normalization) sample paths
- Debounced — waits 300ms after you stop typing before filtering, to avoid laggy UI while typing
Endpoint Drill-Down
Section titled “Endpoint Drill-Down”Click any endpoint row to open a detailed side-by-side view in the bottom panel. This shows the actual request-response data for a matched flow pair from that endpoint.
Flow Pair Navigation
Section titled “Flow Pair Navigation”When an endpoint has multiple flows in each session (which is common — you might have 5 requests to /api/products in session A and 7 in session B), Ghost matches them into pairs and lets you navigate between them:
- Prev/Next arrows to step through pairs
- “X of Y” counter showing which pair you’re viewing (up to 100 pairs per endpoint)
- Unmatched flow counts — if one session has more flows than the other, the drill-down shows “3 extra in Source” or “2 extra in Target” so you know there’s a volume difference
Drill-Down Sections
Section titled “Drill-Down Sections”Each flow pair is displayed with the source flow on the left and target flow on the right:
URL Diff — Full URLs side-by-side. Highlights if different (useful when the same normalized endpoint had different query parameters).
Response Summary — Status code (highlighted if changed), content length, and duration with delta (green for faster, red for slower).
Timing Breakdown — A phase-by-phase comparison of the request lifecycle:
| Phase | What It Measures | Why It Matters |
|---|---|---|
| DNS | Domain name resolution time | DNS server issues, caching differences |
| TCP Connect | Network connection establishment | Network latency, routing changes |
| TLS Handshake | HTTPS encryption setup | Certificate chain issues, crypto overhead |
| TTFB | Time to First Byte (server thinking time) | The most important metric — how fast the server processes the request |
| Transfer | Response body download time | Large response, bandwidth issues |
| Total | End-to-end duration | Overall request performance |
Each phase shows the source time, target time, and delta. Phases where the target is significantly slower (50ms+ difference) are highlighted as bottlenecks — drawing your eye to exactly where the slowdown occurred.
Request Headers Diff — All request headers compared key-by-key with color coding: green (added in target), red (removed in target), amber (value changed), no highlight (identical).
Request Body Diff — For JSON content types, a structural field-by-field comparison showing added, removed, and changed fields at every depth. For non-JSON content, raw side-by-side display. Bodies larger than 1 MB show a “too large to diff” message to prevent browser memory issues.
Response Headers Diff — Same format as request headers.
Response Body Diff — Same structural diffing as request body. A legend shows counts of differences: “N only in Source, N only in Comparison, N changed.” This is where you’ll find the most important changes — new fields in the API response, changed values, missing data.
Press Escape to close the drill-down and return to the endpoint table. Press Escape again to close the entire comparison view.
Per-Endpoint Statistics
Section titled “Per-Endpoint Statistics”For each endpoint in each session, Ghost calculates comprehensive statistics that power the comparison:
| Statistic | How It’s Calculated |
|---|---|
| Count | Total number of flows matching this endpoint |
| Status code distribution | Map of each status code to its occurrence count (e.g., {200: 45, 404: 3, 500: 2}) |
| Average duration | Mean of all flow durations in milliseconds |
| Min/Max duration | Fastest and slowest response times |
| p95 duration | 95th percentile — 95% of requests were faster than this. Calculated by sorting all durations and picking the value at the 95% position. This is the industry-standard metric for “real-world worst case” performance. |
| Average response size | Mean response body size in bytes |
| Error rate | Fraction of flows with HTTP status 400+ or connection errors (0.0 to 1.0) |
| Flow IDs | Up to 100 flow IDs for drill-down pair matching |
| Sample paths | Up to 5 original (pre-normalization) paths for reference |