How it works

A pipeline that reasons like an attacker.

HELIX doesn't run a checklist. It runs a loop: form a hypothesis, execute a real tool, observe what came back, and re-decide what to try next. Underneath sit a search-driven planner and a six-layer guardrail engine that keep every move in scope and in bounds.

The playbook

Six stages, recon to verification

Each engagement moves through the same disciplined arc, the same one a senior operator would follow by hand, only autonomous.

Discover

Map the target. Enumerate surface, endpoints, parameters and entry points, the recon that every later move depends on.

Understand

Build a model of how the target behaves: auth flows, roles, data shapes and business logic, context the planner reasons over.

Exploit

Test hypotheses with real offensive tools. The agent decides the approach; sqlmap, nuclei, Frida and the rest do the work.

Chain

Combine individual weaknesses into a higher-impact path within the engagement, turning a low-severity foothold into a real one.

Prove

Capture a copy-pasteable reproducer, assign CVSS and CWE, and write language-specific remediation. No proof, no finding.

Verify

A Skeptic agent refutes anything unproven and the correlator dedupes across agents, so only confirmed, distinct issues ship.

The reasoning engine

Monte-Carlo Tree Search, not a mega-prompt

A stateful MCTS planner drives the whole engagement. It hypothesizes candidate moves, executes the most promising one for real, observes the result, scores it, and re-decides. It uses UCB1 to balance exploiting a promising lead against exploring new ground, and prunes branches that fail, so it never bangs on the same closed door twice. Deduce there's a WAF in the way? Change strategy.

UCB1branch pruningstatefulmulti-agent blackboard

hypoth login form may allow SQLi
execute sqlmap --level 3 /auth
observe 403, WAF signature detected
decide prune branch · pivot

hypoth JSON body bypasses WAF rule
execute tamper via content-type
observe time-based delay confirmed
decide promote · capture reproducer

The guardrail engine

Six layers on every tool call

Autonomy without recklessness. Every single tool call passes through all six layers, in order, before anything touches your target. See the full controls on the Security page.

Layer 1

Scan mode

Passive, safe or full, you set the aggressiveness per engagement, so HELIX never pushes harder than you authorized.

Layer 2

Scope respect

A hard in-scope allow-list. Anything outside the boundary you defined is rejected before it can run.

Layer 3

Destructive-action blocking

A pattern detector stops data-destroying and service-saturating actions before they execute.

Layer 4

Budget cap

A hard LLM-spend ceiling per engagement. The operator stays within a cost bound you control.

Layer 5

Rate limiting

Request pacing that keeps engagements from degrading availability, pressure, not a flood.

Layer 6

Human-in-the-loop

Production targets require explicit approval gates. Staging first, then prod, with a human in the seat.

Continuous

Posture that doesn't go stale between engagements

Schedule re-scans on an interval and HELIX produces run-over-run diffs, what's new, what's resolved, what regressed, so you watch posture move rather than reading a one-off snapshot. You can also trigger an engagement straight from your CI pipeline via the public API, on whatever event matters to you.

scheduled re-scansrun diffsCI via API

run #41 → run #42 diff
+ new 2 BOLA on /v2/invoices
- resolved 5 XSS on /search
~ regressed 1 auth bypass returned

trigger: scheduled · interval 24h
also available: POST /v1/engagements

Watch the loop close on your target

Recon to reproducible proof, with guardrails you control.

Request demo Meet the agents