Built by an operator, for operators.
HELIX is the autonomous offensive-security operator. It runs a full engagement end to end, recon, exploitation, chaining, reporting, and hands back a triaged report with proof. It was built from inside the problem it solves.
Offensive security is bottlenecked by human throughput.
One skilled pentester runs one engagement at a time. The work is deep, manual, and serial, recon, then a hypothesis, then a tool, then a re-read of the response, then the next hypothesis. It does not parallelize, and it does not scale with the attack surface.
So every security team is structurally understaffed against the surface it owns. A traditional assessment costs roughly USD 5K–15K, takes one to three weeks, and happens once or twice a year. Between engagements, dozens of deploys ship untested. The gap is not a lack of talent, it is a lack of throughput.
One operator, one engagement, one week
The skill exists. The hours do not. A single pentester cannot watch every endpoint, every spec change, and every new deploy across the surface a modern team ships. The work that protects you is exactly the work that does not fit in a calendar.
LLMs can finally reason about attack chains.
For years, models could generate a payload but not run an engagement, they could write a snippet, not decide what to try next, observe the response, and re-plan. That has changed. A model can now hold an offensive hypothesis, reason about why a door is closed, and pick a different door.
That is the moment the throughput gap can close. HELIX's engine, a stateful Monte-Carlo Tree Search planner driving 40+ specialized agents over a shared blackboard, wrapped in a six-layer guardrail engine, is what makes it safe to point that reasoning at real systems instead of a sandbox demo.
Reason, execute, observe, re-decide
An MCTS/UCB1 planner proposes candidate moves, executes the most promising one with real tools, scores the result, and prunes branches that fail, so it never bangs on the same closed door. Guardrails sit on every tool call, in order, before anything touches your systems.
Built from the inside of the problem.
HELIX is built by Cristian, an offensive-security team lead based in Buenos Aires, Argentina. He runs these engagements by hand, professionally: the recon, the hypotheses, the dead ends, the chains, the write-ups. HELIX is not an outsider's idea of what a pentest looks like. It is the operator's own workflow, encoded.
Most security tooling is built by people who have never sat through the eighth hour of a manual access-control review. The decisions HELIX makes, to prune a branch, to demand a reproducer before it believes a finding, to ask for human approval before touching production, are the decisions a good operator already makes. The difference is that HELIX can make them in parallel, continuously, across a surface no single person could cover.
What we will not compromise on.
Evidence over alerts
A scanner says what might be wrong. HELIX proves what is. Every finding ships with a copy-pasteable reproducer, a CVSS score, a CWE, and remediation. A Skeptic agent refutes anything that lacks a runtime witness. If we cannot reproduce it, we do not report it.
Autonomy with guardrails
Full autonomy in the middle, hard constraints at the edges. Every tool call passes through six guardrail layers in order, scan mode, scope, destructive-action blocking, budget cap, rate limiting, and human-in-the-loop. Production targets require explicit approval. Safety is a design constraint, not a setting.
Operator, not assistant
HELIX does not autocomplete your pentest, it runs it. You set the scope and review the output; everything in between is autonomous. It is not a copilot waiting for the next prompt. It is an operator that owns the engagement from recon to triaged report.
Honest about the stage.
HELIX is early. We are pre-launch, and we would rather tell you that plainly than dress it up.
The engine is real and running. We validate it the way an operator validates anything: on public benchmarks, including the OWASP Juice Shop deliberately-vulnerable application, and on authorized bug-bounty scope where we have permission to test live systems. Findings are reproducible and we do not cherry-pick the wins.
We are now onboarding three to five design partners, teams that want continuous offensive coverage and are willing to shape the product with us while it is still early enough to do that. If that is you, the next step is below.
Become a design partner.
Help shape an autonomous operator while it is still early. We are onboarding three to five teams.