Sondera Compiles Natural-Language Rules into Provable Control Over AI Agent Actions

PR Newswire

NEW YORK, June 30, 2026

Sondera's autoformalization research was accepted at ICML 2026's Agents in the Wild workshop and at FLoC 2026's LLM-Solve workshop, with a related tool demo at Black Hat Arsenal. Sondera transforms natural-language policy into formally verified rules that run on any agent.

NEW YORK, June 30, 2026 /PRNewswire/ -- Every organization runs on rules, security, compliance, and business logic written in natural language. As AI agents take on real work, those rules have to bind them too. Sondera today announced that its research on compiling natural-language policy into formally verified controls for AI agents has been accepted at workshops at ICML 2026 and at the Federated Logic Conference (FLoC) 2026's LLM-Solve workshop, and selected for a tool demonstration at Black Hat Arsenal.

The paper, "Autoformalization of Agent Instructions into Policy-as-Code," by Sondera's Adam Mondl, Matthew Maisel, and John Brock, was accepted at the Second Workshop on Agents in the Wild at ICML 2026, one of the field's leading machine learning conferences, and LLMSolve, a workshop at FLoC on applying large language models to formal and solver-based reasoning. A related tool, "GolemHalt: A Deterministic Reference Monitor for AI Coding Agents," will be demonstrated at Black Hat Arsenal, Black Hat's curated showcase of open-source security tools.

In peer-reviewed research using MedAgentBench, an independent benchmark for clinical AI agents published in NEJM AI, Sondera's pipeline autoformalized more of the 88-rule policy than prior published work had hand-coded (23 of 88), and the resulting rules blocked every adversarial unsafe write attempt (99 of 99).

Autoformalization is critical for enforcing agent policies because real policy and business logic live in natural language — in documents like a HIPAA manual, FINRA guidelines, a standard operating procedure, or an agent's own instructions. Traditionally, turning natural-language policies into something a machine can enforce has meant hand-coding rules one at a time, which does not scale to how fast agents are being deployed.

Sondera's pipeline reads the natural language and compiles it directly into formally verified Cedar policy-as-code, with a theorem prover checking every rule and adversarial simulation stress-testing it before production, both to find edge cases and to confirm that legitimate work is still permitted. The approach is neurosymbolic: neural classifiers such as LLMs-as-judges evaluate what an agent is doing probabilistically, while symbolic rules decide deterministically what the agent is allowed to do.

At runtime, a verified deterministic rule returns a decision (such as allow, deny, or escalate) for each agent action. Because the policy-as-code is enforced deterministically outside the context window, rather than relying on an AI-as-judge, issues such as prompt injection, emergent behavior, and model drift cannot talk the policy enforcement layer out of its rules. And because enforcement is stateful, tracking the agent's full trajectory and the flow of information across it, the same action can be allowed or denied depending on what the agent did earlier in the run.

"The agent incidents we see today aren't from prompt injection and hijacking. They're from authorized humans asking authorized agents to do legitimate tasks, like analyzing a financial file or configuring a server. Along the way, the agent reaches the goal with unintended behavior, like leaking or destroying data," said Josh Devon, co-founder and CEO of Sondera. "Even beyond that, enterprises are struggling to apply business logic at scale to their agents, rules that aren't security or compliance but just standard operating procedure, like a coding agent that should stand up a server on the company's approved cloud account rather than an unapproved vendor. Our research is focused on letting organizations turn on the most powerful, long-running agents possible while having the confidence that they will follow the rules."

Sondera's policy and agent control plane is currently in private beta. Teams interested in participating can visit sondera.ai.

The full paper "Autoformalization of Agent Instructions into Policy-as-Code" is available at arxiv.org/abs/2606.26649.

Sondera's open-source harness and SDKs are at github.com/sondera-ai.

About Sondera

Sondera gives enterprises provable control over AI agents, in natural language. Its approach is neurosymbolic: neural classifiers detect what an agent is doing, symbolic rules decide what it is allowed to do. Through a process called autoformalization, Sondera compiles an organization's natural-language rules into formally verified Cedar policy-as-code, checked by a theorem prover and enforced on every action an agent takes, on any agent runtime or harness. Because symbolic enforcement runs outside the model and across the full run, prompt injection and drift can't bypass it, and the same action can be allowed or denied based on what the agent did before it.

Teams use Sondera to apply complex business logic, security, and compliance rules to coding agents, the agents they build, and the agents they sell. Open-source harness at github.com/sondera-ai. Learn more at sondera.ai.

Contacts
Media Contact
pr@sondera.ai

View original content to download multimedia:https://www.prnewswire.com/news-releases/sondera-compiles-natural-language-rules-into-provable-control-over-ai-agent-actions-302814775.html

SOURCE Sondera, Inc.