Motivation
We all want outcomes.
Agents that work on our behalf — reliable co-workers — while we’re out on a hike.
The bottleneck is not intelligence. It’s reliability. It’s trust.
- One day — my agents build me a full SaaS app from a single prompt.
- The next day — they empty the entire contents of my Solana wallet.
Thesis
Today’s agents are mis-managed geniuses
The intelligence is there.
The missing layer is how we specify, manage, reuse, and verify the work.
Recursive Language Models
Context itself is the object of computation
- Externalize — the full prompt lives in a REPL, not the context window.
- Operate — the model writes code to inspect, slice, and transform it.
- Recurse — it sub-queries itself over the slices.
arXiv:2512.24601Recursive Language Models
Code Execution As Reasoning
- Can process inputs way beyond the context window. RLM arXiv paper
- RLM is itself a powerful memory system. LongMemEval results
- RLM can achieve SOTA on long reasoning tasks, even with very small models. LongCoT results

The RLM rubric
Lots of things feel close.
| Executable environment | Prompt externalized | Code calls the model | Model picks decomposition | State stays symbolic | |
|---|---|---|---|---|---|
| Plain long-context call RAG / reasoning-only | |||||
| Subagents verbal delegation | |||||
| Coding agent + bash CodeAct-style, one session | |||||
| Agentic loops Ralph + the 2026 'loop engineering' wave | |||||
| Hardcoded map-reduce developer-authored pipeline — e.g. λ-RLM | |||||
| Recursive Language Model passes every check |
Only an RLM checks every operational box here.
Towards Recursive Coding Agents
Either... Trick question: RLMs are Recursive Coding Agents.
Or... How can we apply the principles of RLM to coding agents?
My Experiments
Finding ypi
Built on Pi (minimal, extensible). Previously pi extensions could not support recursion — so I forked it. Y is for the Y-combinator.
- Wrapper CLI —
ypi- a fully recursive Pi agent. - Pi Extension —
pi-recursive- make any existing Pi config recursive.
The RLM ecosystem
Other notable projects
- alexzhang13/rlm Python The reference implementation from Alex Zhang and the RLM paper authors — the cleanest place to read the core recursion loop.
- stanfordnlp/dspy Python dspy.RLM exposes RLM as a composable module inside larger DSPy programs — what I use for most of my own experiments.
- ax-llm/ax TypeScript A TypeScript, DSPy-style framework with first-class RLM support: AxAgent recursive decomposition and a persistent JS runtime.
- openprose/unix-rlm Shell An RLM whose sandbox is a full Linux filesystem — one bash script, the whole computer as the environment.
- openprose/prose TypeScript A declarative markdown language the agent compiles into reliable, RLM-style execution.
Dynamic workflows made Claude Code recursive.
Claude can write an orchestration script, then run a fleet of subagents. The line is whether the model chooses the decomposition, or the script fixes it ahead of time.
Claude Code blog · dynamic workflows For (almost) any coding agent
A language compiled by the agent, not the computer.
A markdown spec plus a giant prompt, in logical English. No new syntax to learn.
The key: a declarative contract the agent must satisfy to be “done.” That’s the answer to slide 2 — trust.
Any agent with a filesystem and subagents can run it — and behave like an RLM.
See “Stop Babysitting Agents, Start Authoring Outcomes” on Turing Post.
OpenProse explicitly declares subagent work
Here are two OpenProse examples where the model turns an external handle into smaller handles, then verifies the child-work trace.
- Starts from an external
prompt_handle; root does not read the whole thing. - The model decides terminal vs. nonterminal handle.
- Nonterminal handles produce child handles and call the same contract again.
if nonterminal:
for child in manifest:
recurse(child.path) Directory handle slicer directory-handle-slicer.prose.md - Starts from a repo or directory handle, not copied root context.
- The model uses search to choose relevant file handles for the question.
- Workers inspect only assigned handles; aggregation cites worker evidence.
manifest = model_slice(directory)
for child in manifest:
worker(child.path only)
validate worker provenanceRecursive Coding Agents FTW
Trust is reliability — the next step is behavioral, not more model intelligence.
A new paradigm of inference-time compute — RLMs are the new reasoning models → recursive coding agents are the new coding agents.
Coding agents can be RLMs — Claude Code dynamic workflows and OpenProse show two concrete paths.
Until Next Time...
Always Recurse Responsibly
Raymond Weitekamp
Presentation at recursivecodingagents.com | Companion GitHub repo

