Recursive language models (RLMs) have demonstrated that recursion over model calls is a potent technique for long-context reasoning. Recent developments in production coding agents, particularly Anthropic's dynamic workflows, have led to the introduction of the Recursive Agent Harness (RAH), which employs a full agent framework equipped with filesystem tools, code execution, and planning capabilities. This study evaluates RAH's performance against existing baselines, revealing that it improved the Codex coding-agent baseline accuracy from 71.75% to 81.36% on the Oolong-Synthetic dataset. Additionally, with the Claude Sonnet 4.5 backbone, RAH achieved an accuracy of 89.77%, highlighting the effectiveness of harness recursion in enhancing coding agent performance.
Recursive Agent Harness Enhances Long-Context Reasoning in AI Coding Agents
More Articles From This Day
MetaX LLC Aims for Hong Kong Listing Amid AI Industry Surge
MetaX LLC, a Chinese chipmaker, is set to pursue a listing on the Hong Kong stock exchange to capitalize on the burgeoning demand within the artificial intelligence sector. The announcement comes during the World Artificial Intelligence Conference (WAIC) held in Shanghai, which runs from July 6 to July 8, 2023. This strategic move reflects the company's efforts to leverage market opportunities in AI technology and related chip manufacturing.
