InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

•

Original Author:Yuchen Yan et al.

•

February 6, 2026

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

Image generated by Gemini AI

InftyThink+ is a new reinforcement learning framework designed to enhance iterative reasoning in large models by optimizing when to summarize and how to resume reasoning. Through a two-stage training process, it improves accuracy by 21% on AIME24 and outperforms traditional methods while reducing inference latency. This approach not only boosts performance but also enhances generalization to new benchmarks, making reasoning more efficient.

InftyThink+: A Breakthrough in Infinite-Horizon Reasoning via Reinforcement Learning

A new framework, InftyThink+, has been introduced to enhance infinite-horizon reasoning in large models. This end-to-end reinforcement learning approach optimizes iterative reasoning by improving accuracy and reducing inference latency.

InftyThink+ incorporates iterative reasoning, summarizing intermediate thoughts to streamline the process. It employs a novel reinforcement learning framework that optimizes the entire trajectory of reasoning, including model-controlled iteration boundaries and explicit summarization techniques.

Results from experiments using the DeepSeek-R1-Distill-Qwen-1.5B model demonstrate that InftyThink+ achieves a 21% increase in accuracy on the AIME24 benchmark, surpassing conventional long chain-of-thought reinforcement learning methods. Additionally, it shows improved generalization against out-of-distribution benchmarks and reduces inference latency, indicating stronger performance and improved efficiency in reasoning tasks.

Share this article

Twitter Facebook LinkedIn WhatsApp Reddit

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

InftyThink+: A Breakthrough in Infinite-Horizon Reasoning via Reinforcement Learning

Related Topics:

Share this article