MiroThinker Technical Reports and Public Resources Compilation

This comprehensive compilation organizes the technical reports and publicly available resources for MiroThinker, an advanced open-source research agent system. Currently, MiroMind has released three major technical reports covering MiroThinker 1.0, MiroThinker 1.7, and MiroFlow.

Official Resources

Project Websites

Resource	Link	Description
Project Homepage	mirothinker.io	Official introduction, technical features, model version comparisons
Web Demo	dr.miromind.ai	Interactive web-based demo for direct testing
Company Page	miromind.ai	MiroMind team introduction and project ecosystem

Open Source Code and Model Resources

GitHub Repositories

Main Repository: MiroMindAI/MiroThinker
MiroFlow: MiroMindAI/MiroFlow - Understood as a Deep Research agent framework, possibly part of MiroThinker's harness system

Hugging Face Models and Datasets

Model/Dataset Name	Parameters	Context	Tool Calls	Link
MiroThinker-1.7-mini	30B	256K	300	HF Link
MiroThinker-1.7	235B	256K	300	HF Link
MiroThinker-v1.5-30B	30B	256K	400	HF Link
MiroThinker-v1.5-235B	235B	256K	400	HF Link
MiroThinker-v1.0 (8B/30B/72B)	Multiple	256K	600	HF Collection
MiroVerse-v0.1 (Dataset)	147K+ trajectories	-	-	HF Link

Core Project Ecosystem

The MiroMind Open Deep Research (ODR) ecosystem consists of four interconnected components:

MiroMind ODR (Open Deep Research)
├── MiroThinker → Model (Tool-augmented reasoning LLM)
├── MiroFlow → Agent Framework (Reproducible multi-agent orchestration)
├── MiroVerse → Dataset (147K+ research trajectory samples)
└── MiroTrain → Training Infrastructure (RL and long-context training support)

Technical Innovations and Algorithm Overview

Core Innovation: Interactive Scaling

MiroThinker introduces Interactive Scaling as the "third dimension" of model performance, standing alongside model scale and context length as fundamental performance axes. This represents a paradigm shift in how research agent capabilities are measured and optimized.

Training Methodology

The training pipeline employs a sophisticated three-stage optimization approach:

Mid-training Phase: Reinforces planning and tool interaction capabilities
SFT (Supervised Fine-Tuning): Establishes base competencies
DPO (Direct Preference Optimization): Aligns model outputs with human preferences
RL (Reinforcement Learning): Further optimizes decision-making

A critical innovation is the time-sensitive sandbox training approach, which prevents "future information leakage" during the training process—ensuring the model learns to reason through problems sequentially rather than cheating by accessing information it shouldn't have at each reasoning step.

Reasoning Mechanism

MiroThinker supports a complete hypothesis-driven research loop:

Hypothesis → Search → Verify → Revise

This closed-loop reasoning is supported by dual validation mechanisms:

Local Validation: Verifies single-step logical consistency
Global Validation: Ensures overall coherence and consistency across the entire reasoning chain

The system supports up to 600 tool calls per task, enabling extremely thorough and comprehensive research processes.

Tool Integration

The framework integrates multiple external tools:

Web Search: Serper API integration
Web Scraping: Jina AI for content extraction
Code Execution: E2B sandboxed execution environment
Document Parsing: Multi-format document processing
Multimodal Processing: Image and video analysis capabilities

Official Documentation

Documentation Resources

Document Type	Location	Content
README	GitHub/README.md	Quick start, configuration, benchmark results
Tool Documentation	libs/miroflow-tools/README.md	MCP tool configuration, API key instructions
Deployment Guide	GitHub Wiki / docs/ directory	SGLang/vLLM deployment, quantization, Docker support

Technical Reports

Paper Title	arXiv ID	Release Date	Core Contribution
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling	2511.11793	November 2025	Introduces Interactive Scaling, v1.0 benchmark results
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification	2603.15726	March 2026	Introduces verification mechanisms, 1.7 and H1 version technical details
MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework	2602.22808	February 2026	Agent framework design, high concurrency and reproducible evaluation support

Official Blog

The official blog at miromind.ai/blog provides updates, though the technical depth varies across posts.

Chinese Community Third-Party Analyses

Several Chinese technology media outlets have published human-written analyses of MiroThinker:

Publisher	Article
Quantum位 (QbitAI)	Chen Tianqiao and Dai Jifeng Fire the First Shot of 2026 Large Models: 30B Parameters Achieve 1T Performance
StartZhi AI	MiroThinker Open Source: Built for Deep Research and Solving Multi-Step Complex Tasks
AI Product Silver Sea	Now Open Source! This Search Agent Model Has a Unique Approach
OpenCSG	MiroThinker-1.7: When AI Learns "Slow Thinking", Reasoning Abilities Make a Qualitative Leap

Significance and Impact

MiroThinker represents a significant advancement in open-source research agents. By treating interactive tool usage as a first-class scaling dimension alongside model size and context length, the MiroMind team has demonstrated that carefully orchestrated tool interaction can dramatically enhance research capabilities without requiring proportional increases in model parameters.

The release of the MiroVerse dataset (147K+ research trajectories) provides the community with valuable training data, potentially accelerating further research in this domain. The modular architecture—separating the core model (MiroThinker), orchestration framework (MiroFlow), training infrastructure (MiroTrain), and dataset (MiroVerse)—enables researchers to innovate on individual components while maintaining compatibility with the broader ecosystem.