This comprehensive compilation organizes the technical reports and publicly available resources for MiroThinker, an advanced open-source research agent system. Currently, MiroMind has released three major technical reports covering MiroThinker 1.0, MiroThinker 1.7, and MiroFlow.

Official Resources

Project Websites

ResourceLinkDescription
Project Homepagemirothinker.ioOfficial introduction, technical features, model version comparisons
Web Demodr.miromind.aiInteractive web-based demo for direct testing
Company Pagemiromind.aiMiroMind team introduction and project ecosystem

Open Source Code and Model Resources

GitHub Repositories

Hugging Face Models and Datasets

Model/Dataset NameParametersContextTool CallsLink
MiroThinker-1.7-mini30B256K300HF Link
MiroThinker-1.7235B256K300HF Link
MiroThinker-v1.5-30B30B256K400HF Link
MiroThinker-v1.5-235B235B256K400HF Link
MiroThinker-v1.0 (8B/30B/72B)Multiple256K600HF Collection
MiroVerse-v0.1 (Dataset)147K+ trajectories--HF Link

Core Project Ecosystem

The MiroMind Open Deep Research (ODR) ecosystem consists of four interconnected components:

MiroMind ODR (Open Deep Research)
├── MiroThinker → Model (Tool-augmented reasoning LLM)
├── MiroFlow → Agent Framework (Reproducible multi-agent orchestration)
├── MiroVerse → Dataset (147K+ research trajectory samples)
└── MiroTrain → Training Infrastructure (RL and long-context training support)

Technical Innovations and Algorithm Overview

Core Innovation: Interactive Scaling

MiroThinker introduces Interactive Scaling as the "third dimension" of model performance, standing alongside model scale and context length as fundamental performance axes. This represents a paradigm shift in how research agent capabilities are measured and optimized.

Training Methodology

The training pipeline employs a sophisticated three-stage optimization approach:

  1. Mid-training Phase: Reinforces planning and tool interaction capabilities
  2. SFT (Supervised Fine-Tuning): Establishes base competencies
  3. DPO (Direct Preference Optimization): Aligns model outputs with human preferences
  4. RL (Reinforcement Learning): Further optimizes decision-making

A critical innovation is the time-sensitive sandbox training approach, which prevents "future information leakage" during the training process—ensuring the model learns to reason through problems sequentially rather than cheating by accessing information it shouldn't have at each reasoning step.

Reasoning Mechanism

MiroThinker supports a complete hypothesis-driven research loop:

Hypothesis → Search → Verify → Revise

This closed-loop reasoning is supported by dual validation mechanisms:

  • Local Validation: Verifies single-step logical consistency
  • Global Validation: Ensures overall coherence and consistency across the entire reasoning chain

The system supports up to 600 tool calls per task, enabling extremely thorough and comprehensive research processes.

Tool Integration

The framework integrates multiple external tools:

  • Web Search: Serper API integration
  • Web Scraping: Jina AI for content extraction
  • Code Execution: E2B sandboxed execution environment
  • Document Parsing: Multi-format document processing
  • Multimodal Processing: Image and video analysis capabilities

Official Documentation

Documentation Resources

Document TypeLocationContent
READMEGitHub/README.mdQuick start, configuration, benchmark results
Tool Documentationlibs/miroflow-tools/README.mdMCP tool configuration, API key instructions
Deployment GuideGitHub Wiki / docs/ directorySGLang/vLLM deployment, quantization, Docker support

Technical Reports

Paper TitlearXiv IDRelease DateCore Contribution
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling2511.11793November 2025Introduces Interactive Scaling, v1.0 benchmark results
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification2603.15726March 2026Introduces verification mechanisms, 1.7 and H1 version technical details
MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework2602.22808February 2026Agent framework design, high concurrency and reproducible evaluation support

Official Blog

The official blog at miromind.ai/blog provides updates, though the technical depth varies across posts.

Chinese Community Third-Party Analyses

Several Chinese technology media outlets have published human-written analyses of MiroThinker:

Significance and Impact

MiroThinker represents a significant advancement in open-source research agents. By treating interactive tool usage as a first-class scaling dimension alongside model size and context length, the MiroMind team has demonstrated that carefully orchestrated tool interaction can dramatically enhance research capabilities without requiring proportional increases in model parameters.

The release of the MiroVerse dataset (147K+ research trajectories) provides the community with valuable training data, potentially accelerating further research in this domain. The modular architecture—separating the core model (MiroThinker), orchestration framework (MiroFlow), training infrastructure (MiroTrain), and dataset (MiroVerse)—enables researchers to innovate on individual components while maintaining compatibility with the broader ecosystem.