Xiaomi Launches MiMo 2.5 Pro: Game-Changing AI Model with Vision, Audio, and Video Capabilities
Tech Giant Rolls Out Advanced Multimodal AI System as Part of Aggressive Development Strategy
Xiaomi has introduced MiMo-V2.5 and MiMo-V2.5-Pro, marking a significant milestone in artificial intelligence development. The new multimodal models consolidate text, image, audio, and video processing into a single unified system—a feature absent from the company's previous generation.
This latest release demonstrates Xiaomi's accelerating momentum in the AI sector. The company unveiled MiMo-V2-Pro, a trillion-parameter model, merely weeks earlier, which had circulated anonymously on OpenRouter under the codename "Hunter Alpha" before its official reveal. The rapid succession of releases reflects an unprecedented iteration pace in the industry.
What Sets the New Models Apart
The previous iteration, MiMo-V2-Pro, handled text and code exclusively. While a sibling model, MiMo-V2-Omni, possessed multimodal capabilities, it operated as a separate product with lower benchmark performance. The V2.5 series fundamentally restructures this approach, delivering all functionality within a single, optimized platform.
The integration matters significantly for practical applications. Users can now photograph their refrigerator and request meal recommendations, upload instructional videos for step-by-step analysis, or process meeting recordings to extract action items—all without managing multiple tools or separate pricing structures.
Performance and Capabilities
Xiaomi asserts that MiMo-V2.5-Pro represents "a major leap" in autonomous task completion, complex software engineering, and extended-duration operations. The model matches frontier competitors including Claude Opus 4.6 and GPT-5.4 across most coding and agent benchmarks, though performance gaps persist on advanced reasoning challenges.
On the SWE-bench Pro coding evaluation—where models address genuine software bugs in actual startup codebases—MiMo-V2.5-Pro resolves 57.2% of tasks, positioning it among industry leaders; typical models achieve approximately 25% resolution rates. Similar competitiveness appears on τ3-bench and ClawEval assessments. However, on Humanity's Last Exam, a comprehensive graduate-level academic challenge across numerous disciplines, MiMo scores 48.0% compared to GPT-5.4's 58.7%, revealing a notable 10-point disadvantage.
Model Variants and Pricing Structure
Two distinct models address different user needs. MiMo-V2.5-Pro functions as the premium tier, processing 60–80 tokens per second at $1.00 per million input tokens and $3.00 per million output tokens. The system can autonomously complete professional operations involving more than 1,000 tool interactions—tasks requiring days of expert human effort.
MiMo-V2.5 serves as the everyday option, operating at 100–150 tokens per second with pricing at $0.40 input and $2.00 output per million tokens. Despite lower cost, it maintains access to all modality functions—image, audio, and video processing—features the Pro-only variant restricts. Both models support a 1-million-token context window, accommodating approximately 750,000 words within individual conversations.
Token Efficiency Advantage
A significant differentiator emerges in token efficiency. Xiaomi reports that MiMo-V2.5-Pro consumes 42% fewer tokens than Kimi K2.6 at equivalent performance levels, while MiMo-V2.5 requires nearly half the tokens of Muse Spark for comparable results. For developers managing thousands of daily requests, this efficiency translates to substantial cost reductions.
Multimodal task performance positions MiMo-V2.5 competitively alongside GPT-5.4 and Gemini 3.1 Pro, with measurements approaching Opus 4.6 standards.
Rapid Release Cadence Signals Major Investment
Since December 2025, Xiaomi has completed three consecutive major model launches. CEO Lei Jun announced a minimum $8.7 billion artificial intelligence investment commitment over three years, disclosed immediately following the V2-Pro release. The accelerated release schedule indicates that allocated resources are actively deployed.
Market momentum provides context for this development velocity. According to Digital Applied data from early April, Xiaomi models represented approximately 21% of all OpenRouter traffic, with growth exceeding 42% during the preceding seven-day period. When a previous release becomes one of the world's largest AI routing platform's most competitive offerings, both resources and competitive pressure drive rapid iteration.
The surge partly resulted from Hermes, an agentic AI platform, partnering with Xiaomi to provide complimentary MiMo V2 Pro access during a limited period. Although that promotional window closed, the generated momentum established Xiaomi as a significant competitive force.
Availability and Future Direction
The models are accessible through Xiaomi's AI Studio and the Xiaomi MiMo API, with the latter serving as the primary developer integration point. The company announced plans to open-source the models in the near future.
New users receive full credit resets as launch incentives. Xiaomi discontinued additional multipliers for utilizing the maximum 1-million-token context window, reducing long-document analysis expenses.
The company confirmed that next-generation model development is underway, featuring "deeper reasoning, tighter tool integration, and richer real-world grounding." Given the current development velocity, this announcement may arrive sooner than industry observers anticipate.
Схожі новини
Blockchain billionaire Sun takes Trump family’s crypto firm to court
New York and Illinois Impose Strict Bans on State Employee Participation in Prediction Markets
Blockstream CEO Adam Back Dismisses Satoshi Claims at LONGITUDE; Industry Leaders Debate Crypto Regulation Framework