Xiaomi MiMo-V2.5 API prices slashed by up to 99% permanently; Token Plan gets 5–8× more credits at no extra cost

Xiaomi’s MiMo open platform announced a permanent, across-the-board price cut for its MiMo-V2.5 series API, effective May 27, 2026 at 00:00 CST, applied globally and simultaneously. The new pricing reduces costs by up to 99% compared to previous rates and eliminates the previous distinction between short and long inputs — a billing complexity that had drawn developer friction. Alongside the price cuts, the Token Plan billing system has been restructured: subscribers get 5 to 8 times more usable credits for the same price, with updated rules designed to be more transparent. As an additional one-time benefit, all users with an active Token Plan subscription — including those who received plans through the 100 Trillion Token Creator Incentive Program and Apache Software Foundation members — will have their full credit quota reset to zero-used at the same moment the new prices take effect. Xiaomi also said historical paid users whose subscriptions have already expired will receive a separate surprise announcement within the coming week.

The price reduction is underpinned by sustained inference-system engineering from Xiaomi’s technical team. By fully implementing Sliding Window Attention (SWA) via SGLang HiCache, the team reduced KV Cache data transfers across GPU memory, CPU memory, and SSD storage to roughly one-seventh of pre-optimization levels, while increasing the number of cacheable tokens to approximately five times the prior capacity — both factors that materially lift cache hit rates and cut per-token serving costs. Separately, the 100 Trillion Token Creator Incentive Program, which launched April 28, concluded ahead of schedule on May 26 at 16:08 CST after all 100T tokens were distributed; Apache Software Foundation member benefits remain active and unaffected by the program’s close.

Xiaomi MiMo Open Platform

AI,Xiaomi,MiMo,API,Price Cut