Cohere releases its flagship open-source model Command A+: a 219B-parameter MoE model runnable on just two H100 GPUs

On May 20, Cohere released Command A+, fully open-sourcing it under the Apache 2.0 license — this marks the first flagship model from the company ever released under such a permissive agreement, a milestone confirmed personally by CEO Aidan Gomez and co-founder Nick Frosst. Built on a Mixture-of-Experts (MoE) architecture, Command A+ boasts a total of 219 billion parameters, though only 25 billion are activated per inference. It delivers high-performance inference using as few as two H100 GPUs or a single B200 GPU. The model supports a 128K context window, 48 languages, and multimodal capabilities for both text and images; native support was also added to HuggingFace Transformers on the same day. For quantization, Command A+ applies 4-bit quantization (W4A4) solely to its MoE expert components while keeping attention pathways at full precision, achieving near-lossless compression via Quantization-Aware Distillation. Under low-concurrency conditions, it generates up to 375 tokens per second with a Time To First Token (TTFT) of just 113 milliseconds — representing roughly 63% faster output speed and 17% lower latency compared to its predecessor, Command A Reasoning.

At the architectural level, researchers have identified several unconventional design choices in Command A+: parallel computation of attention and MoE layers, query vectors whose total dimensionality equals four times the hidden layer width, use of LayerNorm instead of RMSNorm, just 32 layers overall, and no preceding dense layers. In terms of enterprise features, the model natively supports tool calls and Agentic workflows, plus a grounding span mechanism that explicitly cites source documents or database entries when retrieving external information — a critical capability for regulated industries like finance, healthcare, and law. Cohere also offers multiple quantized model weights and Model Vault, its managed inference service. In his launch post, Nick Frosst hinted that the decision to adopt the Apache 2.0 license is directly tied to recent partnerships with German corporate clients.

Cohere Blog | HuggingFace