Claude Code introduces a fast mode; Opus 4.6 runs 2.5 times faster

Anthropic has released a research preview of “Fast Mode” in the Claude Code documentation — a high-speed API configuration designed specifically for Claude Opus 4.6. This mode delivers responses 2.5 times faster than the standard mode, though at a higher cost per token. Fast Mode isn’t a new model; Opus 4.6’s quality and capabilities remain unchanged — the difference lies solely in API settings optimized for minimal latency. The pricing is set at $30 per million input tokens and $150 per million output tokens. All subscription tiers (Pro, Max, Team, Enterprise) gain access to this mode, but usage counts toward “additional usage” rather than existing plan limits. Users can toggle Fast Mode on or off by typing /fast in the Claude Code CLI or VS Code extension; once active, an symbol appears next to the prompt. Alternatively, setting "fastMode": true in your config file enables it permanently.

Should rate limits be triggered, the system automatically reverts to standard Opus 4.6 usage; the icon dims temporarily until cooldown ends, after which Fast Mode resumes automatically. Note that Fast Mode isn’t available via third-party cloud platforms like Amazon Bedrock, Google Vertex AI, or Microsoft Azure Foundry, and requires Claude Code version 2.1.36 or newer. For Teams and Enterprise organizations, Fast Mode is disabled by default; administrators must manually enable it and may also configure fastModePerSessionOptIn to require re-activation per session as a cost-control measure. Anthropic emphasizes that this feature remains in research preview, so pricing and availability might evolve based on user feedback. Since enabling Fast Mode mid-conversation triggers repricing of the entire dialogue history under Fast Mode rates, the recommended practice is to activate it right at the start of each session to maximize cost efficiency.

Claude Code Docs