DeepSeek has updated its API documentation regarding concurrency limits: the maximum concurrent requests for accounts using the deepseek-v4-pro model is 500, while for deepseek-v4-flash it’s 2500. Exceeding these limits results in an HTTP 429 error. Users requiring higher concurrency can apply for additional quota allocation free of charge; once approved, apart from the overall account-level limit, each user_id will also be subject to the respective model’s per-user concurrency cap.
This update also introduces the user_id parameter, enabling more granular management under a single account — providing isolation across three key areas: content safety, KVCache, and request scheduling. For regular API users, concurrent requests from all user_ids are aggregated toward the total account quota; meanwhile, accounts with elevated quotas are governed not only by the account-wide limit but also by individual user_id caps. This feature is primarily designed for developers operating in multi-tenant environments who need to isolate traffic flows for downstream users.