Skip to main content

Models & Pricing

The prices listed below are in unites of per 1M tokens. A token, the smallest unit of text that the model recognizes, can be a word, a number, or even a punctuation mark. We will bill based on the total number of input and output tokens by the model.

Pricing Details

      MODEL(1)      CONTEXT LENGTHMAX OUTPUT TOKENS(2)INPUT PRICE (CACHE HIT)(3)INPUT PRICE (CACHE MISS)  OUTPUT PRICE  
deepseek-chat128K4K (8KBeta)$0.014 / 1M tokens$0.14 / 1M tokens$0.28/1M tokens
  • (1) The DeepSeek V2 Chat and DeepSeek Coder V2 models have been merged and upgraded into the new model, DeepSeek V2.5. The new model significantly surpasses the previous versions in both general capabilities and code abilities. For backward compatibility, API users can access the new model through either deepseek-coder or deepseek-chat.
  • (2) The 8K output tokens limit of the Chat Completion API is in Beta and requires user to set base_url="https://api.deepseek.com/beta". If the base_url is not set to the Beta url, or max_tokens parameter is not set, the limit is 4K tokens.​
  • (3) Please check this article for the details of Context Caching.

Deduction Rules

The expense = number of tokens × price. The corresponding fees will be directly deducted from your topped-up balance or granted balance, with a preference for using the granted balance first when both balances are available.

Product prices may vary and DeepSeek reserves the right to adjust them. We recommend topping up based on your actual usage and regularly checking this page for the most recent pricing information.