Skip to main content

Change Log

Version: 2024-06-28

deepseek-chat

Model's reasoning capabilities have improved, as shown in relevant benchmarks:

  • Coding: HumanEval Pass@1 79.88% -> 84.76%
  • Mathematics: MATH ACC@1 55.02% -> 71.02%
  • Reasoning: BBH 78.56% -> 83.40%

In the Arena-Hard evaluation, the win rate against GPT-4-0314 increased from 41.6% to 68.3%.

The model's role-playing capabilities have significantly enhanced, allowing it to act as different characters as requested during conversations.

Version: 2024-06-14

deepseek-coder

The deepseek-coder model has been upgraded to DeepSeek-Coder-V2 Instruct, significantly enhancing its coding capabilities. It has reached the level of GPT-4-Turbo-0409 in code generation, code understanding, code debugging, and code completion. Additionally, it possesses excellent mathematical and reasoning abilities, and its general capabilities are on par with DeepSeek-V2 Chat.

Version: 2024-05-17

deepseek-chat

The model has seen a significant improvement in following instructions, with the IFEval Benchmark Prompt-Level accuracy jumping from 63.9% to 77.6%. Additionally, on API end, we have optimized model ability to follow instruction filled in the ``system" part. This optimization has significantly elevated the user experience across a variety of tasks, including immersive translation, Retrieval-Augmented Generation (RAG), and more.

The model's accuracy in outputting JSON format has been enhanced. In our internal test set, the JSON parsing rate increased from 78% to 85%. By introducing appropriate regular expressions, the JSON parsing rate was further improved to 97%.