在2025年科技界的璀璨盛宴中,一款AI新模型犹如新星闪耀,引领了智能科技的新潮流。这款由中国初创公司“杭州深度求索人工智能基础技术研究有限公司”打造的DeepSeek-R1,不仅迅速走红,更以其革命性的技术、高效的成本和开源的精神,成为了科技界的一颗耀眼明珠。从发布后的登顶应用商店到业界的广泛赞誉,DeepSeek的成就再次证明了创新技术在推动行业发展中的关键作用和巨大潜力。此次AI界的突破,不仅标志着中国在大型语言模型领域的崛起,更预示着科技界对于高效、开放和共享发展路径的热切追求与完美实践。
本期我们邀请了商务英语2401班的俞聆,带我们一同回顾本次AI界的新突破——DeepSeek。
The artificial intelligence (AI) community is abuzz with excitement over DeepSeek-R1, a new open-source model developed by Chinese startup DeepSeek.
Released on Jan 20, it quickly soared to the top of Apple's app store's free charts by Monday, surpassing OpenAI's ChatGPT.
Here's what DeepSeek has done and why it is taking the AI industry by surprise.
What is DeepSeek?
Officially known as DeepSeek Artificial Intelligence Fundamental Technology Research Co, Ltd, the firm was founded in July 2023. As an innovative technology startup, DeepSeek is dedicated to developing cutting-edge large language models (LLMs) and related technologies.
The just-released model R1 has achieved an important technological breakthrough -- using pure deep learning methods to allow AI to spontaneously emerge with reasoning capabilities.
Unlike traditional approaches like Chain-of-Thought (CoT) and Supervised Fine-Tuning (SFT), DeepSeek has distinguished itself in the AI industry by adopting Reinforcement Learning (RL) as a core training method.
While CoT and SFT rely on step-by-step reasoning and huge amounts of labeled data, respectively, RL enables models to learn through interaction and reward mechanisms, making it better suited for complex and dynamic tasks.
The adoption of RL has allowed DeepSeek to enhance its models' reasoning, adaptability and efficiency, setting it apart as a frontrunner in the field.
Bigger is no longer always smarter.
According to its V3 model technical report, DeepSeek's manufacturing cost is approximately $5.57 million, making it the least expensive among LLMs. Compared to other well-known models, DeepSeek achieved an order-of-magnitude reduction of cost.
The AI industry development has long relied on piling up computing power. The cost-efficient DeepSeek model may upend the AI landscape.
US investment bank and financial service provider Morgan Stanley believed that DeepSeek demonstrates an alternative path to efficient model training than the current arm's race among hyperscalers by significantly increasing the data quality and improving the model architecture.
Bigger is no longer always smarter, it said.
Open-source model
Open source allows researchers, developers and users to access the model's underlying code and its weights -- the parameters that determine how the model processes information -- enabling them to use, modify or enhance the model to suit their needs.
DeepSeek has greatly benefited from open-source principles and, in turn, demonstrates a strong commitment to sharing knowledge and contributing to the collective advancement of technology.
Meta's chief AI scientist Yann LeCun said: They came up with new ideas and built them on top of other people's work. Because their work is published and open source, everyone can profit from it.
本文摘录自扇贝英语2025-2-14
人工智能(AI)界因DeepSeek-R1而兴奋不已,这是中国初创公司DeepSeek开发的一种新的开源模型。1月20日发布后,到周一它就迅速登上了苹果应用商店免费排行榜的榜首,超过了OpenAI的ChatGPT。以下是DeepSeek的成果以及它为何令人工智能行业感到惊讶。
什么是DeepSeek?
DeepSeek的正式名称是深度求索人工智能基础技术研究有限公司,成立于2023年7月。作为一家创新型科技初创公司,DeepSeek致力于开发前沿的大型语言模型(LLMs)及相关技术。刚刚发布的R1模型取得了一项重要的技术突破——使用纯深度学习方法使人工智能自发产生推理能力。与思维链(CoT)和监督微调(SFT)等传统方法不同,DeepSeek通过采用强化学习(RL)作为核心训练方法在人工智能行业中脱颖而出。CoT和SFT分别依赖于逐步推理和大量的标记数据,而RL使模型能够通过交互和奖励机制进行学习,使其更适合复杂和动态的任务。RL的采用使DeepSeek能够增强其模型的推理、适应性和效率,使其成为该领域的领先者。
“越大不一定越智能”
根据其V3模型技术报告,DeepSeek的制造成本约为557万美元,是大型语言模型中成本最低的。与其他知名模型相比,DeepSeek的成本降低了一个数量级。人工智能行业的发展长期以来依赖于堆砌计算能力。成本效益高的DeepSeek模型可能会颠覆人工智能领域的格局。美国投资银行和金融服务提供商摩根士丹利认为,DeepSeek通过显著提高数据质量和改进模型架构,为高效的模型训练提供了一条不同于当前超大规模数据中心之间军备竞赛的替代路径。该行表示:“越大不一定越智能。”
开源模型
开源使研究人员、开发人员和用户能够访问模型的底层代码及其“权重”——即决定模型如何处理信息的参数——使他们能够根据自己的需求使用、修改或增强模型。DeepSeek从开源原则中受益匪浅,反过来,也显示出其对共享知识和推动技术共同进步的坚定承诺。Meta的首席人工智能科学家杨立昆(Yann LeCun)表示:“他们提出了新的想法,并在他人工作的基础上进行了构建。由于他们的工作是公开且开源的,每个人都能从中受益。”