Last month, DeepSeek revolutionized the artificial intelligence landscape with the introduction of a new, competitive simulated reasoning model. This model was remarkably offered for free, under an MIT license, making it accessible for download and use. Now, in an even bolder move, the company is set to unveil the underlying code that powers this model. Starting next week, DeepSeek promises to release five open source repositories, significantly enhancing accessibility.
In a social media announcement late Thursday, DeepSeek detailed plans for its highly anticipated Open Source Week. The company emphasized that these daily releases would offer unprecedented insight into the foundational components of their online service. These components have been meticulously documented, deployed, and rigorously tested in production environments. DeepSeek underscored its commitment to the open-source community, stating that each shared line of code fuels collective momentum, accelerating progress.
While specific details about the forthcoming code releases remain sparse, the accompanying GitHub page for DeepSeek Open Infra promises transparency. The page highlights that the upcoming releases will include code elements that have propelled their initiatives forward, offering a candid look at their progress. Moreover, it references a 2024 paper that elaborates on DeepSeek's training architecture and software stack, providing a deeper understanding of the company's technological foundation.
The open-source initiative by DeepSeek draws a stark contrast with industry leader OpenAI, whose ChatGPT models remain entirely proprietary. This lack of transparency makes it challenging for external users and researchers to understand the inner workings of these models. DeepSeek's open-source approach not only facilitates broader access but also addresses privacy concerns surrounding its mobile app, which is facing international scrutiny.
DeepSeek's initial model release included open weights, allowing users to access the data representing the strength of connections within the model's simulated neurons. This enables end users to fine-tune model parameters with additional data for specific purposes. Major models like Google's Gemma and Meta's Llama have also adopted this open weights approach, often accompanying it with open source inference-time code.
However, it remains uncertain whether DeepSeek's forthcoming open source release will extend to the training code used in model development. According to the Open Source Institute (OSI), a truly open AI must provide comprehensive training code and data details, enabling skilled individuals to replicate the system. A fully open source release would enhance researchers' ability to identify inherent biases and limitations within the model's architecture.
In recent developments, Elon Musk's xAI released an open source version of Grok 1's inference-time code and has pledged to release Grok 2 soon. However, Grok 3 remains proprietary, available only to X Premium subscribers. Similarly, HuggingFace swiftly launched an open source clone of OpenAI's Deep Research feature, showcasing the growing trend towards open-source AI solutions.