Spaces:
Running
Running
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>DeepSeek Papers</title> | |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0-beta3/css/all.min.css"> | |
| <style> | |
| body { | |
| font-family: 'Arial', sans-serif; | |
| margin: 0; | |
| padding: 0; | |
| line-height: 1.6; | |
| color: #333; | |
| background-color: #f9f9f9; | |
| } | |
| header { | |
| background: #4CAF50; | |
| color: white; | |
| padding: 20px 0; | |
| text-align: center; | |
| } | |
| h1 { | |
| margin: 0; | |
| font-size: 2.5em; | |
| } | |
| .container { | |
| max-width: 800px; | |
| margin: 20px auto; | |
| padding: 20px; | |
| background: white; | |
| border-radius: 8px; | |
| box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); | |
| } | |
| .paper { | |
| margin-bottom: 20px; | |
| } | |
| .paper a { | |
| text-decoration: none; | |
| color: #4CAF50; | |
| font-weight: bold; | |
| } | |
| .paper a:hover { | |
| text-decoration: underline; | |
| } | |
| .coming-soon { | |
| color: #e74c3c; | |
| font-size: 0.9em; | |
| margin-left: 10px; | |
| } | |
| footer { | |
| text-align: center; | |
| padding: 10px 0; | |
| background: #4CAF50; | |
| color: white; | |
| margin-top: 20px; | |
| } | |
| </style> | |
| </head> | |
| <body> | |
| <header> | |
| <h1>DeepSeek Papers</h1> | |
| </header> | |
| <div class="container"> | |
| <h2>DeepSeek Research Contributions</h2> | |
| <p>Below is a list of significant papers by DeepSeek detailing advancements in large language models (LLMs). Each paper includes a brief description and highlights upcoming deep dives.</p> | |
| <!-- Paper List --> | |
| <div class="paper"> | |
| <a href="#">DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism</a> | |
| <span class="coming-soon">[Deep Dive Coming Soon]</span> | |
| <p><strong>Release Date:</strong> November 29, 2023<br> | |
| This foundational paper explores scaling laws and the trade-offs between data and model size, establishing the groundwork for subsequent models.</p> | |
| </div> | |
| <div class="paper"> | |
| <a href="#">DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model</a> | |
| <span class="coming-soon">[Deep Dive Coming Soon]</span> | |
| <p><strong>Release Date:</strong> May 2024<br> | |
| This paper introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing training costs by 42%.</p> | |
| </div> | |
| <div class="paper"> | |
| <a href="#">DeepSeek-V3 Technical Report</a> | |
| <span class="coming-soon">[Deep Dive Coming Soon]</span> | |
| <p><strong>Release Date:</strong> December 2024<br> | |
| This report discusses the scaling of sparse MoE networks to 671 billion parameters, utilizing mixed precision training and HPC co-design strategies.</p> | |
| </div> | |
| <div class="paper"> | |
| <a href="#">DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning</a> | |
| <span class="coming-soon">[Deep Dive Coming Soon]</span> | |
| <p><strong>Release Date:</strong> January 20, 2025<br> | |
| The R1 model enhances reasoning capabilities through large-scale reinforcement learning, competing directly with leading models like OpenAI's o1.</p> | |
| </div> | |
| <div class="paper"> | |
| <a href="#">DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models</a> | |
| <span class="coming-soon">[Deep Dive Coming Soon]</span> | |
| <p><strong>Release Date:</strong> April 2024<br> | |
| This paper presents methods to improve mathematical reasoning in LLMs, introducing the Group Relative Policy Optimization (GRPO) algorithm.</p> | |
| </div> | |
| <div class="paper"> | |
| <a href="#">DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data</a> | |
| <span class="coming-soon">[Deep Dive Coming Soon]</span> | |
| <p>Focuses on enhancing theorem proving capabilities in language models using synthetic data for training.</p> | |
| </div> | |
| <div class="paper"> | |
| <a href="#">DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence</a> | |
| <span class="coming-soon">[Deep Dive Coming Soon]</span> | |
| <p>This paper details advancements in code-related tasks with an emphasis on open-source methodologies, improving upon earlier coding models.</p> | |
| </div> | |
| <div class="paper"> | |
| <a href="#">DeepSeekMoE</a> | |
| <span class="coming-soon">[Deep Dive Coming Soon]</span> | |
| <p>Discusses the integration and benefits of the Mixture-of-Experts approach within the DeepSeek framework.</p> | |
| </div> | |
| </div> | |
| <footer> | |
| © 2025 DeepSeek Research. All rights reserved. | |
| </footer> | |
| </body> | |
| </html> |