DeepSeek Revelation: The Unpredictable Path to Greatness

02/05 2025 559

As Liang Wenfeng eloquently stated, "Innovation arises naturally, not through deliberate arrangement."

Written by | She Zongming

"National-level technological breakthrough", "America's Sputnik moment", "AI Pearl Harbor event"... The "coolest national AI trend" ignited by DeepSeek shows no signs of waning.

The United States has launched a "witch hunt" against DeepSeek under the guise of national security investigations, inadvertently fueling its popularity in public opinion.

DeepSeek appears to be the clear winner of the "Spring Festival season": Established just a year and a half ago, it has silently brewed a thunderbolt, akin to a "rebellious demon child." Leveraging its powerful techniques, it made a splash on Nasdaq, boosting the confidence of the Chinese people, worthy of being hailed as a "great hero." DeepSeek, which sparked a "computing power revolt" in the AI field, has rewritten the rules of the AI power game, deserving to be "deified."

As a result, DeepSeek has been surrounded by various shocking headlines in domestic and international public opinion.

▲Feng Ji, producer of "Black Myth: Wukong," regards DeepSeek-R1 as a national-level technological achievement.

In the grand narrative, DeepSeek's meteoric rise is easily compared to Huawei's breakthrough in overcoming US blockades to relaunch the "aspirational" 5G phone Huawei Mate60 Pro, and is tied to the discourse of the rise of major power technology.

With the Chips Act and TikTok divestiture bill becoming direct reflections of the intensity of the China-US technology competition, such interpretations are inevitable.

However, this should not obscure the "de-nationalism" aspect of the DeepSeek miracle. It should be noted that many specific factors contribute to DeepSeek becoming what it is today, such as Liang Wenfeng's extreme technological idealism and DeepSeek's unconventional approach to talent utilization.

It is precisely these multiple unique aspects of DeepSeek that allow it to bloom like a thorn flower, unafraid of harsh winds and arid lands.

In my opinion, if I had to encapsulate the essence of the DeepSeek Revelation in one sentence, it would be: greatness cannot be planned.

01

The phrase "greatness cannot be planned" implies that one should not attempt to design and plan everything, as many things are the result of "unintentional successes."

DeepSeek's simultaneous topping of the download charts on Apple's App Store in both China and the United States is the most intuitive proof. Its rise has been filled with numerous "unexpected surprises."

Nine months ago, many people's impression of domestic AI front-runners was still dominated by large companies such as BAT and ByteDance, as well as the "Big Six" AI large model startups (Zhipu AI, Dark Side of the Moon, Baichuan Intelligence, MiniMax, Stepwise Stars, ZeroOne Everything).

Who could have imagined that the "relatively unknown" DeepSeek would pierce through the ceiling of cost-effectiveness, proving that "even though it's a movie for mainstream players, I insist on making my mark"?

Over a month ago, many people's perception of OpenAI challengers was still centered in Silicon Valley. Sequoia Capital previously believed that the AI field was a five-power rivalry: Microsoft + OpenAI, Amazon + Antropic, Google, Meta, and xAI.

Who could have imagined that DeepSeek would launch DeepSeek-V3, which can compete with top closed-source models like GPT-4o and Claude 3.5 Sonnet, with only about one-tenth of the pre-training costs of OpenAI?

Just over ten days ago, some people still believed that DeepSeek-V3 was just a flash in the pan, and that even DeepSeek itself would struggle to replicate its success.

Who could have imagined that DeepSeek would then unveil DeepSeek-R1, which is still low-cost but even more powerful (with performance comparable to OpenAI GPT-4o's official version, and API service pricing 27-55 times lower), directly shocking the European and American technology communities?

▲After DeepSeek-R1 triggered an earthquake in US stocks, it attracted widespread attention from domestic and foreign media.

In fact, even when DeepSeek-V3 was released, there were still domestic experts who viewed DeepSeek as a Xiaomi Su7, "resembling a certain brand's appearance, with single-layer, civilian-grade brake calipers, and average sound insulation... Although its brakes severely degrade after a few laps on the track, it's still something that NIO, Xpeng, and Li Auto have already played with, without technological or form breakthroughs." Some foreigners believed that although DeepSeek had presented a "$30 iPhone," it was only a substitute.

However, DeepSeek-R1 proved with reactions of shock from Trump, admiration from Sam Altman, and "connotation" from Elon Musk that it was indeed not an ordinary entity.

Now, many domestic netizens are using full-screen exclamations to form an emoji of Emperor Xuanzong: "How many more surprises do you have that I don't know about?"

02

"Greatness cannot be planned" is also because many "great" things may start out as "small."

There may still be debate about whether DeepSeek can carry the title of a "national-level achievement," but it certainly deserves the word "amazing."

What exactly makes DeepSeek-R1 so impressive? Those who understand technology may immediately spout a bunch of technical terms: synthetic data, knowledge distillation, FP8 low precision, sparse models, MoE, multi-head attention mechanism...

These technologies may not be original, but DeepSeek's ability to utilize existing technologies to achieve extreme improvements in training efficiency and computing power energy efficiency represents a phenomenal breakthrough.

It is often said that "no matter how many carriages you add together, you can't make a car." When Steve Jobs invented the iPhone, he didn't just stack MP3 and camera functions onto a feature phone; instead, he redefined the phone with touchscreen experiences and hardware-software integration. The rarity of DeepSeek also lies in its "redefinition" – it breaks path dependence and redefines the way computing power is enhanced.

Some people use this metaphor: If enhancing computing power is like building a building, then OpenAI is like piling up bricks (chip hardware), while DeepSeek relies on inventing reinforced concrete (mathematical framework innovation) to reshape the construction method.

OpenAI has turned large model research and development into a competition to see who has more bricks, while DeepSeek has turned it into a competition to see who has a more efficient construction method.

▲Some netizens joke about the difference between OpenAI and DeepSeek in this way.

With just over 200 employees, DeepSeek is able to leverage its engineering capabilities to "invent" a computing power multiplier through algorithmic optimization, accomplishing many things that many large domestic and foreign companies have failed to do. This inevitably reminds people of Kevin Kelly's words in "What Technology Wants": "The most successful company in the future will inevitably be a small company that is still unknown today and operates outside the social media sphere."

Small is big. When the successful experiences of large companies become their shackles, startups can demonstrate greater innovation potential through curiosity-driven innovation momentum and flat, hierarchy-free organizational structures.

DeepSeek is a typical example. The high vitality of AI startups, combined with Liang Wenfeng's high-dimensional cognition, produces a stunning chemical reaction.

Liang Wenfeng's technical faith in AGI (Artificial General Intelligence), his clear insight that the essence of the gap between Chinese and American AI is the difference between originality and imitation, and his forward-looking judgment that "the moat of closed source is short-lived, and OpenAI's closed source cannot prevent being surpassed" all reflect a cognitive ability above the industry.

DeepSeek's adoption of a hierarchy-free, flexible mechanism collaboration, recruitment standards that prioritize potential and curiosity over industry experience, and its widely respected open-source strategy are all related to this and amplify its potential.

Therefore, DeepSeek can adhere to a long-term strategic focus on "not doing applications, but focusing on large model research" at a time when the OpenAI route is prevalent. It can embrace a more efficient and open AI development path.

Part of the answer to why DeepSeek is successful lies therein.

03

"Greatness cannot be planned" also means that those contingencies and uniqueness should not be ignored.

Industry expert Yang Kuan said: "When OpenAI is indulging in violently stacking materials, the DeepSeek team is playing 'computing power Tetris' – squeezing the value of each CUDA core to four decimal places. In terms of hardware utilization, Silicon Valley utilizes GPU group communication loss technology to achieve 30%-40%, while DeepSeek utilizes self-developed MoE + dynamic routing algorithms to achieve 78%. This is not a technological gap, but a generational crushing of engineering thinking."

This is inseparable from the MLA architectural innovation proposed by the DeepSeek team, which reduces video memory occupancy to 5%-13% of traditional methods. Behind this is the sudden inspiration of a young researcher.

He also mentioned that Liang Wenfeng injected quantitative trading thinking into AI training: risk hedging (constructing an "investment portfolio" with multimodal data), high-frequency parameter tuning (optimizing hyperparameters every 2 hours under the industry standard of 72 hours/time), and dynamic stop-loss (automatically terminating inefficient training branches). This set of "Wall Street alchemy" makes each training session of DeepSeek feel like a speedrun of the technology tree in "Civilization VI".

This easily brings to mind the examples given by Kenneth Stanley and Joel Lehman in "Why Greatness Cannot Be Planned": the Wright brothers, who invented the airplane, were originally bicycle manufacturers; the vacuum tube was a foundational component of early computers, but its birth had nothing to do with computers...

Liang Wenfeng, who started his career in quantitative trading, has created an AI large model with superior "cost-effectiveness ratio," adding another case to this.

▲DeepSeek, which is called the "mysterious oriental force" by many foreigners, has a unique development path.

DeepSeek is also well-known for its open-source model: against the backdrop of OpenAI deviating from its original intentions and becoming CloseAI, DeepSeek has become a truly open AI; when OpenAI treats developers as "digital sharecroppers," DeepSeek uses open-source agreements to launch an "AI land revolution"... This is also where it shines.

This is also closely related to Liang Wenfeng's technical idealism. Someone else might have adopted Sam Altman's tactics to compete with rivals.

Liang Wenfeng said, "Innovation arises naturally, not through deliberate arrangement."

This is equivalent to patting Kenneth Stanley on the back and expressing agreement with him. Because Kenneth Stanley said: true greatness cannot be planned, and following curiosity step by step is the right path to achieving extraordinary things.

04

"Greatness cannot be planned," so those seemingly small seeds may also "blossom and stretch out new branches"; those explorations in marginal areas, peripheral regions, and hidden corners may also "create miracles with small forces."

Like foreign companies such as GAFA (Google, Apple, Facebook, Amazon) and NVIDIA, and domestic companies such as BAT, DeepSeek was not planned but grew in suitable soil.

Since innovation is the product of stimulating curiosity, activating creativity, and breaking away from path dependence, rather than a planned outcome, the proper care and incentive for innovation should not be a return to path dependence, but should provide a good institutional environment for curiosity and creativity, including an inclusive atmosphere and room for trial and error.

Currently, after DeepSeek's popularity, some reactions in the public opinion field are worth being vigilant about. Some of these reactions further lead to a path dependence on "planning." Specifically, they include:

First, importing DeepSeek's breakthroughs into a nationalist context, believing that it should be incorporated as the "AI national team" and strongly supported.

The "self-generated" DeepSeek does not need to be forced to grow prematurely; it only needs a climate and soil suitable for innovation.

A closer look will reveal that from "Black Myth: Wukong" to Unitree Technology's robot dog to DeepSeek's top-ranked large model in the StyleCtrl category, they were all born in Hangzhou.

This is not accidental. Generally speaking, a market with a solid foundation, many private enterprises, a strong innovation atmosphere, sufficient economic vitality, and broad development opportunities has a high probability of becoming a future technology center. They are all interconnected.

Respecting the market, encouraging innovation, embracing openness, and tolerating failure to cultivate a market ecology suitable for innovation may lead to the emergence of more enterprises – including small and medium-sized private enterprises – like DeepSeek.

On the other hand, interventions in the name of care and coercion under the pretext of responsibility may inhibit their vitality.

Second, pushing DeepSeek into the whirlpool of the great era with an attitude of "overthrowing Silicon Valley and challenging Wall Street" in the spirit of "Wow, my DS is amazing."

Amidst the United States' escalating restrictions on Chinese AI chips, DeepSeek has ingeniously leveraged Huawei chips, local Chinese AI expertise, and reduced computing costs to develop leading-edge large models. This achievement naturally ignites national self-confidence and pride, and is poised to disrupt the current stalemate. However, it's crucial not to overstate DeepSeek's accomplishments or tacitly align with the notion of "decoupling and severing chains."

Viewing DeepSeek's breakthrough as "national-level" is understandable from a perspective of lifting spirits and asserting independence. Nevertheless, one should refrain from associating the company with terms like "overthrowing" or "challenging," as this might inadvertently play into external "threat theories" and unwittingly place it in the crosshairs.

▲Currently, DeepSeek is facing a "witch hunt" orchestrated by the US under the pretext of a national security investigation.

In essence, we must avoid the extremes of "self-deprecation" and "arrogance." Wang Weijia, author of "Dark Knowledge," contends that while DeepSeek's achievements have narrowed the AI technology gap between China and the US, the overall AI landscape remains unchanged.

He emphasizes, "In key AI technology areas, China still significantly lags behind the US in chip technology. Regarding algorithmic breakthroughs, over the past decade, from AlexNet in 2012 to Transformer in 2017, to ChatGPT in 2022 and subsequent advancements like chains of thought, RAG, and reasoning training, most have originated in the US, with minimal contributions from France's Mistral. DeepSeek's contribution stands at around 5%, which is nonetheless remarkable."

Moving forward, we need more companies like DeepSeek to emerge and continue narrowing the gap, rather than extinguishing the spark through overhype.

05

Ultimately, DeepSeek's success epitomizes the view in "Why Greatness Cannot Be Planned" – "In the quest to explore uncharted territories, remain open to intriguing possibilities. With sufficient stepping stones in place, great achievements will unexpectedly materialize." DeepSeek has taken a pivotal step in AI history and will undoubtedly face scrutiny amidst future great power rivalries and technological competitions.

Yet, regardless of the circumstances, remember this – DeepSeek's innovation was not premeditated, nor should it be coerced by "plans" post-recognition.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.