02/10 2025
516
This is the 808th article published by the Xijing Research Institute, compiled from the Institute's semi-monthly live broadcast with some omissions.
During the Spring Festival, a young man in his thirties from Hangzhou suddenly shattered the long silence on the global technological stage, simultaneously causing significant ripples in global capital markets, particularly a direct impact on US stock giants like NVIDIA. For the AI-backed US stock market, this was undoubtedly a black swan event. Although NVIDIA's stock price has somewhat recovered in recent days, it will take more time to observe when the substantial gap and damaged confidence can fully heal. For investors, the crucial question is whether this is a fleeting technological shock or a profound technological shift.
We have previously emphasized that September 24th marked a policy shift—a massive monetary easing that signaled a change in top-level economic thinking and underscored confidence in sustaining the capital market, driving an overall recovery in valuation levels. The new policy put option ensures that the index operates around a new central pivot of around 3200 points, and even with fluctuations, it will not easily breach that previous low. Since the September 24th shift, the technology sector has achieved significant gains, and the valuation center of technology stocks has shifted notably upwards. Presently, we need to consider whether there is also a possibility of a trending improvement on the technology front and whether DeepSeek is sparking a revolutionary and disruptive shift in the technology field.
I. A Significant Engineering Innovation
Currently, the primary consideration is how to define the impact of DeepSeek? We must set aside the simplistic and extreme narratives of the self-media's 'national destiny theory' and 'farce theory' and objectively assess it from the perspective of technological research itself. First, let me share my conclusion. Through in-depth research and study over the past few days, I maintain that my latest understanding of DeepSeek aligns with my previous views in the article "Is DeepSeek the Rise of National Destiny or a Miracle?" Although I cannot claim that DeepSeek represents a great technological revolution, it is indeed a milestone in engineering innovation and another exemplary manifestation of the strengths of the cultural core of Chinese engineers.
We can categorize technological revolutions into two processes: scientific revolutions and industrial revolutions. The development of artificial intelligence is no different. AI has a long history of research and development, officially commencing in the 1960s. If we trace it back to when Turing proposed related concepts, it would be even earlier. However, what truly made AI widely recognized was Microsoft's AlphaGo. Similar to GPT, both utilized reinforcement learning (RL), learning through extensive interactions with experts. AlphaGo's strength lies in not being entirely reliant on past pre-training but on memory capabilities for continuous learning and reinforcement, marking a significant technological milestone in the AI revolution.
Cars were not invented in China, yet today, China is the world's largest automobile producer. The core technology of new energy vehicles was also not invented in China, yet China's current new energy vehicles dominate the globe. Notably, medieval European bible copying was very costly, but after China's movable type printing was introduced to Europe, the bible became very affordable and quickly gained popularity, enabling everyone to read it. Knowledge was no longer monopolized, and human value was highlighted, directly leading to a series of changes such as the European religious revolution, scientific revolution, and Enlightenment Movement. What truly made cars affordable for Americans was Ford's assembly line, which optimized processes and reduced costs. Is it the scientist who invented the car who is great, or the engineer who made car ownership possible for everyone?
The same applies to artificial intelligence. Without DeepSeek, the democratization of large model applications would still be a distant dream. Many overseas companies, including engineers from Silicon Valley and even some traditionally unfriendly overseas media, have given DeepSeek very high evaluations, with some singing high praise. It can be said that this is a significant engineering transformation and can even be considered an industrial revolution in the field of artificial intelligence.
II. Viewing the Development Path and Trends of AI from the Rise of DeepSeek
In 2017, Google introduced an architectural model specializing in machine translation, which introduced a self-attention mechanism that could more efficiently capture long-distance dependencies in sequence data and supported parallel computing, greatly enhancing the speed of training and inference. This is the Transformer architecture. Relying on these advantages, Transformer quickly expanded to other areas of natural language processing and gradually became the mainstream architecture for large language model processing, essentially propelling the development of generative artificial intelligence.
The essence of artificial intelligence lies in the application of mathematics and physics, gradually realized on the foundation of physical revolutions such as the electrical revolution and the chip revolution. The model's working principle is actually not complex and is a very typical mathematical application problem, primarily utilizing three mathematical theories: linear algebra, statistics, and calculus. First, linear algebra is used to convert text into numerical vectors, and then language statistics are employed to form a series of numerical codes from the text and perform regression calculations. Artificial intelligence achieves infinite possibilities through massive data computations and the breadth of space. Given enough time and space, various computation results may evolve.
The mathematicians and physicists in Silicon Valley are a group of idealistic intellectual elites who have spearheaded this technological revolution. However, they often overlook engineering problems, such as cost-saving and efficiency enhancement. DeepSeek's strength lies in its utilization of Fp8, which is the floating-point computing ability with 8-bit binary values, to accomplish tasks that can be achieved by foreign Fp32 computing power, transforming the arrogant elite models of Silicon Valley into more accessible and affordable options, making AI accessible to a broader audience. However, from the perspective of Silicon Valley scientists aiming for the stars and the sea, compared to the advanced Fp32 floating-point format, using Fp8 is a step backward, despite Fp8's ability to significantly reduce costs. It can be said that this method was necessitated by circumstance. DeepSeek has made substantial engineering optimizations on both the training and inference sides, particularly in the application of distillation technology.
Scientists in Silicon Valley often focus on the development of cutting-edge technologies. In an environment where money and chips are abundant, they tend to overlook cost issues. However, high-end chips are expensive, and the cost of training large models is enormous, making it challenging for ordinary people to participate. This differentiation has led to increasingly high technological thresholds, which may become even more pronounced in the future, with only a few individuals capable of leading technological development. Moreover, as historically available data continues to be consumed and training costs rise, it becomes difficult to continuously improve the pre-training effect of models, and the pre-training era may soon come to an end. Therefore, some companies have begun exploring new training methods, such as reinforcement learning and supervised fine-tuning, to reduce dependence on pre-training, thereby enhancing model efficiency while lowering costs. Additionally, some companies have adopted a mixture-of-experts model and a multi-head attention mechanism to further optimize the inference process. In this way, the model only calls the necessary parameters during inference, thereby conserving substantial computing power. The application of these new technologies may usher in a new industrial revolution and propel the development of the global technology ecosystem.
I believe that this technological revolution will not only transform the technology industry but also have a profound impact on society as a whole. The reduction in costs may enable more people to utilize these technologies, fostering the enhancement of industrial manufacturing capabilities and the advancement of human cognitive equity. Simultaneously, it may also trigger a series of social and economic issues that require our close attention.
Finally, I would like to pose a question for your consideration. Will the large model engineering innovation sparked by DeepSeek build momentum for the A-share bull market since September 24th, 2024, transitioning from a 'monetary bull market' to a 'tech bull market', akin to the US stock market after ChatGPT emerged in November 2022? Of course, the 'tech bull market' in the US stock market relies heavily on the seven giants. Does China's 'tech bull market' possess such seven giants to support it? I remain optimistic to a certain extent.