Large Model Daily | Gemini's Lead Engineer is Amazed

01/05 2026 515

01

Major Launches (New Models/Products/Open Source)

① NVIDIA Platform Unveils Domestic Cutting-Edge Models, GLM-4.7 and Minimax-M2.1 Now Free to Use

Recently, as Chinese AI firms Zhipu and Minimax announced their respective plans for listing on the Hong Kong Stock Exchange, NVIDIA promptly responded by officially integrating the latest large language models from these two entities—GLM-4.7 and Minimax M2.1—into its NVIDIA NIM API platform. This platform is designed to package large models into plug-and-play microservices, thereby substantially reducing the deployment and debugging hurdles for developers.

Currently, after registering a NVIDIA account and generating an API Key, users can call these two models programmatically at no cost. Although they are not directly listed in the official model marketplace, the actual interfaces are already accessible for use.

Initial tests indicate that in general conversation scenarios, GLM-4.7 responds at a rate of roughly 25 tokens per second, while Minimax-M2.1 can reach speeds of up to 150 tokens per second. It is speculated that this disparity may be due to the models being newly launched and resources still undergoing allocation.

Short Comment:

This move fosters a mutually beneficial scenario for all involved: developers can bypass cumbersome deployment processes and swiftly integrate advanced model functionalities; Chinese AI firms can effectively amplify the visibility of their technologies on the global stage via NVIDIA's extensive ecosystem; and for NVIDIA, offering free trials of the latest models aids in solidifying its developer community and enhancing platform loyalty. Technology transcends borders, and ecosystems thrive collectively, perhaps heralding a new chapter in AI global collaboration.

② Potential Leak of New Grok Version, xAI's Enigmatic Model Surfaces on Evaluation Leaderboards

Recently, AI researchers have stumbled upon records of several new models with unusual names, including 'Vortexshade,' 'Quantumcrow,' and 'Obsidian,' on renowned large model evaluation platforms such as LMS Arena and DesignArena.

Their naming conventions closely resemble those of previous xAI product code names, leading most to surmise that they could be upcoming iterations of the Grok series.

Short Comment:

Although Grok has neared the top tier in terms of intelligence, its actual user base and ecological influence still trail significantly behind leading products like GPT and Gemini. As AI competition increasingly centers on practical applications, Grok continues to face challenges in multimodality and compliance security.

③ OpenAI Envisions Screenless Voice Interaction, Plans to Launch Conversational AI Hardware

According to TechCrunch, OpenAI is assembling a dedicated team to craft a new generation of AI voice devices capable of engaging in continuous and natural conversations, with the ambition of unveiling the first screenless personal assistant product by 2026. This device aims to transcend the current mechanical 'question-and-answer' interaction mode of voice assistants, supporting more human-like communication methods such as interruptions and overlapping dialogues, akin to having an intelligent companion by your side.

In reality, OpenAI is not the sole entity focusing on the voice sector: Meta has integrated a multi-microphone system into its Ray-Ban smart glasses to enhance voice recognition in noisy environments; Google is experimenting with 'voice summary search,' converting text information into voice broadcasts; and Tesla has incorporated Grok into its vehicle system, enabling natural language control of vehicle functions.

Short Comment:

From graphical interfaces to touch operations, each shift in interaction methods reshapes the technological landscape. OpenAI's current endeavor represents a forward-looking wager on interaction forms in the 'post-screen era.'

However, the path of voice interaction hardware is not devoid of precedents: Humane AI Pin stumbled due to a subpar user experience, and Google's XR exploration has repeatedly fallen short of expectations. Thus, the crucial question that all entrants must address is how to translate technological capabilities into seamless and reliable user experiences.

02

Technical Advancements (Papers/SOTA/Algorithms)

① Claude Code's Programming Prowess Amazes the Industry, Claims to 'Accomplish a Year's Workload in One Hour'

Over the past 24 hours, Jaana Dogan, the lead engineer of Google's Gemini team, has been actively posting on social media, revealing that her team's 'distributed agent orchestrator' project, which took a year to develop, received a structurally complete and directly executable program code in just one hour after presenting the requirements to Claude Code.

She expressed 'not anxiety, but admiration,' noting that although the code could not be directly utilized in Google's core projects due to security considerations, its level of completion matched the results of her team's year-long exploration.

Dogan also underscored that AI-generated code still necessitates manual review and iterative optimization, but this marks a significant leap in AI programming capabilities from 'fragment completion' to 'system-level intent understanding.'

Short Comment:

Although this is presently just an individual case shared without divulging the complete code, the trend it unveils is unmistakably clear: AI programming is rapidly entering a new phase of 'system-level assistance.' Programmers may not lose their jobs due to this, but mastering AI tools to enhance efficiency will undoubtedly become an indispensable skill for future developers.

From 'not reinventing the wheel' to 'not reinventing the entire vehicle,' the automated generation of high-quality code is poised to become one of the most seamless areas for AI technology commercialization.

03

Computing Power and Infrastructure (Chips/Cloud/Data Centers)

① Anthropic Invests $21 Billion, Procures One Million TPU Chips from Broadcom

According to semiconductor analysis firm SemiAnalysis, Anthropic has inked an agreement with Broadcom to acquire approximately one million TPU v7p chips for its self-built data center clusters. These chips were co-designed by Google and Broadcom, but in this transaction, Broadcom will directly supply Anthropic with whole rack systems, while Google will receive corresponding fees as the IP licensor.

Broadcom's CEO confirmed at a December investor meeting that Anthropic's total AI-related orders have surged to $21 billion, and due to the whole rack shipment model, the gross margin for this batch of orders has diminished.

Short Comment:

This procurement trend mirrors the profound evolution of the AI computing power supply chain: Broadcom is transitioning from a chip designer to a system integration supplier, while Google is shifting from hardware sales to intellectual property licensing.

As leading entities such as Google, OpenAI, and Anthropic sign substantial orders with Broadcom and extensively adopt self-developed or customized chips (such as TPUs), NVIDIA's long-standing 'sole dominant' position in the high-end AI computing power market is quietly exhibiting signs of diversified deconstruction.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.