07/09 2024 537
Promoting sharing through joint consultation and promoting wisdom through good governance.
In early July, Shanghai was abuzz with an even hotter topic than the sweltering heat: artificial intelligence (AI) converging into a focal point. For four consecutive days, a physical space representing the intersection of reality and the future was packed with tech companies from home and abroad.
This year's World Artificial Intelligence Conference (WAIC 2024) brought together 1,300 global leaders, exhibitors, and delegations from over 50 countries and regions, including 9 Turing Award, Fields Medal, and Nobel Prize winners, as well as 88 top domestic and foreign academicians. The exhibition area exceeded 52,000 square meters, with over 500 renowned enterprises showcasing more than 1,500 exhibits and over 50 new product launches, both reaching record highs.
With the theme of "Promoting sharing through joint consultation and promoting wisdom through good governance," the conference focused on three core areas: core technologies, intelligent terminals, and application empowerment, spotlighting the development of large models, computing power, robots, autonomous driving, and other fields.
01. Large Models Empowering All Industries
As application-level large models move towards commercialization, finding their niche and creating popular scenarios has become the way large models flex their muscles. Unlike last year, when the industry's focus was mainly on foundational models, leading to "benchmark racing" and the proliferation of slogans like "surpassing GPT4."
In 2024, the application and commercialization of large models have become the focus. Baidu founder Robin Li bluntly stated at the conference forum, "Without applications, foundational models, whether open-source or closed-source, are worthless."
Tencent showcased its Hunyuan large model and various AI applications, such as Tencent Yuanbao and Tencent Yuange, covering work efficiency enhancement, lifestyle entertainment, and other scenarios. According to "QuJieShangYe," Tencent's Hunyuan large model has been tested in nearly 700 internal Tencent businesses and scenarios. Additionally, Tencent demonstrated applications in the medical industry, such as digital humans and holographic cabins, leading the intelligent upgrade of enterprise services by quickly creating intelligent, image-based, and interactive "digital avatars."
Vimi, the first controllable character video generation large model for C-end users created by SenseTime, was selected as the "Treasure of the Exhibition Hall," the highest honor at WAIC, making it one of the most innovative exhibits at the conference. According to "QuJieShangYe," based on the powerful capabilities of SenseTime's Ririxin large model, Vimi can generate character videos that match target actions using only a photo of any style and supports multiple driving methods, including existing character videos, animations, voices, texts, and other elements.
As 80% of short videos and live streams are currently centered around people, Vimi provides a much-needed creation tool for C-end video creators.
In addition to the well-established large models from tech giants like Alibaba Tongyi, Baidu Wenxin, Tencent Hunyuan, and iFLYTEK Spark, "new forces" such as Bilibili, Zhipu AI, MiniMax, Jieyue Xingchen, and Mianbi AI have also attracted significant attention.
Bilibili exhibited its self-developed large language model series for the first time at WAIC, including the open-source Index-1.9B chat and Index-1.9B character models, which support knowledge question and answer, copywriting, logical reasoning, code generation, and more. They can generate Bilibili-style content with different styles based on different settings. Bilibili also showcased various AI technology achievements and AIGC creative content, including a custom AI voicebank, self-developed audio-video large model Bijian Studio, and self-developed AI dynamic comic technology.
Jieyue Xingchen, with its family of general large models and a commitment to multimodal fusion, unveiled three new Step series general large models at the conference: Step-2, a trillion-parameter language model (official version), Step-1.5V, a multimodal large model, and Step-1X, an image generation large model. Since its official announcement in March this year, the Step series has achieved comprehensive progress from 100 billion to trillion parameters, from language models to multimodal models, and from understanding to generation in just around 100 days. With the innovation of the Step series, Jieyue Xingchen won the title of SAIL Star at WAIC 2024.
Dr. Jiang Daxin, founder and CEO of Jieyue Xingchen, said, "To climb the AGI mountain, both 'trillion parameters' and 'multimodal fusion' are indispensable. Trillion-parameter scale is the basic threshold for achieving AGI, while multimodal large models are the only way to AGI."
Unlike last year's "Hundred Models War" at WAIC, this year's exhibition featured more applications of large models in specific scenarios, with medical care, education, and government affairs being the key focus areas.
BaiChuan Intelligence showcased its R&D progress over the past year, including four open-source large models, seven closed-source large models, and its AI application for the general public, "BaiXiaoYing."
BaiChuan Intelligence also presented its latest breakthroughs in general medical enhancement large models and AI medical applications. On this basis, AI Health Advisor, an AI medical application developed by BaiChuan Intelligence, was unveiled for the first time. AI Health Advisor is built upon BaiChuan Intelligence's general medical enhancement large model.
According to Wang Xiaochuan, founder and CEO of BaiChuan, the company will focus on exploring and breaking through in AI medical applications in the future.
The combination of "large models + education" is booming in the industry, with various vertical education large models continuing to be applied. At this year's WAIC, companies like NetEase Youdao, Yuanli Technology, and Xueersi showcased their latest achievements in the field of vertical education large models.
NetEase Youdao displayed its latest AI learning hardware, "Youdao Dictionary Pen X7," which is fully equipped with two key native applications of the Ziyue Education large model: Hi Echo, a virtual human English tutor, and XiaoP Teacher, an AI general family tutor. Additionally, the first highly integrated smart sports terminal based on the Ziyue Education large model, "Youdao Fun Screen," made a stunning debut, demonstrating NetEase Youdao's innovative fusion of education and technology.
After obtaining large model registration in May this year, Yuanli Technology showcased its "family bucket" of educational products based on its self-developed large model for the first time at WAIC, including domestic educational service products like Feixiang Planet, Yuan Coding, Xiaoyuan Learning Machine, Dolphin AI Learning, and two AI education overseas products, CheckMath and LeapMath. According to "QuJieShangYe," Yuanli Technology's large model technology covers dialogue tutoring, oral practice, reading comprehension, and other application scenarios for family education, as well as homework correction, academic analysis, answer counseling, and other educational application scenarios serving governments and schools.
Xueersi presented its two flagship products, the Jiuzhang large model and Xueersi Learning Machine, showcasing the latest application achievements and future prospects of AI in smart learning hardware and product applications.
Government affairs scenarios are now becoming an important area for the commercial application of large models.
At this WAIC, Kingsoft Office publicly unveiled its 13B-level self-developed government affairs model, Kingsoft Government Affairs Office Model 1.0. This model excels in official document writing and can generate five types of official documents, including notices, requests, speeches, bulletins, and plans.
Kingsoft Office also launched WPS AI Government Affairs Edition, which uses AI to improve internal office processes and empower external service systems, enabling functions such as policy inquiries, matter inquiries, and result inquiries in external service systems like government services.
Midu released its Midu Nest Government Affairs Large Model 3.0 and brought more than 20 application scenarios, including government hotlines, intelligent government Q&A, judicial document proofreading, law enforcement document writing assistance, intelligent tour guides, and promotional copywriting assistance.
Hanwang Technology, known for its digital reading technology and smart e-paper products, also showcased the successful application of its TianDi large model in nine industries, including ancient Chinese, office work, and education, and its successful application in benchmark enterprises such as national libraries and state-owned large banks.
Hu Shiwei, co-founder and president of Fourth Paradigm, said that AI milestone events often lead to an overestimation of technology's short-term value and an underestimation of its long-term value, followed by a period of silence. However, these technologies will gradually become the new infrastructure for "AI + all industries." Consequently, Fourth Paradigm's Xianzhi AI platform is committed to solving high-value problems within the industry, such as predicting the future operating status of components in hydropower equipment and assessing future risks in the financial industry.
Compared to last year's "Hundred Models War," this year's WAIC is perhaps more aptly described as "each one blooming in its own way." Today, AI is rapidly infiltrating various application scenarios, deeply integrating into production, operations, and personal lives.
02. The Arrival of Embodied Intelligence Era
According to ITjuzi's statistics, China's robotics sector raised 24 billion yuan in funding in 2023, including four investment events worth over one billion yuan each. In the past six months, there have been around a hundred emerging domestic embodied intelligence companies.
Among the manifestations of embodied intelligence, humanoid robots best align with the public's perception of AGI and are the simplest and most direct physical carriers for unleashing productivity in diverse scenarios, making them a popular racetrack for embodied intelligence.
WAIC showcased 45 intelligent robots, including 25 humanoid robots. The most eye-catching display was the humanoid robot array composed of "eighteen guardian deities," showcasing the latest robot technology and the development level of embodied intelligence. Some of these robots are equipped with advanced AI large models, enhancing their intelligence and interaction capabilities, especially attracting widespread attention for their applications in education, industry, and manufacturing.
At the exhibition, China's first full-size open-source general humanoid robot, "Qinglong," was unveiled. Developed by Humanoid Robot (Shanghai) Co., Ltd., Qinglong stands 185cm tall and weighs 82kg, making it the first open-source humanoid robot in China, leading the open-source trend of embodied intelligence.
Currently, Qinglong has 28 active degrees of freedom, enabling flexible movement and precise, compliant grasping. It supports four major motion functions: fast walking, agile obstacle avoidance, stable uphill and downhill walking, and shock resistance and interference. It is an ideal carrier for the development of general AI hardware and software.
Tesla's Optimus Generation 2 humanoid robot was also one of the highlights of the exhibition. With the support of the AI large model, the second-generation robot weighs less than the first and has enhanced body control capabilities. Its hand joints move more naturally, and its fingers are equipped with tactile sensors, enabling it to perform delicate tasks such as grasping eggs.
At the exhibition booth, Tesla introduced to the media that it expects to begin limited production of humanoid robots next year, with over 1,000 Optimus robots assisting humans in production tasks at Tesla factories. In the future, their cost will be controlled at around $10,000, with an expected selling price of $20,000.
According to "QuJieShangYe," Tesla expects to produce over 1,000 Optimus robots next year, replacing human labor for some motion training and production tasks at Tesla factories.
Fourier Robotics is the first in the industry to achieve mass production and delivery, covering more than 2,000 institutions and hospitals in over 40 countries and regions worldwide. The humanoid robot GR-1, unveiled at last year's WAIC, has now been upgraded, and the lower-limb exoskeleton robot ExoMotus M4 was also showcased, aiding in rehabilitation training for patients with lower-limb motor dysfunction caused by conditions such as stroke and spinal cord injury.
CloudMind Robotics unveiled its latest humanoid bipedal robot, XR4 "Seven Fairies" Xiao Zi, at the conference for the first time. Standing 168cm tall and made of carbon fiber composite materials, this full-size bipedal robot boasts over 60 intelligent flexible joints and can move at a speed of 3.5 km/h, with arms capable of carrying up to 10 kg of weight.
Powered by RobotGPT-driven cloud brains, XR4 possesses a high degree of autonomous learning and adaptation capabilities, laying the foundation for the widespread application of robots in human society in the future.
Moreover, more exhibitors with distinctive designs and business layouts have achieved research and development and commercialization successes over the past year through their innovative forms and industrial scenarios.
As one of the important innovations in the field of smart inspection, the Mover Mini INS substation inspection robot made its debut at the conference. Based on an integrated "cloud-network-end" architecture, this tailored smart inspection robot addresses the pain points of traditional manual inspections, providing a one-stop, all-weather, high-precision intelligent solution