07/02 2024 543
Kuaishou has introduced an Apple Vision Pro version of its app, allowing users to open multiple screens and pages to achieve "browsing videos and commenting simultaneously." Baidu Maps has also launched its "Baidu Maps Time Machine" Vision Pro app, allowing users to experience a 360-degree panoramic map.
Runway's newly released Gen-3 Alpha video generation model has significant improvements in fidelity, consistency, and motion performance. It not only generates stable lighting and shadows but also demonstrates powerful imagination. Beta testers have showcased the 3D giant subtitle effect.
What other hot topics in the AI industry, both domestic and international, are worth paying attention to over the past day? Let Crow Magpie lead you to take a look.
/ 01 / Large Models
1) Meta's newly released LLM Compiler achieves 77% automatic tuning efficiency
Meta has released the open-source model LLM Compiler, which achieves 77% automatic tuning efficiency by optimizing compiler design, significantly improving the speed and efficiency of code compilation. The model performs exceptionally well in disassembly tasks, with a success rate of 45%, providing a powerful tool for reverse engineering and old code maintenance. Trained on a vast LLVM-IR and assembly code library, it enhances understanding of compiler intermediate representations and assembly language.
2) Zhipu AI claims that the performance of the domestic large model GLM-4-9B surpasses Google Gemma
In response to Google's newly released Gemma-2 open-source model, Zhipu AI, a domestic large model unicorn, showed the media a set of data indicating that the domestically open-sourced large model GLM-4-9B, released nearly a month ago, seems to have more advantages in various evaluation comparisons. Specifically, in semantic, mathematical, reasoning, coding, and knowledge datasets, the GLM-4-9B-Chat version exhibited high performance.
3) FaceWall Intelligence contributes to the birth of the first judicial trial vertical large model in China
FaceWall Intelligence announced the birth of the first judicial trial vertical large model in Shenzhen. The Shenzhen Intermediate People's Court has launched an AI-assisted trial system that can accurately diagnose cases and solve AI application challenges in the judicial field. The Shenzhen Court's AI-assisted trial system boasts features such as AI empowerment throughout the entire process, elementalization of material entry, the pioneering tree-shaped prompt word project, an authoritative knowledge service system, and standardized judicial thinking chains.
4) Honor collaborates with ByteDance's Doubao large model
Volcano Engine announced a collaboration between Honor and ByteDance's Doubao large model. Volcano Engine first provided Honor with the Doubao large model family, which includes models for speech recognition, role-playing, and more, to build the basic capabilities for Honor's vertical model applications. In the field of intelligent office, the Doubao large model can help Honor provide users with functions such as interactive Q&A based on document understanding, meeting minutes, and assisted creation.
5) Runway Gen 3 can generate 3D giant subtitle effects for movie title sequences
Runway's newly released Gen-3 Alpha video generation model has significant improvements in fidelity, consistency, and motion performance. It not only generates stable lighting and shadows but also demonstrates powerful imagination. Beta testers have showcased the 3D giant subtitle effect, and Gen3 will soon be open to everyone.
6) GPTPdf: Analyzing PDF files using a GPT-4o-like multimodal LLM
The open-source project "GPTpdf" has gained popularity on Github. It uses a GPT-4o-like VLLM model to parse PDF files and convert them into Markdown format. The project's code is concise and efficient, consisting of only 293 lines, yet it can perfectly parse various contents such as typesetting, mathematical formulas, tables, images, and charts. The average cost per page is 0.013 USD.
/ 02 / AI Applications
1) Apple may be introducing Apple Intelligence to Vision Pro
According to the latest report by technology journalist Mark Gurman, Apple's AI suite, "Apple Intelligence," is即将登陆Vision Pro headsets, but the relevant features are expected to be rolled out next year. Vision Pro has 16GB of memory to support Apple Intelligence, and its operating system, visionOS, is essentially a variant of iPadOS. Therefore, Gurman believes that adapting Apple Intelligence for headsets will not be too difficult.
2) Kuaishou and Baidu Maps Time Machine launch on Apple Vision Pro headsets
Kuaishou officially announced the launch of its Apple Vision Pro version app, allowing users to open multiple screens and pages to achieve "browsing videos, personal profiles, and comment panels simultaneously." Baidu Maps has also launched its "Baidu Maps Time Machine" Vision Pro app, allowing users to experience a 360-degree panoramic map. Users can view the current map's time and location information by looking down, and switch between scenes by "joining hands".
3) Baidu: Nearly 80% of examinees use AI to fill out college application forms
According to Baidu's official data, on June 25 alone, over 10 million users used Baidu's AI volunteer assistant to assist in filling out college application forms. It is reported that after the college entrance examination, over 13 million examinees nationwide have gradually entered the college application process. Additionally, Kuake App has also launched an intelligent volunteer selection service, and Reliable AI has introduced the first AI tool for filling out college application forms powered by multiple large language models in China.
4) ByteDance releases Doubao MarsCode, an intelligent development tool
ByteDance has released Doubao MarsCode, an intelligent development tool based on the Doubao large model, which is free and open to domestic developers. MarsCode includes two product forms: a programming assistant and a Cloud IDE, supporting over 100 mainstream programming languages. It can achieve project Q&A, code completion, and unit test generation in three scenarios: demand development, bug fixing, and open-source project learning.
5) Popular AI search tool Perplexity accused of citing erroneous information
The AI search tool Perplexity has been exposed for citing erroneous AI-generated spam information from LinkedIn articles. Startup GPTZero found that more and more of the sources linked by Perplexity are AI-generated and may even use outdated and incorrect information from these sources.
6) Audi and Microsoft collaborate: Approximately 2 million vehicles will soon be integrated with ChatGPT
Audi plans to integrate ChatGPT technology into approximately 2 million vehicles starting from July this year to enhance voice control functionality. Audi models equipped with the Modular Infotainment System (MIB3) will allow owners to query information using natural language while driving through ChatGPT. New models such as the Q6 e-tron and future models equipped with the E3 1.2 electronic architecture will integrate ChatGPT to expand the functionality of the Audi assistant.
7) Zhihu officially launches its AI search function: "Zhihu Direct Answer"
Zhihu has released its latest AI product, "Zhihu Direct Answer." "Discovery · AI Search" is a beta version of an AI search function that integrates new search, real-time Q&A, and follow-up questions based on the capabilities of the "Zhihai Map AI" large model. Zhihu has also announced that "Zhihu Direct Answer" will gradually introduce App development and multimodal capabilities in the future.
8) CharacterAI introduces a new voice function, allowing users to "call" AI characters
Character.AI has introduced a real-time voice call function with AI characters, supporting multiple languages including English, Spanish, and Chinese. This function has been tested by over 3 million users, ensuring a natural and smooth calling experience with no significant differences from communicating with real people. Character.AI has also enhanced the realism of AI characters, allowing users to choose or create over 1 million unique voices.
/ 03 / Investment and Financing Intelligence
1) AI document search company Hebbia completes nearly $100 million in Series B funding, with a valuation of $800 million
According to TechCrunch, three informed sources revealed that Hebbia, a startup developing generative AI search tools for large documents, has recently completed a Series B funding round of nearly $100 million led by Andreessen Horowitz (a16z).
2) Andrew Ng plans to raise an additional $120 million for his AI fund
Andrew Ng plans to raise over $120 million for his AI fund, demonstrating his continued investment and influence in the AI field. This move also reflects the industry's development trends and potential bubble risks.
3) OpenAI reportedly hires Zapier's former Chief Revenue Officer as Head of Sales Strategy
According to reports, OpenAI is expanding its enterprise software business, and Giancarlo Lionetti, who has served as Zapier's Chief Revenue Officer for over two years, has joined OpenAI as Head of Sales Strategy.
/ 04 / AI Infrastructure
1) SoftBank's Masayoshi Son plans to raise $100 billion to establish an AI chip company
According to media reports from February this year, SoftBank is formulating a plan to invest approximately $100 billion in AI-related chips, with the project named "Izanagi." Last week, when asked about "Izanagi" by a shareholder, Masayoshi Son said that he would be committed to achieving results and striving to meet his set goals, but did not elaborate.
2) Microsoft AI leader: Future knowledge production costs will be reduced to zero marginal cost
Mustafa Suleyman, CEO of Microsoft AI, indicated that for much of the content on open networks, the default social contract allows for such use. Existing intellectual property laws have some flexibility, which is being challenged in the AI era. Suleyman believes that information economics is about to undergo fundamental changes, "as we will reduce the cost of knowledge production to zero marginal cost."
3) PAB: A new method for accelerated video generation, capable of real-time video generation at 21.6 frames per second
This article introduces PAB technology successfully proposed by researchers from the National University of Singapore and Purdue University, achieving real-time processing for video generation based on diffusion conversion. This technology achieves a generation speed of up to 21.6 frames per second by reducing redundant attention calculations, accelerating it by 10.6 times, and is applicable to multiple popular DiT video generation models.
4) The first multimodal video arena, Video-MME, is released
The first multimodal LLM video analysis comprehensive evaluation benchmark, Video-MME, has been released. In its benchmark tests, Gemini 1.5 Pro led the way, comprehensively surpassing GPT-4o in a new and more complex multimodal examination.
5) Sam Altman: AGI may double global GDP within a decade
OpenAI CEO Sam Altman believes that AGI could double global GDP, adding that "this makes sense to me and is certainly consistent with other technological revolutions. We do believe it will be a huge productivity driver, and even in the early stages, we've already seen people using it to greatly improve products and services."