AI Daily Report: Apple Responds to iPhone Recording Notifying the Other Party; Baidu Wenku AI Function Tops the Domestic Overall Ranking of AI Product Rankings in May

06/13 2024 363

"iPhone has call recording, but the recording will automatically notify the other party" has recently caused heated discussions. Apple's official customer service stated that whether "call recording notifies the other party" may not be known until the official release of iOS 18.

According to the latest May ranking released by the "AI Product Rankings," Baidu Wenku's AI function tops the domestic overall ranking, and this domestic veteran product integrated with AI functions has been ranked first for two consecutive months.

What other hot topics in the AI industry at home and abroad are worth paying attention to in the past day? Let Crow News take you to take a look.

/ 01 / Large Models

1) Stable Diffusion 3 countdown to open source, 2B standalone can outperform closed-source Midjourney

At the Computex 2024 conference, Stability AI's co-CEO Christian Laforte officially announced that SD 3 Medium will be publicly released on June 12 (tomorrow). It is reported that versions of 4B and 8B will also be open-sourced later. Two months ago, Stable Diffusion 3 surpassed DALL-E 3 and Midjourney v6 in human preference evaluations.

/ 02 / AI Applications

1) From AI applications to AI workflows, the third Meitu Imaging Festival released 6 products

Meitu released 6 products at the Meitu Imaging Festival. Chen Jianyi, Senior Vice President of Meitu Group, released three products: Meitu Cloud Retouching V2, Kaipai V2, and Meitu Design Studio V3. In addition to these three products, it also released ZCOOL Design Services for design needs, Qimi for game advertising marketing, and an AI short film production tool called MOKI.

2) Surpassing kimi Doubao! Baidu Wenku AI function ranks first in the domestic overall ranking of AI Product Rankings

According to the latest May ranking released by the "AI Product Rankings," Baidu Wenku's AI function ranks first in the domestic overall ranking. This domestic veteran product integrated with AI functions has been ranked first for two consecutive months.

It is reported that after being reconstructed with large models, Baidu Wenku can achieve operations such as intelligent PPT, intelligent documents, intelligent research reports, intelligent mind maps, document summaries and Q&A, and全场景指令编辑, among other advanced functions, including intelligent drawing books, intelligent novels and comics, and image-generated copywriting.

3) Apple responds to iPhone recording notifying the other party: Uncertain, wait for the official release of iOS 18

The recent claim that "iPhone has call recording, but the recording will automatically notify the other party" has caused heated discussions. Regarding this, Apple's official customer service stated that whether "call recording notifies the other party" may not be known until the official release of iOS 18.

4) 360 responds to image incident: It is redrawing, and the other party requests to purchase the model at ten times the price

AIGC creator DynamicWang accused 360 AI's new product launch conference of stealing the original image generated by its AI drawing model. 360 responded that the image was generated based on the original image and was not stolen, questioning the copyright issue. The AIGC creator proposed that 360 purchase the model at a price of 10 times the original price and pay additional compensation. 360 disagreed and decided to determine the copyright issue through litigation.

5) Microsoft has reportedly outsourced AI research and development projects to OpenAI, and Google is expected to benefit from it

Todd McKinnon, CEO of the network security company Okta, stated that Google is working hard to avoid outsourcing research and development in order to defend its position as the search engine giant. He also mentioned that Microsoft has outsourced all of its advanced AI tool and software research and development to OpenAI, which may benefit Google, while Microsoft's position in the field of artificial intelligence risks becoming a "consultant".

/ 03 / Investment and Financing Intelligence

1) French large model Mistral AI completes B round financing of $644 million, with a valuation of $6 billion

French startup Mistral AI has completed its much-speculated B round financing, raising 600 million euros (approximately $644 million) in a mix of equity and debt. General Catalyst led this round of financing, and the post-financing valuation reached $6 billion.

It is reported that Mistral AI raised $112 million in seed round financing about a year ago to compete with OpenAI, Anthropic, and other AI giants. Mistral AI has also released pre-trained and fine-tuned models using open-source licenses under open weights.

2) CampusAI raises $10 million in seed funding to create a metaverse for learning AI skills

Warsaw-based CampusAI announced the completion of a $10 million seed round of financing aimed at establishing a virtual online campus, providing a new platform for AI training and practice for businesses and individuals. This funding came from Polish angel investor Maciej Zientara. CampusAI plans to use this funding to expand into 10 new markets.

3) AI+edge computing enterprise Jiangxing Intelligence secures B round investment from Langmafeng Venture Capital, Zhuoyuan Asia, and others

Jiangxing Intelligence is committed to deeply integrating edge computing and artificial intelligence technology, focusing on new-generation cloud-edge collaborative intelligent IoT products and services. This round of financing was jointly strategically invested by Langmafeng Venture Capital, Zhuoyuan Asia, Songhe Capital, Lenovo Venture Capital, Baidu Ventures, and other domestic AI investment companies.

/ 04 / AI Infrastructure

1) US government aims to restrict exports of AI chip core technologies GAA and high-bandwidth memory HBM

According to Bloomberg, informed sources said that the Biden administration is considering making it more difficult for Chinese companies to purchase and develop advanced AI server chips. The new rules may affect Chinese companies' independent research and development of AI chips.

The report stated that the US Department of Commerce is considering whether to restrict China's ability to use the next-generation advanced chip manufacturing technology GAA (Gate-All-Around). Early discussions also involved restricting the export of high-bandwidth memory (HBM) chips, which are a key component of high-performance AI chips.

2) Biren President Xu Lingjie starts a new business, and the new company has been established to focus on server clusters and computing power optimization

Xu Lingjie, the former president of Biren Technology, formally established Shanghai Mojing Intelligence Co., Ltd. this month, four months after leaving Biren, with a registered capital of up to $10 million. An informed source told China Business News that Mojing Intelligence's business is currently in a very early stage, mainly focusing on data center-related businesses such as server clusters and computing power optimization.

3) Intel Labs releases research results: Using neural architecture search to efficiently "slim down" LLMs

Intel Labs claims that it can efficiently "slim down" LLMs using neural architecture search (NAS). Their experiments based on the LLaMA2-7B model showed that this technology can not only reduce model size but sometimes even improve model accuracy.

4) Mobile phones smoothly run 47 billion large models: Shanghai Jiao Tong University releases the LLM mobile inference framework PowerInfer-2, speeding up by 29 times

The IPADS laboratory at Shanghai Jiao Tong University has introduced a large model inference engine called "PowerInfer-2.0" for mobile phones. This engine enables fast inference on smartphones with limited memory, achieving a speed of 11 tokens/s on the Mixtral 47B model on mobile phones. The current paper has been made publicly available on arxiv. Compared to the popular open-source inference framework llama.cpp, PowerInfer-2.0 achieves an average inference speedup ratio of 25 times, with a maximum of 29 times.

5) An all-Chinese team launches a new benchmark for multimodal large models, with GPT-4o's accuracy rate at only 65.5%, and all models are most prone to perceptual errors

Institutions from Shanghai AI Lab, the University of Hong Kong, Shanghai Jiao Tong University, Zhejiang University, and others have proposed a comprehensive multimodal benchmark test called MMT-Bench, aiming to comprehensively evaluate the performance of large visual-linguistic models (LVLMs) in multimodal and multitask understanding. The evaluation found that perceptual errors and reasoning errors are the two most common errors among all models.

Researchers conducted a comprehensive evaluation of 30 publicly available large visual-linguistic models (LVLMs) based on MMT-Bench. The results showed that even advanced models such as InternVL-Chat, GPT-4o, and GeminiProVision only achieved accuracy rates of 63.4%, 65.5%, and 61.6%, respectively.

6) A trilogy written jointly by top ML/LLM experts titled "What We Learned from Building LLMs in the Past Year?"

This trilogy shares valuable experiences and lessons learned in building LLM applications over the past year, covering aspects from tactics to strategy.

1. Tactical Chapter

Prompt Engineering: How to design effective prompts? Use small, focused prompts instead of complex prompts.

Information Retrieval: Best practices for improving knowledge bases and output quality.

Workflow: Design a reliable workflow to ensure process manageability.

Fine-tuning and Evaluation: How to fine-tune and evaluate when prompt engineering is insufficient?

2. Operational Chapter

Team Building: How to build a diverse and efficient LLMs team?

Continuous Deployment: Reliable and sustainable deployment recommendations to ensure product quality and user experience.

3. Strategic Chapter

Product-Market Fit: Find PMF first, build systems rather than models.

Cost Control and Iteration: How to control costs and iterate products while maintaining competitiveness?

Practical Applications of LLMs: Emphasize that LLMs are mature enough for practical use.

Finally, building effective LLM applications is more difficult than demonstrating demos and requires comprehensive consideration from the tactical, operational, and strategic levels.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.