06/19 2024 459
In the 1980s, Yang LeCun, the current chief scientist of Meta AI, was still attending university.
Back then, deep learning was a "discredited" technological approach, with only a handful of people persevering in its development, including a group of Japanese scientists.
Yang LeCun discovered that most of the deep learning papers at that time were written in English by Japanese researchers, and these papers provided him with significant inspiration.
Interestingly, while everyone is focusing on large models in 2024, Japan has almost disappeared from this wave.
Until recently, a local unicorn aiming to create Japanese-specific generative AI emerged. According to a report by Asahi Shimbun on Saturday, the Japanese generative AI startup SakanaAI is expected to receive a significant new investment, raising approximately 20 billion yen (about $127 million) by the end of this month, which will bring the company's valuation to 180 billion yen (approximately $1.142 billion).
This is the fastest-growing unicorn in Japan, having been established for less than a year.
So, what is the background of this Japanese AI company? What insights does the birth of Japanese-specific generative AI bring to the development of large models?
/ 01 / Merging "Black Magic" Models, Integrating Japanese Language Understanding with Language, Math, and Vision
Sam Altman, the "father of ChatGPT," predicted in late May that China would produce large models with its own unique characteristics. This leads us to consider the necessity of more nations having culturally specific large models.
Sakana AI is mindful of aligning the cultural attributes and artistic texture of AI-generated content with Japanese culture and user values. "Sakana" itself is a Japanese word, pronounced "sakana," meaning fish. The logo of Sakana AI is shaped like a fish, and various fish drawn by generative AI can be found everywhere on the company's official website.
In March, Sakana AI open-sourced on Hugging Face and GitHub a technology that imitates biological evolution mechanisms and combines multiple AIs to generate advanced AI, including the visual language model EvoVLM-JP. On April 22, Sakana AI announced the launch of a high-speed image generation model compatible with education and Japanese, EvoLLM-JP, which solves math problems in Japanese, and EvoSDXL-JP, a model that generates and understands Japanese images and text. To date, the company has launched three Japanese-specific generative AIs.
▲Examples of images generated by EvoSDXL-JP. Prompts: cute knitted elephant, ramen and ukiyo-e, Katsushika Hokusai, origami lunch box, lower town rocket, ukiyo-e, etc. (Image source: SakanaAI official website)
On the social media platform X's account @hardmaru, I also found that the model is not limited to Japanese styles, including the ability to generate various film styles with excellent results. For example, entering related prompts like "Musk" and "Zuckerberg" generates images that instantly transform into a confrontation between Jack and Tyler from "Fight Club"; the classic meme of Jacky Cheung was extended into a scene of "having dinner with a cat," which made me laugh.
▲Using the "Movie Effect" mode in the SDXL version to generate images of Musk and Zuckerberg
▲SDXL generated some interesting results by extending Hong Kong TV scenes
Specifically, Sakana AI's three models are proficient in Japanese, understanding complex issues and even joking