Baidu showed up this week with a full toolbox. And its newest model, ERNIE 5.0, stole the spotlight.
At its Baidu World 2025 event, the company unveiled ERNIE 5.0, a new multimodal model that can simultaneously understand and generate text, images, audio, and video. The company calls it a “natively omni-modal” system built from the ground up to process different forms of media together.
According to the announcement, ERNIE 5.0 aims to deliver stronger reasoning, better tool use, richer creative output, and more reliable factual processing. A preview is now available through ERNIE Bot for the public and through Baidu’s Qianfan cloud platform for enterprise customers.
Baidu co-founder and CEO Robin Li used the event to argue that AI should be embedded directly into everyday work.
“When you internalize AI, it becomes a native capability and transforms intelligence from a cost into a source of productivity,” he said in remarks shared at the event. He added: “We should focus on integrating AI with every task we do to make it a native driving force for corporate and personal growth.”
A model meant to keep up with global rivals
Baidu positioned ERNIE 5.0 as its answer to the accelerated competition from players like OpenAI, Google, and domestic rivals such as DeepSeek, Alibaba, and ByteDance.
Bloomberg reported that Baidu showcased benchmark results in which ERNIE 5.0 went head-to-head with Google’s Gemini, OpenAI’s GPT-5, and DeepSeek across language, audio, and visual tasks. While it didn’t always take the top spot, Baidu’s goal was to show it remains competitive.
Apollo Go dominance, new chips, and AI-powered search
Baidu also provided an update on its autonomous driving progress. Its ride-hailing service, Apollo Go, has surpassed 17 million rides globally, securing its position as the largest service of its kind in the world.
The fully driverless fleet now completes over 250,000 weekly rides and has logged more than 140 million kilometers in fully driverless mode. Li noted that as robotaxi costs continue to fall, greater affordability will significantly boost demand, transforming urban transportation and social ecosystems.
To support its ambitious AI goals, Baidu is also building the foundation beneath them. The company announced two new custom AI chips, the Kunlunxin M100 and M300, set to launch in 2026 and 2027. This development is strategically significant for Chinese companies, offering them a powerful and cost-effective source of computing power that is domestically controlled, a critical advantage given the current US restrictions on exporting advanced AI chips.
Even Baidu’s classic search engine is getting an AI makeover. The company revealed that about 70% of its top search results are now presented in rich media formats, such as images and videos.
“We used AI to reconstruct the search results page—not by simply inserting AI summaries, but by transforming search from a text-and-link-based application into an AI application centered on rich media,” Li said.
A broader lineup of AI products
Baidu also rolled out upgraded and new AI tools:
- A next-gen real-time digital human system.
- Miaoda 2.0, a no-code app builder.
- GenFlow 3.0, now with over 20 million users.
- A new global AI workspace called Oreate, the company says, has “attracted over 1.2 million users across global markets.”
- And Famou — described as the first commercially available self-evolving agent.
Baidu said its digital human technology is already live in Brazil, with plans to expand into regions including the US and Southeast Asia. During this year’s Double 11 festival, Baidu said 83% of livestreamers used its digital human technology, driving large year-over-year increases in both livestream volume (119%) and GMV (Gross Merchandise Volume) (91%).
Want to see how Baidu’s rivals are evolving? Don’t miss our breakdown of Google’s newest Gemini Live Mode update.


