Maluuba, the Canadian artificial intelligence (AI) startup that Microsoft acquired earlier this year, has beat Ms. Pac-Man using its technology. And the feat may have implications beyond the realm of retro arcade games.
The group achieved the highest possible score of 999,990 using a technique called Hybrid Reward Architecture for Reinforcement Learning. Maluuba picked Ms. Pac-Man to test its technology because it requires human-like intelligence to beat, according to the company.
Ms. Pac-Man is also considered tougher to win than Pac-Man because it was intentionally designed to subject players to more unpredictability, which kept arcade goers dropping quarters into the game after it was released in 1982.
Fast forward a few decades, and the classic video game is helping Microsoft advance its AI capabilities with the potential for practical applications in sales organizations and other business environments.
Maluuba's Hybrid Reward Architecture beat Ms. Pac-Man like workforces tackle complex challenges: by distributing the work. More than 150 AI agents were deployed and rewarded for finding pellets—which are consumed along with bonuses to rack up points and advance through stages—and avoiding ghosts, Ms. Pac-Man's deadly adversaries.
Those agents would then offer suggestions to a top agent, which evaluated the suggestions and decided in which direction to move Ms. Pac-Man. The top agent also weighed how intensely the agents wanted to move in a particular direction.
In an example provided by Microsoft, the company said this approach helps the AI make better decisions. Rather than follow the directions of a majority of agents that were motivated to move right to chomp down on a pellet, the top agent would heed the suggestions of the handful of agents that noticed a ghost on that lethal path and wanted to head left instead.
Maluuba also discovered that the system worked the best when agents acted egotistically and were laser-focused on one objective, like finding the best path to a pellet. It created a cooperative environment in which each agent only cared about a problem at the exclusion of all others but somewhat counterintuitively contributed positively to the entire collective.
Maluuba program manager Rahul Mehrotra believes his group's research can be used in software that empowers sales teams in the future.
Mehrotra believes the Hybrid Reward Architecture AI technology used to beat Ms. Pac-Man may one day "help a company's sales organization make precise predictions about which potential customers to target at a particular time or on a particular day," wrote Microsoft writer Allison Linn in a blog post. "The system could use multiple agents, each representing one client, with a top agent weighing factors such as which clients are up for contract renewal, which contracts are worth the most to the company and whether the potential customer is typically in the office that day or available at that time."
Beyond sales software, Microsoft envisions applications in natural language processing, financial modeling and robotics. A research paper on Hybrid Reward Architecture for Reinforcement Learning is available here. A video of the technology in action can be viewed here.