Google DeepMind wants to define what counts as artificial general intelligence

AGI is one of the most disputed concepts in tech. These researchers want to fix that.

Will Douglas Heavenarchive page

November 16, 2023

Stephanie Arnett/MITTR

AGI, or artificial general intelligence, is one of the hottest topics in tech today. It’s also one of the most controversial. A big part of the problem is that few people agree on what the term even means. Now a team of Google DeepMind researchers has put out a paper that cuts through the cross talk with not just one new definition for AGI but a whole taxonomy of them.

In broad terms, AGI typically means artificial intelligence that matches (or outmatches) humans on a range of tasks. But specifics about what counts as human-like, what tasks, and how many all tend to get waved away: AGI is AI, but better.

To come up with the new definition, the Google DeepMind team started with prominent existing definitions of AGI and drew out what they believe to be their essential common features.

The team also outlines five ascending levels of AGI: emerging (which in their view includes cutting-edge chatbots like ChatGPT and Bard), competent, expert, virtuoso, and superhuman (performing a wide range of tasks better than all humans, including tasks humans cannot do at all, such as decoding other people’s thoughts, predicting future events, and talking to animals). They note that no level beyond emerging AGI has been achieved.

“This provides some much-needed clarity on the topic,” says Julian Togelius, an AI researcher at New York University, who was not involved in the work. “Too many people sling around the term AGI without having thought much about what they mean.”

The researchers posted their paper online last week with zero fanfare. In an exclusive conversation with two team members—Shane Legg, one of DeepMind’s co-founders, now billed as the company’s chief AGI scientist, and Meredith Ringel Morris, Google DeepMind’s principal scientist for human and AI interaction—I got the lowdown on why they came up with these definitions and what they wanted to achieve.

A sharper definition

“I see so many discussions where people seem to be using the term to mean different things, and that leads to all sorts of confusion,” says Legg, who came up with the term in the first place around 20 years ago. “Now that AGI is becoming such an important topic—you know, even the UK prime minister is talking about it—we need to sharpen up what we mean.”

It wasn’t always this way. Talk of AGI was once derided in serious conversation as vague at best and magical thinking at worst. But buoyed by the hype around generative models, buzz about AGI is now everywhere.

When Legg suggested the term to his former colleague and fellow researcher Ben Goertzel for the title of Goertzel’s 2007 book about future developments in AI, the hand-waviness was kind of the point. “I didn’t have an especially clear definition. I didn’t really feel it was necessary,” says Legg. “I was actually thinking of it more as a field of study, rather than an artifact.”

His aim at the time was to distinguish existing AI that could do one task very well, like IBM’s chess-playing program Deep Blue, from hypothetical AI that he and many others imagined would one day do many tasks very well. Human intelligence is not like Deep Blue, says Legg: “It is a very broad thing.”

But over the years, people started to think of AGI as a potential property that actual computer programs might have. Today it’s normal for top AI companies like Google DeepMind and OpenAI to make bold public statements about their mission to build such programs.

“If you start having those conversations, you need to be a lot more specific about what you mean,” says Legg.

For example, the DeepMind researchers state that an AGI must be both general-purpose and high-achieving, not just one or the other. “Separating breadth and depth in this way is very useful,” says Togelius. “It shows why the very accomplished AI systems we’ve seen in the past don’t qualify as AGI.”

They also state that an AGI must not only be able to do a range of tasks, it must also be able to learn how to do those tasks, assess its performance, and ask for assistance when needed. And they state that what an AGI can do matters more than how it does it.

It’s not that the way an AGI works doesn’t matter, says Morris. The problem is that we don’t know enough yet about the way cutting-edge models, such as large language models, work under the hood to make this a focus of the definition.

“As we gain more insights into these underlying processes, it may be important to revisit our definition of AGI,” says Morris. “We need to focus on what we can measure today in a scientifically agreed-upon way.”

Measuring up

Measuring the performance of today’s models is already controversial, with researchers debating what it really means for a large language model to pass dozens of high school tests and more. Is it a sign of intelligence? Or a kind of rote learning?

Assessing the performance of future models that are even more capable will be more difficult still. The researchers suggest that if AGI is ever developed, its capabilities should be evaluated on an ongoing basis, rather than through a handful of one-off tests.

The team also points out that AGI does not imply autonomy. “There’s often an implicit assumption that people would want a system to operate completely autonomously,” says Morris. But that’s not always the case. In theory, it’s possible to build super-smart machines that are fully controlled by humans.

One question the researchers don’t address in their discussion of what AGI is, is why we should build it. Some computer scientists, such as Timnit Gebru, founder of the Distributed AI Research Institute, have argued that the whole endeavor is weird. In a talk in April on what she sees as the false (even dangerous) promise of utopia through AGI, Gebru noted that the hypothetical technology “sounds like an unscoped system with the apparent goal of trying to do everything for everyone under any environment.”

Most engineering projects have well-scoped goals. The mission to build AGI does not. Even Google DeepMind’s definitions allow for AGI that is indefinitely broad and indefinitely smart. “Don’t attempt to build a god,” Gebru said.

In the race to build bigger and better systems, few will heed such advice. Either way, some clarity around a long-confused concept is welcome. “Just having silly conversations is kind of uninteresting,” says Legg. “There’s plenty of good stuff to dig into if we can get past these definition issues.”

Deep Dive

Artificial intelligence

Google DeepMind used a large language model to solve an unsolved math problem

They had to throw away most of what it produced but there was gold among the garbage.

Will Douglas Heavenarchive page

Unpacking the hype around OpenAI’s rumored new Q* model

If OpenAI's new model can solve grade-school math, it could pave the way for more powerful systems.

Melissa Heikkiläarchive page

Finding value in generative AI for financial services

Financial services firms have started to adopt generative AI, but hurdles lie in their path toward generating income from the new technology.

MIT Technology Review Insightsarchive page

Google DeepMind’s new Gemini model looks amazing—but could signal peak AI hype

It outmatches GPT-4 in almost all ways—but only by a little. Was the buzz worth it?

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Google DeepMind wants to define what counts as artificial general intelligence

A sharper definition

Measuring up

Deep Dive

Artificial intelligence

Google DeepMind used a large language model to solve an unsolved math problem

Unpacking the hype around OpenAI’s rumored new Q* model

Finding value in generative AI for financial services

Google DeepMind’s new Gemini model looks amazing—but could signal peak AI hype

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

A sharper definition

Measuring up

Deep Dive

Artificial intelligence

Google DeepMind used a large language model to solve an unsolved math problem

Unpacking the hype around OpenAI’s rumored new Q* model

Finding value in generative AI for financial services

Google DeepMind’s new Gemini model looks amazing—but could signal peak AI hype

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review