Deep Learning

Murray Shanahan

An Overview of My Work

After largely abandoning the paradigm of symbolic AI in the early 2000s, I spent a decade or so doing research on the only portion of the space of possible minds to which we had any direct access, the biological brain. I was still teaching AI, but I wasn't actively researching in the field. But in the early 2010s, AI started to show some promise again, thanks to progress in machine learning using neural networks. First came the dramatic success of Geoff Hinton's team in the 2012 ImageNet competition. Then in 2014, my friends and colleagues at DeepMind (this was before I joined) published their pioneering work on deep reinforcement learning (DRL). Their system, DQN, could learn to play retro Atari games, such as Breakout or Space Invaders, from scratch, given only the pixels on the screen and the score, and a repertoire of low-level actions with unknown effects (the game controller). It did this by combining two established machine learning methods, deep learning (with neural networks) and reinforcement learning (i.e. learning to maximise reward through trial and error).

The impact DQN had on me was huge. It was the first demonstration of an AI system with a degree of generality. There were no constraints on the game it could be tasked with learning. DeepMind had evaluated the system on Atari games, but in theory the rules of a game could be whatever you liked, the screen could depict anything at all, and the actions could do whatever you wanted. Would DQN learn to play a new game well? Maybe not, but at least it would try. At least it made sense to ask the question. When DeepMind released the source code (in an obscure language called Lua), we soon got it up and running on the big (for the time) GPU in our lab, and watched it learn to play a decent game of Breakout over a couple of hours. But the experience of getting the code to work and then watching it learn brought some shortcomings to my attention.

DQN was a stunning achievement, but compared to a human it was data hungry and brittle. While a naive human player can learn to play reasonably well after a few dozen games, DQN required tens of thousands before it achieved any competency at all. And while a human who has learned the game can adapt their expertise to minor variations (such as changes in the colours or sizes of the objects), for DQN making such changes was like giving it a whole new task to learn. Humans can even transfer expertise they have acquired in one game (say Breakout) to an altogether different game (say Pong), a capability known as transfer learning, something that was hopelessly beyond DQN. So I began tho think about how to overcome these limitations. And, to my surprise, I found myself revisiting symbolic artificial intelligence, the paradigm I had given up on a decade ago.

Together with my (then) PhD student, Marta Garnelo, we put together a paper that was one of the first to set out the limitations of deep reinforcement learning. It also outlined an approach to overcoming those limitations, complete with a simple prototype system along the proposed lines. Our 2016 paper "Towards Deep Symbolic Reinforcement Learning" not only showed how to get the best of both the DRL and symbolic AI worlds, it was also instrumental in getting us both jobs at DeepMind. A central idea in the paper, further explored in our 2019 paper "Reconciling Deep Learning with Symbolic Artificial Intelligence", was to devise architectures that are constrained to learn representations built out of objects, relations, and propositions. Such representations have a compositional structure that promotes abstraction and generality, which in turn helps with data efficiency, robustness, and transfer learning.

Well, that's the theory, anyway. Things are not so straightforward in practise. To exploit the power of deep learning, the architecture has to have an important mathematical property, namely end-to-end differentiability. In practise, forcing an end-to-end differentiable system to learn representations with the right structure is not so easy. One of our best attempts appears in our 2020 paper "An Explicitly Relational Neural Network Architecture", which incorporates certain features of the transformer architecture that powered advances in large language models.

Another theme that has recurred throughout my career is so-called common sense, and especially what I call foundational common sense. Foundational common sense is the basic understanding of the everyday physical world shared by all humans and many animals, a repertoire of concepts such as objects, paths, surfaces, obstructions, containers, liquids and solids, parts and wholes, number and quantity, and so on. In the days of symbolic AI, researchers (myself included) attempted to render these concepts into formal logic, which is a long way removed from their roots in embodied interaction with the world. But thanks to deep reinforcement learning, it became feasible to acquire foundational common sense in the same way humans and animals acquire it, as we described in a 2020 paper co-authored with Lucy Cheke and others, "Artificial Intelligence and the Common Sense of Animals". As well as a new characterisation of foundational common sense (based on J.J.Gibson's idea of affordances), that paper draws heavily on the animal cognition literature, and commends experimental protocols developed by animal cognition researchers to evaluate embodied (or virtually embodied) AI systems.

Foundational common sense exemplifies the brain's capacity for abstraction, which is central to generalisation and transfer. In humans, foundational common sense lies at the base of a hierarchy of abstraction and underpins our highest intellectual achievements. In our 2022 paper, "Abstraction for Deep Reinforcement Learning", Melanie Mitchell and I argue that analogy-making, or "seeing similarity" is the core cognitive operation that makes abstraction possible, and that seeing similarity generates a set of templates for common everyday situations (such as making a journey or taking something out of a container). We also address the tricky relationship between language and abstraction. In humans, certain kinds of abstraction come "pre-packaged" in language, but the behaviour of animals that lack language still exhibits some capacity for abstraction. Conversely, large language models that lack embodiment can perform impressive feats of abstraction even though their grasp of foundational common sense (such as it is) is not directly grounded in sensorimotor interaction with the world.

Since the rise of large language models, interest in deep reinforcement learning has somewhat subsided, especially as a possible route to artificial general intelligence (AGI). But from a cognitive science point of view, and with the goal of understanding the space of possible minds in mind, the topics discussed in my trilogy of papers with Marta Garnelo (2019), Lucy Cheke (2020), and Melanie Mitchell (2022), remain profoundly important. The themes of relational representation, compositionality, common sense, and abstraction are perennial. Cognitively sophisticated agents built around LLMs will have to incorporate solutions to the attendant problems, whether they are explicit or implicit, learned or engineered. Insofar as they are implicit and learned, reverse engineering those solutions will be a fascinating scientific endeavour.

A Selection of Papers

For a more complete list of my publications see my Google Scholar page.

Abstraction for deep reinforcement learning
M. Shanahan, M. Mitchell
International Joint Conference on Artificial Intelligence, 5588-5596 (2022)
Artificial intelligence and the common sense of animals
M Shanahan, M Crosby, B Beyret, L Cheke
Trends in Cognitive Sciences, 24 (11), 862-872 (2020)
An explicitly relational neural network architecture
M Shanahan, K Nikiforou, A Creswell, C Kaplanis, D Barrett, M Garnelo
International Conference on Machine Learning, 8593-8603 (2020)
Reconciling deep learning with symbolic artificial intelligence: Representing objects and relations
M Garnelo, M Shanahan
Current Opinion in Behavioral Sciences, 29, 17-23 (2019)
Conditional neural processes
M. Garnelo, D. Rosenbaum, C. Maddison, T. Ramalho, D. Saxton, M Shanahan, YW Teh, D Rezende, SM Ali Eslami
International Conference on Machine Learning, 1704-1713 (2018)
Deep unsupervised clustering with Gaussian mixture variational autoencoders
N Dilokthanakul, PAM Mediano, M Garnelo, MCH Lee, H Salimbeni, K Arulkumaran, M Shanahan
arXiv preprint arXiv:1611.02648 (2016)