Home

About Me

I’m a Research Scientist at Google Deepmind in London. I work on the Language Model Interpretability team, trying to better understand how exactly large language models work. We obviously understand quite well how to make large language models, but the algorithmic specifics of how and why they are able to perform a particular behaviour remain somewhat poorly understood. Understanding this better seems like an important problem, both in terms of inherent scientific interest, but also because understanding such systems better is likely to be useful for ensuring their safety and reliability, particularly if the capabilities of such systems continue to improve at their current pace. Before I joined Deepmind, I was a machine learning engineer at Cohere, on the foundation model training team, for around a year and a half. I worked on LLM training, and later on finetuning. Before that, I studied for my DPhil at the University of Oxford with the Autonomous Intelligent Machines & Systems CDT, working on machine learning & related fields. Before that, I did my undergrad & masters in physics at the University of Manchester.

At Oxford, I was part of the OATML group, supervised by Yarin Gal.

Research Interests

Currently I work on mechanistic intepretability in large language models.

At Cohere, I’ve also worked on foundation model pre-training and advanced finetuning, like preference learning and on-policy reinforcement learning.

In my PhD, I was interested in how machine learning models learn invariances and regularity from data (there is evidence that this doesn’t happen quite the way we would want), and designing models that do this better. I also worked a lot on structured probabilistic models, like capsule nets, though I mostly decided these weren’t such a great idea, and designing models which were ‘uncertain by default’.

I’ve always been interested in the robustness of deep models, as well as broader issues of safety and reliablity in machine learning, which seems a bit more relevant these days.

Though I haven’t directly worked on applications, as opposed to ‘core’ modelling, for a little while, I’m also interested in applications, particularly in physics, and I have worked on practical applications of Bayesian ideas in crowdsourced astronomy.

Personal

Outside of work, I’m really into music, playing a few instruments (mostly guitar). You can see a sample of music I like on this playlist

Recent Papers

Recent Blog Posts

See the archives for a complete listing.