Home
About Me
I’m a Research Scientist at Google Deepmind in London. I work on the Language Model Interpretability team, trying to better understand how exactly large language models work. We obviously understand quite well how to make large language models, but the algorithmic specifics of how and why they are able to perform a particular behaviour remain somewhat poorly understood. Understanding this better seems like an important problem, both in terms of inherent scientific interest, but also because understanding such systems better is likely to be useful for ensuring their safety and reliability, particularly if the capabilities of such systems continue to improve at their current pace. Before I joined Deepmind, I was a machine learning engineer at Cohere, on the foundation model training team, for around a year and a half. I worked on LLM training, and later on finetuning. Before that, I studied for my DPhil at the University of Oxford with the Autonomous Intelligent Machines & Systems CDT, working on machine learning & related fields. Before that, I did my undergrad & masters in physics at the University of Manchester.
At Oxford, I was part of the OATML group, supervised by Yarin Gal.
Research Interests
Currently I work on mechanistic intepretability in large language models.
At Cohere, I’ve also worked on foundation model pre-training and advanced finetuning, like preference learning and on-policy reinforcement learning.
In my PhD, I was interested in how machine learning models learn invariances and regularity from data (there is evidence that this doesn’t happen quite the way we would want), and designing models that do this better. I also worked a lot on structured probabilistic models, like capsule nets, though I mostly decided these weren’t such a great idea, and designing models which were ‘uncertain by default’.
I’ve always been interested in the robustness of deep models, as well as broader issues of safety and reliablity in machine learning, which seems a bit more relevant these days.
Though I haven’t directly worked on applications, as opposed to ‘core’ modelling, for a little while, I’m also interested in applications, particularly in physics, and I have worked on practical applications of Bayesian ideas in crowdsourced astronomy.
Personal
Outside of work, I’m really into music, playing a few instruments (mostly guitar). You can see a sample of music I like on this playlist
Recent Papers
- Uncertainty Quantification for virtual diagnostic of particle accelerators - Physical Review Accelerators and Beams
[Paper]
- Towards global flood mapping onboard low cost satellites with machine learning - Nature Scientific Reports, 2021
[Paper]
- Liberty or Depth: Deep Bayesian Neural Nets Do Not Need Complex Weight Posterior Approximations - NeurIPS, 2020
[Paper] [arXiv] - Capsule Networks: A Generative Probabilistic Perspective - Object Oriented Learning Workshop, ICML 2020
[Paper]
- Uncertainty Estimation Using a Single Deep Deterministic Neural Network - ICML, 2020
[Paper] [BibTex] - Try Depth Instead of Weight Correlations: Mean-field is a Less Restrictive Assumption for Deeper Networks - Contributed talk, Workshop on Bayesian Deep Learning, NeurIPS 2019
[Workshop paper], [arXiv] - Flood Detection On Low Cost Orbital Hardware - Spotlight talk, Artificial Intelligence for Humanitarian Assistance and Disaster Response (AI+HADR) NeurIPS 2019 Workshop
[arXiv] - Galaxy Zoo: Probabilistic Morphology through Bayesian CNNs and Active Learning - Monthly Notices of the Royal Astronomical Society, 2019
[Paper] [arXiv] - Sufficient Conditions for Idealised Models to Have No Adversarial Examples: a Theoretical and Empirical Study with Bayesian Neural Networks - arXiv, 2018
[arXiv] [BibTex] - Understanding Measures of Uncertainty for Adversarial Example Detection - UAI, 2018
[Paper] [arXiv] [BibTex]
Recent Blog Posts
- The 'Strong' Feature Hypothesis could be wrong. - August 2, 2024
- On AI Risk - December 2, 2023
- Dropout can create a privileged basis in the ReLU output model - April 28, 2023
- A guide to recursion problems for interviews - December 13, 2020
- What does it mean to not have a mean? - September 13, 2020
- Are capsules a good idea? A generative perspective - June 23, 2020
- A gentle introduction to information geometry - September 27, 2019
- Itô and Stratonovich; a guide for the perplexed - September 30, 2018
See the archives for a complete listing.