2025-06-28

Publications and preprints

Distributional Training Data Attribution
Preprint, under review
Bruno Mlodozeniec^†, Isaac Reid^†, Sam Power, David Krueger, Murat Erdogdu, Richard Turner, Roger Grosse
Randomness is an inherent part of machine learning; retraining with an identical dataset can in general return a different model. We argue that the ‘influence’ of samples can be understood by how drastically the distribution over final trained models changes if they are removed. We demonstrate that influence functions – a popular but poorly understood training data attribution tool – emerge organically from this new mathematical formulation.
Paper
^† Denotes equal contribution. Order decided by who could swim the furthest underwater.

Learning the RoPEs: Better 2D and 3D Position Encodings with STRING
ICML 2025, accepted as spotlight paper
Connor Schenck*, Isaac Reid*, Mithun George Jacob*, Alex Bewley*, Joshua Ainslie*, David Rendleman*, Deepali Jain, Mohit Sharma, Avinava Dubey, Ayzaan Wahid, Sumeet Singh, René Wagner, Tianli Ding, Chuyuan Fu, Arunkumar Byravan, Jake Varley, Alexey Gritsenko, Matthias Minderer, Dmitry Kalashnikov, Jonathan Tompson, Vikas Sindhwani, Krzysztof Choromanski*
Seperable Translationally Invariant Position Encodings = STRING. A more general extension of Rotary Position Encodings (RoPE) using some very lightweight group theory. Provides big gains in downstream robotics applications and, importantly, has a killer name.
Paper

Linear Transformer Topological Masking with Graph Random Features
ICLR 2025
Isaac Reid, Kumar Avinava Dubey, Deepali Jain, Will Whitney, Amr Ahmed, Joshua Ainslie, Alex Bewley, Mithun Jacob, Aranyak Mehta, David Rendleman, Connor Schenck, Richard E. Turner, René Wagner, Adrian Weller, Krzysztof Choromanski
How can you efficiently incorporate information about the underlying graph structure into a linear attention transformer, where the attention matrix is never explicitly instantiated in memory? Using GRFs, of course.
Paper

Optimal Time Complexity Algorithms for Computing General Random Walk Graph Kernels on Sparse Graphs
AISTATS 2025
Krzysztof Choromanski*, Isaac Reid*, Arijit Sehanobish, Avinava Dubey
By simulating correlated random walks on an ensemble of graphs, you can estimate the graph kernel between any given pair in linear time.
Paper

Variance-Reducing Couplings for Random Features: Perspectives from Optimal Transport
ICLR 2025
Isaac Reid, Stratis Markou, Krzysztof Choromanski, Richard E. Turner, Adrian Weller
Variance reduction in Monte Carlo is really a multi-marginal optimal transport problem, and treating it as such gives us tools to sample more efficiently in Euclidean and discrete space.
Paper Code

~~Universal~~ General Graph Random Features
ICLR 2024
Isaac Reid*, Krzysztof Choromanski*, Eli Berger*, Adrian Weller
You give me an arbitrary function of a weighted adjacency matrix, I give you a random feature mechanism to approximate it efficiently (name changed during review).
Paper Code

Repelling Random Walks
ICLR 2024
Isaac Reid, Eli Berger, Krzysztof Choromanski, Adrian Weller
The QMC scheme below wasn’t so good after all; now we correlate walker directions in an algorithm of broader interest.
Paper Code

Quasi-Monte Carlo Graph Random Features
NeurIPS 2023, accepted as spotlight paper
Isaac Reid, Krzysztof Choromanski, Adrian Weller
A QMC scheme that induces correlations between the lengths random walks on a graph, with possible applications in bioinformatics and graph-based Transformers.
Paper Code

Simplex Random Features
ICML 2023, accepted with oral presentation
Isaac Reid, Krzysztof Choromanski, Valerii Likhosherstov, Adrian Weller
Derivation of a provably optimal random feature mechanism for unbiased approximation of the Gaussian kernel, motivated by a host of new analytical results and tested with extensive Transformer experiments.
Paper Code

Entanglement Barriers in Dual-Unitary Circuits
Phys. Rev. B 104, 014301 – Published 1 July 2021
Isaac Reid, Bruno Bertini
Exact characterisation of the dynamics of quantum entanglement arising after a quantum quench in a many-body, locally interacting system, including both the integrable and completely chaotic regimes.
Paper