Leopoldo Sarra

AI Research Scientist

Head of AI Research at Axiomatic AI

Building machine learning systems for scientific reasoning and discovery.
Hands-on researcher, engineer, and technical lead.

Leopoldo Sarra

Research

Foundation models, representation learning, formal methods, scientific discovery, and reinforcement learning.

Agentic AI & Formal Methods

Agentic AI & Formal Methods

Current work at Axiomatic AI / Head of AI Research

Building machine-checkable, reusable infrastructure for scientific reasoning through Lean4 libraries and AI-powered autoformalization, to bring to physics the same success formal methods recently had in mathematics.

AxProverBase: A Minimal Agent for Automated Theorem Proving

AxProverBase: A Minimal Agent for Automated Theorem Proving

A minimal agent for automated theorem proving in Lean4, combining LLM reasoning with formal verification to prove real-world mathematical theorems.

SorryDB: A Real-World Benchmark for Theorem Provers

SorryDB: A Real-World Benchmark for Theorem Provers

A dynamically-updating benchmark of theorem proving tasks sourced from real GitHub projects, measuring how well AI provers can complete real-world Lean theorems.

Foundation Models for Science

Foundation Models for Science

Flatiron Institute / Flatiron Research Fellow

Training multimodal foundation models that learn shared representations across scientific data types (images, spectra, time series, and metadata) to enable cross-domain transfer and accelerate discovery.

AION-1

AION-1

Omnimodal foundation model for astronomical sciences, processing 15+ data types including images, spectra, time series, and metadata.

AstroCLIP

AstroCLIP

Cross-modal foundation model for galaxies using contrastive learning to align galactic images and spectroscopic data into a shared embedding space.

Spectral Tokenization

Spectral Tokenization

Universal self-supervised representation learning for astronomical spectra, enabling transfer across instruments and wavelength ranges.

Joint Embeddings go Temporal

Joint Embeddings go Temporal

Adapting Joint Embedding Predictive Architectures (JEPA) for scientific time series, learning temporal representations without contrastive pairs.

Exploration in Reinforcement Learning

Exploration in Reinforcement Learning

Google DeepMind / Research Scientist Intern

Novelty-based Exploration

Novelty-based Exploration

A representation-based approach to long-horizon exploration that achieves state-of-the-art performance in hard sparse-reward environments like Montezuma's Revenge, using learned representations to drive novelty-seeking behavior over extended time horizons.

Artificial Scientific Discovery

Artificial Scientific Discovery

Max Planck Institute for the Science of Light / Doctoral Researcher

Developing machine learning methods for automated scientific reasoning, from unsupervised feature extraction to optimal experimental design and program synthesis.

Renormalized Mutual Information

Renormalized Mutual Information

An unsupervised method for extracting physically meaningful features from complex systems. The resulting paper was selected as the cover for Physical Review Letters Vol. 126, Issue 20.

Bayesian Experimental Design

Bayesian Experimental Design

Deep active learning approach for optimally designing experiments on quantum many-body systems, maximizing information gain per measurement.

Program Synthesis for Quantum Circuits

Program Synthesis for Quantum Circuits

Using program synthesis techniques to automatically discover reusable quantum circuit components, decomposing complex unitaries into interpretable building blocks.

Reinforcement Learning with Learned Gadgets

Reinforcement Learning with Learned Gadgets

An RL agent that automatically discovers reusable quantum circuit primitives (gadgets) and composes them to solve hard optimization problems on real quantum hardware.

PhD Thesis: Artificial Scientific Discovery

Machine learning methods for automated scientific reasoning, combining information-theoretic approaches, active learning, and program synthesis to accelerate discovery in quantum physics.

Statistical Physics, Biophysics and Quantum Information

Sapienza University / collaborations

Spin Glasses

Spin Glasses

Study of longitudinal fluctuations of the Sherrington-Kirkpatrick model, conducted under the supervision of Nobel laureate Giorgio Parisi at Sapienza University.

Gene Expression Patterns

Gene Expression Patterns

Maximum entropy models for patterns of gene expression, applying statistical physics methods to understand biological systems.

Device-Independent Quantum Tests

Device-Independent Quantum Tests

Experimental semi-device-independent tests of quantum channels, bridging quantum information theory with laboratory experiments.

Software

Open-source tools and repositories.

Publications

Research at the intersection of machine learning and physics.

2026

SorryDB: Can AI Provers Complete Real-World Lean Theorems?

A. Letson, L. Sarra, A. Poiroux, O. Dressler, P. Lezeau, D. Aranha, F. Pu, A. Hill, M. Corredera Hidalgo, J. Berman, G. Tsoukalas, L. Taelman

Under review

2026

A Minimal Agent for Automated Theorem Proving

B. Requena, A. Letson, K. Nowakowski, I. Beltran Ferreiro, L. Sarra

Under review

2026

Reinforcement learning with learned gadgets to tackle hard quantum problems on real hardware

A. Kundu, L. Sarra

Commun. Phys. 9, 44

2025

AION-1: Omnimodal Foundation Model for Astronomical Sciences

F. Lanusse, L. Parker, J. Shen, O. Liu, T. Hehir, L. Sarra, L. Meyer, M. Bowles, S. Wagner-Carena, H. Qu, S. Golkar, A. Bietti, H. Bourfoune, P. Cornette, K. Hirashima, G. Krawezik, R. Ohana, N. Lourie, M. McCabe, R. Morel, P. Mukhopadhyay, M. Pettee, B. Blancard, K. Cho, M. Cranmer, S. Ho

NeurIPS 2025

2025

Universal Spectral Tokenization via Self-Supervised Panchromatic Representation Learning

J. Shen, F. Lanusse, L. H. Parker, O. Liu, T. Hehir, L. Sarra, L. Meyer, M. Bowles, S. Wagner-Carena, H. Qu, S. Golkar, A. Bietti, H. Bourfoune, N. Cassereau, P. Cornette, K. Hirashima, G. Krawezik, R. Ohana, N. Lourie, M. McCabe, R. Morel, P. Mukhopadhyay, M. Pettee, B. Régaldo-Saint Blancard, K. Cho, M. Cranmer, S. Ho

arXiv:2510.17959

2025

Maximum entropy models for patterns of gene expression

C. Sarra, L. Sarra, L. Di Carlo, T. GrandPre, Y. Zhang, C. G. Callan Jr., W. Bialek

Phys. Rev. E 112, 014408

2024

Joint Embeddings go Temporal

S. Ennadir, S. Golkar, L. Sarra

NeurIPS 2024 TSALM Workshop

2024

AstroCLIP: A cross-modal foundation model for galaxies

L. Parker, F. Lanusse, S. Golkar, L. Sarra, M. Cranmer, A. Bietti, M. Eickenberg, G. Krawezik, M. McCabe, R. Morel, R. Ohana, M. Pettee, B. Régaldo, K. Cho, S. Ho, The Polymathic AI Collaboration

MNRAS 531(4), 4990–5011

2024 Spotlight

Unlocking the Power of Representations in Long-term Novelty-based Exploration

A. Saade, S. Kapturowski, D. Calandriello, C. Blundell, P. Sprechmann, L. Sarra, O. Groth, M. Valko, B. Piot

ICLR 2024

2024

Discovering Quantum Circuit Components with Program Synthesis

L. Sarra, K. Ellis, F. Marquardt

Mach. Learn.: Sci. Technol. 5, 025029

2023

Deep Bayesian Experimental Design for Quantum Many-Body Systems

L. Sarra, F. Marquardt

Mach. Learn.: Sci. Technol. 4, 045022

2023

Artificial Scientific Discovery

L. Sarra

Ph.D. Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg

2021 PRL Cover

Renormalized Mutual Information for Artificial Scientific Discovery

L. Sarra, A. Aiello, F. Marquardt

Phys. Rev. Lett. 126, 200601

2019

Experimental semi-device-independent tests of quantum channels

I. Agresti, D. Poderini, G. Carvacho, L. Sarra, R. Chaves, F. Buscemi, M. Dall'Arno, F. Sciarrino

Quantum Sci. Technol. 4, 035004

2019

Study of longitudinal fluctuations of the Sherrington-Kirkpatrick model

G. Parisi, L. Sarra, L. Talamanca

J. Stat. Mech. 2019, 033302

View on Google Scholar

Get in touch

Interested in discussions or collaborations?

Connect on LinkedIn