Jason Hartford

I work on building the tools to enable scientific discovery from complex, high-dimensional data. Most of my current research focuses on developing new techniques for causal representation learning and causal inference, as well as active learning techniques for designing experiments to learn causal models. Before starting at Valence and Manchester, I completed a post-doc with Yoshua Bengio at Université de Montréal, and before that, I completed my PhD at the University of British Columbia with Kevin Leyton-Brown.

I am looking for PhD students to work on these problems. Please get in touch if you are interested! I also have two CDT funded positions available on causal abstraction of dynamical systems and unpaired multimodal representation learning (deadline 31 January 2025).

news

Dec 19, 2024	I am co-organising the Learning Meaningful Representations of Life workshop at ICLR 2025.
Dec 1, 2024	We had two papers in the main conference (Targeted Sequential Indirect Experiment Design and Propensity Score Alignment of Unpaired Multimodal Data) and three in the workshops (foundation models, mechanistic interpretability, and score-based interaction testing) at NeurIPS 2024!
Dec 1, 2024	Our paper on large-scale masked autoencoders for cellular microscopy received the runner up best paper award at the NeurIPS 2024 Workshop on Foundation Models for Science.
Dec 1, 2024	I gave an oral presentation on our dictionary learning paper in the NeurIPS 2024 Workshop Interpretable AI workshop.
Oct 19, 2024	I now an Ellis member and can supervise students through the Ellis PhD program.

selected publications

Weakly Supervised Representation Learning with Sparse Perturbations

Ahuja, Kartik, Hartford, Jason, and Bengio, Yoshua

In Advances in Neural Information Processing Systems ; 2022

Abs Bib PDF

The theory of representation learning aims to build methods that provably invert the data generating process with minimal domain knowledge or any source of supervision. Most prior approaches require strong distributional assumptions on the latent variables and weak supervision (auxiliary information such as timestamps) to provide provable identification guarantees. In this work, we show that if one has weak supervision from observations generated by sparse perturbations of the latent variables–e.g. images in a reinforcement learning environment where actions move individual sprites–identification is achievable under unknown continuous latent distributions. We show that if the perturbations are applied only on mutually exclusive blocks of latents, we identify the latents up to those blocks. We also show that if these perturbation blocks overlap, we identify latents up to the smallest blocks shared across perturbations. Consequently, if there are blocks that intersect in one latent variable only, then such latents are identified up to permutation and scaling. We propose a natural estimation procedure based on this theory and illustrate it on low-dimensional synthetic and image-based experiments.
@inproceedings{ahuja2022sparse, doi = {10.48550/ARXIV.2206.01101}, author = {Ahuja, Kartik and Hartford, Jason and Bengio, Yoshua}, month = {;}, keywords = {Machine Learning (cs.LG), Machine Learning (stat.ML), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Weakly Supervised Representation Learning with Sparse Perturbations}, booktitle = {Advances in Neural Information Processing Systems}, year = {2022}, copyright = {arXiv.org perpetual, non-exclusive license} }
Sequential Underspecified Instrument Selection for Cause-Effect Estimation

Ailer, Elisabeth, Hartford, Jason, and Kilbertus, Niki

In Proceedings of the 40th International Conference on Machine Learning ; oral presentation (2% acceptance); 2023

Abs Bib PDF

Instrumental variable (IV) methods are used to estimate causal effects in settings with unobserved confounding, where we cannot directly experiment on the treatment variable. Instruments are variables which only affect the outcome indirectly via the treatment variable(s). Most IV applications focus on low-dimensional treatments and crucially require at least as many instruments as treatments. This assumption is restrictive: in the natural sciences we often seek to infer causal effects of high-dimensional treatments (e.g., the effect of gene expressions or microbiota on health and disease), but can only run few experiments with a limited number of instruments (e.g., drugs or antibiotics). In such under-specified problems, the full treatment effect is not identifiable in a single experiment even in the linear case. We show that one can still reliably recover the projection of the treatment effect onto the instrumented subspace and develop techniques to consistently combine such partial estimates from different sets of instruments. We then leverage our combined estimators in an algorithm that iteratively proposes the most informative instruments at each round of experimentation to maximize the overall information about the full causal effect.
@inproceedings{pmlr-v202-ailer23a, title = {Sequential Underspecified Instrument Selection for Cause-Effect Estimation}, author = {Ailer, Elisabeth and Hartford, Jason and Kilbertus, Niki}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {408--420}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {; Oral presentation (2\% acceptance);} }
Properties from mechanisms: an equivariance perspective on identifiable representation learning

Ahuja, Kartik, Hartford, Jason, and Bengio, Yoshua

In International Conference on Learning Representations ; (joint first author), spotlight presentation (5% acceptance); 2022

Abs Bib PDF

A key goal of unsupervised representation learning is "inverting" a data generating process to recover its latent properties. Existing work that provably achieves this goal relies on strong assumptions on relationships between the latent variables (e.g., independence conditional on auxiliary information). In this paper, we take a very different perspective on the problem and ask, "Can we instead identify latent properties by leveraging knowledge of the mechanisms that govern their evolution?" We provide a complete characterization of the sources of non-identifiability as we vary knowledge about a set of possible mechanisms. In particular, we prove that if we know the exact mechanisms under which the latent properties evolve, then identification can be achieved up to any equivariances that are shared by the underlying mechanisms. We generalize this characterization to settings where we only know some hypothesis class over possible mechanisms, as well as settings where the mechanisms are stochastic. We demonstrate the power of this mechanism-based perspective by showing that we can leverage our results to generalize existing identifiable representation learning results. These results suggest that by exploiting inductive biases on mechanisms, it is possible to design a range of new identifiable representation learning approaches.
@inproceedings{ahuja2022properties, month = {; (Joint first author), Spotlight presentation (5\% acceptance);}, title = {Properties from mechanisms: an equivariance perspective on identifiable representation learning}, author = {Ahuja, Kartik and Hartford, Jason and Bengio, Yoshua}, booktitle = {International Conference on Learning Representations}, year = {2022}, url = {https://openreview.net/forum?id=g5ynW-jMq4M} }
Valid Causal Inference with (Some) Invalid Instruments

Hartford, Jason, Veitch, Victor, Sridhar, Dhanya and 1 more author

In Proceedings of the 38th International Conference on Machine Learning 18–24 jul 2021

Abs Bib PDF

Instrumental variable methods provide a powerful approach to estimating causal effects in the presence of unobserved confounding. But a key challenge when applying them is the reliance on untestable "exclusion" assumptions that rule out any relationship between the instrument variable and the response that is not mediated by the treatment. In this paper, we show how to perform consistent IV estimation despite violations of the exclusion assumption. In particular, we show that when one has multiple candidate instruments, only a majority of these candidates—or, more generally, the modal candidate-response relationship—needs to be valid to estimate the causal effect. Our approach uses an estimate of the modal prediction from an ensemble of instrumental variable estimators. The technique is simple to apply and is "black-box" in the sense that it may be used with any instrumental variable estimator as long as the treatment effect is identified for each valid instrument independently. As such, it is compatible with recent machine-learning based estimators that allow for the estimation of conditional average treatment effects (CATE) on complex, high dimensional data. Experimentally, we achieve accurate estimates of conditional average treatment effects using an ensemble of deep network-based estimators, including on a challenging simulated Mendelian Randomization problem.
@inproceedings{pmlr-v139-hartford21a, title = {Valid Causal Inference with (Some) Invalid Instruments}, author = {Hartford, Jason and Veitch, Victor and Sridhar, Dhanya and Leyton-Brown, Kevin}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {4096--4106}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, }
Deep IV: A Flexible Approach for Counterfactual Prediction

Hartford, Jason, Lewis, Greg, Leyton-Brown, Kevin and 1 more author

In Proceedings of the 34th International Conference on Machine Learning 06–11 aug 2017

Abs Bib PDF

Counterfactual prediction requires understanding causal relationships between so-called treatment and outcome variables. This paper provides a recipe for augmenting deep learning methods to accurately characterize such relationships in the presence of instrument variables (IVs) – sources of treatment randomization that are conditionally independent from the outcomes. Our IV specification resolves into two prediction tasks that can be solved with deep neural nets: a first-stage network for treatment prediction and a second-stage network whose loss function involves integration over the conditional treatment distribution. This Deep IV framework allows us to take advantage of off-the-shelf supervised learning techniques to estimate causal effects by adapting the loss function. Experiments show that it outperforms existing machine learning approaches.
@inproceedings{pmlr-v70-hartford17a, title = {Deep {IV}: A Flexible Approach for Counterfactual Prediction}, author = {Hartford, Jason and Lewis, Greg and Leyton-Brown, Kevin and Taddy, Matt}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {1414--1423}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, }