On Domain Generalization Datasets as Proxy Benchmarks for Causal Representation Learning

Paper

Abstract

Benchmarking causal representation learning for real-world high-dimensional settings where most relevant causal variables are not directly observed remains a challenge. Notably, one promise of causal representations is their robustness to interventions, enabling models to generalize effectively under distribution shift—domain generalization. Given this connection, we ask to what extent domain generalization performance can serve as a reliable proxy task/benchmark for causal representation learning in such complex datasets. In this work, we provide theoretical evidence that one condition that identifies reliable domain generalization tasks that are reliable proxies is when non-causal correlations with labels/outcomes In-Distribution are reversed or have sufficiently reduced signal-to-noise ratio out-of-distribution. Additionally, we demonstrate that benchmarks with this reversal do not have strong positive correlations between in-distribution (ID) and out-of-distribution (OOD) accuracy, commonly called "accuracy on the line." Finally, we characterize our derived conditions on state-of-the-art domain generalization benchmarks to identify effective proxy tasks for causal representation learning.

Cite

@inproceedings{salaudeen2024domain,

  title={On Domain Generalization Datasets as Proxy Benchmarks for Causal Representation Learning},

  author={Salaudeen, Olawale Elijah and Chiou, Nicole and Koyejo, Sanmi},

  year={2024},

  booktitle={NeurIPS 2024 Causal Representation Learning Workshop}

}