On the 2025–26 academic job market, seeking tenure-track positions beginning Fall 2026.
I develop the principles and practices of reliable AI measurement and Intervention. This includes studying the external validity of key benchmarks (ImageNet) in deep learning, the internal validity of benchmarks for out-of-distribution generalization, and frameworks for valid evaluation of latent AI capabilities and traits. This also includes understanding and intervening on mechanisms that determine AI behavior, such as causal versus spurious pathways. My work enables AI systems to generalize and adapt to new environments that differ from their training data, ensuring that AI systems are reliable and safe in dynamic, real-world settings. Application areas of my work include health and medicine, algorithmic fairness, and AI policy.
Aggregation Hides Out-of-Distribution Generalization Failures from Spurious Correlations
Olawale Salaudeen, Haoran Zhang, Kumail Alhamoud, Sara Beery, Marzyeh Ghassemi
NeurIPS 2025. Accepted as a spotlight paper
[paper]
Measurement to Meaning: A Validity-Centered Framework for AI Evaluation
Olawale Salaudeen*, Anka Reuel*, Ahmed Ahmed, Suhana Bedi, Zachary Robertson, Sudharsan Sundar, Ben Domingue, Angelina Wang, Sanmi Koyejo
In Review. Accepted at the NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling
[paper] [webpage] [policy brief]
On Evaluating Methods vs. Evaluating Models
Olawale Salaudeen, Florian Dorner, Peter Hase
Accepted as an oral + best paper award at the NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle: Benchmarks, Emergent Abilities, and Scaling
[paper]
On Group Sufficiency Under Label Bias
Haoran Zhang, Olawale Salaudeen, Marzyeh Ghassemi
NeurIPS 2025
[paper]
Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness
Stephen R. Pfohl, Natalie Harris, Chirag Nagpal, David Madras, Vishwali Mhasawade, Olawale Salaudeen, Awa Dieng, Shannon Sequeira, Santiago Arciniegas, Lillian Sung, Nnamdi Ezeanochie, Heather Cole-Lewis, Katherine Heller, Sanmi Koyejo, Alexander D'Amour
NeurIPS 2025 (Preliminary version appeared at NeurIPS 2023 Workshop on Distribution Shifts: New Frontiers with Foundation Models).
[paper]
Improving Single-round Active Adaptation: A Prediction Variability Perspective
Xiaoyang Wang, Yibo Jacky Zhang, Olawale E Salaudeen, Mingyuan Wu, Hongpeng Guo, Chaoyang He, Klara Nahrstedt, Sanmi Koyejo
TMLR 2025
[paper]
Are Domain Generalization Benchmarks with Accuracy on the Line Misspecified?
Olawale Salaudeen, Nicole Chiou, Shiny Weng, Sanmi Koyejo
TMLR 2025. Awarded TMLR Journal to Conference Track for ICLR 26
[paper] [code] [webpage] [news]
Stop Evaluating AI with Human Tests, Develop Principled, AI-specific Tests instead
Tom Sühr, Florian E. Dorner, Olawale Salaudeen, Augustin Kelava, Samira Samadi
Preprint 2025
[paper]
Toward an Evaluation Science for Generative AI Systems
Laura Weidinger, Inioluwa Deborah Raji, Hanna Wallach, Margaret Mitchell, Angelina Wang, Olawale Salaudeen, Rishi Bommasani, Deep Ganguli, Sanmi Koyejo, William Isaac
The Bridge 2025, National Academy of Engineering
[paper]
What’s in a Query: Polarity-Aware Distribution-Based Fair Ranking
Aparna Balagopalan, Kai Wang, Olawale Salaudeen, Asia Biega, Marzyeh Ghassemi
WWW 2025
[paper] [code]
On Domain Generalization Datasets as Proxy Benchmarks for Causal Representation Learning
Olawale Salaudeen, Nicole Chiou, Sanmi Koyejo
Accepted as an oral at the NeurIPS 2024 Workshop on Causal Representation Learning
[paper] [webpage]
ImageNot: A Contrast with ImageNet Preserves Model Rankings
Olawale Salaudeen, Moritz Hardt
Preprint, 2024
[paper] [code] [webpage]
Causally Inspired Regularization Enables Domain General Representations
Olawale Salaudeen, Oluwasanmi Koyejo
AISTATS 2024 (Preliminary version appeared at NeurIPS 2021 Workshop on Distribution Shift)
[paper] [code] [webpage]
Proxy Methods for Domain Adaptation
Katherine Tsai, Stephen R. Pfohl, Olawale Salaudeen, Nicole Chiou, Matt J. Kusner, Alexander D’Amour, Sanmi Koyejo, Arthur Gretton
AISTATS 2024
[paper] [code]
Adapting to Latent Subgroup Shifts via Concepts and Proxies
α–β. Ibrahim Alabdulmohsin*, Nicole Chiou*, Alexander D’Amour*, Arthur Gretton*, Sanmi Koyejo*, Matt J. Kusner*, Stephen R. Pfohl*, Olawale Salaudeen*, Jessica Schrouff*, Katherine Tsai*
AISTATS 2023. Preliminary version appeared at ICML 2022 Workshop on the Principles of Distribution Shift
[paper] [code] [webpage]
Addressing Observational Biases in Algorithmic Fairness Assessments
Chirag Nagpal, Olawale Salaudeen, Sanmi Koyejo, Stephen Pfohl
NeurIPS 2022 AFCP Workshop (extended abstract)
[poster]
Enhancing fMRI Motion Denoising with ICA-AROMA and Causal Discovery
Olawale Salaudeen, Paul Camacho, Aron Barbey, Brad Sutton, Sanmi Koyejo
In Review
[code]
Ultra-fast 3D fMRI to Explore Cardiac-Induced Fluctuations in BOLD-Based Functional Imaging
Brad Sutton, Aaron Anderson, Benjamin Zimmerman, Paul Camacho, Riwei Jin, Charles Marchini, Olawale Salaudeen, Natalie Ramsy, Davide Boido, Serge Charpak, Andrew Webb, Luisa Ciobanu
International Society for Magnetic Resonance in Medicine (ISMRM), 2022 (abstract)
[link]