Research areas

This page organizes my main research areas. See the list of papers by year here.
(* equal contribution or alphabetical order)


1. Trustworthy AI & uncertainty quantification

When AI models are deployed in high-stakes scenarios, the challenges are not just about accuracy. When can their outputs be trusted? What guarantees can we provide for cases most relevant to the problem at hand? How could uncertainty quantification guide decisions and resource allocation? My research develops statistical methods with novel guarantees to answer these questions.

  • Selective conformal inference: identify instances worth acting on (e.g., active drugs, reliable LLM outputs) based on imperfect predictions, or report uncertainty quantification after data-driven decisions.
  • Conformal prediction in complex settings such as graphs, multiple environments and unmeasured confounding.
  • Uncertainty-guided decisions: connect UQ to decisions, operationalizing the guarantees into trust/defer policies and real-world resource allocation pipelines with foundation AI models.

Representative papers:

  • Optimized Conformal Selection: Powerful selective inference after conformity score optimization
    Tian Bai and Ying Jin, 2024

  • Conformal alignment: Knowing when to trust foundation models with guarantees
    Yu Gui*, Ying Jin* and Zhimei Ren*. Conference on Neural Information Processing Systems (NeurIPS), 2024

  • Confidence on the focal: Conformal prediction with selection-conditional coverage
    Ying Jin* and Zhimei Ren*. Journal of the Royal Statistical Society: Series B (JRSS-B), 2025

  • Selection by prediction with conformal p-values
    Ying Jin and Emmanuel Candès. Journal of Machine Learning Research, 2023
More papers in this topic
  • Multi-distribution robust conformal prediction
    Yuqi Yang and Ying Jin, 2026

  • ACS: An interactive framework for conformal selection
    Yu Gui*, Ying Jin*, Yash Nair*, and Zhimei Ren*, 2025

  • Diversifying conformal selections
    Yash Nair, Ying Jin, James Yang, and Emmanuel Candès, 2025

  • Model-free selective inference under covariate shift via weighted conformal p-values
    Ying Jin and Emmanuel Candès. Biometrika, 2025

  • Uncertainty quantification over graph with conformalized graph neural networks
    Kexin Huang, Ying Jin, Emmanuel Candès, and Jure Leskovec. NeurIPS, 2023 (Spotlight)

  • Sensitivity analysis of individual treatment effects: a robust conformal inference approach
    Ying Jin*, Zhimei Ren*, and Emmanuel Candès. Proceedings of the National Academy of Sciences (PNAS), 2023

2. Inference in AI for science

Generative AI introduces a new paradigm for scientific discovery, but it also raises a central challenge: how do we ensure generated outputs are relevant, testable, and worth pursuing? I develop statistical frameworks and agentic workflows that automate parts of the scientific loop, from hypothesis generation to validation.

Representative papers:

  • ConfHit: Conformal generative design via nested testing
    Siddhartha Laghuvarapu, Ying Jin and Jimeng Sun. International Conference on Learning Representations (ICLR), 2026

  • Automated hypothesis validation with agentic sequential falsifications
    Kexin Huang*, Ying Jin*, Ryan Li*, Michael Li, Emmanuel Candès, and Jure Leskovec. ICML, 2025
More papers in this topic
  • Controllable sequence editing for biological and clinical trajectories
    Michelle Li, Kevin Li, Yasha Ektefaie, Ying Jin, Yepeng Huang, Shvat Messica, Tianxi Cai, Marinka Zitnik. ICLR, 2026

  • Contemporary symbolic regression methods and their relative performance
    William La Cava, P. Orzechowski, B. Burlacu, F. O. de Franca, M. Virgolin, Ying Jin, M. Kommenda and J. Moore. NeurIPS D&B, 2021

  • Bayesian symbolic regression
    Ying Jin, Weilin Fu, Jian Kang, Jiadong Guo, and Jian Guo.. International Workshop on Statistical Relational Artificial Intelligence, 2020


3. Robust causal inference

Causal insights are most valuable when they transfer to new populations: policymakers care about whether the effect in an earlier experiment generalizes to new places, and whether new decision rules learned from historical data deliver desired outcomes in new populations. My research in this topic focuses on the key issue of distribution shift in these problems:

  • Generalizability, replicability: empirical distribution shifts in treatment effect generalization and methods to address them.
  • Robust and efficient causal inference via covariate adjustments and sensitivity analysis.
  • Policy learning with limited overlap: pessimism principles for efficient learning of data-informed personalized decisions.

Representative papers:

  • Beyond reweighting: On the predictive role of covariate shift in effect generalization
    Ying Jin, Naoki Egami, and Dominik Rothenhäusler. Proceedings of the National Academy of Sciences (PNAS), 2025

  • Cross-balancing for data-informed design and efficient analysis of observational studies
    Ying Jin and José R. Zubizarreta, 2025

  • Policy learning “without” overlap: pessimism and generalized empirical Bernstein’s inequality
    Ying Jin*, Zhimei Ren*, Zhuoran Yang, and Zhaoran Wang, 2022. Annals of Statistics, 2025

  • Is pessimism provably efficient for offline RL?
    Ying Jin, Zhuoran Yang, and Zhaoran Wang. Mathematics of Operations Research, 2024+. Short version in ICML 2021
More papers in this topic
  • Replicability Within One Study: Harnessing Multiplicity for Observational Causal Inference
    Ying Jin. Harvard Data Science Review (Column article), 2026

  • Diagnosing the role of observable distribution shift in scientific replications
    Ying Jin*, Kevin Guo*, and Dominik Rothenhäusler, 2023

  • Tailored inference for finite populations: conditional validity and transfer across distributions
    Ying Jin and Dominik Rothenhäusler. Biometrika, 2023

  • Modular regression: improving linear models by incorporating auxiliary data
    Ying Jin and Dominik Rothenhäusler. Journal of Machine Learning Research (JMLR), 2023

  • Towards optimal variance reduction in online controlled experiments
    Ying Jin and Shan Ba. Technometrics, 2023

  • Upper bounds on the Natarajan dimensions of some function classes
    Ying Jin. IEEE International Symposium on Information Theory (ISIT), 2023

  • Sensitivity analysis under the f-sensitivity models: a distributional robustness perspective
    Ying Jin*, Zhimei Ren*, and Zhengyuan Zhou. Operations Research, 2025 Arxiv