About me

Hi! I’m Ying. I am currently a Wojcicki-Troper Postdoctoral Fellow at Harvard Data Science Initiative, where I have the fortune to be mentored by Professor José Zubizarreta and also work with Professor Marinka Zitnik.

Starting July 2025, I will be an Assistant Professor in the Department of Statistics and Data Science at the Wharton School, University of Pennsylvania. I obtained my PhD in Statistics from Stanford University in 2024, advised by Professors Emmanuel Candès and Dominik Rothenhäusler. Prior to that, I studied Mathematics at Tsinghua University. Here are my CV, Github and Google scholar pages.

I currently help organize the Online Causal Inference Seminar.


Research interests

I work on statistical problems related to two main themes:

  • Uncertainty quantification
    I develop methods to quantify and control the uncertainty of black-box AI models for their confident deployment in critical domains. My recent work integrates selective inference ideas into conformal prediction, with the goal of enabling trusted decision-making based on uncertainty principles. One example is Conformal Selection using multiple testing ideas to find unlabeled instances with “good” labels, such as active drugs and trustable LLM outputs, which can then be acted upon with confidence. Related projects:

  • Generalizability and robustness
    I am interested in understanding the generalization and robustness of statistical findings across datasets, populations, and contexts. My recent works study the empirical nature of distribution shifts in large-scale replication projects. I also work on methods that address distribution shifts in generalizing treatment effects, learning causal decision rules, and combining datasets. Related projects:

These questions lead me to the fields of conformal prediction, causal inference, and multiple testing.


News

  • Feb 2025: Imagine LLM agents for scientific discovery—agents that autonomously gather knowledge by creative reasoning and flexible tool use. How to ensure the soundness of what they acquire? We propose Popper, a framework where LLM agents design sequential experiments, collect data, and accumulate statistical evidence to validate a free-form hypothesis with error control!

  • Dec 2024: Recent empirical investigations have challenged the sufficiency of covariate shift adjustment for generalization under distribution shift. How to address what remained unexplained? Analyzing two large-scale multi-site replication projects, our new paper suggests a predictive role of covariate shift: it informs the strength of unknown conditional shift, which helps generalization!

  • Sept 2024: Outputs from black-box foundation models must align with human values before use. For example, can we ensure only human-quality AI-generated medical reports are deferred to doctors? Our paper Conformal Alignment is accepted to NeurIPS 2024!

  • Sept 2024: My paper on optimal variance reduction in online experiments (2021 internship project at LinkedIn) receives the 2024 Jack Youden Prize for the best expository paper in Technometrics! Thank you, ASQ/ASA!

  • March 2024: How to quantify the uncertainty for an “interesting” unit picked by a complicated, data-driven process? Check out JOMI, our framework for conformal prediction with selection conditional coverage!

  • Sept 2023: I’ll be giving a seminar at Genentech on leveraging Conformal Selection [1, 2] for reliable AI-assisted drug discovery.

  • Sept 2023: Scientists often refer to distribution shifts when effects from two studies differ, e.g. in replicability failure. Do they really contribute? See our preprint for a formal diagnosis framework. Play with our live app, or explore our data repository! I gave an invited talk about it in the Causality in Practice Conference.

Beyond academics, I love traveling and photography in my free time. See my photography gallery!


Education

Recent posts