Fall 2025 Seminar Series : Department of Statistics and Data Science

Fall 2025 Seminar Series

Fall 2025 At a Glance

Friday, October 17 @11:00am
Do Large Language Models (Really) Need Statistical Foundations?
Weijie Su

Friday, October 24 @11:00am
Phase transitions in estimation with low-degree polynomials
Youngtak Sohn

Friday, October 31 @11:00am
Trees against gerrymandering
Moon Duchin

Friday, November 7 @11:00am
Statistical Inference for Temporal Diﬀerence Learning with Linear Function Approximation
Alessandro Rinaldo

Friday, November 14 @11:00am
How do neural networks learn features from data?
Adityanarayanan Radhakrishnan

Department of Statistics and Data Science 2025-2026 Seminar Series - Fall 2025

The 2025-2026 Seminar Series will primarily be in person, but some talks will be offered virtually using Zoom. Talks that are virtual will be clearly designated and registration for the Zoom talks will be required to receive the zoom link for the event. Please email Kisa Kowal at k-kowal@northwestern.edu if you have questions.

Seminar Series talks are free and open to faculty, graduate students, and advanced undergraduate students

Do Large Language Models (Really) Need Statistical Foundations?

Friday, October 17, 2025

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Weijie Su, Associate Professor, Wharton Statistics and Data Science Department, University of Pennsylvania

Abstract: In this talk, we advocate for developing statistical foundations for large language models (LLMs). We begin by examining two key characteristics that necessitate statistical perspectives for LLMs: (1) the probabilistic, autoregressive nature of next-token prediction, and (2) the inherent complexity and black box nature of Transformer architectures. To demonstrate how statistical insights can advance LLM development and applications, we present two examples. First, we demonstrate statistical inconsistencies and biases arising from the current approach to aligning LLMs with human preference. We propose a regularization term for aligning LLMs that is both necessary and sufficient to ensure consistent alignment. Second, we introduce a novel statistical framework for analyzing the efficacy of watermarking schemes, with a focus on a watermarking scheme developed by OpenAI for which we derive optimal detection rules that outperform existing ones. Time permitting, we will explore how statistical principles can inform rigorous evaluation for LLMs. Collectively, these findings demonstrate how statistical insights can effectively address several pressing challenges emerging from LLMs. This talk is based on arXiv:2404.01245, 2405.16455, 2503.10990, 2505.19145, and 2506.12350.

This talk will be given in person on Northwestern's Evanston campus.

planitpurple.northwestern.edu/event/632777

Phase transitions in estimation with low-degree polynomials

Friday, October 24, 2025

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Youngtak Sohn, Assistant Professor, Applied Mathematics, Brown University

Abstract: High-dimensional planted problems, such as finding a hidden dense subgraph within a random graph, often exhibit a gap between statistical and computational feasibility. While recovering the hidden structure may be statistically possible, it is conjectured to be computationally intractable in certain parameter regimes. A powerful approach to understanding this hardness involves proving lower bounds on the efficacy of low-degree polynomial algorithms. In this talk, I will introduce the low-degree polynomial framework and explain how it captures key features of algorithmic hardness. I will then discuss recent progress on understanding computational barriers in community detection under the Stochastic Block Model with many communities. This is joint work with Byron Chin, Elchanan Mossel, and Alex Wein.

This talk will be given in person on Northwestern's Evanston campus.

planitpurple.northwestern.edu/event/635110

Trees against gerrymandering

Friday, October 31, 2025

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Moon Duchin, Professor of Computer Science and Data Science, University of Chicago

Abstract: Motivated by the study of political redistricting, many mathematicians have gotten interested in sampling algorithms for graph partitions. (In this case the graph is a contact network of geographic units in a state.) There has been quite a lot of recent progress developing spanning-tree methods to do the sampling, and I'll survey some of what is and is not known. Bonus: I'll show you how this is being used in the current redistricting court cases in Texas!

This talk will be given in person on Northwestern's Evanston campus at the location listed above.

planitpurple.northwestern.edu/event/632793

Statistical Inference for Temporal Diﬀerence Learning with Linear Function Approximation

Friday, November 7, 2024

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Alessandro Rinaldo, Professor, Department of Statistics and Data Sciences, The University of Texas at Austin

Abstract: Policy evaluation is a fundamental task in Reinforcement Learning (RL), with applications in numerous fields, such as clinical trials, mobile health, robotics, and autonomous driving. Temporal Difference (TD) learning and its variants are arguably the most widely used algorithms for policy evaluation with linear approximation. Despite the popularity and practical importance of TD estimators of the parameters of the best linear approximation to the value function, theories and methods for formal statistical inference with finite sample validity in high dimensions remain limited. Consequently, RL practitioners often lack essential statistical tools to guide their decision-making. To address this gap, we develop efficient inference procedures for TD learning-based estimators under linear function approximation in on-policy settings. We obtain improved consistency rates and derive novel high-dimensional Berry-Esseen bounds for the TD estimator under independent samples and Markovian trajectories. Additionally, we propose an online algorithm to construct non-asymptotic confidence intervals for the target parameters.

Joint work with Weichen Wu (Voleon) and Yuting Wei (UPenn).

planitpurple.northwestern.edu/event/633538

How do neural networks learn features from data?

Friday, November 14, 2025

Time: 11:00 a.m. to 12:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Adityanarayanan Radhakrishnan, Assistant Professor of Mathematics, Massachusetts Institute of Technology

Abstract: The ability of neural networks to learn patterns from data, or features, has been central to their success. In this talk, I will present a unifying mechanism that characterizes feature learning across neural network architectures. Namely, features learned by neural networks are captured by a statistical operator known as the average gradient outer product (AGOP). More generally, the AGOP enables feature learning in machine learning models that have no built-in feature learning mechanism (e.g., kernel methods). I will present two applications of this line of work. First, I will show how AGOP can be used to steer LLMs and vision-language models, guiding them towards specified concepts and shedding light on vulnerabilities in these models. I will then discuss how AGOP connects feature learning with independence testing and how we used AGOP to develop a scalable, nonlinear measure of dependence known as the InterDependence Score (IDS). I will conclude with an application of IDS to million-scale text and genomics datasets, where we use it to identify subpopulations of interest.

This talk will be given in person on Northwestern's Evanston campus.

planitpurple.northwestern.edu/event/635291