Skip to main content

Fall 2023 Seminar Series

Department of Statistics and Data Science 2023-2024 Seminar Series - Fall 2023

The 2023-2024 Seminar Series will primarily be in person, but some talks will be offered virtually using Zoom. Talks that are virtual will be clearly designated and registration for the Zoom talks will be required to receive the zoom link for the event. Please email Kisa Kowal at k-kowal@northwestern.edu if you have questions. 

Seminar Series talks are free and open to faculty, graduate students, and advanced undergraduate students

Statistical Optimality and Computational Tractability of ICA

Friday, September 29, 2023

Time: 2:00 p.m. to 3:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Ming Yuan, Professor, Department of Statistics and Associate Director, Data Science Institute, Columbia University

Abstract: Independent component analysis (ICA) is a powerful and general data analysis tool. Yet there is an increasing amount of empirical evidence that the classical methods for ICA are not well suited for modern applications, both computationally and statistically, where the effect of dimensionality is not negligible. We will investigate the optimal sample complexity and statistical performance for ICA, and how considerations of computational tractability may affect them. We will also introduce estimating procedures for ICA that are both statistically efficient and computationally tractable. Our development exploits the close connection between ICA and moment estimation and reveals a number of new insights for both problems.

This talk will be given in person on Northwestern's Evanston campus at the location listed above.

https://planitpurple.northwestern.edu/event/604977

On Fine-Tuning Large Language Models with Less Labeling Cost

Friday, October 13, 2023

Time: 2:00 p.m. to 3:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Tuo Zhao, Assistant Professor, H. Milton Stewart School of Industrial and Systems Engineering, Georgia Tech

Abstract: Labeled data is critical to the success of deep learning across various applications, including natural language processing, computer vision, and computational biology. While recent advances like pre-training have reduced the need for labeled data in these domains, increasing the availability of labeled data remains the most effective way to improve model performance. However, human labeling of data continues to be expensive, even when leveraging cost-effective crowd-sourced labeling services. Further, in many domains, labeling requires specialized expertise, which adds to the difficulty of acquiring labeled data.

In this talk, we demonstrate how to utilize weak supervision together with efficient computational algorithms to reduce data labeling costs. Specifically, we investigate various forms of weak supervision, including external knowledge bases, auxiliary computational tools, and heuristic rule-based labeling. We showcase the application of weak supervision to both supervised learning and reinforcement learning across various tasks, including natural language understanding, molecular dynamics simulation, and code generation.

This talk will be given in person on Northwestern's Evanston campus.

Gaussian random field approximation for wide neural networks

Friday, October 27, 2023

Time: 2:00 p.m. to 3:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Nathan Ross, Associate Professor, School of Mathematics and Statistics, University of Melbourne

Abstract: It has been observed that wide neural networks (NNs) with randomly initialized weights may be well-approximated by Gaussian fields indexed by the input space of the NN, and taking values in the output space. There has been a flurry of recent work making this observation precise, since it sheds light on regimes where neural networks can perform effectively. In this talk, I will discuss recent work where we derive bounds on Gaussian random field approximation of wide random neural networks of any depth, assuming Lipschitz activation functions. The bounds are on a Wasserstein transport distance in function space equipped with a strong (supremum) metric, and are explicit in the widths of the layers and natural parameters such as moments of the weights. The result follows from a general approximation result using Stein's method, combined with a novel Gaussian smoothing technique for random fields, which I will also describe. The talk covers joint works with Krishnakumar Balasubramanian, Larry Goldstein, and Adil Salim; and A.D. Barbour and Guangqu Zheng.

This talk will be given in person on Northwestern's Evanston campus at the location listed above.

A Single level Deep Learning Approach to Solve Stackelberg Mean Field Game Problems

Friday, November 3, 2023

Time: 2:00 p.m. to 3:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Gökçe Dayanıklı, Assistant Professor, Department of Statistics, University of Illinois Urbana-Champaign

Abstract: In many real-life policy making applications, the principal (i.e., governor or regulator) wants to find the optimal policies for a large population of interacting agents who optimize their own objectives in a game theoretical framework. However, it is well known that finding an equilibrium in a game with a large number of agents is a challenging problem because of the increasing number of interactions among agents. In this talk, we introduce the Stackelberg mean field game problem to approximate the game between a principal and a large number of agents. Then, we discuss how to rewrite this bi-level problem as a single-level problem to propose an efficient numerical solution. In the model, the agents in the population play a non-cooperative game and choose their controls to optimize their individual objectives by interacting with the principal and other agents in the society through the population distribution. The principal can influence the resulting mean field game Nash equilibrium through incentives to optimize her own objective. After analyzing this game by using a probabilistic approach, we rewrite this bi-level problem as a single-level problem and propose a deep learning approach to solve the Stackelberg mean field game. We look at different applications such as the systemic risk model for a regulator and many banks and an optimal contract problem between a project manager and a large number of employees.

This talk will be given in person on Northwestern's Evanston campus.

https://planitpurple.northwestern.edu/event/606509

Sparse topic modeling via spectral decomposition and thresholding

Friday, November 10, 2023

Time: 2:00 p.m. to 3:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Claire Donnat, Assistant Professor, Department of Statistics, University of Chicago

Abstract: By modeling documents as mixtures of topics, Topic Modeling allows the discovery of latent thematic structures within large text corpora, and has played an important role in natural language processing over the past decades. Beyond text data, topic modeling has proven itself central to the analysis of microbiome data, population genetics, or, more recently, single-cell spatial transcriptomics. Given the model’s extensive use, the development of estimators — particularly those capable of leveraging known structure in the data — presents a compelling challenge. In this talk, we focus more specifically on the probabilistic Latent Semantic Indexing model, which assumes that the expectation of the corpus matrix is low-rank and can be written as the product of a topic-word matrix and a word-document matrix. Although various estimators of the topic matrix have recently been proposed, their error bounds highlight a number of data regimes in which the error can grow substantially — particularly in the case where the size of the dictionary p is large. In this talk, we propose studying the estimation of the topic-word matrix under the assumption that the ordered entries of its columns rapidly decay to zero. This sparsity assumption is motivated by the empirical observation that the word frequencies in a text often adhere to Zipf’s law. We introduce a new spectral procedure for estimating the topic-word matrix that thresholds words based on their corpus frequencies, and show that its ℓ1-error rate under our sparsity assumption depends on the vocabulary size p only via a logarithmic term. Our error bound is valid for all parameter regimes and in particular for the setting where p is extremely large; Our procedure also empirically performs well relative to well-established methods when applied to a large corpus of research paper abstracts, as well as the analysis of single-cell and microbiome data where the same statistical model is relevant but the parameter regimes are vastly different.

This talk will be given in person on Northwestern's Evanston campus.

https://planitpurple.northwestern.edu/event/607098

The unreasonable effectiveness of negative association

Friday, November 17, 2023

Time: 2:00 p.m. to 3:00 p.m. central time

Location: Ruan Conference Room – lower level (Chambers Hall 600 Foster Street)

Speaker: Subhro Ghosh, Assistant Professor, Department of Mathematics, Dept of Statistics and Data Science, faculty affiliate Institute of Data Science, National University of Singapore

Abstract: In 1960, Wigner published an article famously titled "The Unreasonable Effectiveness of Mathematics in the Natural Sciences”. In this talk we will, in a small way, follow the spirit of Wigner’s coinage, and explore the unreasonable effectiveness of negatively associated (i.e., self-repelling) stochastic systems far beyond their context of origin. As a particular class of such models, determinantal processes (a.k.a. DPPs) originated in quantum and statistical physics, but have emerged in recent years to be a powerful toolbox for many fundamental learning problems. In this talk, we aim to explore the breadth and depth of these applications. On one hand, we will explore a class of Gaussian DPPs and the novel stochastic geometry of their parameter modulation, and their applications to the study of directionality in data and dimension reduction. At the other end, we will consider the fundamental paradigm of stochastic gradient descent, where we leverage connections with orthogonal polynomials to design a minibatch sampling technique based on data-sensitive DPPs; with provable guarantees for a faster convergence exponent compared to traditional sampling. Principally based on the following works [1] Gaussian determinantal processes: A new model for directionality in data, with P. Rigollet, Proceedings of the National Academy of Sciences, vol. 117, no. 24 (2020), pp. 13207--13213 (PNAS Direct Submission) [2] Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD, with R. Bardenet and M. Lin Advances in Neural Information Processing Systems 34 (Spotlight Paper at NeurIPS 2021)

This talk will be given in person on Northwestern's Evanston campus.

https://planitpurple.northwestern.edu/event/607099