Skip to main content

Winter 2019 Seminar Series

 

Wednesday, January 30, 2019

Score-Matching Representative Approach for Big Data Analysis with Generalized Linear Models

Time: 11:00 a.m.

Speaker: Keren Li, Postdoctoral Fellow, NSF-Simons Center for Quantitative Biology, Northwestern University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: We propose a fast and efficient strategy, called the representative approach, for big data analysis with linear models and generalized linear models. With a given partition of big dataset, this approach constructs a representative data point for each data block and fits the target model using the representative dataset. In terms of time complexity, it is as fast as the subsampling approaches in the literature. As for efficiency, its accuracy in estimating parameters is better than the divide-and-conquer method.

With comprehensive simulation studies and theoretical justifications, we recommend two representative approaches. For linear models or generalized linear models with a flat inverse link function and moderate coefficients of continuous variables, we recommend mean representatives (MR). For other cases, we recommend score-matching representatives (SMR).

As an illustrative application to the Airline on-time performance data, MR and SMR are as good as the full data estimate when available. Furthermore, the proposed representative strategy is ideal for analyzing massive data dispersed over a network of interconnected computers.

https://planitpurple.northwestern.edu/event/543550

Wednesday, February 6, 2019

TBA 

Time: 11:00 a.m.

Speaker: Lihui Zhao, Associate Professor, Preventive Medicine; Feinberg School of Medicine, Northwestern University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: TBA

 https://planitpurple.northwestern.edu/event/545466

Wednesday, February 20, 2019

PANDA: AdaPtive Noisy Data Augmentation for Regularization of Undirected Graphical Models

Time: 11:00 a.m.

Speaker: Fang Liu, Associate Professor, Department of Applied and Computational Mathematics and Statistics, University of Notre Dame

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: We propose PANDA, an AdaPtive Noise Augmentation technique to regularize estimating and constructing single and multiple undirected graphical models (UGMs). PANDA iteratively solves MLEs given noise augmented data in the regression-based framework until convergence. The noises can be designed to achieve various regularization effects on graph estimation, such as lasso, group lasso, ridge and elastic net, among others. When PANDA is used for constructing multiple graph simultaneously, two types of noises are augmented. The first type is to regularize the estimation of each graph while the second type promotes either the structural similarities (joint group lasso), or numerical similarities (joint fused ridge), among the edges in the same position across multiple graphs. We establish theoretically that the noise-augmented loss functions and its minimizer converge almost surely to the expected penalized loss function and its minimizer, respectively. We also derive the asymptotic distributions and inferences for the regularized regression coefficients through PANDA in the setting of GLMs. PANDA can be easily programmed in any standard software without resorting to complicated optimization techniques. We apply PANDA to the autism spectrum disorder data to construct a mixed-node graph, and a real-life lung cancer microarray data to simultaneously construct four protein networks. 

https://planitpurple.northwestern.edu/event/543551

Wednesday, February 27, 2019

Simultaneous Estimation and Variable Selection for Incomplete Event History Data

Time: 11:00 a.m.

Speaker: Jianguo (Tony) Sun, Professor of Statistics, University of Missouri

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: This talk discusses regression analysis of incomplete event history
data with the focus on simultaneous estimation and variable selection.
Such data commonly occur in many areas such as medical studies and social sciences,
and a great deal of literature has been established for their analysis except for
the variable selection problem.  To address this, we will present
a new method, which will be referred to as a broken adaptive ridge regression
approach, and establish its asymptotic properties including the oracle property
and clustering effect.  Numerical studies suggest that the proposed method
performs well in practical situations and better than the existing methods.
An application will be presented.

https://planitpurple.northwestern.edu/event/543552

Wednesday, March 13, 2019

TBA

Time: 11:00 a.m.

Speaker: Rosemary Braun, Assistant Professor of Preventive Medicine (Biostatistics) and McCormick School of Engineering, Northwestern University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: TBA

https://planitpurple.northwestern.edu/event/543553

Back to top