Skip to main content

Spring 2019 Seminars

How non-ignorable is the selection bias in non-probability samples? An illustration of new measures using a large genetic study on Facebook

Wednesday, April 17, 2019

Time: 11:00 a.m.

Speaker: Brady West, Research Associate Professor, Survey Methodology Program (SMP), Survey Research Center (SRC), Institute for Social Research (ISR), University of Michigan

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: Survey researchers are currently evaluating the utility of "big data" that are not selected by probability sampling. Existing indices of the degree of departure of non-probability samples from representative probability samples, such as the R-Indicator, are agnostic about the relationship between the inclusion probability and survey outcomes, which is crucial to understanding the risk of selection bias in non-probability samples. We describe simple model-based indices of the degree of departure from ignorable selection for estimates of means, proportions, and regression coefficients that correct this deficiency. We then use simulation studies to evaluate the ability of the proposed indices and other existing indices to detect non-ignorable selection bias. Finally, we apply the proposed indices to data from the Genes for Good project at the University of Michigan, which recruits a non-probability sample of study volunteers via Facebook, using genetic data from the Health and Retirement Study as a population benchmark.

Dissecting transcriptional and translational regulatory circuits in human cancers

Wednesday, April 24, 2019

Time: 11:00 a.m.

Speaker: Zhe Ji, Assistant Professor, Pharmacology, Feinberg School of Medicine; Assistant Professor, Biomedical Engineering, McCormick School of Engineering; Robert H. Lurie Comprehensive Cancer Center, Northwestern University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: With the advances of genomics technologies, we now can study the steps of gene expression regulation in a systematic and cost-effective manner. I will present our work developing novel computational tools using machine learning for integrative analyses of multi-omics data, and revealing novel molecular mechanisms controlling gene transcription and RNA translation. And we applied the experimental and computational genomics technologies to decode the key regulatory circuits mediating cancer progression and tumor microenvironment.

xBART: Accelerated Bayesian Additive Regression Trees

Wednesday, May 1, 2019

Time: 11:00 a.m.

Speaker: P. Richard Hahn, Associate Professor of Statistics, School of Mathematical and Statistical Sciences, Arizona State University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: Bayesian additive regression trees (BART) is a powerful predictive model that often outperforms alternative models at out-of-sample prediction. BART is especially well-suited to settings with unstructured predictor variables and substantial sources of unmeasured variation as is typical in the social, behavioral and health sciences. This paper develops a modified version of BART that is amenable to fast posterior estimation. We present a stochastic hill climbing algorithm that matches the remarkable predictive accuracy of previous BART implementations, but is many times faster and less memory intensive. Simulation studies show that the new method is comparable in computation time and more accurate at function estimation than both random forests and gradient boosting.


Bayesian Variable Selection for Integrative Genome-Wide Association Analysis

Wednesday, May 15, 2019

Time: 11:00 a.m.

Speaker: Min Zhang, Professor of Statistics, Department of Statistics, Purdue University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: A variable selection framework is proposed to integrate pathway information for genome-wide association analysis. Unlike other methods that rely on computation-intensive Markov chain Monte Carlo algorithms, we proposed an iterated conditional modes/medians algorithm to implement an empirical Bayes variable selection. Iterated conditional modes are first utilized to optimize values of the hyper-parameters and to implement the empirical Bayes method, and then iterated conditional medians are used to estimate the model parameters and therefore implement the variable selection function. In addition to the advantages of Bayesian inference, the proposed method enjoys efficient computation, increased statistical power of the analysis, and improved estimation of the model parameters. Simulation studies showed the superior performance of the proposed approach, and the method has been applied to real data from genome-wide association studies


A Novel Application of the Simes Test in Group Sequential Setting

Wednesday, May 22, 2019

Time: 11:00 a.m.

Speaker: Ajit C. Tamhane, Professor of IEMS and Statistics (by courtesy), Northwestern University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: This talk will be in two parts. The first part will be pedagogic in which I will discuss some basic notions and procedures from multiple testing and group sequential testing. The second part will be about an interesting research idea that did not quite pan out.  In this part I will discuss an out-of-box application of the Simes (1986) test, which plays an important role in multiple testing, to group sequential setting.

The Simes test is designed to test a single null hypothesis formed by taking the intersection of multiple null hypotheses using their p-values. It can be extended to test the component null hypotheses with familywise error rate control by applying the closure method of Marcus, Peritz and Gabriel (1976). In a group sequential setting also there is a single null hypothesis, which can be tested using the Simes test based on the p-values from sequential looks. However, this application turns out to be less powerful than the classical group sequential tests of Pocock (1977) and O’Brien and Fleming (1979). The reason is that the rejection decision of the Simes test is not necessarily based on sufficient statistics.

Note: This work is joint with Dr. Jiangtao Gou and Dr. Alex Dmitrienko.


Nonparametric Regression Models of Multilevel Treatment Effect Moderation

Wednesday, May 29, 2019

Time: 11:00 a.m.

Speaker: Jared Murray, Assistant Professor of Statistics, University of Texas at Austin, Department of Information, Risk,and Operations Management and Department of Statistics and Data Science

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: Bayesian nonparametric approaches to causal inference have recently become popular. However, current approaches fail to address three important features in applications: Accounting for multilevel structure, allowing for targeted regularization, and providing interpretable summaries of scientifically meaningful quantities. We extend recently proposed BART-based methods to include all of these features. A key component of this model is a parameterization that allows treatment heterogeneity to be regularized separately from prognostic effects, and also parsimoniously incorporates multilevel structure. In an application to the National Study of Learning Mindsets (Yeager et. al., 2017) we use these new tools to provide meaningful insights about effect modification at the school and individual level. Our posterior summarization strategy avoids pitfalls common to existing approaches relying on post-hoc data snooping or large collections of hypothesis tests.


Back to top