Fall 2016 Seminar Series

Wednesday, October 12, 2016

Discrete Optimization via Simulation using Gaussian Markov Random Fields

Time: 11:15 a.m.

Speaker: Professor Barry L Nelson, Department of Industrial Engineering & Management Sciences, Northwestern University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: The problem is maximizing or minimizing the expected value of a stochastic performance measure that can be observed by running a dynamic, discrete-event simulation when the feasible solutions are defined by integer decision variables. Inventory sizing, call center staffing and manufacturing system design are common applications. Standard approaches are ranking and selection, which takes no advantage of the relationship among solutions, and adaptive random search, which exploits it but in a heuristic way (“good solutions tend to be clustered”). Instead, we construct an optimization procedure built on modeling the relationship as a discrete Gaussian Markov random field (GMRF). This enables computation of the expected improvement (EI) that could be obtained by running the simulation for any feasible solution, whether actually simulated or not. The computation of EI can be numerically challenging, in general, but the GMRF representation greatly reduces the burden by facilitating the use of sparse matrix methods. By employing a multiresolution GMRF, problems with millions of feasible solutions can be solved.

 

Wednesday, October 26, 2016

Designing randomized trials for making generalizations to policy-relevant populations

Time: 11:00 a.m.

Speaker: Elizabeth Tipton, Assistant Professor of Applied Statistics, Department of Human Development, Teachers College, Columbia University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: Randomized trials are common in education, the social sciences, and medicine. While random assignment to treatment ensures that the average treatment effect estimated is causal, studies are typically conducted on samples of convenience, making generalizations of this causal effect outside the sample difficult. This talk provides an overview of new methods for improving generalizations through improved research design. This includes defining an appropriate inference population, developing a sampling plan and recruitment strategies, and taking into account planned analyses for treatment effect heterogeneity. The talk will also briefly introduce a new webtool useful for those planning randomized trials in education research.

Wednesday, November 2, 2016

Combined Hypothesis Testing on Graphs with Applications to Gene Set Enrichment Analysis

Time: 11:00 a.m.

Speaker: Professor Ming Yuan, Department of Statistics, University of Wisconsin-Madison, Medical Sciences Center

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: Motivated by gene set enrichment analysis, we investigate the problem of combined hypothesis testing on a graph. We introduce a general framework to effectively use the structural information of the underlying graph when testing multivariate means. A new testing procedure is proposed within this framework. We show that the test is optimal in that it can consistently detect departure from the collective null at a rate that no other test could improve, for almost all graphs. We also provide general performance bounds for the proposed test under any specific graph, and illustrate their utility through several common types of graphs.

Wednesday, November 9, 2016

Quantifying Nuisance Parameter Effects in Likelihood and Bayesian Inference

Time: 11:00 a.m.

Speaker:  Todd Kuffner, Assistant Professor, Department of Mathematics, Washington University in St. Louis

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: In the age of computer-aided statistical inference, the practitioner has at her disposal an arsenal of computational tools to perform inference for low-dimensional parameters of interest, where the elimination of nuisance parameters can be accomplished by optimization, numerical integration, or some other computational sorcery. An operational assumption is that black-box tools can usually vaccinate the final inference for the interest parameter against potential effects arising from the presence of nuisance parameters. At the same time, from a theoretical, analytic perspective, accurate inference on a scalar interest parameter in the presence of nuisance parameters may be obtained by asymptotic refinement of likelihood-based statistics. Among these are Barndorff-Nielsen’s adjustment to the signed root likelihood ratio statistic, the Bartlett correction, and Cornish-Fisher transformations. We show how these adjustments may be decomposed into two terms, the first taking the same value when there are no nuisance parameters or there is an orthogonal nuisance parameter, and the second term being zero when there are no nuisance parameters. Illustrations are given for a number of examples which provide insight into the effect of nuisance parameters on parametric inference for parameters of interest. Connections and extensions for Bayesian inference are also discussed, and some open foundational questions are posed regarding the role of nuisance parameters in Bayesian inference, with some emphasis on possible effects in computational procedures for inference. Time permitting, I will explore potential links with recent work on post-selection inference.


Wednesday, November 30, 2016

Using Machine Learning to Predict Laboratory Test Results

Time: 11:00 a.m.

Speaker: Yuan Luo, Assistant Professor of Preventive Medicine (Health and Biomedical Informatics), Department of Industrial Engineering and Management Science, Northwestern University

Place: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road

Abstract: While clinical laboratories report most test results as individual numbers, findings or observations, clinical diagnosis usually relies on the results of multiple tests. Clinical decision support that integrates multiple elements of laboratory data could be highly useful in enhancing laboratory diagnosis. Using the analyte ferritin in a proof-of-concept, we extracted clinical laboratory data from patient testing and applied a variety of machine learning algorithms to predict ferritin test result using the results from other tests. We compared predicted to measured results and reviewed selected cases to assess the clinical value of predicted ferritin. We show that patient demographics and results of other laboratory tests can discriminate normal from abnormal ferritin results with a high degree of accuracy (AUC as high as 0.97, held-out test data). Case review indicated that predicted ferritin results may sometimes better reflect underlying iron status than measured ferritin. Our next step is to integrate temporality into predicting multi-variate analytes. We devise an algorithm alternating between multiple imputation based cross sectional prediction and stochastic process based auto regressive prediction. We show modest performance improvement of the combined algorithm compared to either component alone. These findings highlight the substantial informational redundancy present in patient test results and offer a potential foundation for a novel type of clinical decision support aimed at integrating, interpreting and enhancing the diagnostic value of multi-analyte sets of clinical laboratory test results.