Date/Room |
Person |
Title |
Abstract |
|
Elizabeth Sweeney |
The Structure of Structural Magnetic Resonance Imaging Data and Applications in Multiple Sclerosis |
Structural magnetic resonance imaging (MRI) of the brain and other parts of the body provide detailed and accurate anatomical information. Much work has been done on the analysis of statistics and summary metrics derived from these images, but until recently statisticians and biostatisticians have played a limited role in working directly with this data. To familiarize you with this exciting data, I will provide an introduction to structural MRI, including visualization and resources for learning to work with this data in R. I will also discuss two algorithms we developed for detecting and segmenting lesions in the brains of patients with multiple sclerosis in structural MRI: Subtraction-Based Logistic Inference for Modeling and Estimation (SuBLIME) and OASIS is Automated Statistical Inference for Segmentation (OASIS). |
|
Leonardo COLLADO TORRES |
Implementation of fast differential expression analysis annotation-agnostic across groups with biological replicates |
Since the development of high-throughput technologies for probing the genome we have been interested in finding differences across groups that could potentially explain the phenotypic differences we observe. In other words, methods for generation of hypothesis at a large scale where we try our best to remove artifacts. The traditional tools have focused on the transcriptome and are highly dependent on existing annotation. Alyssa Frazee et al developed a statistical framework to find candidate Differentially Expressed Regions (DERs) that I have attempted to make faster. I will introduce the problem we are trying to solve and show examples of https://github.com/lcolladotor/derfinder applied to various data sets. |
|
Aaron Fisher |
Intro to Adaptive Trials, and the EAGLE Visualization Tool |
This talk will be an introduction to how adaptive trials work, and the basic theoretical framework behind them. We'll be glossing over the more complicated details, so if you know what a multivariate normal distribution is, and you know what a null distribution is, then it should be a breeze. The specific application we'll be talking about is a trial design where we start by enrolling everyone, and then decide who to continue enrolling based on results from the currently enrolled patients. For example, if one subpopulation doesn't appear to be benefiting, we might stop enrolling them towards the end of the trial. This summer I worked on a Shiny App with Harris Jaffee and Michael Rosenblum, which helps users explore trial designs with this kind of adaptive enrollment. If there's time, I'll show some of the app's bells & whistles. |
|
Chen Yue |
Principal curves and surfaces |
The concept of principal curves and surfaces was proposed in Trevor Hastie's PhD thesis in 1984, and later published in JASA (1989). The idea of this concept is to find a lower dimensional manifold embedded in a higher dimension space. In this presentation, I will share my limited knowledge on the intuition, algorithm and applications on this interesting concept. I will also show some results that how the original algorithm was improved. Besides, I will introduce several other related concepts such as ISOMAP (published in Science), a current algorithm that helps finding low dimensional manifold. |
|
Amanda Mejia |
A Layered Grammar of Graphics |
A grammar of graphics is a tool that enables us to concisely describe
the components of a graphic. Such agrammar allows us to move beyond
named graphics (e.g., the “scatterplot”) and gain insight into the
deep structure that underlies statistical graphics. This article
builds on Wilkinson, Anand, and Grossman (2005), describing extensions
and refinements developed while building an open source implementation
of thegrammar of graphics for R, ggplot2.
The topics in this article include an introduction to the grammar by
working through the process of creating a plot, and discussing the
components that we need. The grammar is then presented formally and
compared to Wilkinson’s grammar, highlighting the hierarchy of
defaults, and the implications of embedding a graphical grammar into a
programming language. The power of the grammar is illustrated with a
selection of examples that explore different components and their
interactions, in more detail. The article concludes by discussing some
perceptual issues, and thinking about how we can build on thegrammar
to learn how to create graphical “poems.” |
|
Ivan Diaz |
Reaping the Computational Benefits of Targeted Maximum Likelihood Estimation using Exponential Families |
Targeted maximum likelihood estimation (TMLE) is a general template for constructing estimators of parameters in semi and nonparametric models. A crucial step in implementation of TML estimators in a nonparametric model is the proposal of a parametric submodel for the relevant components of the likelihood. In the context of causal inference, where TMLE has been most studied, TML estimators can often be implemented by running standard regression software on auxiliary variables carefully constructed, usually referred to as ``clever covariates''. In this paper we examine targeted maximum likelihood estimation in a more general setting, exploring the use of an exponential family to define the parametric submodel. We illustrate the method in four examples involving estimation of the mean of an outcome missing at random, median regression, variable importance, and the causal effect of a continuous exposure. We take advantage of the fact that estimation of a parameter in an exponential family is a convex optimization problem, a well developed area for which software implementing reliable and computationally efficient methods exist. This emplementation of TMLE provides a completely general framework in which TML estimators can be computed for any parameter that can be defined in the nonparametric model. |
|
Huitong Qiu |
Robust covariance estimation with application in portfolio optimization |
In portfolio optimization, estimating the covariance matrix (or the scatter matrix, which is
a matrix proportional to the covariance matrix) of stock returns is the key step. In this paper,
we propose a new robust portfolio optimization strategy by resorting to a quantile based
scatter matrix estimator. Computationally, the proposed robust portfolio optimization method
is as e cient as its Gaussian-based alternative. Theoretically, by exploiting the quantile-based
statistics, we show that the actual portfolio risk approximates the oracle risk with parametric
rate even under very heavy-tailed distributions and a stationary time series with weak dependence.
The rate of convergence is set in a double asymptotic framework where the portfolio
size may scale exponentially with sample size. The empirical e ffectiveness of the proposed estimator
is demonstrated in both synthetic and real data. The experiments visualize that the
proposed method can signi cantly stabilize portfolio risk under highly volatile stock returns,
and e ectively avoid extremal losses. |
|
John Muschelli |
Statistical Modeling: The Two Cultures |
The link to this paper is: http://projecteuclid.org/euclid.ss/1009213726 |
|
Parichoy Pal Choudhury |
Mendelian Randomization: A Review from a Causal Inference Perspective |
In epidemiology it is often of interest to study the causal eect of a modi-
able phenotype on the risk of a disease. Though randomized controlled trials
are considered \gold standard" for such questions, they are often not ethical or
practical to conduct. Moreover, it is dicult to draw causal inference from ob-
servational data due to the problems of confounding and reverse causation. One
statistical approach to deal with unmeasured confounding is through the use of
instrumental variables. When genes are considered as instrumental variables,
the method is called \Mendelian randomization". In this talk, I review some
of the statistical methods that exploit the idea of Mendelian randomization. In
particular, when a gene satises the core conditions that dene an instrumen-
tal variable, the average causal eect of the phenotype on the outcome is not
identied, but it is possible to derive bounds for this parameter. I discuss addi-
tional parametric restrictions and other assumptions needed to identify causal
parameters of interest. I also highlight some of the challenging open research
questions in this area. |
|
Lei Huang |
Bayesian scalar-on-image regression with application to association between intracranial DTI and cognitive outcomes |
Diffusion tensor imaging (DTI)measureswater diffusionwithinwhite matter, allowing for in vivo quantification of brain pathways. These pathways often subserve specific functions, and impairment of those functions is often associated with imaging abnormalities. As a method for predicting clinical disability from DTI images, we propose a hierarchical Bayesian “scalar-on-image” regression procedure. Our procedure introduces a latent binary map that estimates the locations of predictive voxels and penalizes themagnitude of effect sizes in these voxels,
thereby resolving the ill-posed nature of the problem. By inducing a spatial prior structure, the procedure yields a sparse associationmap that also maintains spatial continuity of predictive regions. The method is demonstrated on a simulation study and on a study of association between fractional anisotropy and cognitive disability in a cross-sectional sample of 135 multiple sclerosis patients. |