|
THESIS DEFENSE ABSTRACT
Statistical Methods for Failure Time Data with Biased Sampling and Measurement
Errors Substantial methodological and applied research has been dedicated in recent years to survival analysis with covariates subject to measurement error. Current statistical approaches for Cox regression with covariates measured with error only focus on linear log-hazard function. We propose, develop and implement a fully Bayesian inferential approach for the Cox model when the log hazard function contains unknown smooth functions of the variables measured with error. Our approach is to model nonparametrically both the log-baseline hazard and the smooth components of the log-hazard functions using low-rank penalized splines. The likelihood of the Cox model is coupled with the likelihood of the measurement error process. Careful implementation of the Bayesian inferential machinery is shown to produce remarkably better results than the naive approach. Our methodology was motivated by and applied to the study of progression time to chronic kidney disease (CKD) as a function of baseline kidney function and applied to the Atherosclerosis Risk in Communities (ARIC) study, a large epidemiological cohort study. In many epidemiological and cancer survey studies, units are sampled with proportional probability to some function of their values, leading to what are called biased samples and have been addressed in last two decades. In such studies, we focus on estimating marginal causal treatment effect. However, due to systematic differences between treated and untreated groups with respect to various covariates, direct comparisons of observed outcomes from the two groups are not appropriate. We make inference on the marginal causal survival function and the propensity score with biased sampling. The problem is especially complex because outcome (“potential outcomes”), as well as covariates, are partially observed. The missingness comes from two different sources: One is due to the hypothetic potential outcome framework; the other is because of prevalent sampling scheme. Making causal inference without adjusting for the both sources of the missingness will lead to a bias result. We propose an inverse weighting approach to estimate marginal causal survival function and develop a method to correct the propensity score. Furthermore, we provide a double robust estimator which is asymptotically unbiased if either the underlying propensity score model or the underlying regression function is correctly specified. Our methodology was motivated by and applied to Surveillance, Epidemiology, and End Results (SEER)-Medicare data for women diagnosed with breast cancer. Return to Biostatistics Calendar | Return to Home Page |
|