| |
Logistic regression has been the standard method of analysis for case-control studies in epidemiology. The classical justifications (Prentice-Pyke, 1979, Biometrika) for the efficiency of the estimates of odds ratios from such prospective analysis, despite the retrospective case-control sampling of subjects, depend on the nonparametric treatment of the distribution of the covariates in the underlying population. In genetic epidemiologic studies, however, parametric- or semi-parametric assumptions, such as Hardy-Weinberg-Equilibrium (HWE) and independence of genetic and environmental factors, can lead to alternative methods for analysis of case-control studies with major efficiency advantage over the standard analysis. Unlike logistic regression, however, these alternative methods may produce biased inference when the underlying distributional assumptions are violated. In this talk, I will describe a novel Empirical-Bayes-type estimation methodology that can “shrink” the case-control analysis towards the natural assumptions of HWE and gene-environment independence for the underlying population, but only to the extent the data validates those assumptions. Both simulation studies and real data examples are used to illustrate the trade-off between bias and efficiency achieved by the proposed method. Theoretical issues will be discussed regarding how such “shrinkage” estimators can achieve efficiency beyond known semi-parametric efficiency bounds. It is concluded that the novel methods could be potentially very useful for discovery and characterization of genetic associations and gene-environment interactions from case-control studies.
This talk presents joint work with Dr. Raymond Carroll, Dr. Yi-Hau Chen, Dr. Sheng Luo and Dr. Bhramar Mukherjee. |