Skip to Content
Discovering the causes of cancer and the means of prevention

Publications Search - Abstract View

Title: Testing logistic regression coefficients with clustered data and few positive outcomes.
Authors: Hunsberger S,  Graubard BI,  Korn EL
Journal: Stat Med
Date: 2008 Apr 15
Branches: BB
PubMed ID: 17705348
PMC ID: not available
Abstract: Applications frequently involve logistic regression analysis with clustered data where there are few positive outcomes in some of the independent variable categories. For example, an application is given here that analyzes the association of asthma with various demographic variables and risk factors using data from the third National Health and Nutrition Examination Survey, a weighted multi stage cluster sample. Although there are 742 asthma cases in all (out of 18,395 individuals), for one of the categories of one of the independent variables there are only 25 asthma cases (out of 695 individuals). Generalized Wald and score hypothesis tests, which use appropriate cluster-level variance estimators, and a bootstrap hypothesis test have been proposed for testing logistic regression coefficients with cluster samples. When there are few positive outcomes, simulations presented in this paper show that these tests can sometimes have either inflated or very conservative levels. A simulation-based method is proposed for testing logistic regression coefficients with cluster samples when there are few positive outcomes. This testing methodology is shown to compare favorably with the generalized Wald and score tests and the bootstrap hypothesis test in terms of maintaining nominal levels. The proposed method is also useful when testing goodness-of-fit of logistic regression models using deciles-of-risk tables.