Skip to Content
Discovering the causes of cancer and the means of prevention

Publications Search - Abstract View

Title: The use of the risk percentile curve in the analysis of epidemiologic data
Authors: Chatterjee N,  Graubard BI,  Gastwirth JL
Journal: Statistics and its interface
Date: 2009
Branches: BB
PubMed ID:
PMC ID: not available
Abstract: Economists and social scientists have used percentilebasedcurves, e.g., the Lorenz curve, to summarize data frompositive random variables, especially skewed data such as income.Measures of interest, e.g., the Gini index of relativeinequality, correspond to areas defined by the curves. In thispaper we explore the usefulness of risk-percentile and relatedcurves in epidemiology, especially when the exposure data isskewed. These curves are defined and risk measures, e.g. thepopulation attributable risk are related to areas under themfor data from either a cohort or a case-control study. Regressionspline methods of estimating these curves are used asthey do not require a pre-specified risk model. The conceptsare illustrated by analyzing data from a cohort study of dietaryred meat consumption and all-cause mortality and acase-control study of serum homocysteine level and colorectalcancer. These examples show that the risk percentilecurves often are more useful than presenting the risk as afunction of the raw exposure data as the later graph is often dominated by the tails when the data is skewed. Furthermore, the risk percentile curve is more informative than the commonly used method of presenting the average risk in categories defined by several fixed percentiles such as quartiles or quintiles. Indeed, the risk averages for these categories can be obtained from the risk-percentile curve.