Skip to main content

Main menu

  • Home
  • Content
    • Published Ahead of Print
    • Current Issue
    • Podcasts
    • Subject Collections
    • Archives
    • ASN Meeting Abstracts
    • Saved Searches
  • Authors
    • Submit a Manuscript
    • Author Resources
    • Reprint Information
  • Trainees
    • Peer Review Program
    • Prize Competition
  • About CJASN
    • About CJASN
    • Editorial Team
    • CJASN Impact
    • CJASN Recognitions
  • More
    • Alerts
    • Advertising
    • Reprint Information
    • Subscriptions
    • Feedback
  • ASN Kidney News
  • Other
    • JASN
    • Kidney360
    • Kidney News Online
    • American Society of Nephrology

User menu

  • Subscribe
  • My alerts
  • Log in
  • My Cart

Search

  • Advanced search
American Society of Nephrology
  • Other
    • JASN
    • Kidney360
    • Kidney News Online
    • American Society of Nephrology
  • Subscribe
  • My alerts
  • Log in
  • My Cart
Advertisement
American Society of Nephrology

Advanced Search

  • Home
  • Content
    • Published Ahead of Print
    • Current Issue
    • Podcasts
    • Subject Collections
    • Archives
    • ASN Meeting Abstracts
    • Saved Searches
  • Authors
    • Submit a Manuscript
    • Author Resources
    • Reprint Information
  • Trainees
    • Peer Review Program
    • Prize Competition
  • About CJASN
    • About CJASN
    • Editorial Team
    • CJASN Impact
    • CJASN Recognitions
  • More
    • Alerts
    • Advertising
    • Reprint Information
    • Subscriptions
    • Feedback
  • ASN Kidney News
  • Visit ASN on Facebook
  • Follow CJASN on Twitter
  • CJASN RSS
  • Community Forum
Feature
You have accessRestricted Access

Statistical Methods for Cohort Studies of CKD: Prediction Modeling

Jason Roy, Haochang Shou, Dawei Xie, Jesse Y. Hsu, Wei Yang, Amanda H. Anderson, J. Richard Landis, Christopher Jepson, Jiang He, Kathleen D. Liu, Chi-yuan Hsu and Harold I. Feldman
CJASN June 2017, 12 (6) 1010-1017; DOI: https://doi.org/10.2215/CJN.06210616
Jason Roy
*Department of Biostatistics and Epidemiology and
†Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Haochang Shou
*Department of Biostatistics and Epidemiology and
†Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dawei Xie
*Department of Biostatistics and Epidemiology and
†Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jesse Y. Hsu
*Department of Biostatistics and Epidemiology and
†Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wei Yang
*Department of Biostatistics and Epidemiology and
†Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Amanda H. Anderson
*Department of Biostatistics and Epidemiology and
†Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
J. Richard Landis
*Department of Biostatistics and Epidemiology and
†Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher Jepson
*Department of Biostatistics and Epidemiology and
†Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jiang He
‡Department of Epidemiology, Tulane University, New Orleans, Louisiana;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kathleen D. Liu
§Department of Medicine, University of California, San Francisco, California; and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chi-yuan Hsu
§Department of Medicine, University of California, San Francisco, California; and
‖Division of Research, Kaiser Permanente Northern California, Oakland, California
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Harold I. Feldman
*Department of Biostatistics and Epidemiology and
†Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data Supps
  • Info & Metrics
  • View PDF
Loading

Abstract

Prediction models are often developed in and applied to CKD populations. These models can be used to inform patients and clinicians about the potential risks of disease development or progression. With increasing availability of large datasets from CKD cohorts, there is opportunity to develop better prediction models that will lead to more informed treatment decisions. It is important that prediction modeling be done using appropriate statistical methods to achieve the highest accuracy, while avoiding overfitting and poor calibration. In this paper, we review prediction modeling methods in general from model building to assessing model performance as well as the application to new patient populations. Throughout, the methods are illustrated using data from the Chronic Renal Insufficiency Cohort Study.

  • Calibration
  • C-statistic
  • ROC curve
  • Sensitivity
  • Specificity
  • Cohort Studies
  • Disease Progression
  • Humans
  • Risk
  • Renal Insufficiency, Chronic

Introduction

Predictive models and risk assessment tools are intended to influence clinical practice and have been a topic of scientific research for decades. A PubMed search of prediction model yields over 40,000 papers. In CKD, research has focused on predicting CKD progression (1,2), cardiovascular events (3), and mortality (4–6) among many other outcomes (7). Interest in developing prediction models will continue to grow with the emerging focus on personalized medicine and the availability of large electronic databases of clinical information. Researchers carrying out prediction modeling studies need to think carefully about design, development, validation, interpretation, and the reporting of results. This methodologic review article will discuss these key aspects of prediction modeling. We illustrate the concepts using an example from the Chronic Renal Insufficiency Cohort (CRIC) Study (8,9) as described below.

Motivating Example: Prediction of CKD Progression

The motivating example focuses on the development of prediction models for CKD progression. In addition to the general goal of finding a good prediction model, we explore whether a novel biomarker improves prediction of CKD progression over established predictors. In this case, urine neutrophil gelatinase–associated lipocalin (NGAL) was identified as a potential risk factor for CKD progression on the basis of a growing literature that showed elevated levels in humans and animals with CKD or kidney injury (2). The question of interest was whether baseline urine NGAL would provide additional predictive information beyond the information captured by established predictors.

The CRIC Study is a multicenter cohort study of adults with moderate to advanced CKD. The design and characteristics of the CRIC Study have been described previously (8,9). In total, 3386 CRIC Study participants had valid urine NGAL test data and were included in the prediction modeling. Details of the procedures for obtaining urine NGAL are provided elsewhere (2,10).

Established predictors include sociodemographic characteristics (age, sex, race/ethnicity, and education), eGFR (in milliliters per minute per 1.73 m2), proteinuria (in grams per day), systolic BP, body mass index, history of cardiovascular disease, diabetes, and use of angiotensin–converting enzyme inhibitors/angiotensin II receptor blockers. In this example, all were measured at baseline.

The outcome was progressive CKD, which was defined as a composite end point of incident ESRD or halving of eGFR from baseline using the Modification of Diet in Renal Disease Study equation (11). ESRD was considered to have occurred when a patient underwent kidney transplantation or began chronic dialysis. For the purposes of this paper, in lieu of a broader examination of NGAL that was part of the original reports (2,10), we will on focus on occurrence of progressive CKD within 2 years from baseline (a yes/no variable).

Among the 3386 participants, 10% had progressive CKD within 2 years. The median value of urine NGAL was 17.2 ng/ml, with an interquartile range of 8.1–39.2 ng/ml. Detailed characteristics of the study population are given in the work by Liu et al. (2).

We excluded patients with missing predictors (n=119), leaving a total of 3033 with a valid NGAL measurement, no missing predictors, and an observed outcome. We made this decision, because the percentage of missing data was low, the characteristics of those with observed and missing data were similar (data not shown), and we wanted to focus on prediction and not missing data issues. In practice, however, multiple imputation is generally recommended for handling missing predictors (12,13).

Prediction Models

It is important to distinguish between two major types of modeling that are found in medical research—associative modeling and prediction modeling. In associative modeling, the goal is typically to identify population-level relationships between independent variables (e.g., exposures) and dependent variables (e.g., clinical outcomes). Although associative modeling does not necessarily establish causal relationships, it is often used in an effort to improve our understanding of the mechanisms through which outcomes occur. In prediction modeling, by contrast, the goal is typically to generate the best possible estimate of the value of the outcome variable for each individual. These models are often developed for use in clinical settings to help inform treatment decisions.

Prediction models use data from current patients for whom both outcomes and predictors are available to learn about the relationship between the predictors and outcomes. The models can then be applied to new patients for whom only predictors are available—to make educated guesses about what their future outcome will be. Prediction modeling as a field involves both the development of the models and the evaluation of their performance.

In the era of big data, there has been an increased interest in prediction modeling. In fact, an entire field, machine learning, is now devoted to developing better algorithms for prediction. As a result of so much research activity focused on prediction, many new algorithms have been developed (14). For continuous outcomes, options include linear regression, generalized additive models (15), Gaussian process regression (16), regression trees (17), and k-nearest neighbor (18) among others. For binary outcomes, prediction is also known as classification, because the goal is to classify an individual into one of two categories on the basis of their set of predictors (features in machine learning terminology). Popular options for binary outcome prediction include logistic regression, classification tree (17), support vector machine (19), and k-nearest neighbor (20). Different machine learning algorithms have various strengths and weaknesses, discussion of which is beyond the scope of this paper. In the CRIC Study example, we use the standard logistic regression model of the binary outcome (occurrence of progressive CKD within 2 years).

Variable Selection

Regardless of which type of prediction model is used, a variable selection strategy will need to be chosen. If we are interested in the incremental improvement in prediction due to a novel biomarker (like urine NGAL), then it is reasonable to start with a set of established predictors and assess what improvement, if any, the biomarker adds to the model. Variable selection is, therefore, knowledge driven. Alternatively, if the goal is simply to use all available data to find the best prediction model, then a data-driven approach can be applied. Data-driven methods are typically automated—the researcher provides a list of a large set of possible predictors, and the method will select from that a shorter list of predictors to include in a final model. Data-driven methods include criterion methods, such as maximizing Bayesian information criterion (21), regularization methods, such as Lasso (22), and dimension reduction methods, such as principal components analysis (23), when there is a large set of predictors. Which type of variable selection approach to use depends on the purpose of the prediction model and its envisioned use in clinical practice. For example, if portability is a goal, then restricting to commonly collected variables might be important.

Performance Evaluation

After a model is developed, it is important to quantify how well it performs. In this section, we describe several methods for assessing performance. Our emphasis is on performance metrics for binary outcomes (classification problems); some of the metrics can be used for continuous outcomes as well.

Typically, a prediction model for a binary outcome produces a risk score for each individual, denoting their predicted risk of experiencing the outcome given their observed values on the predictor variables. For example, logistic regression yields a risk score, which is the log odds (logit) of the predicted probability of an individual experiencing the outcome of interest. A good prediction model for a binary outcome should lead to good discrimination (i.e., good separation in risk scores between individuals who will, in fact, develop the outcome and those who will not). Consider the CRIC Study example. We fitted three logistic regression models of progressive CKD (within 2 years from baseline). The first included only age, sex, and race as predictors. The second model also included eGFR and proteinuria. Finally, the third model also included other established predictors: angiotensin–converting enzyme inhibitor/angiotensin II receptor blocker, an indicator for any history of cardiovascular disease, diabetes, educational level, systolic BP, and body mass index. In Figure 1, the risk scores from each of the logistic regression models are plotted against the observed outcome. The plots show that, as more predictors were added to the model, the separation in risk scores between participants who did or did not experience the progressive CKD outcome increased. In model 1, for example, the distribution of the risk score was very similar for both groups. However, in model 3, those not experiencing progressive CKD tended to have much lower risk scores than those who did.

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

Plot of risk score against CKD progression within 2 years in the Chronic Renal Insufficiency Cohort (CRIC) Study data. The plots are for three different models, with increasing numbers of established predictors included from left to right. The risk score is log odds (logit) of the predicted probability. The outcome, progressive CKD (1= yes, 0= no), is jittered in the plots to make the points easier to see. ACE/ARB, angiotensin-converting enzyme/angiotensin II receptor blocker; BMI, body mass index; CVD, cardiovascular disease; diab., diabetes; educ., education; SBP, systolic BP.

Sensitivity and Specificity

On the basis of the risk score, we can classify patients as high or low risk by choosing some threshold, values above which are considered high risk. A good prediction model tends to classify those who will, in fact, develop the outcome as high risk and those who will not as low risk. Thus, we can describe the performance of a test using sensitivity, the probability that a patient who will develop the outcome is classified as high risk, and specificity, the probability that a patient who will not develop the outcome is classified as low risk (24).

In the CRIC Study example, we focus on the model that included all of the established predictors. To obtain sensitivity and specificity, we need to pick a threshold risk score, above which patients are classified as high risk. Figure 2 illustrates the idea using two risk score thresholds—a low value that leads to high sensitivity (96%) but moderate specificity (50%) and a high value that leads to lower sensitivity (43%) but a high specificity (98%). Thus, from the same model, one could have a classification that is highly sensitive (patients who will go on to CKD progression almost always screen positive) and moderately specific (only one half of patients who will not go on to CKD progression will screen negative) or one that is highly specific but only moderately sensitive.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Plots of progressive CKD within 2 years by risk score from the model with all established predictors. The plot in the left panel is on the basis of a risk score cutoff of −4, and the plot in the right panel uses a cutoff of zero. The vertical line separates (left panel) low- and (right panel) high-risk patients. The horizontal line separates the groups with and without progressive CKD (1= yes, 0= no). By counting the number of subjects in each of the quadrants, we obtain a standard 2×2 table. For example, using the cutoff of −4, there were 12 patients who were classified as low risk and ended up with CKD progression within 2 years. The sensitivity of this classification approach can be estimated by taking the number of true positives (294) and dividing by the total number of high-risk patients (294+12=306). Thus, sensitivity is 96%. Similar calculations can be used to find the specificity of 50%. Given the same prediction model, a different classification cutpoint could be chosen. In the plot in the right panel, in which a risk score of zero is chosen as the cutpoint, sensitivity decreases to 43%, whereas specificity increases to 98%.

Receiver Operating Characteristic Curves and c Statistic

By changing the classification threshold, one can choose to increase sensitivity at the cost of decreased specificity or vice versa. For a given prediction model, there is no way to increase both simultaneously. However, both potentially can be increased if the prediction model itself is improved (e.g., by adding an important new variable to the model). Thus, sensitivity and specificity can be useful for comparing models. However, one model might seem better than the other at one classification threshold and worse than the other at a different threshold. We would like to compare prediction models in a way that is not dependent on the choice of risk threshold.

Receiver operating characteristic (ROC) curves display sensitivity and specificity over the entire range of possible classification thresholds. Consider again Figure 2. By using two different thresholds, we had two different pairs of values of sensitivity and specificity. We could choose dozens or more additional thresholds and record equally many pairs of sensitivity and specificity data points. These data points can be used to construct ROC curves. In particular, ROC curves are plots of true positive rate (sensitivity) on the vertical axis against the false positive rate (1− specificity) on the horizontal axis. Theoretically, the perfect prediction model would simply be represented by a horizontal line at 100% (perfect true positive rate, regardless of threshold). A 45° line represents a prediction model equivalent to random guessing.

One way to summarize the information in an ROC curve is with the area under the curve (AUC). This is also known as the c statistic (25). A perfect model would have a c statistic of one, which is the upper bound of the c statistic, whereas the random guessing model would have a c statistic of 0.5. Thus, one way to compare prediction models is with the c statistic—larger values being better. The c statistic also has another interpretation. Given a randomly selected case and a randomly selected control (in the CRIC Study example, a CKD progressor and a nonprogressor), the probability that the risk score is higher for the case than for the control is equal to the value of the c statistic (26).

In Figure 3A, the ROC curve is displayed from a prediction model that includes urine NGAL as the only predictor variable. The c statistic for this model is 0.8. If our goal was simply to determine whether urine NGAL has prognostic value for CKD progression, the answer would be yes. If, however, we were interested in the incremental value of the biomarker beyond established predictors, we would need to take additional steps. In Figure 3B, we compare ROC curves derived from two prediction models—one including only demographics and the other including demographics plus urine NGAL. From Figure 3B, we can see that the ROC curve for the model with NGAL (red curve in Figure 3B) (AUC=0.82) dominates (is above) the one without NGAL (blue curve in Figure 3B) (AUC=0.69). The c statistic for the model with urine NGAL is larger by 0.13. Thus, there seems to be incremental improvement over a model with demographic variables alone. However, the primary research question in the work by Liu et al. (2) was whether NGAL had prediction value beyond that of established predictors, which include additional factors beyond demographics. The blue curve in Figure 3C is the ROC curve for the model that included all of the established predictors. That model had a c statistic of 0.9—a very large value for a prediction model. When NGAL is added to this logistic regression model, the resulting ROC curve is the red curve in Figure 3C. These two curves are nearly indistinguishable and have the same c statistic (to two decimal places). Therefore, on the basis of this metric, urine NGAL does not add prediction value beyond established predictors. It is worth noting that urine NGAL was a statistically significant predictor in the full model (P<0.01), which illustrates the point that statistical significance does not necessarily imply added prediction value.

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

Receiver operating characteristics (ROC) curves for three different prediction models. A is the ROC curve when urine neutrophil gelatinase–associated lipocalin (NGAL) is the only predictor variable included in the model. The c statistic is 0.8. B and C each include two ROC curves. In B, the blue curve (area under the curve [AUC] =0.69) is from the model that includes only demographic variables. The red curve (AUC=0.82) additionally includes urine NGAL. In C, the blue curve is for the model that includes all established predictors. The red curve includes all established predictors plus urine NGAL. The c statistics in C are both 0.9. This plots show that urine NGAL is a relatively good predictor variable alone and has incremental improvement over the demographics-only model but does not help to improve prediction beyond established predictors.

It is also important to consider uncertainty in the estimate of the ROC curve and c statistic. Confidence bands for the ROC curve and confidence intervals for the c statistic can be obtained from available software. For comparing two models, a confidence interval for the difference in c statistics could be obtained via, for example, nonparametric bootstrap resampling (27). However, this will generally be less powerful than the standard chi–squared test for comparing two models (28).

A limitation of comparing models on the basis of the c statistic is that it tends to be insensitive to improvements in prediction that occur when a new predictor (such as a novel biomarker) is added to a model that already has a high c statistic (29–31). It also should not be used without additionally assessing calibration, which we next briefly describe.

Calibration

A well calibrated model is one for which predicted probabilities closely match the observed rates of the outcome over the range of predicted values. A poorly calibrated model might perform well overall on the basis of measures, like the c statistic, but would perform poorly for some subpopulations.

Calibration can be checked in a variety of ways. A standard method is the Hosmer–Lemeshow test, where predicted and observed counts within percentiles of predicted probabilities are compared (32). In the CRIC Study example with the full model (including NGAL), the Hosmer–Lemeshow test on the basis of deciles of the predicted probabilities has a P value of 0.14. Rejection of the test (P<0.05) would suggest a poor fit (poor calibration), but that is not the case here. Another useful method is calibration plots. In this method, rather than obtaining observed and expected rates within percentiles, observed and expected rates are estimated using smoothing methods. Figure 4 displays a calibration plot for the CRIC Study example, with established predictors and urine NGAL included in the model. The plot suggests that the model is well calibrated, because the observed and predicted values tend to fall near the 45° line.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

Calibration plot for the model that includes established predictors and urine neutrophil gelatinase–associated lipocalin. The straight line is where a perfectly calibrated model would fall. The curve is the estimated relationship between the predicted and observed values. The shaded region is ±2 SEMs.

Brier Score

A measure that takes into account both calibration and discrimination is the Brier score (33,34). We can estimate the Brier score for a model by simply taking the average squared difference between the actual binary outcome and the predicted probability of that outcome for each individual. A low value of this metric indicates a model that performs well across the range of risk scores. The perfect model would have a Brier score of zero. The difference in this score between two models can be used to compare the models. In the CRIC Study example, the Brier scores were 0.087, 0.075, 0.057, and 0.056 for the demographics-only, demographics plus NGAL, established predictors, and established predictors plus NGAL models, respectively. Thus, there was improvement in adding NGAL to the demographics model (Brier score difference of 0.012—a 14% improvement). There was also improvement by moving from the demographics-only model to the established predictors model (Brier score difference of 0.020). However, adding NGAL to the established predictors model only decreased the Brier score by 0.001 (a 2% improvement). Although how big of a change constitutes a meaningful improvement is subjective, an improvement in Brier score of at least 10% would be difficult to dismiss, whereas a 2% improvement does not seem as convincing.

Net Reclassification Improvement and Integrated Discrimination Improvement

Pencina et al. (35,36) developed several new methods for assessing the incremental improvement of prediction due to a new biomarker. These include net reclassification improvement (NRI), both categorical and category free, and integrated discrimination improvement. These methods are reviewed in detail in a previous Clinical Journal of the American Society of Nephrology paper (37), and therefore, here we briefly illustrate the main idea of just one of the methods (categorical NRI).

The categorical NRI approach assesses the change in predictive performance that results from adding one or more predictors to a model by comparing how the two models classify participants into risk categories. In this approach, therefore, we begin by defining risk categories as we did when calculating specificity and sensitivity—that is, we choose cutpoints of risk for the outcome variable that define categories of lower and higher risks. NRI is calculated separately for the group experiencing the outcome event (those with progressive CKD within 2 years in the CRIC Study example) and the group not experiencing the event. To calculate NRI, study participants are assigned a score of zero if they were not reclassified (i.e., if both models place the participant in the same risk category), a score of one if they were reclassified in the right direction, and a score of −1 if they were reclassified in the wrong direction. An example of the right direction is a participant who experienced an event who was classified as low risk under the old model but classified as high risk under the new model. Within each of the two groups, NRI is calculated as the sum of scores across all patients in the group divided by the total number of patients in the group. These NRI scores are bounded between −1 and one, with a score of zero implying no difference in classification accuracy between the models, a negative score indicating worse performance by the new model, and a positive score showing improved performance. The larger the score, the better the new model classified participants compared with the old model on the basis of this criterion.

We will now consider a three–category classification model, where participants were classified as low risk if their predicted probability was <0.05, medium risk if it was between 0.05 and 0.10, and high risk if it was above 0.10. These cutpoints were chosen a priori on the basis of what the investigators considered to be clinically meaningful differences in the event rate (2). The results for the comparison of the demographics-only model with the demographics and urine NGAL model are given in Figure 5, left panel. The NRIs for events and nonevents were 3.6% (95% confidence interval, −3.3% to 9.2%) and 33% (95% confidence interval, 30.8% to 35.8%), respectively. Thus, there was large improvement in risk prediction for nonevents when going from the demographics-only model to the demographics and NGAL model. Next, we compared the model with all established predictors with the same model with urine NGAL as an additional predictor. The reclassification data are given in Figure 5, right panel. Overall, there was little reclassification for both events and nonevents, indicating no discernible improvement in prediction when NGAL was added to the established predictors model, which is consistent with the c statistic and Brier score findings described above.

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

Classification of patients in the event and nonevent groups in models with and without urine neutrophil gelatinase–associated lipocalin (NGAL). The counts in red are from patients who were reclassified in the wrong direction. The left panel shows the reclassification that occurs when urine NGAL is added to a model that includes only demographics. For events, 3+8+37=48 were reclassified in the right direction, whereas 16+16+5=37 were reclassified in the wrong direction. The net reclassification improvement (NRI) for events was (48/306)−(37/306)=3.6%. The NRI for nonevents was (1153/2727)−(243/2727)=33%. Thus, there was large improvement in risk prediction for nonevents when going from the demographics-only model to demographics and NGAL model. In the right panel, urine NGAL is added to a model that includes all established predictors. Overall, there was little reclassification for both events and nonevents. The NRI for events is simply (6/306)−(8/306)=−0.65%. The NRI for nonevents is (94/2727)−(79/2727)=0.55%.

The categorical NRI depends on classification cutpoints. There is a category-free version of NRI that avoids actual classification when assessing the models. Although NRI has practical interpretations, there is concern that it can be biased when the null is true (when a new biomarker is not actually predictive) (38,39). In particular, for poorly fitting models (especially poorly calibrated models), NRI tends to be inflated, making the new biomarker seem to add more predictive value than it actually does. As a result, bias-corrected methods have been proposed (40). In general, however, it is not recommended to quantify incremental improvement by NRI alone.

Decision Analyses

The methods discussed above tend to treat sensitivity and specificity as equally important. However, in clinical practice, the relative cost of each will vary. An approach that allows one to compare models after assigning weights to these tradeoffs is decision analysis (41,42). That is, given the relative weight of a false positive versus a false negative, the net benefit of one model over another can be calculated. A decision curve can be used to allow different individuals who have different preferences to make informed decisions.

Continuous or Survival Outcomes

We have focused our CRIC Study example primarily on a binary outcome thus far (classification problems). However, most of the principles described above apply to continuous or survival data. Indeed, some of the same model assessment tools can also be applied. For example, for censored survival data, methods have been developed to estimate the c statistic (29,43–45). For continuous outcomes, plots of predicted versus observed outcome and measures, such as R2, can be useful. Calibration methods for survival outcomes have been described in detail elsewhere but are generally straightforward to implement (46).

Validation

Internal

A prediction model has good internal validity or reproducibility if it performs as expected on the population from which it was derived. In general, we expect that the performance of a prediction model will be overestimated if it is assessed on the same sample from which it was developed. That is, overfitting is a major concern. A method that assesses internal validation or one that avoids overfitting should be used. Split samples (training and test data) can be used to avoid overestimation of performance. Alternatively, methods, such as bootstrapping and crossvalidation, can be used at the model development phase to avoid overfitting (47,48).

External

External validity or transportability is the ability to translate a model to different populations. We expect the performance of a model to degrade when it is evaluated in new populations (49–51). Poor transportability of a model can occur because of underfitting. This would occur, for example, when important predictors are either unknown or not included in the original model. Even if the associations between the predictors and outcome stay the same, there is still a possibility that the baseline risk may be different in the new populations. It is, therefore, important to check (via performance metrics described above) external validity and possibly recalibrate the model when applying the model to a new population. Suppose, for example, that our prediction model is a logistic regression, and we have adequately captured the relationship between predictors and the outcome. However, the baseline risk in the new population might be higher or lower. Baseline risk is represented by the intercept term. Therefore, a simple recalibration method is to re-estimate the intercept term. Similarly, for survival data, re-estimation of the baseline hazard function can be used for recalibration.

Dynamic Prediction

In this paper, we have focused on situations where a prediction model is developed using available data that would potentially be applied unchanged to the assessment of new patients. However, a growing area of research is dynamic prediction models, where models are frequently updated over time as new data become available (52,53). The major points of the article still hold for dynamic prediction, but these models have the potential to be more transportable in that they can rapidly adapt to new populations.

Reporting

After the prediction model has been developed, evaluated, and possibly, validated, the next step is to report the findings. A set of recommended guidelines for the reporting of prediction modeling studies, the TRIPOD statement, was recently published and includes a 22-item checklist (54,55). For each item in the checklist, there are detailed examples and explanations. We highly recommend reviewing this document before publishing any results. The document places a strong emphasis on being very specific about the aims and design of the study, the definition of the outcome, the study population, the statistical methods used, and discussion of limitations.

An important part of reporting is a discussion of the clinical implications. Typically, adoption of the prediction model should not be recommended if it has not been validated. After the prediction model has been validated, it could be used or further studied in a variety of ways. For example, a model that accurately predicts CKD progression across a wide range of populations would be helpful to provide clinicians and patients with prognostic information. Such models could inform clinical decision making. For example, high-risk patients might need to be followed more intensely (e.g., clinician visits every 3 months rather than every 12 months) and have evidence-based interventions to slow CKD rigorously applied (e.g., BP<130/80 mmHg for those with proteinuria). Another example is the decision to have surgery for arteriovenous fistula creation if ESRD is imminent. If such models are used for decision making, the relative costs of false positives and false negatives need to be assessed. Along the same lines, whether knowing the risk score improves patient care would need to be studied, possibly in a randomized trial.

Concluding Remarks

Prediction modeling is likely to become increasingly important in CKD research and clinical practice, with richer data becoming available at a rapid rate. This paper described many of the key methods in the development, assessment, and application of prediction models. Other papers in this CKD methods series will focus on complex modeling of CKD data, such as longitudinal data, competing risks, time-dependent confounding, and recurrent event analysis.

Disclosures

None.

Acknowledgments

Funding for the Chronic Renal Insufficiency Cohort Study was obtained under a cooperative agreement from the National Institute of Diabetes and Digestive and Kidney Diseases (grants U01DK060990, U01DK060984, U01DK061022, U01DK061021, U01DK061028, U01DK60980, U01DK060963, and U01DK060902). Additional funding was provided by grants K01DK092353 and U01DK85649 (CKD Biocon).

Footnotes

  • Published online ahead of print. Publication date available at www.cjasn.org.

  • Copyright © 2017 by the American Society of Nephrology

References

  1. ↵
    Lennartz CS, Pickering JW, Seiler-Mußler S, Bauer L, Untersteller K, Emrich IE, Zawada AM, Radermacher J, Tangri N, Fliser D, Heine GH: External validation of the kidney failure risk equation and re-calibration with addition of ultrasound parameters. Clin J Am Soc Nephrol 11: 609–615, 2016
  2. ↵
    1. Liu KD,
    2. Yang W,
    3. Anderson AH,
    4. Feldman HI,
    5. Demirjian S,
    6. Hamano T,
    7. He J,
    8. Lash J,
    9. Lustigova E,
    10. Rosas SE,
    11. Simonson MS,
    12. Tao K,
    13. Hsu CY; Chronic Renal Insufficiency Cohort (CRIC) study investigators
    : Urine neutrophil gelatinase-associated lipocalin levels do not improve risk prediction of progressive chronic kidney disease. Kidney Int 83: 909–914, 2013pmid:23344473
    OpenUrlCrossRefPubMed
  3. ↵
    1. Solak Y,
    2. Yilmaz MI,
    3. Siriopol D,
    4. Saglam M,
    5. Unal HU,
    6. Yaman H,
    7. Gok M,
    8. Cetinkaya H,
    9. Gaipov A,
    10. Eyileten T,
    11. Sari S,
    12. Yildirim AO,
    13. Tonbul HZ,
    14. Turk S,
    15. Covic A,
    16. Kanbay M
    : Serum neutrophil gelatinase-associated lipocalin is associated with cardiovascular events in patients with chronic kidney disease. Int Urol Nephrol 47: 1993–2001, 2015pmid:26498629
    OpenUrlCrossRefPubMed
  4. ↵
    Deo R, Shou H, Soliman EZ, Yang W, Arkin JM, Zhang X, Townsend RR, Go AS, Shlipak MG, Feldman HI: Electrocardiographic measures and prediction of cardiovascular and noncardiovascular death in CKD. J Am Soc Nephrol 27: 559–569, 2016
    1. Weiss JW,
    2. Platt RW,
    3. Thorp ML,
    4. Yang X,
    5. Smith DH,
    6. Petrik A,
    7. Eckstrom E,
    8. Morris C,
    9. O’Hare AM,
    10. Johnson ES
    : Predicting mortality in older adults with kidney disease: A pragmatic prediction model. J Am Geriatr Soc 63: 508–515, 2015pmid:25739329
    OpenUrlCrossRefPubMed
  5. ↵
    1. Bansal N,
    2. Katz R,
    3. De Boer IH,
    4. Peralta CA,
    5. Fried LF,
    6. Siscovick DS,
    7. Rifkin DE,
    8. Hirsch C,
    9. Cummings SR,
    10. Harris TB,
    11. Kritchevsky SB,
    12. Sarnak MJ,
    13. Shlipak MG,
    14. Ix JH
    : Development and validation of a model to predict 5-year risk of death without ESRD among older adults with CKD. Clin J Am Soc Nephrol 10: 363–371, 2015pmid:25710804
    OpenUrlAbstract/FREE Full Text
  6. ↵
    1. Tangri N,
    2. Kitsios GD,
    3. Inker LA,
    4. Griffith J,
    5. Naimark DM,
    6. Walker S,
    7. Rigatto C,
    8. Uhlig K,
    9. Kent DM,
    10. Levey AS
    : Risk prediction models for patients with chronic kidney disease: A systematic review. Ann Intern Med 158: 596–603, 2013pmid:23588748
    OpenUrlCrossRefPubMed
  7. ↵
    1. Feldman HI,
    2. Appel LJ,
    3. Chertow GM,
    4. Cifelli D,
    5. Cizman B,
    6. Daugirdas J,
    7. Fink JC,
    8. Franklin-Becker ED,
    9. Go AS,
    10. Hamm LL,
    11. He J,
    12. Hostetter T,
    13. Hsu CY,
    14. Jamerson K,
    15. Joffe M,
    16. Kusek JW,
    17. Landis JR,
    18. Lash JP,
    19. Miller ER,
    20. Mohler ER 3rd.,
    21. Muntner P,
    22. Ojo AO,
    23. Rahman M,
    24. Townsend RR,
    25. Wright JT; Chronic Renal Insufficiency Cohort (CRIC) Study Investigators
    : The Chronic Renal Insufficiency Cohort (CRIC) Study: Design and methods. J Am Soc Nephrol 14[Suppl 2]: S148–S153, 2003pmid:12819321
    OpenUrlAbstract/FREE Full Text
  8. ↵
    1. Lash JP,
    2. Go AS,
    3. Appel LJ,
    4. He J,
    5. Ojo A,
    6. Rahman M,
    7. Townsend RR,
    8. Xie D,
    9. Cifelli D,
    10. Cohan J,
    11. Fink JC,
    12. Fischer MJ,
    13. Gadegbeku C,
    14. Hamm LL,
    15. Kusek JW,
    16. Landis JR,
    17. Narva A,
    18. Robinson N,
    19. Teal V,
    20. Feldman HI; Chronic Renal Insufficiency Cohort (CRIC) Study Group
    : Chronic Renal Insufficiency Cohort (CRIC) Study: Baseline characteristics and associations with kidney function. Clin J Am Soc Nephrol 4: 1302–1311, 2009pmid:19541818
    OpenUrlAbstract/FREE Full Text
  9. ↵
    1. Liu KD,
    2. Yang W,
    3. Go AS,
    4. Anderson AH,
    5. Feldman HI,
    6. Fischer MJ,
    7. He J,
    8. Kallem RR,
    9. Kusek JW,
    10. Master SR,
    11. Miller ER 3rd.,
    12. Rosas SE,
    13. Steigerwalt S,
    14. Tao K,
    15. Weir MR,
    16. Hsu CY; CRIC Study Investigators
    : Urine neutrophil gelatinase-associated lipocalin and risk of cardiovascular disease and death in CKD: Results from the Chronic Renal Insufficiency Cohort (CRIC) Study. Am J Kidney Dis 65: 267–274, 2015pmid:25311702
    OpenUrlCrossRefPubMed
  10. ↵
    1. Levey AS,
    2. Coresh J,
    3. Greene T,
    4. Marsh J,
    5. Stevens LA,
    6. Kusek JW,
    7. Van Lente F; Chronic Kidney Disease Epidemiology Collaboration
    : Expressing the Modification of Diet in Renal Disease Study equation for estimating glomerular filtration rate with standardized serum creatinine values. Clin Chem 53: 766–772, 2007pmid:17332152
    OpenUrlAbstract/FREE Full Text
  11. ↵
    1. Rubin DB
    : Multiple Imputation for Nonresponse in Surveys, New York, John Wiley & Sons, 2004
  12. ↵
    1. Carpenter J,
    2. Kenward M
    : Multiple Imputation and Its Application, 1st Ed., Chichester, United Kingdom, Wiley, 2013
  13. ↵
    James G, Witten D, Hastie T, Tibshirani R: An Introduction to Statistical Learning, Vol. 103, New York, Springer, 2013
  14. ↵
    1. Hastie T,
    2. Tibshirani R
    : Generalized additive models. Stat Sci 1: 297–310, 1986
    OpenUrlCrossRefPubMed
  15. ↵
    1. Rasmussen CE
    : Gaussian Processes for Machine Learning, Cambridge, MIT Press, 2006
  16. ↵
    1. Breiman L,
    2. Friedman J,
    3. Stone CJ,
    4. Olshen RA
    : Classification and Regression Trees, Wadsworth, Belmont, CRC Press, 1984
  17. ↵
    1. Altman NS
    : An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46: 175–185, 1992
    OpenUrlCrossRef
  18. ↵
    1. Suykens JAK,
    2. Vandewalle J
    1. Vapnik V
    : The support vector method of function estimation. In: Nonlinear Modeling, edited by Suykens JAK, Vandewalle J, Boston, Springer, 1998, pp 55–85
  19. ↵
    Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning, New York, Springer, 2009
  20. ↵
    1. Schwarz G
    : Estimating the dimension of a model. Ann Stat 6: 461–464, 1978
    OpenUrlCrossRef
  21. ↵
    1. Tibshirani R
    : Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol 58: 267–288, 1994
    OpenUrl
  22. ↵
    1. Bair E,
    2. Hastie T,
    3. Paul D,
    4. Tibshirani R
    : Prediction by supervised principal components. J Am Stat Assoc 101: 119–137, 2006
    OpenUrlCrossRef
  23. ↵
    1. Pepe MS
    : The Statistical Evaluation of Medical Tests for Classification and Prediction, Oxford, Oxford University Press, 2004
  24. ↵
    1. Bamber D
    : The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12: 387–415, 1975
    OpenUrlCrossRef
  25. ↵
    1. Hanley JA,
    2. McNeil BJ
    : The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143: 29–36, 1982pmid:7063747
    OpenUrlCrossRefPubMed
  26. ↵
    1. Carpenter J,
    2. Bithell J
    : Bootstrap confidence intervals: When, which, what? A practical guide for medical statisticians. Stat Med 19: 1141–1164, 2000pmid:10797513
    OpenUrlCrossRefPubMed
  27. ↵
    1. Vickers AJ,
    2. Cronin AM,
    3. Begg CB
    : One statistical test is sufficient for assessing new predictive markers. BMC Med Res Methodol 11: 13, 2011pmid:21276237
    OpenUrlCrossRefPubMed
  28. ↵
    1. Harrell FE Jr..,
    2. Lee KL,
    3. Mark DB
    : Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 15: 361–387, 1996pmid:8668867
    OpenUrlCrossRefPubMed
    1. Cook NR
    : Statistical evaluation of prognostic versus diagnostic models: Beyond the ROC curve. Clin Chem 54: 17–23, 2008pmid:18024533
    OpenUrlAbstract/FREE Full Text
  29. ↵
    1. Pepe MS,
    2. Feng Z,
    3. Huang Y,
    4. Longton G,
    5. Prentice R,
    6. Thompson IM,
    7. Zheng Y
    : Integrating the predictiveness of a marker with its performance as a classifier. Am J Epidemiol 167: 362–368, 2008pmid:17982157
    OpenUrlCrossRefPubMed
  30. ↵
    1. Hosmer D,
    2. Lemeshow S,
    3. Sturdivant R
    : Applied Logistic Regression, 3rd Ed., New York, Wiley, 2013
  31. ↵
    1. Brier GW
    : Verification of forecasts expressed in terms of probability. Mon Wea Rev 78: 1–3, 1950
    OpenUrlCrossRef
  32. ↵
    1. Gerds TA,
    2. Cai T,
    3. Schumacher M
    : The performance of risk prediction models. Biom J 50: 457–479, 2008pmid:18663757
    OpenUrlCrossRefPubMed
  33. ↵
    1. Pencina MJ,
    2. D’Agostino RB Sr..,
    3. Steyerberg EW
    : Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat Med 30: 11–21, 2011pmid:21204120
    OpenUrlCrossRefPubMed
  34. ↵
    1. Pencina MJ,
    2. D’Agostino RB Sr..,
    3. D’Agostino RB Jr..,
    4. Vasan RS
    : Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Stat Med 27: 157–172, 2008pmid:17569110
    OpenUrlCrossRefPubMed
  35. ↵
    1. Kerr KF,
    2. Meisner A,
    3. Thiessen-Philbrook H,
    4. Coca SG,
    5. Parikh CR
    : Developing risk prediction models for kidney injury and assessing incremental value for novel biomarkers. Clin J Am Soc Nephrol 9: 1488–1496, 2014
    OpenUrlAbstract/FREE Full Text
  36. ↵
    1. Pepe MS,
    2. Fan J,
    3. Feng Z,
    4. Gerds T,
    5. Hilden J
    : The net reclassification index (NRI): A misleading measure of prediction improvement even with independent test data sets. Stat Biosci 7: 282–295, 2015pmid:26504496
    OpenUrlCrossRefPubMed
  37. ↵
    1. Hilden J,
    2. Gerds TA
    : A note on the evaluation of novel biomarkers: Do not rely on integrated discrimination improvement and net reclassification index. Stat Med 33: 3405–3414, 2014pmid:23553436
    OpenUrlPubMed
  38. ↵
    1. Paynter NP,
    2. Cook NR
    : A bias-corrected net reclassification improvement for clinical subgroups. Med Decis Making 33: 154–162, 2013pmid:23042826
    OpenUrlCrossRefPubMed
  39. ↵
    Hunink MM, Glasziou PP, Siegel J, Weeks JC, Pliskin JS, Elstein AS, Weinstein MC: Decision Making in Health and Medicine. Integrating Evidence and Values. Cambridge, Cambridge University Press, 2001
  40. ↵
    1. Vickers AJ,
    2. Elkin EB
    : Decision curve analysis: A novel method for evaluating prediction models. Med Decis Making 26: 565–574, 2006pmid:17099194
    OpenUrlCrossRefPubMed
  41. ↵
    1. Heagerty PJ,
    2. Zheng Y
    : Survival model predictive accuracy and ROC curves. Biometrics 61: 92–105, 2005pmid:15737082
    OpenUrlCrossRefPubMed
    1. Uno H,
    2. Cai T,
    3. Pencina MJ,
    4. D’Agostino RB,
    5. Wei LJ
    : On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med 30: 1105–1117, 2011pmid:21484848
    OpenUrlCrossRefPubMed
  42. ↵
    1. Pencina MJ,
    2. D’Agostino RB
    : Overall C as a measure of discrimination in survival analysis: Model specific population value and confidence interval estimation. Stat Med 23: 2109–2123, 2004pmid:15211606
    OpenUrlCrossRefPubMed
  43. ↵
    1. D’Agostino RB,
    2. Nam B-H
    : Evaluation of the performance of survival analysis models: Discrimination and calibration measures. In: Handbook of Statistics, edited by Balakrishnan N, Rao CR, Amsterdam, Elsevier, 2004, pp 1–25
  44. ↵
    1. Borra S,
    2. Di Ciaccio A
    : Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods. Comput Stat Data Anal 54: 2976–2989, 2010
    OpenUrlCrossRef
  45. ↵
    1. Steyerberg EW,
    2. Harrell FE Jr..,
    3. Borsboom GJ,
    4. Eijkemans MJ,
    5. Vergouwe Y,
    6. Habbema JD
    : Internal validation of predictive models: Efficiency of some procedures for logistic regression analysis. J Clin Epidemiol 54: 774–781, 2001pmid:11470385
    OpenUrlCrossRefPubMed
  46. ↵
    1. König IR,
    2. Malley JD,
    3. Weimar C,
    4. Diener H-C,
    5. Ziegler A; German Stroke Study Collaboration
    : Practical experiences on the necessity of external validation. Stat Med 26: 5499–5511, 2007pmid:17907249
    OpenUrlCrossRefPubMed
    1. Altman DG,
    2. Vergouwe Y,
    3. Royston P,
    4. Moons KGM
    : Prognosis and prognostic research: Validating a prognostic model. BMJ 338: b605, 2009pmid:19477892
    OpenUrlFREE Full Text
  47. ↵
    1. Moons KGM,
    2. Kengne AP,
    3. Grobbee DE,
    4. Royston P,
    5. Vergouwe Y,
    6. Altman DG,
    7. Woodward M
    : Risk prediction models: II. External validation, model updating, and impact assessment. Heart 98: 691–698, 2012pmid:22397946
    OpenUrlAbstract/FREE Full Text
  48. ↵
    1. Finkelman BS,
    2. French B,
    3. Kimmel SE
    : The prediction accuracy of dynamic mixed-effects models in clustered data. BioData Min 9: 5, 2016pmid:26819631
    OpenUrlCrossRefPubMed
  49. ↵
    1. McCormick TH,
    2. Raftery AE,
    3. Madigan D,
    4. Burd RS
    : Dynamic logistic regression and dynamic model averaging for binary classification. Biometrics 68: 23–30, 2012pmid:21838812
    OpenUrlCrossRefPubMed
  50. ↵
    1. Collins GS,
    2. Reitsma JB,
    3. Altman DG,
    4. Moons KGM
    : Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): The TRIPOD statement. Ann Intern Med 162: 55–63, 2015pmid:25560714
    OpenUrlCrossRefPubMed
  51. ↵
    1. Moons KGM,
    2. Altman DG,
    3. Reitsma JB,
    4. Ioannidis JPA,
    5. Macaskill P,
    6. Steyerberg EW,
    7. Vickers AJ,
    8. Ransohoff DF,
    9. Collins GS
    : Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): Explanation and elaboration. Ann Intern Med 162: W1–W73, 2015pmid:25560730
    OpenUrlCrossRefPubMed
View Abstract
PreviousNext
Back to top

In this issue

Clinical Journal of the American Society of Nephrology: 12 (6)
Clinical Journal of the American Society of Nephrology
Vol. 12, Issue 6
June 07, 2017
  • Table of Contents
  • Table of Contents (PDF)
  • About the Cover
  • Index by author
View Selected Citations (0)
Print
Download PDF
Sign up for Alerts
Email Article
Thank you for your help in sharing the high-quality science in CJASN.
Enter multiple addresses on separate lines or separate them with commas.
Statistical Methods for Cohort Studies of CKD: Prediction Modeling
(Your Name) has sent you a message from American Society of Nephrology
(Your Name) thought you would like to see the American Society of Nephrology web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Statistical Methods for Cohort Studies of CKD: Prediction Modeling
Jason Roy, Haochang Shou, Dawei Xie, Jesse Y. Hsu, Wei Yang, Amanda H. Anderson, J. Richard Landis, Christopher Jepson, Jiang He, Kathleen D. Liu, Chi-yuan Hsu, Harold I. Feldman
CJASN Jun 2017, 12 (6) 1010-1017; DOI: 10.2215/CJN.06210616

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Statistical Methods for Cohort Studies of CKD: Prediction Modeling
Jason Roy, Haochang Shou, Dawei Xie, Jesse Y. Hsu, Wei Yang, Amanda H. Anderson, J. Richard Landis, Christopher Jepson, Jiang He, Kathleen D. Liu, Chi-yuan Hsu, Harold I. Feldman
CJASN Jun 2017, 12 (6) 1010-1017; DOI: 10.2215/CJN.06210616
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like

Jump to section

  • Article
    • Abstract
    • Introduction
    • Motivating Example: Prediction of CKD Progression
    • Prediction Models
    • Variable Selection
    • Performance Evaluation
    • Validation
    • Dynamic Prediction
    • Reporting
    • Concluding Remarks
    • Disclosures
    • Acknowledgments
    • Footnotes
    • References
  • Figures & Data Supps
  • Info & Metrics
  • View PDF

More in this TOC Section

  • Optimizing Peritoneal Dialysis–Associated Peritonitis Prevention in the United States
  • Health Policy for Dialysis Care in Canada and the United States
  • Preprint Servers in Kidney Disease Research
Show more Feature

Cited By...

  • Risk Factors for CKD Progression: Overview of Findings from the CRIC Study
  • Google Scholar

Similar Articles

Related Articles

  • No related articles found.
  • PubMed
  • Google Scholar

Keywords

  • calibration
  • C-statistic
  • ROC curve
  • Sensitivity
  • Specificity
  • Cohort Studies
  • Disease Progression
  • Humans
  • Risk
  • Renal Insufficiency, Chronic

Articles

  • Current Issue
  • Early Access
  • Subject Collections
  • Article Archive
  • ASN Meeting Abstracts

Information for Authors

  • Submit a Manuscript
  • Trainee of the Year
  • Author Resources
  • ASN Journal Policies
  • Reuse/Reprint Policy

About

  • CJASN
  • ASN
  • ASN Journals
  • ASN Kidney News

Journal Information

  • About CJASN
  • CJASN Email Alerts
  • CJASN Key Impact Information
  • CJASN Podcasts
  • CJASN RSS Feeds
  • Editorial Board

More Information

  • Advertise
  • ASN Podcasts
  • ASN Publications
  • Become an ASN Member
  • Feedback
  • Follow on Twitter
  • Password/Email Address Changes
  • Subscribe

© 2021 American Society of Nephrology

Print ISSN - 1555-9041 Online ISSN - 1555-905X

Powered by HighWire