## Summary

**Background and objectives** The estimated GFR (eGFR) is important in clinical practice. To find the best formula for eGFR, this study assessed the best model of correlation between sinistrin clearance (iGFR) and the solely or combined cystatin C (CysC)– and serum creatinine (SCreat)–derived models. It also evaluated the accuracy of the combined Schwartz formula across all GFR levels.

**Design, setting, participants, & measurements** Two hundred thirty-eight iGFRs performed between January 2012 and April 2013 for 238 children were analyzed. Regression techniques were used to fit the different equations used for eGFR (*i.e.*, logarithmic, inverse, linear, and quadratic). The performance of each model was evaluated using the Cohen κ correlation coefficient and the percentage reaching 30% accuracy was calculated.

**Results** The best model of correlation between iGFRs and CysC is linear; however, it presents a low κ coefficient (0.24) and is far below the Kidney Disease Outcomes Quality Initiative targets to be validated, with only 84% of eGFRs reaching accuracy of 30%. SCreat and iGFRs showed the best correlation in a fitted quadratic model with a κ coefficient of 0.53 and 93% accuracy. Adding CysC significantly (*P*<0.001) increased the κ coefficient to 0.56 and the quadratic model accuracy to 97%. Therefore, a combined SCreat and CysC quadratic formula was derived and internally validated using the cross-validation technique. This quadratic formula significantly outperformed the combined Schwartz formula, which was biased for an iGFR≥91 ml/min per 1.73 m^{2}.

**Conclusions** This study allowed deriving a new combined SCreat and CysC quadratic formula that could replace the combined Schwartz formula, which is accurate only for children with moderate chronic kidney disease.

## Introduction

GFR is the best measure to assess kidney function, and the most accurate manner for its determination requires the use of exogenous markers (1–3) that are not easily available in clinical practice. Therefore, GFR is often estimated using endogenous markers, such as serum creatinine (SCreat) and serum cystatin C (CysC), with conflicting results regarding their performance (4–7).

This study assessed the best model of correlation between sinistrin clearance (iGFR), which is the gold standard method for determining measured GFR (mGFR) and CysC- or SCreat-derived formulas for determining estimated GFR (eGFR) in children. We first analyzed the best correlation model between iGFRs and CysC by evaluating the diagnostic accuracy of all previously published equation models based only on CysC for estimation of GFR. We then studied the best correlation between iGFRs and SCreat by evaluating the accuracy of the two recently published SCreat-based models by Schwartz *et al.* (8) and Gao *et al.* (9). We also analyzed the diagnostic accuracy of adding CysC and BUN to the best-fitted model of correlation. These led to deriving and validating the best combined SCreat and CysC accurate model/formula for eGFR in children. Finally, we compared the accuracy and validity of the combined SCreat and CysC Schwartz model and equation (10) to our combined SCreat and CysC fitted formula in a cohort of children with renal failure, as well as in children with normal renal function, about whom limited data are available in the literature.

## Materials and Methods

### Population

Two hundred forty-three iGFRs performed between January 2012 and April 2013 for 243 children were evaluated. All patients age 2–18.5 years referred to our laboratory unit for GFR measurement were included. Children with bladder dysfunction, those unable to void spontaneously, and those in whom bladder catheterization failed were excluded. Patients’ renal disorders and CKD classification are presented in Table 1. This study was approved by the local research ethics board and was conducted in accordance with the ethical standards of the Declaration of Helsinki.

### Laboratory Analyses

SCreat was measured using the kinetic colorimetric compensated Jaffe method, as reported by the manufacturer, Roche Modular P system (Roche Diagnostics, Mannheim, Germany), which was standardized to the isotope-dilution mass spectrometry reference.

CysC was measured by particle-enhanced nephelometric immunoassay on a BN ProSpec analyzer (Siemens Healthcare Diagnostics). The results were multiplied by 1.174 as indicated in the Siemens customer bulletin to adjust to the values obtained with the assay standardized to the new traceable International Reference Preparation-ERM-DA471/IFCC, as recommended by Kidney Disease Improve Global Outcomes (11).

The iGFR (Inutest SPC, Fresenius Kabi Pharma, Austria) was measured using the anthrone method by the manufacturer: Wright automatic method by Wright and Gann (12), using an Autoanalyzer 3 system (high-resolution digital colorimeter, SEAL Bran Luebbe, Norderstedt, Germany).

### Statistical Analyses

Statistical analyses were performed using R software, version 2.15.2 (R Foundation for Statistical Computing, Vienna, Austria). Linear regression techniques (function “lm” in R software) were used to fit the different equations to eGFR. We considered SCreat, CysC, age, and sex as predictor variables for iGFR. In total, nine models adjusted for age and sex with iGFR as the dependent variable and CysC or SCreat as the independent variable were evaluated. To assess the performance of the different models, we calculated for each model the Cohen κ coefficient, which measures agreement between two measurements; values below and above 90 ml/min per 1.73 m^{2} were used to categorize iGFR and eGFR values. To compare κ coefficients between two models, we used a permutation procedure by randomly permuting the estimated values from both models. We also calculated the accuracy (*i.e.*, the percentage of estimated values within 10% and 30% of observed values). Type of correlation between iGFR and CysC was also assessed using a graphical representation (lowess function). Likelihood ratio tests were performed to compare the fit of the combined SCreat and CysC–based logarithmic Schwartz model and the combined SCreat and CysC–based quadratic model. The cutoff of the applicability of the new combined Schwartz formula was determined by applying the circular binary segmentation method (9,13,14) to formula residuals. The circular binary segmentation method allows segmenting of data through change-point detection using a likelihood ratio statistic, which tests the null hypothesis that there is no change, against the alternative hypothesis that there is exactly one change at each location.

To test for normality of model residuals, we used two approaches: The first one involved the D’Agostino and Pearson omnibus normality test (package fBasics, R software), and the second one graphically assessed the normality assumption by plotting residuals on a Q-Q plot. The Q-Q plot shows the quantile of the distribution of the new quadratic formula residuals (*y*-axis) versus the quantile of a Gaussian distribution (*x*-axis) with a mean of 0 and an SD of 1.

To check for internal validity of our models estimates, we performed a cross-validation technique called repeated random subsampling validation, also known as Monte Carlo cross-validation (15). This involves randomly dividing the data into two samples: the training set (two thirds of the sample), on which the model was fitted, and the testing set (one third of the sample), on which the model was evaluated. The cross-validation was done with 1000 bootstrap replications. For each replication, the model was fit to the training data; the root mean square error (RMSE) was calculated using this fitted model for the training set and then for the testing set. We therefore developed for each replication our estimating equation from the training data and validated it on the validation data. The distribution (mean; median; first quartile, third quartile; minimum and maximum) of RMSE values was reported.Additional details on the methods and population are available in the Supplemental Material.

## Results

Two hundred forty-three patients were enrolled. In two patients bladder catheterization failed, and three patients had an allergic reaction and could not finish the sinistrin clearance test. They were therefore excluded, which left 238 patients for final analysis. Patients’ demographic characteristics are reported in Table 2. Ten children presented growth retardation for body weight (4.2% of all patients), 17 presented growth retardation for height (7.1% of all patients), and 20 presented combined growth retardation for both body weight and height (8.4% of all patients).

### Correlation between iGFR and CysC

The correlation between iGFR and CysC was analyzed in the whole population using logarithmic, inverse, quadratic, and simple linear fitted models, as follows: log[iGFR] = α + βlog(CysC) + a log(age) + b(sex), iGFR = α + β(1/CysC) + a(age) + b(sex), iGFR = α + β(CysC) + γ(CysC)^{2} + a(age) + b(sex), and iGFR = α + β(CysC) +a(age)+b(sex), respectively. Regression coefficients were calculated independently of the published coefficient equations (5,6,16–19). As shown in Figure 1, the scatter plot that graphically appears to be the closest to a straight line was the one performed without transformation of CysC, in favor of a linear relationship between CysC and iGFR. Furthermore, κ coefficient for the linear model was higher than for the inverse and logarithmic models (Table 3). However, accuracies of all these different models were poor: We found that 42%, 42%, 42%, and 41% of eGFRs were within 10% of iGFR and 84%, 83%, 84%, and 84% of eGFRs were within 30% of iGFR for the linear, inverse, logarithmic, and quadratic correlations, respectively.

### Correlation between iGFR and SCreat

Schwartz *et al.* proposed a linear formula for eGFR using SCreat (8) as follows: eGFR = 0.413 × height (Ht)/SCreat. We first analyzed, in the whole population, the correlation between iGFR and this linear model independent of the defined formula’s constant (0.413) as follow: iGFR = α + β(Ht/SCreat) + a(age) + b(sex). This linear model achieved accuracies within 10% and 30% of the observed values of 49% and 90%, respectively, compared with 45% and 88%, respectively, with the 0.413 Schwartz coefficient formula. Second, we fit the relationship between iGFR and the variable ratio of Ht/SCreat in a quadratic model, as proposed by Gao *et al.* (9), as follows: iGFR = α + β(Ht/SCreat) + γ(Ht/SCreat)^{2} + a(age) + b(sex). We obtained a better model performance than the one obtained from the linear model, with 93% of eGFR values being within 30% of iGFR values and a highly significant (*P*<0.001) likelihood ratio test between these two models (Table 3). The κ coefficient was also increased for the quadratic model compared with the linear one (0.53 versus 0.46). Therefore, adding a quadratic term to the linear SCreat-based model increased the fit of the linear model.

### Correlation between iGFR and SCreat-Based Quadratic Model with CysC

With use of the whole population, adding CysC to the SCreat-based quadratic model significantly increased model fit with a significant (*P*<0.001) likelihood ratio test. The analysis of the accuracy of this new quadratic SCreat and CysC–based model (Table 3) showed that for an accuracy of 10% and 30%, 53% and 97% of eGFRs were accurate, respectively. This combined SCreat and CysC quadratic model also achieved a better κ coefficient—0.56—than did the solely SCreat quadratic model.

### Correlation between iGFR and New Quadratic SCreat and CysC–Based Model with BUN

With use of the whole population, adding BUN to the combined quadratic SCreat and CysC–based model did not increase model fit, with a nonsignificant (*P*=0.23) log-likelihood ratio chi-square difference between the two models with and without BUN. Table 3 shows that the performances of the models with or without BUN were equal. Therefore, the BUN variable was not added to the final chosen model.

### Quadratic SCreat and CysC–Based Model Validation and New Combined Quadratic Formula: Internal Model Validation

A Monte Carlo cross-validation technique (15) as described in the statistical analyses section was performed. The cross-validation was done with 1000 bootstrap replications. In each replication, the RMSE was calculated using the fitted quadratic model for the training set and for the testing set. This allows comparing the RMSE between the training set and testing set (Table 4). The mean of RMSE obtained for the testing set was close to the one obtained for the training set: 12.8 for the new combined quadratic model in the testing set versus 12.04 for the combined quadratic model in the training set. This showed good internal consistency of our estimates.

On the basis of results described above, we derived from the whole population a new quadratic SCreat and CysC–based formula for eGFR. Estimated coefficients and 95% confidence intervals of this new formula are reported in Table 5. The new quadratic formula for female and male patients is as follows:

Female patients:Males:(where Ht is in cm, SCreat in mg/dl, CysC in mg/L, and age in years).

While normal distribution for residuals was rejected according to the D’Agostino and Pearson omnibus normality test, the residual Q-Q plot did not show a strong departure from normality assumption (Figure 2). Of note, no observation achieved a Cook distance >4/*n* (*n* is the number of observations), which is classically (20) admitted as the cutoff value to use for spotting highly influential points.

### Is the Combined SCreat and CysC Schwartz Equation (2012) Valid in Our Study Sample?

We fit the combined SCreat, CysC, and BUN–based logarithmic model proposed by Schwartz *et al.* in 2012, which includes six variables (height, age, sex, and three endogenous biomarkers: SCreat, CysC, and BUN) as follow: log(iGFR)= α + βlog(Ht/SCreat) + γlog(Ht) + δlog(CysC) + θlog(BUN) + alog(age) + b(sex). We observed that in the whole population, the combined logarithmic Schwartz model had a significantly lower κ coefficient than did the new combined SCreat and CysC quadratic model (0.49 versus 0.56; *P*=0.01 using a permutation procedure). Both combined models had very similar accuracies; 96% and 97% of eGFRs were within 30% of iGFR, respectively. With use of estimated coefficients proposed by Schwartz *et al.* (10) for the entire population, a much lower κ coefficient was obtained (0.15); for an accuracy of 30%, only 86% of eGFRs reached this accuracy level (Table 3). Meanwhile, when we restricted our population to individuals with iGFR values ranging from 15 to 75 ml/min per 1.73 m^{2}, accuracy increased for the Schwartz formula; 90% of estimated values were within 30% of the iGFRs.

We also compared internal consistency of the Schwartz combined logarithm model to that of the quadratic combined model (Table 4) using the training and the testing datasets. The lowest RMSE, and therefore the best fit, was obtained using the new combined SCreat and CysC–based quadratic model. Mean RMSE between the training and the testing set for the logarithmic model increased at the same percentage as for the new combined quadratic model (approximately 6%), meaning that the logarithmic model suggested by Schwartz (without the defined Schwartz coefficients) was not biased in our study population.

To define the cutoff of applicability of the combined Schwartz formula, we applied the circular binary segmentation method for residual values (iGFR − eGFR). This method demonstrates the presence of three points of change corresponding to iGFRs of 75, 91, and 126 ml/min per 1.73 m^{2}; as a result, four segments were identified (Figure 3). When we applied the circular binary segmentation method to residual values from the new combined quadratic formula, we also detected (Figure 4) three points of change corresponding to iGFRs of 81, 96, and 126 ml/min per 1.73 m^{2}. In consequence, four segments were identified. Detailed results for each segment regarding the mean difference between iGFRs and eGFRs and the accuracy within 30% of iGFRs are reported in Table 6. We observed that mean absolute value of the segment (*i.e.*, the mean bias) was greater for the combined Schwartz formula than for the combined quadratic formula and that accuracy was also better for the latter formula in all segments except for iGFRs<75 ml/min per 1.73 m^{2}, corresponding to the population from which the combined Schwartz formula was derived. In addition, results showed that the combined Schwartz formula was biased for iGFR≥91 ml/min per 1.73 m^{2}. For patients with iGFRs above that precise cutoff, the mean difference between iGFRs and eGFRs was significantly wider and the accuracy within 30% of iGFRs was significantly lower compared with patients with iGFRs<91 ml/min per 1.73 m^{2}.

### Is There a Practical Reason for Clinicians to Use the New Combined Quadratic Formula?

From a practical point of view, it is mostly important for clinicians to differentiate children with normal (*i.e.*, ≥90 ml/min per 1.73 m^{2}) from those with abnormal GFR. We divided our cohort into two categories: below and above 90 ml/min per 1.73 m^{2}. Performances of the all-fitted models and of the combined Schwartz formula were evaluated by calculating the sensitivity, specificity, and area under the curve (AUC) (Table 7). Using a permutation procedure, we also calculated the difference in the κ concordance coefficient between the new combined quadratic model/formula and the SCreat-based Schwartz model. Results show that although the AUC of the linear SCreat-based model is similar to that of the new combined quadratic model and formula, its specificity of 60% is much lower than the one obtained with the new combined quadratic model or formula; as a result it could lead to misclassification of children with normal GFRs into abnormal GFRs categories. In addition, the κ coefficient obtained from the new combined quadratic model or formula was significantly better than the one obtained from the linear SCreat-based model (*P*<0.001). Therefore, in clinical practice, the new combined quadratic formula achieves the best agreement with iGFR.

To facilitate the reader’s understanding and interpretation of our findings, Table 8 summarizes the main performances of the SCreat-based Schwartz equation (8), the SCreat-based quadratic formula proposed by Gao *et al.* (9), the combined logarithmic Schwartz model with coefficients fitted to our population, the combined logarithmic Schwartz formula with its original coefficients (10), and the new combined quadratic formula.

## Discussion

Several formulas for eGFR in children were recently developed on the basis of SCreat (8,9), CysC (5,6,16–19), or combined SCreat and CysC measurements (10). They all show limitations in accuracy. We found that the best-fit model of correlation between iGFR and CysC is linear, but it is still far below the KDOQI targets to be validated, with <90% of estimated GFRs reaching 30% accuracy (21). Regarding the solely SCreat-based formulas for eGFR, the linear model proposed by Schwartz *et al.* is inaccurate according to the KDOQI target, in agreement with Bacchetta *et al*. (22) and Pottel *et al*. (23), and the quadratic model better fits data, in agreement with Gao *et al.* (9). Adding CysC to the SCreat-based quadratic model significantly increased the model fit, with better Cohen κ concordance coefficient and accuracy, leading us to derive a new combined SCreat and CysC–based quadratic formula that provides the most precise and accurate estimate of GFR across all GFR values. This new combined SCreat and CysC quadratic formula performed as well as the combined logarithmic Schwartz model with estimated coefficients derived from our data but was significantly better than the combined Schwartz formula with its original coefficients, which was biased for iGFR≥91 ml/min per 1.73 m^{2}. Note that the combined SCreat, CysC, and BUN Schwartz formula was derived from a group of children with a measured GFR between 15 and 75 ml/min per 1.73 m^{2} (10). When we restricted our population to individuals with iGFR values ranging from 15 to 75 ml/min per 1.73 m^{2}, the combined Schwartz formula with its original coefficients presented a better correlation to iGFR with a lower mean bias compared with the overall GFR cohort values. In essence, coefficients/formula proposed by Schwartz and developed in a population with defined mGFRs are not indiscriminately applicable to children with varying mGFRs, potentially falling outside the Schwartz selected GFRs. Unlike the combined Schwartz formula with its original coefficients, the new combined quadratic formula demonstrates a high accuracy for all iGFR values in our study, which includes mostly patients with CKD stages I, II, and III. This finding highlights the importance of GFR in the performance of various formulas, as shown recently (24). All these findings have important implications for public health and clinical practice, particularly in conditions of less renal impairment or normal GFR.

To our knowledge, this study is the first to evaluate all published CysC-only–based formulas, independent of their published coefficients. This approach allows one to better evaluate model performance and avoids any bias resulting from the difference in the gold standard mGFR method used in this study (*i.e.*, iGFR) and other methods used for deriving all these CysC-based formulas. A new combined SCreat and CysC–based quadratic formula, applicable across all GFR values, was derived. This new combined quadratic formula improves the accuracy of the previously published SCreat-only–based quadratic formula and represents an advance over currently available equations for eGFR. We believe this study is also the first to externally validate the combined Schwartz formula and to define a precise cutoff of 91 ml/min per 1.73 m^{2} for the applicability of this combined Schwartz formula.

To facilitate physicians’ use of this new combined SCreat and CysC quadratic formula in daily care practice, we have developed a JavaScript application that can be downloaded for free (http://www.chuv.ch/combinedquadraticformula); the previously published SCreat quadratic equation (9) link can also be found at this site.

Of note, this combined quadratic formula adds CysC measurement, which is obviously more costly than formulas based only on creatinine measurement. Although this could be a limitation to its wide application, the potential avoidance of unnecessary true GFR measurement should easily overcome this drawback.

This study has some limitations. First, very few patients had CKD stage IV and V. However, the circular binary segmentation method applied for all GFR values using the new combined quadratic formula did not detect any point of change in patients with CKD stage III or higher. Additional studies should confirm the validity of the new combined quadratic formula in children with CKD stages IV and V. Second, we found that the best correlation between iGFR and CysC was linear. However, few patients in our study had a CysC>2.5 mg/L; therefore, we cannot exclude the presence of a one-phase exponential decay or a hyperbola that our data did not demonstrate, Third, Schwartz *et al.* (10) found an increased bias in the eGFR for heavier patients. The small number of overweight and obese children or children with growth retardation prevented any meaningful evaluation of the new quadratic formula in this selective population. Fourth, the new combined quadratic formula does not overcome limitations of SCreat measurements in some patients, which depend on muscle mass levels. It is important to note that muscle mass was not evaluated in our population, but no infant was diagnosed with myopathy or malnutrition. Finally, we cannot rule out that a decrease in the performance of the solely SCreat or the combined Schwartz formulas could occur secondary to the different SCreat and/or CysC measurement methods. However, in our study, SCreat measurement was performed using the compensated Jaffe technique standardized against isotope-dilution mass spectrometry method, with inter- and intra-assay coefficients of variation far below the current recommendations of the Laboratory Working Group of the National Kidney Disease Education Program (25). We also compared the compensated Jaffe technique to the enzymatic technique used by Schwartz *et al.* and found an average difference between the enzymatic and the Jaffe methods of 0.99 µmol/L (95% confidence interval, −6.586 to 8.566 µmol/L), which shows that both methods are closely aligned. In our opinion, the differences in SCreat measurements must have a small error that should minimally affect the performance of the SCreat-based Schwartz formulas in our population. Regarding CysC measurement, and contrary to our standardized method of analysis, Schwartz *et al.* in their cohort could not use the standardized CysC calibrators, which could decrease the performance of their formula in a population such as ours.

## Disclosures

None.

## Acknowledgments

The authors would like to thank all colleagues at the Lausanne University Hospital who graciously contributed to this study. We are grateful for Mr. Christian Lehmann, Mr. Joel Stauber, Mr. Ziad Daher, Mr. Manuel Petter, and Mrs. Valerie Blanc for developing the combined quadratic formula website.

## Footnotes

Published online ahead of print. Publication date available at www.cjasn.org.

This article contains supplemental material online at http://cjasn.asnjournals.org/lookup/suppl/doi:10.2215/CJN.00940113/-/DCSupplemental.

- Received February 15, 2013.
- Accepted September 1, 2013.

- Copyright © 2014 by the American Society of Nephrology