Abstract
Background and objectives Use of small changes in serum creatinine to diagnose AKI allows for earlier detection but may increase diagnostic false–positive rates because of inherent laboratory and biologic variabilities of creatinine.
Design, setting, participants, & measurements We examined serum creatinine measurement characteristics in a prospective observational clinical reference cohort of 2267 adult patients with AKI by Kidney Disease Improving Global Outcomes creatinine criteria and used these data to create a simulation cohort to model AKI false–positive rates. We simulated up to seven successive blood draws on an equal population of hypothetical patients with unchanging true serum creatinine values. Error terms generated from laboratory and biologic variabilities were added to each simulated patient’s true serum creatinine value to obtain the simulated measured serum creatinine for each blood draw. We determined the proportion of patients who would be erroneously diagnosed with AKI by Kidney Disease Improving Global Outcomes creatinine criteria.
Results Within the clinical cohort, 75.0% of patients received four serum creatinine draws within at least one 48-hour period during hospitalization. After four simulated creatinine measurements that accounted for laboratory variability calculated from assay characteristics and 4.4% of biologic variability determined from the clinical cohort and publicly available data, the overall false–positive rate for AKI diagnosis was 8.0% (interquartile range =7.9%–8.1%), whereas patients with true serum creatinine ≥1.5 mg/dl (representing 21% of the clinical cohort) had a false–positive AKI diagnosis rate of 30.5% (interquartile range =30.1%–30.9%) versus 2.0% (interquartile range =1.9%–2.1%) in patients with true serum creatinine values <1.5 mg/dl (P<0.001).
Conclusions Use of small serum creatinine changes to diagnose AKI is limited by high false–positive rates caused by inherent variability of serum creatinine at higher baseline values, potentially misclassifying patients with CKD in AKI studies.
Introduction
Consensus definitions of AKI have been critical to facilitating comparisons of clinical studies of AKI. The design of AKI diagnostic criteria has been heavily informed by epidemiologic data showing powerful associations between small changes in serum creatinine (SCr) and mortality (1–4). This has led to inclusion of a 0.3-mg/dl rise in SCr within 48 hours as part of the Kidney Disease Improving Global Outcomes (KDIGO) guidelines for AKI, the most recent AKI criteria developed (5).
The strong association of small SCr changes with mortality might also suggest that earlier intervention in AKI would improve outcomes. However, randomized trials attempting to intervene on early AKI have been largely unsuccessful (6–9). This may, in part, result from misclassification of AKI under frameworks that do not reflect true GFR reduction. Inclusion of patients without true renal dysfunction in AKI research will dilute observed effect sizes, potentially leading to the false conclusion that certain interventions are ineffective. However, a rigorous assessment of the false-positive (FP) rate of AKI diagnosis caused by random laboratory and biologic variations has not been performed.
SCr, in its role as an AKI biomarker, may be suboptimal in that its use is on the basis of a relative change in value of a continuous variable instead of the crossing of a particular threshold (1–4,10). Because only a small increase in SCr is needed to meet AKI criteria, random variation in SCr may be a significant contributor to AKI diagnosis in the absence of a true reduction in GFR. Factors predisposing to FP AKI diagnoses could include a higher degree of variation in laboratory measurement of SCr, a higher degree of biologic variation in creatinine levels (1–5,11,12), and a higher number of creatinine assays performed. Like all laboratory tests, SCr measurements are affected by within– and between–sample coefficients of variation (CoVs), which for the widely used modified Jaffe rate reaction (5–9,13), range from 0.6% to as high as 9% depending on the true SCr concentration and assay characteristics (6–9,14). SCr measurements are also associated with intraindividual biologic variation and can be affected by changes in volume status, medications, and creatinine generation during critical illness (1–4,10,11,15,16). In an additive manner, increased SCr measurement frequency can further compound the effects of laboratory and biologic variabilities. These factors can be modeled with simulation techniques.
To test our hypothesis that AKI can be erroneously diagnosed on the basis of laboratory and biologic variations without a true decrease in GFR, we simulated the potential effects of these sources of variation on the FP rate of AKI diagnosis using the KDIGO stage 1 creatinine criteria.
Materials and Methods
Reference Cohort
To provide a reasonable distribution of baseline SCr values for simulation, we identified all individuals admitted to the Hospital of the University of Pennsylvania from July of 2013 to May of 2014 who developed stage 1 AKI according to the KDIGO creatinine criteria (Table 1). Methods to define this cohort have been previously published (17). After approval from the University of Pennsylvania’s Institutional Review Board, we obtained data abstracted from the electronic medical record, including demographic and clinical information (admission and discharge International Classification of Disease-9 codes), all SCr values, discharge disposition, and orders for intermittent hemodialysis and continuous venovenous hemodialysis. Patients with SCr≥4.0 mg/dl or documented ESRD were excluded from the reference cohort.
Clinical characteristics of the reference cohort
AKI was classified according to the KDIGO criteria (5). In primary analyses, any creatinine value that exceeded a prior creatinine value by either 0.3 mg/dl or 50% would define AKI.
Simulation Dataset
The simulation dataset design is shown in Figure 1. Full Stata code to recapitulate the simulation under varying conditions can be found at http://patr.yale.edu/resources/index.aspx#page4. For each of 2267 simulated study participants, an unchanging true SCr was generated. The value of this true SCr was drawn from the distribution of baseline values in the reference cohort of actual patients described above. Each simulated participant then underwent up to seven virtual blood draws. The SCr obtained for each blood draw was equal to the sum of the true SCr and the error terms for laboratory and biologic variabilities (Figure 1). Error terms were generated by randomly sampling from a distribution with a mean of zero and an SD equal to the CoV (considered independently for laboratory and biologic variabilities) multiplied by the true SCr. Simulations of 2267 patients were performed 100 times, and we averaged the FP rate identified in each to give the final rate and used the method by Rubin (18) to provide the 95% confidence interval.
The simulated dataset uses baseline information from the reference cohort and simulates multiple creatinine measurements. Overview of the simulation dataset. True serum creatinine (SCr) values were assigned to simulated patients in a distribution similar to SCr values from the clinical reference cohort. True SCr values did not change in simulated patients, reflective of patients without true AKI. Random error in simulated measured SCr values was generated on the basis of laboratory and biologic variabilities in SCr measurements determined by data from the reference cohort. The error term was randomly sampled from a normal distribution of mean =0. Repeated simulated measured SCr values were generated, and the number of iterations to the first false positive was determined. Bio, biologic; Cr, creatinine; KDIGO, Kidney Disease Improving Global Outcomes; Lab, laboratory; Var, variation.
Laboratory Variability
Laboratory variation changes depending on the concentration of the analyte. We used available manufacturers’ data for the CoVs of three commonly used SCr assays and fit linear or quadratic models to these data (Supplemental Figure 1). In the primary analysis, an intermediate level of laboratory variation (Supplemental Figure 1B), consistent with Penn’s creatinine assay, was used. We performed sensitivity analyses with the other two levels of laboratory variation.
Biologic Variability
We examined both published data and variation in our cohort to arrive at an estimate of biologic variability. Published values for biologic variability of SCr determined from repeated measurements in outpatients range from 4.1% to 7.6% (19,20). Because these values were not obtained in the inpatient setting, we also examined variability in the reference cohort. We identified patients with at least three inpatient SCr values before the first value used to define an AKI event. Each patient was assigned an individual CoV equal to the SD of all pre-AKI values divided by the mean of these values. We subtracted the expected laboratory variation (see above) (Supplemental Figure 1) to obtain a biologic variation term for each patient. To keep our estimate of biologic variation conservative, we used the 25th percentile of mean variation in our reference cohort (4.4%) in our primary analysis, which fell within the range of variability found in the cited outpatient studies (see above and Results). To validate further that 4.4% biologic variability in SCr measurements is not an overestimation in the inpatient setting, we calculated biologic variability among 792 patients without AKI enrolled in the Translational Research Investigating Biomarker Endpoints in AKI (TRIBE-AKI) Study (design of the TRIBE-AKI Study has been previously described) (21–23) and found that our chosen value was lower than the 25th percentile mean variation calculated in the TRIBE-AKI Study. As a sensitivity analysis, we also used the 10th percentile of mean variation (2.1%) in the reference cohort.
Primary Outcome
The primary outcome was the FP rate of AKI diagnosis after four simulated SCr draws in the simulation dataset. Examples of repeated blood draws in two simulated patients appear in Figure 2. We chose four SCr draws as our end point, because 75% of our reference cohort received this number of SCr assays within at least one 48-hour period before AKI onset. Given our a priori hypothesis that higher true SCr values are more susceptible to variation, leading to higher FP rates, an SCr of 1.5 mg/dl was selected as a cut point to evaluate patients with elevated baseline SCr.
Random changes in serum creatinine can results in AKI misclassification. Two simulated patients undergoing repeated creatinine measurement. Dotted lines represent true creatinine values. Red arrows indicate the creatinine that would lead to a diagnosis of AKI under Kidney Disease Improving Global Outcomes (KDIGO) criteria. Note that neither patient would receive a diagnosis of AKI under the fixed baseline criteria. C0, baseline creatinine value that allowed KDIGO AKI criteria to be met within the relevant time interval (this is the clinical equivalent of the true serum creatinine value of the simulation cohort).
Sensitivity Analyses
We performed three other simulations under the alternative conditions listed in Table 2. Our primary analysis did not fix a lower limit for an individual patient’s simulated creatinine. In clinical practice, where the baseline creatinine value is known, values lower than the baseline may not be considered in the diagnosis of AKI. The fixed baseline simulation ignores all creatinine values lower than the initial admission value when categorizing AKI so that random variation below baseline levels does not contribute to FP rates. The sustained increase simulation required that one creatinine value subsequent to the value meeting AKI criteria remain elevated to allow diagnosis. Finally, we examined a model wherein AKI could only be diagnosed by a 50% increase in creatinine. These alternative criteria were first applied to the simulation dataset to determine their FP rates. The rates of clinically relevant renal events (achieving higher stages of AKI, dialysis, and death during hospitalization) were examined in the reference cohort in participants who did or did not have AKI by each criterion.
Sensitivity analyses
Although our primary analysis used rates of biologic and laboratory variations on the lower end of published ranges, we explored the effect of higher and lower levels of variation on FP rates. We examined the effect of frequency of blood draws by simulating up to three additional blood draws beyond the four blood draws in our primary analysis.
Statistical Analyses
Descriptive statistics were used to characterize the reference cohort. In the simulation dataset, we calculated the proportion of participants diagnosed with AKI after four SCr measurements. Chi-squared testing was used to compare FP rates among groups defined by different AKI criteria and true SCr values as well as compare the proportions of outcomes identified by the various criteria. Log rank testing was used to compare the time to AKI diagnosis by the alternative criteria. Associations between frequency of creatinine measurement and pre–AKI variability and clinical outcomes were assessed with logistic regression. Analyses were performed using STATA software, version 13.0 (College Station, TX).
Results
Characteristics of the Reference Cohort
The reference cohort included 2267 patients with a mean (SD) age of 60.5 (16.3) years old, 45% were women, and 27% were patients with African ancestry. Median admission SCr was 1.09 mg/dl (interquartile range [IQR] =0.76–1.60), and the median baseline SCr before meeting AKI criteria was 0.91 mg/dl (IQR=0.59–1.39; P<0.001) (Table 1); 707 patients (31.2%) in the reference cohort were critically ill, and 225 patients (9.9%) experienced inpatient death.
Variability of SCr in Reference Cohort
In total, 1513 participants had at least three SCr values before the SCr level diagnostic of AKI. The median CoV of creatinine was 11.2% (IQR=7.5%–16.1%). After accounting for expected laboratory variation on the basis of published manufacturers’ data, median biologic variation was estimated at 7.6% (IQR=4.4%–12.6%). We used the 25th percentile of this variation (4.4%), which was within range of published values for outpatients (4.1%–7.6%) (19,20), in our primary analysis. For sensitivity analysis, we also used 10th percentiles of the above variation (2.1%). Higher pre–AKI variation in SCr was not associated with inpatient mortality (P=0.53) or dialysis (P=0.39).
Frequency of Creatinine Measurement in Reference Cohort
There was a median of four creatinine measurements before AKI diagnosis, with 75.0% of patients receiving four SCr draws within any 48-hour period. SCr measurements were more frequent earlier in the hospital course, with patients receiving a median of 3 (IQR=3–4) measurements in the first 48 hours and 9 (IQR=7–11) measurements in the first 7 days of hospitalization (P<0.001). Higher pre–AKI frequency of creatinine measurement was associated with higher inpatient mortality (odds ratio for each additional 1 hour between measurements, 0.97; 95% confidence interval, 0.96 to 0.99; P=0.003) and dialysis (odds ratio for each additional 1 hour between measurements, 0.95; 95% confidence interval, 0.93 to 0.97; P<0.001).
FP Rates of AKI in the Simulation Dataset
Our primary outcome was the proportion of simulated patients erroneously diagnosed with AKI after four creatinine measurements in the relevant time interval under the condition of 25th percentile biologic and moderate laboratory variabilities (calculated in Supplemental Figure 1). Under these assumptions, KDIGO creatinine criteria incorrectly classified 8.0% (IQR=7.9%–8.1%) of individuals with AKI (Figure 3A, Table 3). This effect was higher, with a higher true SCr: the rate of FP AKI diagnosis was 30.5% (IQR=30.1%–30.9%) among those with a true SCr ≥1.5. Conversely, those with true SCr <1.5 had an FP rate of 2.0% (IQR=1.9%–2.1%).
The likelihood of AKI misclassification is higher at higher creatinine levels, greater assay variability, and with more frequent blood draws. (A) Likelihood (percentage) of false–positive AKI diagnoses after four creatinine draws using Kidney Disease Improving Global Outcomes criteria for the simulation dataset with 1+ (low), 2+ (medium), and 3+ (high) coefficients of variation for laboratory (Lab) assays of serum creatinine (SCr) and 1+ (2.1%) and 2+ (4.4%) biologic (Bio) variabilities. Supplemental Figure 1 has more details on the different levels of laboratory variation. (B) Likelihood (percentage) of false–positive AKI diagnoses after repeated simulated measurements.
False-positive rate of alternative AKI diagnostic criteria versus Kidney Disease Improving Global Outcomes on simulated dataset
Sensitivity Analyses
Outcomes under alternative assumptions regarding the biologic and laboratory variabilities of creatinine are in Figure 3A. Even with biologic variability of 2.1% and moderate laboratory variability, the FP rate was 9.4% among simulated patients with a true SCr >1.5 mg/dl. At higher levels of biologic and laboratory variabilities, the FP rate was higher. More frequent laboratory draws beyond the four draws used for the primary analysis led to higher FP rates (Figure 3B, Supplemental Table 1).
To test whether alternative methods for AKI diagnosis might lead to lower FP rates, we performed simulations using the 50% increase, fixed baseline, and sustained increase criteria (Materials and Methods, Table 2). We simulated the fixed baseline under the same degree of variation as our primary analysis and found that, after four tests, the FP rate was 4.5% (IQR=4.4%–4.6%) in the full cohort, 1.1% (IQR=1.0%–1.1%) in those with true SCr <1.5 mg/dl, and 17.9% (IQR=17.6%–18.2%) among those with true SCr ≥1.5 mg/dl. These rates are significantly lower than the FP rate seen when the local nadir creatinine is used as the baseline value (Table 3) (P<0.001 for all comparisons). The 50% increase and sustained increase simulations also led to lower FP rates (Table 3).
We tested whether the alternative criteria that lower FP rates can also capture adverse clinical end points, such as dialysis or death. Among patients in the reference cohort, all of whom had at least KDIGO stage 1 AKI, 32% subsequently developed KDIGO stage 2 or higher, 13.6% developed KDIGO stage 3, 7.2% started inpatient dialysis, and 6.9% died during hospitalization. Table 4 shows the percentages of these events captured by the alternate AKI criteria. Each alternate criterion resulted in roughly one-third fewer AKI diagnoses, although only 3%–20% of the adverse clinical end points were missed.
Comparing clinical outcomes of the reference cohort using Kidney Disease Improving Global Outcomes and alternative AKI diagnostic criteria
Time to Diagnosis of AKI
Reference cohort patients were diagnosed with AKI by the KDIGO criteria a median of 2.74 days after admission. Under the alternative diagnostic frameworks, the time to diagnosis was longer (0.0; 95% confidence interval, 0.0 to 8.9 hours after KDIGO diagnosis for 50% increase; 0.0; 95% confidence interval, 0.0 to 0.0 hours for fixed baseline; and 17.6; 95% confidence interval, 8.2 to 24.4 hours for sustained increase; all P<0.001) (Supplemental Figure 2).
Discussion
An ideal AKI biomarker would accurately reflect true renal dysfunction, predict relevant outcomes, and be detectable early in the course of disease to allow for timely intervention. Although SCr concentration remains a central diagnostic tool and is the standard to which novel biomarkers are compared, our study provides additional evidence that SCr is an imperfect surrogate marker.
In this study, we show that inherent variability of SCr measurement contributes to a high FP rate, especially in the setting of higher baseline SCr values. Baseline SCr values <1.5 mg/dl seem less likely to be affected by laboratory and biologic variabilities, consistent with a prior study showing small incremental increases in SCr (0.1–0.4 mg/dl) when admission SCr <1.5 mg/dl independently predicted severe AKI (24). Although random variability in SCr measurements might also be expected to increase the number of false-negative diagnoses, simulation of a rising true SCr with random error showed that false negatives are a very minor contribution, because the signal of an increasing SCr outpaces the noise of random error.
We also show that more frequent pre–AKI SCr measurement, not the degree of pre–AKI SCr variation, was associated with worse clinical outcomes. Thus, some of the observed association between small creatinine changes and clinical outcomes in other studies may be the result of confounding by other clinical factors, such as severity of illness, that may influence the number of assays performed (25). Future studies examining the prognostic significance of small changes in SCr must account for frequency of measurement in the study design. Finally, in our clinical cohort, we show that alternative methods for AKI diagnosis can capture many relevant clinical events.
Although earlier detection of injury is an important metric in assessing diagnostic performance for AKI, increasing earlier detection while increasing diagnostic FP rates can misclassify patients and lead to erroneous study conclusions. For example, prior studies have yielded conflicting results on whether baseline CKD is protective in the critically ill. The Program to Improve Care in Acute Renal Disease Study suggested that baseline CKD was associated with decreased mortality in an AKI cohort (26). However, in a separate study using a different AKI definition, critically ill patients with CKD did not experience improved mortality over their non-CKD counterparts after adjustment for different covariates (4). Our simulation suggests that patients with CKD may be at risk for AKI misclassification because of the high FP rates of AKI diagnosis surrounding elevated baseline SCr values. Inclusion of patients with misclassification of CKD in AKI studies would, thus, cause CKD to seem to impart a protective advantage in the setting of AKI when it may not in actuality confer mortality benefit in critical illness.
Our study has several strengths. This study offers the first simulation estimating the FP rate of current AKI definitions under different assumptions of laboratory and biologic variations, showing the pitfalls of creatinine-based criteria for diagnosing AKI. Our determination of baseline SCr mirrors clinical practice, allowing our cohort to reflect how AKI definitions may be applied more broadly. Finally, we show how current KDIGO criteria may misclassify patients with higher baseline SCr values.
The study has several limitations worth consideration. One major limitation of this study is the absence of non-AKI controls in the reference cohort; thus, we were not able to compare outcomes with patients without AKI. Limited data exist to inform the biologic CoVs of creatinine in acutely ill individuals, and therefore, we derived our CoV values for simulation from pre–AKI creatinine measurements. We acknowledge that, without inulin clearance data in our clinical reference cohort, we cannot guarantee that the pre–AKI SCr variation did not reflect GFR changes. However, our use of the 25th percentile of biologic variation (4.4%), which is less than that reported in outpatient datasets (19,20) and less than calculated mean variation of SCr values in patients without AKI in the TRIBE-AKI Study cohort, was very conservative. The lack of association between pre–AKI biologic CoV and clinical outcomes suggests that our estimates do not reflect variation associated with early sustained AKI. In addition, our simulation was not able to incorporate a time component (only serial tests over an unspecified amount of time), and therefore, we could not model the effect of reduced creatinine generation from muscle over time, thereby possibly underestimating the number of false–negative AKI diagnoses in the acutely ill who are at increased risk for decreasing muscle mass (12). Another limitation is that our study, like the KDIGO criteria, may have missed some patients with community-acquired AKI, which may explain why, in our sensitivity analyses, the SCr–based alternative algorithms for AKI detection missed a few patients with acute dialysis if RRT was required for other clinical indications in the absence of robust SCr changes. Finally, this study uses simulation and a single-center cohort; broader applicability of our findings would require a larger multicenter study with non-AKI controls and a noncreatinine-based biomarker for additional analysis.
There are several paths forward to ameliorate the issues presented herein. First, although we do not necessarily advocate that fewer laboratory tests in the acutely ill be drawn, more stringent SCr criteria may be needed to define AKI in the setting of repeated measurements to avoid misclassification of patients. Second, the 0.3-mg/dl absolute increase may not be appropriate to apply to those with elevated creatinine at baseline. Third, better use of readily available clinical data (such as those used by kinetic creatinine models [27,28] or integrating creatinine with other relevant laboratory values) may improve diagnostic and prognostic performance. Fourth, biomarkers with greater sensitivity and specificity than creatinine should continue to be developed, tested, and deployed.
In conclusion, use of SCr–based KDIGO criteria to diagnose AKI is limited by laboratory and biologic variabilities, especially at higher baseline SCr values. This study brings to light the challenges of using SCr, an imperfect surrogate for renal function, in defining AKI. Thus, modifications to current consensus criteria accounting for patients with higher baseline SCr may be warranted.
Disclosures
None.
Acknowledgments
J.L. is supported by National Institutes of Health (NIH) Grant 5T32DK00700640. H.F. is supported by NIH Grant 3T32DK007785. M.G.S.S. is supported by NIH Grant K23DK097307. J.M.T. is supported by NIH Grants K23HL114868 and L30HL115790. C.R.P. is supported by NIH Grant K24DK090203. F.P.W. is supported by NIH Grant K23DK097201.
Footnotes
Published online ahead of print. Publication date available at www.cjasn.org.
This article contains supplemental material online at http://cjasn.asnjournals.org/lookup/suppl/doi:10.2215/CJN.02430315/-/DCSupplemental.
- Received March 3, 2015.
- Accepted July 22, 2015.
- Copyright © 2015 by the American Society of Nephrology