Abstract
The past few years have seen major advances in genome-wide association studies (GWAS) of CKD and kidney function–related traits in several areas: increases in sample size from >100,000 to >1 million, enabling the discovery of >250 associated genetic loci that are highly reproducible; the inclusion of participants not only of European but also of non-European ancestries; and the use of advanced computational methods to integrate additional genomic and other unbiased, high-dimensional data to characterize the underlying genetic architecture and prioritize potentially causal genes and variants. Together with other large-scale biobank and genetic association studies of complex traits, these GWAS of kidney function–related traits have also provided novel insight into the relationship of kidney function to other diseases with respect to their genetic associations, genetic correlation, and directional relationships. A number of studies also included functional experiments using model organisms or cell lines to validate prioritized potentially causal genes and/or variants. In this review article, we will summarize these recent GWAS of CKD and kidney function–related traits, explain approaches for downstream characterization of associated genetic loci and the value of such computational follow-up analyses, and discuss related challenges along with potential solutions to ultimately enable improved treatment and prevention of kidney diseases through genetics.
- chronic kidney disease
- genetic renal disease
- Kidney Genomics Series
- Genome-Wide Association Study
- Multifactorial Inheritance
- Sample Size
- Biological Specimen Banks
- Follow-Up Studies
- Genetic Loci
- Genome
- Genomics
- Genetic Association Studies
- Cell Line
- Renal Insufficiency
- Chronic
Introduction
The genetic contribution to both kidney function in the healthy range and to kidney diseases is supported by significant heritability estimates and a long line of familial aggregation and linkage studies (1,2). Over the past decade, the contribution of hundreds of genes to kidney health has become increasingly clear: there are currently >600 genes implicated in monogenic diseases of the kidney (3), and genome-wide association studies (GWAS) of complex kidney function measures and diseases are in agreement with a proposed model in which hundreds of genes contribute to complex traits (4). GWAS is a mapping method for identifying genetic variants associated with an outcome across the genome in an unbiased manner. It tests for a statistical association between genotype at a genetic marker—typically a single nucleotide polymorphism (SNP)—and the outcome, typically a human trait or disease. By performing this test for millions of SNPs genome-wide, this study design holds the promise of uncovering biologic mechanisms related to the outcome through identifying the genes and variants that drive association signals with the outcome. When the genes or variants causing the association are amenable to modulation, they may represent potential therapeutic targets. In addition, genome-wide association statistics can inform an individual’s cumulative genetic predisposition for a disease, and may potentially be used for motivating lifestyle modifications and personalized medicine.
GWAS and their meta-analyses are commonly used for locus discovery of complex diseases and traits using data from largely population-based cohorts. The underlying methods, as well as its applications, benefits, and limitations in general and with respect to kidney traits, have been reviewed recently (1,5). The analytical workflows for locus discovery have been standardized and are now routinely used, such that the main focus of recent large-scale gene discovery efforts has shifted to the downstream characterization of the identified trait-associated loci. As outlined in Figure 1, such follow-up analyses include (1) enrichment testing and (2) colocalization analyses to identify trait-relevant tissues, cell types, and pathways; (3) fine-mapping and genomic feature annotation to prioritize the variants most likely to cause the association signals; (4) genetic correlation and Mendelian randomization (MR) analyses to assess a shared genetic basis between traits and their directional and causal relations; (5) genetic risk score construction to evaluate the variants’ combined effect, their usefulness for risk prediction or, when combined with phenome-wide association studies (PheWAS), to discover additional associated traits and diseases; and (6) functional experiments to generate mechanistic insights and validate causal genes, variants, and pathways.
Methods and objectives for the downstream characterization of findings from large-scale meta-analyses of GWAS. (A) Tissue and/or cell type enrichment analyses can identify the organs or cell types that affect the trait and inform on its genetic architecture. (B) Colocalization of gene association patterns between the trait and gene expression can reveal the potential causal gene in a locus and its tissues of action. (C) Fine-mapping and functional annotation focus on narrowing down the set of potential causal variants in a locus. (D) Genetic correlation analysis can reveal the shared genetic basis between traits, whereas Mendelian randomization aims to assess their causal relation using genetic information. (E) Polygenic risk scores can provide an estimate of disease risk. When used in the context of a phenome-wide association study, it can discover new genetic relations between diseases. (F) Experimental follow-up studies provide biologic evidence for causal genes and variants.
In this review article, we will summarize recent GWAS of CKD and kidney function–related traits from largely population-based cohorts since 2016, with earlier articles having been reviewed previously (6). Findings from GWAS of diabetic nephropathy have recently been summarized elsewhere and are not reviewed in this article (7,8). Genetics of kidney disease with specific causes, such as IgA nephropathy, membranous nephropathy, steroid-sensitive nephrotic syndrome, and APOL1 risk variants, are covered in other review articles in this series. We will explain approaches to downstream characterization of associated genetic loci and their value, and discuss challenges for these downstream analyses along with potential solutions, to ultimately enable improved treatment and prevention of kidney diseases through genetics.
Overview of Published GWAS of Kidney Function Traits and CKD
To illuminate the genetic underpinnings of complex human characteristics, the study of continuous outcomes, such as a biomarker, provides higher statistical power than that of binary outcomes, such as the presence or absence of a disease (9). As a result, eGFR and the urinary albumin-to-creatinine ratio (UACR), continuous measures used for CKD classification (10), have become the main outcomes of many kidney-related GWAS. The narrow-sense heritability of eGFR (i.e., the proportion of variance of eGFR explained by additive genetic effects) was estimated as 29% in the large, population-based UK Biobank study and as 39% in a large, European pedigree study (6,11). The corresponding heritability estimates of UACR are lower, with 4% for UACR to 9% for microalbumin in urine in large European cohorts (11,12). A number of GWAS have additionally included CKD, ESKD, or microalbuminuria as binary outcomes (6,12–15). Other kidney function biomarkers have been studied, either to overcome limitations related to creatinine-based eGFR such as cystatin C and BUN, or as a readout of the function of specific nephron parts, such as serum urate for the proximal tubule. An overview of GWAS of CKD and kidney-related traits is shown in Table 1.
Summary of genome-wide and exome-wide association studies of kidney function traits and CKD
Importantly, findings from the study of continuous kidney function traits are applicable to the diseases that they define or cause, such that genetic loci associated with eGFR are also associated with CKD, loci associated with UACR are also associated with albuminuria, defined on the basis of a dipstick test or UACR≥30 mg/g, and serum urate-associated loci are also related to gout (Figure 2).
Studies of continuous kidney traits in the general population deliver insights that are relevant to clinical phenotypes. (A) Genetic variants associated with lower eGFR are also associated with higher odds of acute and CKD. (B) Genetic variants associated with higher UACR are also associated with higher odds of microalbuminuria. (C) Genetic variants associated with higher serum urate are also associated with higher odds of gout, with a 100-fold difference across the range of a genetic risk score. GRS, genetic risk score; OR, odds ratio; UACR, urinary albumin-to-creatinine ratio. *Logistic regression two-sided P<0.05; **P<5 × 10-10; ***P<5 × 10-100.
GWAS of CKD and kidney function–related traits published since 2016 share many of the following characteristics: large sample size (mostly >100,000 up to >1 million), resulting in the discovery of many significantly associated genetic loci that are highly reproducible; the inclusion of participants not only of European but also of non-European ancestries; and the use of advanced computational methods for the integration of additional data, such as genome-wide gene expression levels for locus characterization. A number of studies also conducted functional experiments using model organisms or cell lines to validate causal genes and/or variants (Table 1).
Of the 16 GWAS of CKD and kidney-related traits since 2016, 13 had sample size >100,000 (6,11–13,16–23,26), with the largest sample size of 1,046,070 (6). Nine studies included participants from two or more ancestries (6,11–14,16,17,19,20). For eGFR, as expected from the genetic architecture of complex traits, multiple studies reported the identification of >100 highly reproducible loci (6,17,19,20). The largest reported number of replicated loci was 264 (6). For UACR, three studies had a sample size of >100,000 since 2016 (12,18,21), with the largest number of reported loci being 68 (12). Significant GWAS loci exhibited little heterogeneity across ancestries, i.e., the underlying variants may be common across ancestry groups (6,16,17,20). Across strata of diabetes or hypertension, the index SNPs of eGFR loci largely showed similar effect sizes, suggesting shared underlying pathways in the presence and absence of diabetes or hypertension (17,26).
Investigations of kidney function genetics have also started to expand, using the statistical power provided by the large sample sizes to study not only common (minor allele frequency >5%) but also low-frequency and rare genetic variants (Table 1) (11,24). Moreover, novel variant aggregation tests have been developed to increase the power for detecting associations with a group of rare variants, which may be combined at the gene level and/or by predicted functional categories (27).
Studies used a range of computational methods to prioritize the likely causal genes within significant loci, a central challenge of GWAS. As outlined in Figure 1A, one technique uses gene expression data to identify relevant tissues or cell types. eGFR-associated loci were shown to be enriched for high expression mainly in the kidney and liver (6,11), and urate-associated loci in the kidney (16). Gene expression levels indicate whether a gene may be active and on the directional relationship with respect to the investigated kidney trait, and whether a gene product is specific to a particular tissue or cell type. Enrichment of the expression of kidney disease–associated genes in specific tissues and cell types can inform on the genetic architecture and guide the design of future experimental studies with respect to relevant target tissues. In addition to enrichment, colocalization can reveal the potential causal gene within a significant genetic locus and its potential tissue(s) of importance (Figure 1B). This method matches the regional patterns of two genetic association studies (e.g., genetic association of eGFR with genetic association of gene expression in the kidney [28]) and is discussed below in more detail.
Each GWAS locus can contain a large number of trait-associated and correlated genetic variants. Functional annotation and statistical fine-mapping (Figure 1C) are two methods that have been used in GWAS of kidney-function traits (6,12,14,16,20) to hone in on the potential causal variants. Functional annotation provides information on the genomic context of a variant; for example, whether a variant leads to an amino acid substitution in the encoded protein (“missense”) and thus may alter protein function. Statistical fine-mapping estimates the probability of a variant of causing the detected association signal.
To evaluate whether two traits that are correlated according to observational studies also share some common genetic basis, many GWAS used genetic correlation analysis (6,12,16,18,22,29) (Figure 1D). For example, a recent GWAS meta-analysis detected significant genetic correlations between eGFR, the trait of interest, with cystatin C and serum citrate, which likely reflects a shared genetic component to glomerular filtration (6). MR is a technique that can be used to determine whether the correlations are likely to be causal in one or both directions. This approach uses exposure-associated genetic variants as “instrumental variables” of the exposure. The effects of these genetic instruments on the exposure represent lifelong differences in exposure levels. Given the random assignment of genetic variants during gamete formation, the difference in exposure levels owing to genetics mimics the randomly assigned intervention in a randomized controlled trial. When certain assumptions are met, the genetic instrument can therefore be used to estimate the causal effect of the exposure on an outcome (30). MR studies have been used as follow-up investigations in the context of PheWAS or GWAS (11,20–22,31); for example, a recent large-scale GWAS identified significant genetic correlations between UACR and measures of hypertension, with MR studies supporting bidirectional causal relationships between UACR and hypertension (12,21).
Polygenic risk scores (PRS) summarize the combined effect of the many trait-associated genetic variants (Figure 1E) (32). When generated from genome-wide genetic variants rather than only the lead variants identified from GWAS, they are also referred to as “genome-wide polygenic scores” (33). Different forms of PRS of CKD and related traits have been used to assess genetic links with risk outcomes (6,11,12,16,17,19,21). For example, PRS of eGFR have been associated with International Classification of Diseases–coded chronic renal failure, glomerular diseases, acute renal failure, and hypertensive kidney diseases in large cohorts (6,17). A PRS of serum urate showed >100-fold difference for gout prevalence across its range between those in the lowest and highest decile (16). PheWAS assess the genetic association of a candidate set of genetic variants, such as those in a PRS, with a large number of outcomes (Figure 1E). Large electronic health record systems with genotype data and International Classification of Diseases–coded diagnoses and procedures have enabled the development of PheWAS (34). Coupling of PRS to PheWAS has revealed associations between a PRS of eGFR with calculus of kidney, calculus of ureter, and urinary calculus (17), and a PRS of UACR with proteinuria, hyperlipidemia, gout, and hypertension (12). PheWAS are also informative for the in silico evaluation of potential side effects upon modulation of a genetic target (35). For instance, if CKD-associated genetic variants in the UMOD gene are associated with higher uromodulin levels in urine and lower risk of kidney stone disease (36,37), then pharmacologic lowering of uromodulin concentrations in urine may result in increased kidney stone risk.
Lastly, studies using model organisms and cell culture can reveal the causal mechanisms underlying the genetic association. For instance, the Drosophila nephrocyte model revealed that lower tubular albumin reabsorption rather than higher glomerular albumin filtration was the likely mechanism underlying the association at the OAF locus discovered in a GWAS of UACR (12). In another GWAS of urate, a kidney epithelial cell line was used to validate a potentially causal variant in the gene encoding the transcription factor HNF4A by showing altered transactivation of the gene encoding the urate transporter ABCG2 (16).
Challenges in Kidney Trait GWAS and How to Address Them
Understanding and Refining the Phenotype
The overarching goal of GWAS of kidney function traits is to increase our understanding of their genetic architecture and reveal their causal genes and variants. Kidney function traits are estimated by biomarkers instead of being measured directly. Understanding the investigated trait and its relations with the biomarkers is crucial for interpreting GWAS findings. The best overall index of kidney function is GFR (38). GFR is impractical to measure and therefore most commonly estimated by serum creatinine (39), which is an end product of creatine metabolism. Its blood concentrations are mainly affected by its excretion via glomerular filtration, with a small amount of active tubular secretion in healthy individuals (40). Creatinine has its own genetic determinants. GWAS of creatinine-based eGFR discover not only genetic loci related to the kidney’s filtration function, but also those related to creatinine metabolism, such as its generation. To prioritize loci that are likely related to true GFR, recent studies have used GWAS of complementary kidney function biomarkers, such as BUN and cystatin C (6,41,42). A genetic locus truly related to kidney filtration function, given appropriate statistical power, should show associations in GWAS of all of these GFR markers. Conversely, a genetic locus associated with creatinine-based eGFR because of its effect on creatinine generation or secretion would not be associated with cystatin C–based eGFR and/or BUN (Figure 3A). This intuition was used in the largest GWAS of eGFR to date (6), which used the association with BUN to prioritize 147 of 264 eGFR loci as most likely to be relevant to kidney function.
Necessity of using additional kidney function markers for understanding genetic associations with eGFR and with the UACR. (A) BUN, (B) Cystatin C, and (C) both components of the ratio. Cys, cystatin C; UKBB, UK Biobank.
For UACR, three aspects of the biomarker require attention. First, UACR is a ratio; therefore, UACR-associated genetic loci can originate from an association with the urine concentration of its numerator, albumin, or its denominator, creatinine. Nephrologists and kidney disease researchers will likely be more interested in the associations with albumin concentrations, which reflects damage to the glomerular filter and/or impaired tubular albumin reabsorption. A recent large-scale study investigated UACR-associated genetic variants with respect to their associations with urine albumin and creatinine concentrations separately (12) (Figure 3B) and found that some reported UACR loci, such as GATM and/or TCF4 (12,18), are likely driven by their association with urinary creatinine. This has important implications for the experimental follow-up and interpretation of the associated loci, Second, this issue is further compounded by the detection limit of assays for the quantification of albumin in urine. Commonly used assays are not very sensitive, with as many as two thirds of participants in the UK Biobank study having values below the limit of detection (18,21). Setting these values to the lower limit of detection before deriving UACR, a common practice in GWAS (12,18,21), augments the contribution of urinary creatinine to UACR in the lower range and can strengthen genetic associations with urinary creatinine. We therefore recommend that future GWAS of UACR either include separate evaluations of urinary albumin and creatinine, or focus on the binary phenotype of albuminuria. Third, kidney function biomarkers measured in urine are not only influenced by their glomerular filtration, but also by their handling along the nephron. Some loci discovered in GWAS of UACR can be linked to the glomerulus via experimental studies and/or monogenic diseases (e.g., COL4A4), whereas others are connected to the tubular reabsorption of filtered albumin (e.g., CUBN). When no prior biologic evidence or cell type–specific gene expression can be used to distinguish these two possibilities, follow-up studies using experimental models that can differentiate between these two aspects are a practical and elegant solution, as recently reported in a large-scale GWAS of UACR (12).
These challenges underline the continued need to assess and identify additional kidney function biomarkers to refine genetic associations, thereby enabling a better understanding of the genetic programs underlying reduced kidney function and CKD. Making the full genome-wide summary statistics of large-scale GWAS publicly available, as done by the CKDGen Consortium (6,12,15,23,25,26,42,43), is an important step to enable investigators who may have only one kidney function biomarker measured in their own study to incorporate GWAS of complementary biomarkers.
It also needs to be pointed out that GWAS of kidney-related traits in population-based studies have mostly been conducted using a single measure of the trait as the outcome to maximize sample size. Only rarely have they used repeated measures to reduce measurement error or to investigate disease progression. The imprecision in eGFR and UACR measurements, including biologic variability and the low sensitivity of the assay for urine albumin discussed above (39,44), likely reduces the power of GWAS. The current emergence of biobanks with repeated measurements will enable the definition of more precise phenotypes for large-scale studies of disease incidence or progression, and for subgroup analyses by age or disease subtypes.
Lastly, many GWAS have defined CKD as the presence of an eGFR <60 ml/min per 1.73 m2. This definition of CKD ignores heterogeneity with respect to its cause and reduces power to identify subgroup-specific risk genes. Indeed, validated risk genes for IgA nephropathy (45,46) and membranous nephropathy (47) are not detected in GWAS of eGFR-based CKD (6,15,41,42). The genetic basis of these more specific kidney diseases is covered by other articles in this series. The eGFR-based definition allows for identifying loci in the general population, consistent with the reported absence of subgroup-specific effects for the great majority of detected loci (15).
Difficulties in Pinpointing Potentially Causal Genes and New Methods to Address This
Similar to other complex traits, many genome-wide significant loci of kidney-related traits contain multiple genes, and the associated variants are often common variants located outside the coding part of the genome. The causal variants may affect their target genes over a distance. It can thus be difficult to determine which gene in the locus is most likely to causally affect kidney function on the basis of its genomic location. For example, one of the eGFR loci with the largest effect size contains two independent signals (6,48): the first index SNP was located between the neighboring genes UMOD and PDILT on chromosome 16, and the other index SNP in an intron of PDILT (6). Colocalization is a method for identifying the potentially causal gene by matching the patterns of genetic associations of the trait at that locus with the genetic associations of another trait-related measure, e.g., the expression of each gene in the locus (28). If colocalization with the expression of only one gene is observed, this suggests the colocalized gene is likely the causal gene (Figure 4). This approach greatly benefits from the public availability of genome-wide gene expression data, such as those from the Genotype-Tissue Expression Project (49) and the kidney-specific NephQTL resource (50).
Concept of colocalization of genetic associations to prioritize genes underlying the association with UACR. The genetic associations with UACR in a chromosome 11 region with several genes (A) colocalized with gene expression of OAF (but none of the other genes) in tubulointerstitial kidney portions (B), as well as with plasma protein levels of OAF (but none of the other gene products) (C). The region of interest on chromosome 11 contains several genes of interest, but colocalization was only observed with transcript and protein levels of one of the genes, OAF. This implicates OAF as the gene in the region that underlies the association signal and a regulatory variant acting through altered gene expression as the most likely mechanism.
At the UMOD/PDILT locus, colocalization analysis of eGFR with gene expression in the kidney tubulointerstitial compartment and with urine uromodulin levels identified UMOD as the most likely causal gene, although neither of the two index SNPs mapped into the UMOD gene sequence (6). The OAF locus of UACR is another example, where the index SNP is located between the genes TRIM29 and OAF. Colocalization analysis of UACR and gene expression from the kidney cortex identified OAF as the likely causal gene. Further colocalization analysis of UACR with plasma protein levels found colocalization with plasma concentrations of OAF levels, further substantiating OAF as the likely causal gene over the other genes in the locus (12). Similar to genetic associations with gene expression, the public availability of genetic associations with the concentrations of hundreds of plasma proteins represents an important community resource (51).
Challenge of Detecting Potentially Causal Variants and Approaches to Prioritize Them
In addition to identifying potentially causal genes, determining the potentially causal variants in GWAS is challenging. A number of statistical fine-mapping approaches have been used to identify the variants most likely to cause the association signal. These approaches aim to identify the set of variants with >99% posterior probability of causing the association signals (“99% credible set”) using association summary statistics or leveraging different patterns of linkage disequilibrium across ancestries. For example, a recent trans-ethnic GWAS of eGFR found that 40 of 93 loci contained a single variant with >50% posterior probability of causing the association signals (20). The largest GWAS of eGFR to date found that 58 of 228 replicated loci among European ancestry participants had small 99% credible sets with five or fewer variants (6). The continued generation of genetic data in non-European ancestry populations is important and will likely improve statistical fine-mapping.
Annotating the prioritized genetic variants with their functional genomic features can further narrow down the set of potentially casual variants. These features include, for instance, the variant consequence (e.g., missense), its degree of evolutionary conservation, and the mapping into gene regulatory regions. For example, the urate-associated index variant at HNF4A mentioned above is a missense variant and was prioritized as the potential causal variant at this locus with a posterior probability of >99% (16). These combined pieces of evidence made this missense variant a highly promising candidate for subsequent experimental studies, which confirmed its functional effect and suggested that this variant affected urate levels through altering the transactivation of the ABCG2 gene. Another example to illustrate the workflow from GWAS to potential therapeutic target is illustrated in the vignette in Figure 5.
Vignette illustrates how genome-wide association studies (GWAS) followed by experimental validation and characterization can reveal previously unknown and important biological mechanisms for potential therapeutic use. Genome-wide significant signals at ABCG2 in a GWAS of serum urate levels were followed up with experimental studies that revealed the previously unknown function of ABCG2 as an urate transporter and the GWAS index SNP, rs2231142, as the causal variant. Each copy of the T allele of rs2231142 is associated with 0.2 mg/dl higher serum urate levels and two-fold higher odds of gout. ABCG2 therefore represents a potential therapeutic target for lowering urate levels and preventing gout. (A) Regional association plot from Woodward et al. 2008, with genomic location on the x axis and –log10(P value) on the y axis. (B) Transport assays using radioactively labeled urate in Xenopus oocytes: over a wide range of extracellular urate concentrations (x axis), oocytes expressing wild-type ABCG2 (red) showed significantly lower intracellular accumulation of urate (y axis) than water-injected control oocytes (blue) or a loss-of-function mutation of ABCG2 (black). This indicates that the function of ABCG2 is the active export of urate out of oocytes. (C) Immunofluorescence of polarized porcine renal epithelial cells (LLC-PK1) shows expression of ABCG2 at the apical brush border membrane. Together with urate accumulation in the cells after application of an ABCG2 inhibitor, these experiments establish that ABCG2 is a secretory urate transporter in the proximal tubule. (D) The ABCG2 Q141K variant encoded by rs2231142 results in reduced urate transport as indicated by higher intracellular urate accumulation, establishing rs2231142 as the causal variant. (E) Conceptual model of the role of ABCG2 in urate handling in the kidney in the context of other urate transport proteins. WT, wild type.
Making Genetic Scores of Kidney Function–Related Traits Clinically Useful
The clinical utility of genetic scores for complex traits is an active area of research, both in terms of identifying individuals with a high genetic predisposition for a disease and, in combination with clinical measures, to optimize prevention and treatments (52). Novel methods for computing genetic scores using a large number of variants across the genome beyond the index variants from GWAS, termed genome-wide polygenic scores, may increase prediction accuracy (33,53–55). The challenges discussed above also influence the potential clinical utility of genetic scores of kidney function–related traits. First, the inclusion of genetic loci driven by association with kidney function biomarker metabolism rather than kidney function itself into a genetic score will lower its prediction accuracy. Second, difficulties in identifying causal variants result in the incorporation of correlated rather than causal variants in the genetic score, thereby increasing imprecision of the prediction (52). Finally, the trait variances explained by GWAS loci are still modest, limiting the prediction utility. For example, the 308 index SNPs from the largest GWAS of eGFR to date explained 7% of eGFR variance (6). The challenges related to making PRS clinically relevant are likely to be addressed with a multimarker approach to estimate kidney function, improved fine-mapping and genomic annotation, the discovery of low-frequency variants with larger effect and genetic interactions, and the identification of genetic variants more relevant to specific CKD causes.
Other challenges to eventually making kidney function–related genetic score clinically useful are similar to those for other complex traits, including the need for absolute risk prediction for a given time frame, which is more relevant for clinical decision and requires the integration of prospective disease outcome with other patient characteristics (54). In addition, the availability of large datasets from non-European ancestry populations for the assessment of prediction accuracy has been limited (52). For example, in European ancestry participants of the UK Biobank study, individuals in the highest decile of a PRS of serum urate on the basis of a recent GWAS had over ten times the odds of having developed gout compared with those in the most common (fifth) decile (16). It is worth investigating whether similar risk differences translate to populations of non-European ancestries, and whether communication of this genetic risk predisposition has clinical utility; for example, by encouraging individuals to adopt urate-lowering lifestyle measures or treatment for the prevention of gout.
Outlook and Conclusions
There are several aspects not covered in this review that we believe to be of importance in future GWAS of CKD and kidney function traits. First, the evaluation of genetic variants other than biallelic SNP markers, including structural variation, may uncover additional genetic determinants of kidney function. Second, the role of epigenetic variation in kidney function and CKD (56), and how it integrates with our knowledge from GWAS, deserves further study given that epigenetic mechanisms regulate gene expression and may mediate genetic effects on phenotypes. Third, interactions of genetic risk variants with each other and with environmental factors are largely not studied in current GWAS, owing in part to the high multiple testing burden, computational limitations, and the need for well-defined measures of environmental factors for gene-by-environment interaction testing. Fourth, the continued generation of tissue-, cell type–, and state-specific annotation resources, such as gene expression or transcription factor binding profiles, is a prerequisite to translate findings from GWAS into a mechanistic understanding. Methods to perform integrative analyses of summary results from GWAS datasets with other functional genomics datasets are an area of active development that is likely to benefit from advances in machine-learning approaches; for example, to identify regulatory variants. Lastly, the development of experimental high-throughput screening tools for identifying regulatory functional variants (57,58) is necessary to close the loop from association to causation.
With respect to the kidney, it will be equally important to zoom in to understand the contribution of the many individual cell types to kidney function and CKD, as well as to zoom out to understand the function of the kidney in the systemic context. The latter will be important for the study of kidney disease and other biomarkers with elimination by the kidneys (11). As our understanding of the biologic mechanisms underlying GWAS findings increases, more causal genes and variants will be identified. We are cautiously optimistic that some of them will become therapeutic targets. With respect to risk prediction and personalized prevention, genetic scores on the basis of genome-wide variants look promising, but additional research on the generalizability and clinical relevance is necessary. We are looking forward to these future developments and the advances that they will bring to improve kidney health in populations.
Disclosures
All authors have nothing to disclose.
Funding
A. Tin is supported by National Institute of Arthritis and Musculoskeletal and Skin Diseases grant R01AR073178. A. Köttgen was funded by German Research Foundation grant KO 3598/5-1.
Footnotes
Published online ahead of print. Publication date available at www.cjasn.org.
- Copyright © 2020 by the American Society of Nephrology