Abstract
Background and objectives In steroid-resistant nephrotic syndrome (SRNS), >21 single-gene causes are known. However, mutation analysis of all known SRNS genes is time and cost intensive. This report describes a new high-throughput method of mutation analysis using a PCR-based microfluidic technology that allows rapid simultaneous mutation analysis of 21 single-gene causes of SRNS in a large number of individuals.
Design, setting, participants, & measurements This study screened individuals with SRNS; samples were submitted for mutation analysis from international sources between 1996 and 2012. For proof of principle, a pilot cohort of 48 individuals who harbored known mutations in known SRNS genes was evaluated. After improvements to the method, 48 individuals with an unknown cause of SRNS were then examined in a subsequent diagnostic study. The analysis included 16 recessive SRNS genes and 5 dominant SRNS genes. A 10-fold primer multiplexing was applied, allowing PCR-based amplification of 474 amplicons in 21 genes for 48 DNA samples simultaneously. Forty-eight individuals were indexed in a barcode PCR, and high-throughput sequencing was performed. All disease-causing variants were confirmed via Sanger sequencing.
Results The pilot study identified the genetic cause of disease in 42 of 48 (87.5%) of the affected individuals. The diagnostic study detected the genetic cause of disease in 16 of 48 (33%) of the affected individuals with a previously unknown cause of SRNS. Seven novel disease-causing mutations in PLCE1 (n=5), NPHS1 (n=1), and LAMB2 (n=1) were identified in <3 weeks. Use of this method could reduce costs to 1/29th of the cost of Sanger sequencing.
Conclusion This highly parallel approach allows rapid (<3 weeks) mutation analysis of 21 genes known to cause SRNS at a greatly reduced cost (1/29th) compared with traditional mutation analysis techniques. It detects mutations in about 33% of childhood-onset SRNS cases.
- nephrotic syndrome
- focal segmental glomerulosclerosis
- genetic renal disease
- human genetics
- molecular genetics
Introduction
Nephrotic syndrome (NS) is characterized by the features of edema, proteinuria, hypoalbuminemia, and hypertriglyceridemia. Cases of NS can be separated into two classes: steroid-sensitive NS and steroid-resistant NS (SRNS). In childhood-onset NS, 80% of patients are steroid sensitive with the histologic feature of minimal-change NS. Approximately 63%–73% of patients with childhood-onset SRNS usually have renal histologic features of FSGS (1,2). In particular, in congenital NS, histologic examination of the kidney biopsy specimen can show diffuse mesangial sclerosis, representing a renal developmental form of SRNS. SRNS may lead to ESRD (3), requiring RRT and/or renal transplantation within a few years after onset. It recurs in 30% of renal transplant recipients, causing about 15% of all ESRD in children (4).
SRNS is a genetically heterogeneous disorder. To date, >21 single-gene causes of NS (or SRNS) have been identified (5–8) and display dominant and recessive inheritance. Recessive genetic causes of SRNS include the genes NPHS1 (nephrin) (9), NPHS2 (podocin) (10), LAMB2 (laminin subunit β2) (11), PLCE1 (phospholipase C, ε1) (12), COQ6 (coenzyme Q10 biosynthesis monooxygenase 6) (13), SMARCAL1 (Swi/Snf-related matrix–associated actin-dependent regulator of chromatin subfamily a-like protein) (14), COQ2 (coenzyme Q10 biosynthesis monooxygenase 2) (15), PDSS2 (decaprenyl-diphosphate synthase subunit 2) (16), CD2AP (CD2-associated protein) (17), SCARB2 (scavenger receptor class b member 2) (18), CFH (complement factor 2) (19), CUBN (cubilin) (20), PTPRO (protein-tyrosine phosphatase receptor-type O) (21), MYO1E (myosin 1E) (22,23), NEIL1 (endonuclease VIII–like 1) (23), and ITGA3 (integrin-α3) (24).
Dominant single-gene causes of SRNS have been described for the genes ACTN4 (actinin-α4) (25), LMX1B (LIM homeobox transcription factor 1-β) (26), TRPC6 (transient receptor potential cation channel, subfamily C, member 6) (27), and INF2 (inverted formin 2) (28). The dominant gene WT1 (Wilms’ tumor 1) (29) also causes childhood-onset SRNS.
Mutation analysis is usually performed following a certain genotype-phenotype algorithm, where genes known to be mutated in a specific phenotype are prioritized before others (30). For example, congenital NS is defined by onset within the first 90 days of life (5,30,31), and mutation analysis in NPHS1 would be prioritized before screening for NPHS2 or PLCE1. Similarly, if a patient presented with Pierson syndrome, LAMB2 would be preferentially selected for mutation analysis over other SRNS genes. In addition to phenotypic variation between different genes, varying phenotypes exist between different alleles within the same gene. Mutations of NPHS2 cause 10%–28% of all nonfamilial SRNS cases in childhood (32), and different recessive alleles determine the age of onset of SRNS and ESRD (33–35). For example, a homozygous missense mutation in R138Q leads to onset of SRNS at a median age of 4.7 years, yet a truncating mutation at this same position leads to an earlier onset of SRNS at 1.7 years (34). Another example of an allele-dependent phenotypic variance is the variant R229Q, which is common in people of European ancestry and has an allele frequency of 2.292% (Exome Variant Server, http://evs.gs.washington.edu/EVS/). When this variant is found in trans with another heterozygous mutation in podocin, R229Q causes adult-onset SRNS in up to 15% of all cases (36).
The heterogenic nature of SRNS makes mutation analysis cost intensive and time consuming. We have developed a new method for mutation analysis using a PCR-based microfluidic technology (48.48 Access Array System; Fluidigm, South San Francisco, CA) followed by consecutive next-generation sequencing (NGS), which allows rapid simultaneous mutation analysis of 21 single-gene causes of SRNS in a large number of individuals at a much lower cost per sample than traditional Sanger sequencing (37,38).
Materials and Methods
This mutation analysis was performed in 96 patients with SRNS using a high-throughput technique of the 48×48 Fluidigm Access Array system followed by bidirectional next-generation resequencing on one lane of a flow cell of either an Illumina Genome Analyzer II or a HiSeq2000 NGS instrument. Samples from the first 48 individuals were run as a proof of principle to test the method’s efficacy, and samples for the next 48 individuals were received for new mutation analysis (following an initial screen for NPHS2 and WT1 [exon 8 and 9] mutations).
We combined 10 primer pairs in each PCR to allow for amplification of 474 PCR products simultaneously. As a result, we obtained 474 different PCR products per array run for each of the 48 DNA samples, leading to a total of 22,752 distinct PCR products for each DNA sample. The last step before sequencing was to add unique 8-bp barcodes to each sample in order to distinguish the sequences of each individual after resequencing. To find the disease-causing mutations among the multitude of variants we received from resequencing, we developed a strategy that reduced the number of variants by combining various filtering and validation steps with consecutive segregation analysis (Supplemental Figures 1 and 2). We applied this method in families affected by SRNS and could identify genetic causes in a subset of the families.
Study Participants
In this study, we screened individuals with SRNS; samples were received for mutation analysis from international sources between 1996 and 2012. After receiving informed consent, we obtained clinical data, blood samples, and pedigrees from individuals with NS. Approval for research on humans was obtained from the University of Michigan Institutional Review Board. Adult or pediatric nephrologists diagnosed NS on the basis of standardized clinical and renal histologic criteria (39). Renal pathologists evaluated renal biopsy specimens. Clinical data were obtained using a standardized questionnaire (http://www.renalgenes.org).
Primer Design
For 21 genes, 474 target-specific primer pairs were designed. These target-specific primers amplified 424 coding exons. The amplicon size was designed in a range from 150 to 300 bp as described elsewhere (37).
Multiplex PCR Using the Fluidigm 48.48 Access Array Integrated Fluidic Circuit System
To amplify all 474 amplicons for 21 genes implicated in nephrotic syndrome, we pooled 10 primer pairs per well (10-plex primer pools) for each of the “48 primer inlets” in the 48×48 Access Array. The master mix containing the DNA samples were filled into the 48 “sample inlets.” Afterward, the Access Array was placed into the pre-PCR integrated fluidic circuit controller AX. In this step, all 48 DNA samples and the 48 primer 10-fold mix were distributed across the array to combine each sample with each 10-plex primer solution, resulting in 2304 separate PCR reactions on a 2-cm2 area. PCR was performed using a specific thermocycler (FC1 Cycler) from Fluidigm. The products of the PCR were harvested using the post-PCR integrated fluidic circuit controller AX and were transferred to a 96-well plate. In the next step, PCR products were barcoded per individual with Illumina specific barcodes in a PCR reaction with 15 cycles.
The primer sequences required to tag for bidirectional amplicon sequencing (requiring two separate PCRs) compatible with Illumina NGS were as follows: PE1-CS1: 5′-AATGATACGGCGACCACCGAGATCTACACTGACGACATGGTTCTACA-3′, PE2-BC-CS2: 5′-CAAGCAGAAGACGGCATACGAGAT-[BARCODE]-TACGGTAGCAGAGACTTGGTCT-3′, PE1-CS2: 5′AATGATACGGCGACCACCGAGATCTTACGGTAGCAGAGACTTGGTCT-3′, and PE2-BC-CS1: 5′-CAAGCAGAAGACGGCATACGAGAT-[BARCODE] ACACTGACGACATGGTTCTACA-3′. We pooled all 48 different barcoded samples for next-generation sequencing on an Illumina Genome Analyzer II or HiSeq2000 platform as described before by Halbritter et al. (37).
Sequencing Using the Genome Analyzer II or HiSeq2000 Platform
Pooled and indexed PCR products were sequenced on an Illumina Genome Analyzer II or a single lane of an Illumina HiSeq2000 instrument as 1×150 bases single run (v2.5 reagents) following standard Illumina protocols with modifications as described elsewhere (37).
Variant Calling and Mutation Analysis
Using the CASAVA v1.7 demultiplex.dp script (Illumina) Sequence, reads were separated according to their barcodes. Using CLC Genomics Workbench software (CLC-bio, Aarhus, Denmark), a reference sequence was concatenated containing the genomic sequence of the 21 known NS genes downloaded from NCBI (hg19 build). For each individual barcode, the sequence reads were aligned to this reference sequence. All donor and acceptor splice sites of all exons were annotated within that reference sequence.
The parameters used for the alignment algorithm were used as shown before (37). Common variants that were dbSNPs (version132) and not reported before as disease causing (e.g., R229Q in podocin) and synonymous variants with an allele frequency >1% were excluded before further investigation. Truncating variants (nonsense, frameshift, and obligatory splice variants) and missense variants, insertions, and deletions were called and further investigated.
Missense variants were further investigated for evolutionary conservation, and web-based programs were used to predict the effect of these variants at the protein level. A flowchart showing the variant calling and analysis is shown in Supplemental Figures 1 and 2.
Sanger Sequencing Confirmation and Segregation Analysis
Mutations that were found with the CLC Genomics Workbench software after NGS were confirmed by Sanger sequencing using the affected individuals’ DNA and, if available, the affected/unaffected family members’ DNA (siblings and parents). Mutations were examined whether they segregate with the affected status. The protocol for the PCR for confirmation was performed using a touchdown protocol described previously (40). Sanger sequencing was performed using BigDye Terminator v3.1 Cycle Sequencing Kit on an ABI 3730 XL sequencer (Applied Biosystems). Sequence traces were analyzed using Sequencher (version 4.8) software (Gene Codes Corporation, Ann Arbor, MI) (37).
Results
Pilot Study
Amplification of targeted genes with a total of 424 exons (474 Amplicons) in 48 individuals was performed using the Fluidigm 48×48 Access Array System. Following the initial multiplex PCR amplification, each sample was indexed with a unique barcode tag in an additional PCR. Following purification and quantification of PCR products, resequencing was performed on a Genome Analyzer II sequencing platform. This generated a total of 18,611,063 raw reads, of which 12,088,656 (64.9%) aligned to the targeted exonic regions in the selected NS genes. When assessing and comparing the mean coverage of the 474 amplicons, we found that 409 amplicons (86.3%) showed a mean coverage depth >20-fold, a threshold that we consider sufficient to consistently call heterozygous mutations (Figure 1).
Minimum coverage of targeted amplicons. The y axis indicates the number of sequencing reads (coverage). The x axis indicates each amplicon from 1 to 474, sorted by coverage in descending order. Pilot study coverage data are shown by blue triangles, and diagnostic study coverage data are shown by red triangles. Note that in the pilot study, 409 of 474 amplicons (82.3%) had a minimum coverage depth (horizontal black line) >20-fold (blue triangles). In the diagnostic study, 462 of 474 amplicons (97.4%) had a minimum coverage depth >20-fold (red triangles).
Raw sequencing data were further analyzed with CLC Genomics Workbench (CLC) software (CLC-bio). For each barcoded sample, single-nucleotide polymorphism and deletion/insertion polymorphism variants were called, with filter parameters as described previously (37). An overview of the data and filtering steps is shown in Supplemental Figure 1.
In 48 individuals with 71 mutations among 10 different genes, we detected 65 of 71 (91.5%). These 71 mutations represented missense, deletion, insertion, and splice-site variants, in both homozygous and heterozygous states (Supplemental Table 1). By detecting both recessive mutant alleles, 42 of 48 individuals could be solved in our pilot study (87.5%). In these 48 individuals, no competing variants in known NS genes were detected. One individual (A1537) was included as a negative control harboring only one heterozygous mutation in the recessive gene NPHS1.
Of the six mutations that could not be detected, low-coverage amplicons accounted for two of the six in individuals A1537 and A1193 (Supplemental Table 1). Another undetected mutation in ITGA3 (patient A3113) resulted from a faulty primer pair in exon 14, causing the initial multiplex PCR amplification of that particular exon to fail. An undetected splice-site mutation in ITGA3 (patient A3503) resulted from a low-quality DNA sample (see Supplemental Figure 3, #8 in pilot study [blue bars]). An insertion mutation in NPHS2 (patient A355) was not detected because high-throughput sequencing cannot reliably detect insertions or deletions within homopolymeric regions. Finally, a large deletion (23 bp) in NPHS2 could not be detected because the CLC software cannot detect deletions of more than approximately 8 bp. One general limitation of this technique is at the initial multiplex PCR step: Some primer pairs do not amplify their target sequence because of GC-rich regions, particularly in the first exons of selected genes (additional material of all genes and coverage statistic of all exons are available from the authors upon request). Despite these shortcomings, our pilot study nonetheless confirmed 65 of 71 (91.5%) mutations, and 42 of 48 (87.5%) individuals could be solved for a single-gene cause of NS by this method.
To improve upon our pilot experiment, we tested the primer pairs corresponding to low-coverage amplicons, and improved them by (1) redesigning and replacing inefficient primer pairs or (2) repooling functional single-plex primers in a different multiplex reaction. To further improve coverage, we ran the following experiment on an Illumina HiSeq2000 platform instead of the Genome Analyzer-II platform to increase the raw number of sequencing reads, mitigating the risk of missing a mutation due to low coverage. Another modification was made to use original blood-extracted genomic DNA instead of using whole genome amplified DNA to reduce variants caused by amplification artifacts. The refined methods, developed from the pilot study, were used in the diagnostic study.
Diagnostic Study
The data for the diagnostic study were generated and analyzed with the same parameters as for our pilot study. This sequencing run generated a total of 88,992,683 reads, which aligned to the genomic reference sequence, of which 81,533,024 (91.6%) aligned to the targeted regions in the 21 selected NS genes. The total reads prorated for each patient resulted in mean±SEM reads per amplicon of 3583±108 (Supplemental Figure 3), a 6- to 7-fold increase from the pilot experiment.
Coverage statistics showed that 462 of 474 (97.5%) amplicons and 89.3% of targeted bases had a mean coverage >20×. The HiSeq2000 sequencing platform provided a significant increase in coverage (see Figure 1), and the number of low-coverage amplicons was greatly reduced.
For the diagnostic study, raw sequencing data were generated and analyzed with the same parameters as applied for the pilot study. An overview of the data and filtering steps is shown in Supplemental Figure 2.
In the diagnostic study, we examined a new cohort (with unknown molecular cause) of 48 individuals with NS (Supplemental Table 2). The single-gene causes of NS were found for 16 of 48 (33%) individuals. Homozygous mutations were found in 12 individuals/families and compound heterozygous mutations in 2 individuals/families; single heterozygous mutations were considered causative only when found in a dominant gene (Table 1, Figure 2). Disease-causing variants were found in the following genes: NPHS1 (n=8), NPHS2 (n=1), PLCE1 (n=4), WT1 (n=2), and LAMB2 (n=1) (numbers behind genes represent the total number of individuals with mutations in the respective gene) (Table 1, Figure 2). Of all identified mutations, 7 were novel finding disease-causing mutations in the following genes: PLCE1 (n=5), NPHS1 (n=1), and LAMB2 (n=1) (Table 1, Figure 2).
Genotypes and phenotypes of novel mutations in NPHS1, NPHS2, PLCE1, WT1, and LAMB2 detected by high-throughput sequencing in 16 of 48 families with steroid-resistant nephrotic syndrome
Relative proportion of solved cases in 48 families with steroid-resistant nephrotic syndrome (SRNS) in the diagnostic study. In the diagnostic study, 48 individuals from 48 different families with an unknown cause of SRNS were screened for variants in 21 genes known to cause SRNS. In 16 of 48 individuals, the underlying molecular cause was identified. Numbers in parentheses following the gene symbols represent the number of patients with mutations in each respective gene.
Individuals in whom we found a disease-causing mutation had a younger age of onset (median, 2 months; range, 0–48 months; interquartile range, 1–9.5 months) than those in whom we did not find a mutation (median, 36 months; range, 0–196 months; interquartile range, 7.25–93 months) (P<0.01) (Supplemental Figure 4).
Discussion
Here we describe a high-throughput approach that allows the rapid (<3 weeks) mutation analysis of 21 genes that cause SRNS at a greatly reduced cost (1/29th) compared with traditional mutation analysis techniques (see Supplemental Methods). In the pilot study, performed in a positive control cohort, we identified the disease-causing mutation in 42 of 48 individuals (87.5%) who had known mutations in 1 of the 21 known NS genes. In the subsequent diagnostic study, we were able to identify a single-gene cause in 16 of 48 (33%) individuals of previously unknown cause (Supplemental Table 2).
We included in the pilot study mostly individuals with recessive mutations. The only dominant mutations that were included were in genes WT1 and TRPC6. Individuals were considered as genetically “solved” only if both recessive mutant alleles in a recessive gene or a single mutated allele in a dominant gene was detected. In the pilot study, 42 of 48 individuals (87.5%) were identified as genetically “solved.” Sixty-five of 71 mutations (91.5%) were detected (Supplemental Table 1).
Although we included in the diagnostic study recessive and dominant SRNS genes, most of the mutations we found were in recessive genes. The low number of dominant mutations is most likely due to the fact that 13 of the 16 individuals were younger than age 6 years (Supplementary Figure 4). An age distribution of the identified mutations is shown in Supplemental Figure 5. Because there is a strong phenotype-genotype correlation for patients with mutations in NS (41,42), the clinical diagnosis for each patient was considered to determine whether the putative genetic cause found in our mutation analysis was truly disease-causing. In a previously published high-throughput analysis, the decision of whether a genetic variant was disease-causing wasn’t sufficiently clarified (43).
Genotype–Phenotype Correlations
With one exception, seven of eight NPHS1 mutations were found in individuals with congenital NS, and all mutations but one had been published previously. We detected five PLCE1 mutations in five patients, all of whom were diagnosed with FSGS and had early onset of disease (before 6 years). All mutations discovered in PLCE1 were novel findings (Table 1) and showed high prediction scores when using programs that showed the possible impact of an amino acid substitution on the structure and function of the protein. All novel mutated PLCE1 amino acids were evolutionarily highly conserved and identical (e.g., in zebrafish [Danio rerio]). The individual in whom we found a homozygous LAMB2 mutation presented with eye involvement together with FSGS, which is consistent with the phenotype caused by LAMB2 mutations.
In a heterogeneous disease such as NS, we revealed the molecular cause of SRNS in 16 of 48 (33%) families in <3 weeks. This highly effective approach will help to reduce the cost and time for genetic diagnostics in pediatric and adult NS. Additional single-gene causes of NS, such as ITGB4 (integrin-β4), DGKE (diacylglycerol kinase-ε, 64 kD) (44), MTTL1 (mitochondrially encoded tRNA leucine 1), and ARHGAP24 (Rho guanosine triphosphatase activating protein 24) (45) have been published since we established this method. These additional genes can be accommodated in a new multiplex setup for future diagnostics. Our study represents a major advancement for traditional mutation analysis: Instead of prioritizing specific genes for mutation analysis in order to match very specific clinical phenotypes, all known disease genes for a general phenotype (i.e., NS) can be examined simultaneously. As costs continue to fall for the sequencing of whole genomes, eventually it may become feasible to sequence a patient’s entire exome or genome for clinical diagnostic purposes. However, currently the use of targeted resequencing techniques remains a powerful tool to identify single-gene causes of disease.
Disclosures
None.
Acknowledgments
The authors thank the affected individuals and their families who contributed to this study.
This work was supported by a grant from the National Institutes of Health to F.H. (DK076683). F.H. is an Investigator of the Howard Hughes Medical Institute, a Doris Duke Distinguished Clinical Scientist, and a Warren E. Grupe Professor.
Members of the Nephrotic Syndrome Study Group are as follows: V. Feygina (Brooklyn, New York), A. Zolotnitskaya (Valhalla, New York), H. Fathy (Alexandria, Egypt), R. Cohn (Chicago, Illinois), R. Sinha (Kolkata, India), I. Agarwal (Tamilnadu State, India), A. Bagga (New Delhi, India), Y. Yang (Ann Arbor, Michigan), S. Nampoothiri (Cochin, India), K. Moorani (Karachi, Pakistan), S.A. Bakkaloglu (Ankara, Turkey), E. Isiyel (Ankara, Turkey), I. Dursun (Kayseri, Turkey), E. Comak (Antalya, Turkey), A. Soylu (Izmir, Turkey), F. Gok (Ankara, Turkey), A. Nayir (Istanbul, Turkey), E. Serdaroglu (Izmir, Turkey), Y. Frishberg (Jerusalem, Israel), I. Eisenstein (Haifa, Israel), and R. Cleper (Petah Tiqwa, Israel).
Footnotes
Published online ahead of print. Publication date available at www.cjasn.org.
This article contains supplemental material online at http://cjasn.asnjournals.org/lookup/suppl/doi:10.2215/CJN.09010813/-/DCSupplemental.
- Received August 29, 2013.
- Accepted January 15, 2014.
- Copyright © 2014 by the American Society of Nephrology