Edited by: Berit Kerner, University of California Los Angeles, USA
Reviewed by: Dawn Thiselton, Virginia Commonwealth University, USA; Maria R. Dauvermann, University of Edinburgh, UK
*Correspondence: Marquis P. Vawter, Functional Genomics Laboratory, GNRF Room 2119, Department of Psychiatry and Human Behavior, School of Medicine, Irvine, CA 92697-4260, USA. e-mail:
†Adolfo Sequeira and Maureen V. Martin and Brandi Rollins have contributed equally to this work.
This article was submitted to Frontiers in Behavioral and Psychiatric Genetics, a specialty of Frontiers in Genetics.
This is an open-access article distributed under the terms of the
Mitochondrial deficiencies with unknown causes have been observed in schizophrenia (SZ) and bipolar disorder (BD) in imaging and postmortem studies. Polymorphisms and somatic mutations in mitochondrial DNA (mtDNA) were investigated as potential causes with next generation sequencing of mtDNA (mtDNA-Seq) and genotyping arrays in subjects with SZ, BD, major depressive disorder (MDD), and controls. The common deletion of 4,977 bp in mtDNA was compared between SZ and controls in 11 different vulnerable brain regions and in blood samples, and in dorsolateral prefrontal cortex (DLPFC) of BD, SZ, and controls. In a separate analysis, association of mitochondria SNPs (mtSNPs) with SZ and BD in European ancestry individuals (
Mitochondria are subcellular organelles enriched in energetic tissues, such as muscle and brain, and located in the cytoplasm. Although there are over 1500 human mitochondrial genes (Wallace,
One possible indication of a mitochondrial disorder (Wallace et al.,
Somatic mutations that arise during aging in brain and variation in length of mtDNA have been reported for select regions of the mtDNA genome (Cortopassi and Arnheim,
Previously, our group and others have studied the mtDNA association with SZ and BD cases in a small number of subjects, and these case-control analyses of mtDNA SNPs have focused on rare mutations or haplogroup-defining SNPs and have not found compelling evidence for association (Shao et al.,
The relationships between a common somatic mtDNA variation, a 4977 bp deletion (Cortopassi et al.,
Informed consent was obtained from the next of kin for each subject and the study was reviewed and approved by the University of California, Irvine (UCI) Institutional Review Board. Three brain tissue cohorts were processed. Cohort 1 consisted of 11 subjects (6 controls, and 5 cases with schizophrenia) with 11 brains regions per subject (121 samples in total; Table
Diagnosis | Sex (male/female) | Age (mean ± SD) |
---|---|---|
SZ | 2/3 | 53 ± 7 |
Control |
2/4 | 57 ± 27 |
Total | 4/7 | |
BD | 9/3 | 50 ± 17 |
MDD | 11/4 | 51 ± 15 |
SZ | 11/3 | 44 ± 10 |
Control | 29/6 | 53 ± 12 |
Total | 60/16 | |
BD | 3/1 | 57 ± 8 |
MDD | 4/1 | 41 ± 14 |
SZ | 2/2 | 41 ± 9 |
Control |
7/3 | 52 ± 23 |
Total | 16/7 |
For cohort 1, DNA was extracted from the following tissues: anterior cingulate cortex (ACC), amygdala (AMY), caudate nucleus (CAUN), cerebellum (CB), DLPFC, hippocampus (HIPP), nucleus accumbens (NACC), orbitofrontal cortex (OFC), putamen (PUT), SN, and thalamus (THAL) for 11 subjects and whole blood was obtained for three of those subjects. Briefly, prefrontal cortex samples were dissected from orbital gyrus (OFC) and the dorsal lateral prefrontal cortex at the level of the anterior end of the corpus callosum. The ACC was sampled just above the most anterior part of the corpus callosum. Three striatal structures (CAUN, putamen, and NACC) were dissected from the same coronal slice at a level where the three structures were visible anterior to the anterior commissure. The amygdala was sampled from the temporal lobe, ventral to the putamen just anterior of the hippocampus. The whole hippocampus and parahippocampal gyrus were dissected from the subsequent posterior coronal slices as a block and sampled together. The SN and the whole thalamus (from the lateral ventricle to the medial lemniscus and from the internal capsule to the midline of the brain), were individually dissected from the same coronal slice. Finally, a cortical sample of the lateral left cerebellum was dissected for processing and quality control. DNA was extracted from 25 mg of dissected brain tissue using a DNeasy Blood and Tissue Kit (Qiagen), according to the manufacturer’s protocol. For cohort 2 DNA was extracted from 76 DLPFC samples using the phenol phase of a Trizol protocol and precipitated with ethanol (Shao et al.,
Primers were designed targeting the deleted or non-deleted region around the region of the deletion spanning mt8224–13501. As described previously (Shao et al.,
Quantitative polymerase chain reaction was performed using the 7900 Sequence Detection System (Applied Biosystems) with cloned standards that ranged from 1,000,000 to 100 copies/μl, and the SYBR Green chemistry (Applied Biosystems). Two separate reactions were run for the deleted and non-deleted amplicon detection. The ABI Prism 7900 Sequence Detection System default thermal cycler program was used for each reaction: 10 min of pre-incubation at 95°C followed by 40 cycles of 15 s at 95°C and 1 min at 60°C. Individual real-time PCR reactions were performed in duplicate in a total volume of 12.5 μl in 384-well plates (Applied Biosystems) containing 6.25 μl SYBR Green (Applied Biosystems), 2.5 μl DNA (10 ng/μl), or 2.5 μl standards, 0.25 μl forward and reverse primers (10 pmol/μl), and 3.25 μl H2O. The cycle threshold (CT) was manually set at the level that reflected the best kinetic PCR parameters. Using the copy number standard curve, CT values were used to calculate copy number for the deleted and non-deleted amplicons: percent common deletion = (deletion copy number)/(deletion copy number + non-deletion copy number) × 100 (Shao et al.,
Non-deleted and deletion-specific fragments were amplified by PCR and gel purified according to the QIAquick Gel Extraction Kit protocol (Qiagen) and TA cloned using the Original TA Cloning Kit (Invitrogen). The ligation reaction consisted of 1 μl PCR product (2.60 ng), 1 μl 10× ligation buffer, 2 μl PCR 2.1 vector (25ng/μl), 5 μl H2O, and 1 μl T4 DNA ligase run at 14°C overnight. Two μl of each ligation reaction was pipetted into a 50 μl vial of frozen One Shot competent cells (Invitrogen) per sample and heat shocked for 30 s at 42°C. After cooling the vials on ice, S.O.C. medium (Invitrogen) was added to each vial and 10–50 μl from each transformation vial was spread on LB agar plates containing 60 μg/ml X-gal and 50 μg/ml of kanamycin (Teknova). Plates were incubated overnight at 37°C and white colonies were selected and grown overnight at 37°C in a 2–5 ml LB broth tube containing 50 μg/ml kanamycin (Teknova). The plasmid containing the amplicon was extracted using a PureLink Quick Plasmid MiniPrep Kit (Qiagen). The plasmid was amplified and the samples were run on a 1% agarose gel and sequenced to confirm the presence of the correct amplicons. To further validate the qPCR amplicons, randomly selected qPCR amplicon reactions from the deletion and non-deletion assay plates were sequenced (Genewiz, Inc.; Ann Arbor, MI, USA). QPCR products for sequencing were run on a 1% agarose gel, purified using a QIAquick Gel Extraction Kit (Qiagen), and 10 ng DNA was mixed with 5 μl forward deletion or non-deletion specific forward primers (5 μM) in a total volume of 15 μl for capillary sequencing.
Mutations were identified in 11 brain regions and blood (cohort 1) and in a separate cohort (cohort 3) of a single brain region (DLPFC) in subjects with SZ, BD, MDD, and controls by Illumina GAII sequencing. DNA (50 ng) was used to amplify the entire mitochondrial genome in two overlapping fragments of 9,289 bp (I) and 7,626 bp (II) in length. The primer sequences used to amplify fragment (I) are 5′-AACCAAACCCCAAAGACACC-3′ and 5′-GCCAATAATGACGTGAAGTCC-3′. The primer sequences for fragment (II) are 5′-TCCCACTCCTAAACACA TCC-3′ and 5′-TTTATGGGGTGATGTGAGCC-3′. PCR reactions, 50 μl using TaKaRa LA Taq polymerase, were performed with the following cycling conditions: 95°C 2 min; 35 cycles of 95°C 20 s, 59°C 30 s, 68°C 10 min; 68°C 20 min. PCR reactions were electrophoresed on 0.8% TAE agarose gels at 120 V. Fragments were gel extracted using the QIAquick gel extraction kit (Qiagen) and eluted in 30 μl elution buffer. Eluted samples were measured using the SpectraMax Plus 384 microplate spectrophotometer (Molecular Devices). The parallel DNA sequencing was performed using the Illumina Genome Analyzer II at Ambry Genetics Corporation (Aliso Viejo, CA, USA) according to the manufacturer’s protocol. Briefly, pooled, amplified mtDNA samples (fragments I and II) were sonicated and fragment ends were repaired and phosphorylated using Klenow T4 DNA polymerase and T4 polynucleotide kinase. An A base was added to the 3′ end of the blunted fragments and the resulting fragments were ligated to custom index adapters. The ligated products were size selected by gel purification and then PCR amplified. Using the Agilent Bioanalyzer (Santa Clara, CA, USA), each library size and concentration was determined so that samples were combined at equal molar ratios and multiplexed at 16 libraries per lane in a flow cell. This approach allows the purification of mtDNA from genomic DNA but has the disadvantage of excluding deleted mtDNA molecules.
Illumina GAII single reads were aligned to the mitochondrial revised Cambridge Reference Sequence (rCRS; GenBank accession number NC_012920; Andrews et al.,
The Illumina sequencing data was screened for novel or rare homoplasmic variants using the following criteria: (1) coverage for a position was greater than 500×, the called allele was not the reference allele, (2) the% call at the variant position was >96%, and (3) the mutation had a reported incidence less than 1% in PhyloTree Build 12 (van Oven and Kayser,
The GAIN datasets were obtained from the database of Genotypes and Phenotype (dbGaP). Data access was granted by the GAIN Data Access Committee for analysis of an approved project “Genetic Variation in Mitochondria in Schizophrenia and Bipolar Disorder.” The WTCCC2 cohort was obtained by Fabio Macciardi and Sara Lupoli. A certain number of control samples were shared between the GAIN cohorts; we made sure that there was no overlap between the cohorts and that only unique control subjects were included in the final analysis. Further, we used a common control group consisting of pooled WTCCC2 and GAIN samples to compare against BD and SZ subjects from GAIN. The subjects from GAIN chosen for inclusion in the analysis were controls (general research consent use, GRU), subjects with schizophrenia and related conditions (SARC), and bipolar subjects (bipolar disorder only, BDO; bipolar and related disorders, BARD), demographic details and distribution per cohort are showed in Table S1 in Supplementary Material. In order to minimize population specific mtDNA variants that would confound case-control analysis, African Americans were not included in the mtDNA analysis and individuals of European ancestry (EA) were analyzed. Affymetrix 6.0 SNP CEL files were extracted for 465 mtSNPs using Affymetrix Power Tools scripts (available upon request from author) and annotated at NetAffx
We measured the levels of the common 4977 bp deletion across all brain regions in cohort 1. We observed, in a multivariate analysis, that age was significantly correlated with common deletions levels,
The highest levels of the common deletion were observed in regions that have origins in the mesencephalon (SN) and telencephalon (CAUN). Overall, common deletion levels were greatest in mesencephalon (SN) > telencephalon (CAUN, putamen, NACC, cortical regions) > diencephalon (thalamus) ( rhomboencephalon (cerebellum). An ongoing study that includes over 500 samples will be used to independently retest these correlations of the common deletion across all brain regions. The OFC showed the highest number of significant correlations to all other brain regions of common deletion, in fact was significantly correlated to all other 10 regions (Table S3 in Supplementary Material).
A repeated measures ANCOVA was also performed for the common deletion with age as a covariate, region as a repeated measure factor, and sex and diagnosis as between-subjects factors for cohort 1. There was a significant overall effect of region [
Results of the common deletion study in DLPFC only (cohort 2) showed also a significant relationship between age and common deletion levels (
There were 149 homoplasmic variants that were either novel or rare found by Illumina re-sequencing (Table S5 in Supplementary Material): 88% (131/149) were transitions, 11% (17/149) were transversions (Table S6 in Supplementary Material). Of these 149 homoplasmic variants, seven were novel and not reported in four databases: dbSNP (Sherry et al.,
Position | Gene | Function | rCRS Base | Mutation | AA change | Syn/non-syn | Tissue | Diagnosis | In Mitomap | Validated by Sequencing |
---|---|---|---|---|---|---|---|---|---|---|
224 | D-loop | nc | T | T > C | DLPFC | MDD | No | |||
5686 | tRNA Asn | ps | A | A > T | ACC, AMY, CAUN, CB, DLPFC, HIPP, NACC, OFC, PUT, SN, THAL | C | No | |||
6578 | COI | pp | A | A > G | 225G > G | SYN | DLPFC | C | No | Yes |
7834 | COII | pp | C | C > T | 83I > I | SYN | DLPFC | SZ | No | |
9126 | ATPase 6 | pp | T | T > A | 200T > T | SYN | DLPFC | C | T > C | |
10858 | ND4 | pp | T | T > C | 33I > I | SYN | DLPFC | BD | No | |
11026 | ND4 | pp | A | A > G | 89L > L | SYN | DLPFC | C | No | Yes |
Only one homoplasmic deletion was detected, an AC deletion at mt523-524, present in 11 brain regions sampled from two control subjects that have distant haplogroups (L1cb2 and H5, Figure
Considering further the rate of novel coding and non-coding mutations in cohort 3 (23 DLPFC subjects), there was one novel mutation in the non-coding region of the mitochondrial genome, resulting in a novel mutation frequency of 3.87 × 10−5 (23 genomes × 1121 bp non-coding = 25,783 total non-coding bp, one mutation/25,783 bp = 3.87 × 10−5 or 1:25,783; Table S7 in Supplementary Material). There were six novel mutations in the coding region for these subjects leading to a 1.69 × 10−5 novel mutation frequency (23 genomes × 15,447 bp total coding sequence = six novel mutations per 35,5281 bp or 1:59,214). These calculations show that novel mutations occurred more frequently in the non-coding region of the mitochondria compared to the coding region.
A total of 142 of the 149 variants were rare (Table S5 in Supplementary Material) with minor allele frequency <1% based upon mtDB (Ingman and Gyllensten,
We conducted phylogenetic analysis using the consensus sequence of the whole mitochondrial genome which allowed us to determine the specific haplogroup for each sample (van Oven and Kayser,
PLINK logistic regression was run with genotypes for 362 mtSNPs from 6,040 subjects using MDS covariates. In all analyses, the overlap of SZ and BD controls from the GAIN study were removed before undertaking association or normalization.
The GAIN and WTCCC2 controls were pooled into one common control group for comparison to BD and SZ subjects from GAIN. There were no significant mtSNP associations with psychiatric disorders after genome-wide correction. However, the logistic regression analysis showed nominally significant association with SZ, BD, and pooled analysis for nine different mtSNP alleles that passed nominal threshold of
Case-control | SNP affymetrix ID | dbSNP RS ID | rCRS pos | Mutation | Risk allele | OR | STAT | ||
---|---|---|---|---|---|---|---|---|---|
BD | SNP_A-8574923 | rs28357968 | 3666 | G3666A | A | 4900 | 1.81 | 2.328 | 0.020 |
BD | SNP_A-8574914 | rs28357375 | 15784 | T15784C | C | 4902 | 1.49 | 1.982 | 0.048 |
SZ | SNP_A-8574733 | rs2857291 | 195 | T195C |
T | 4965 | 0.82 | −3.547 | 0.000 |
SZ | SNP_A-8574778 | rs3937033 | 16519 | T16519C | C | 5039 | 0.92 | −2.131 | 0.033 |
SZ + BD | SNP_A-8574733 | rs2857291 | 195 | T195C |
T | 5896 | 0.87 | −3.375 | 0.001 |
SZ + BD | SNP_A-8574923 | rs28357968 | 3666 | G3666A | A | 6037 | 1.58 | 2.051 | 0.040 |
SZ + BD | SNP_A-8574991 | rs28380140 | 9377 | A9377G |
A | 6040 | 0.46 | −2.046 | 0.041 |
SZ + BD | SNP_A-8574741 | rs3088053 | 11812 | A11812G |
A | 6015 | 0.86 | −2.086 | 0.037 |
SZ + BD | SNP_A-8574553 | rs2853497 | 12007 | G12007A |
G | 6040 | 0.77 | −2.098 | 0.036 |
BD vs. SZ | SNP_A-8574692 | rs2853515 | 263 | A263G | A | 2094 | 1.76 | 2.3 | 0.021 |
BD vs. SZ | SNP_A-8574530 | rs1599988 | 4216 | T4216C | C | 1994 | 1.71 | 2.451 | 0.014 |
BD vs. SZ | SNP_A-8574778 | rs3937033 | 16519 | T16519C | T | 2098 | 1.11 | 2.02 | 0.043 |
Case-control | Risk allele | Posn. | Base | A | G | C | T | Location | Amino Change | Syn? |
---|---|---|---|---|---|---|---|---|---|---|
BD | A | 3666 | G | 2646 | ND1 | Gly → Gly | Yes | |||
BD | C | 15784 | T | 2619 | Cytb | Pro → Pro | Yes | |||
SZ | T | 195 | T | 11 | 1574 | D-Loop | ||||
SZ | C | 16519 | T | 1115 | D-Loop | |||||
SZ + BD | T | 195 | T | 11 | 280 (P) | 1574 | D-Loop | |||
SZ + BD | A | 3666 | G | 58 (R) | 2646 | ND1 | Gly → Gly | Yes | ||
SZ + BD | A | 9377 | A | 2661 | COIII | Trp → Trp | Yes | |||
SZ + BD | A | 11812 | A | 2616 | 88 (P) | ND4 | Leu → Leu | Yes | ||
SZ + BD | G | 12007 | G | 96 (P) | 2608 | ND4 | Trp → Trp | Yes | ||
BD vs. SZ | A | 263 | A | 6 (R) | 1861 | D-Loop | ||||
BD vs. SZ | C | 4216 | T | 2460 | ND1 | Tyr → His | No | |||
BD vs. SZ | T | 16519 | T | 1115 | D-Loop |
The haplogroup-defining SNP (A12308G) and the T195C allele in D-Loop region were combined in an exploratory
mtDNA allele |
Group |
|||
---|---|---|---|---|
12308 | 195 | Control ( |
SZ ( |
BD ( |
A | T | 2600 | 761 | 629 |
C | 417 | 82 | 81 | |
G | T | 701 | 193 | 180 |
C | 168 | 37 | 40 | |
Total N | 3886 | 1073 | 930 | |
CMH | 10.62 | 2.69 | ||
df | 2 | 2 | ||
CMH |
0.1007 |
In summary, when SZ and BD were compared to combined controls from GAIN and WTCCC2 datasets, there was weak evidence of association after correction for MDS as well as following the removal of stratification effect of a major ethnic defining haplogroup SNP. The present findings suggest that similar frequencies of SZ or BD were found at major haplogroup specific EA mtSNPs, while more sporadically occurring mtSNPs, T195C and T16519C, in the hypervariable region showed preliminary associations to SZ, and pooled SZ and BD.
We confirmed an increase in brain levels of the 4,977 bp mtDNA common deletion previously reported in BD, aging, and sex (Kato et al.,
The levels of common deletion vary significantly across brain regions, and are particularly high in mesostriatal and mesolimbic regions such as the SN, putamen, NACC, CAUN, and amygdala (Figure
A caveat to measurements of homogenized dissected post-mortem brain regions is the lack of specific cellular phenotypes for the mtDNA common deletion, although it is unlikely that cells with deleterious mutations survived with huge burdens of heteroplasmy. Another issue we could not address in the current study is sub-regional or layer specific differences mtDNA common deletion, or the cell specific differences. While it is possible to use laser capture microdissection to look at separate cells, in post-mortem brain, for clonal expansion possibility, we did not initially perform this analysis, as we were interested in the brain regional distribution of somatic mutations. In the future, with certain regions that show very high common deletion rates, this approach could be very fruitful to pursue, although initial studies capturing cell nuclei are not promising as very few mitochondria are captured from cell bodies (Mamdani, F., personal communication). Important details about patients are not specifically known, for example the exact age of onset, there is only partial information regarding the decade of life when psychotic symptoms began for SZ or mood symptoms for BD. The lifetime medication histories are lengthy, most patients receive at least 5–10 different medications, and we are not aware of what the cumulative effects of those medications are on common deletion levels in brain, although we have proposed to study this relationship. These potential factors could contribute to our results as false negatives or false positives in this study.
This study found 7 novel and 142 rare homoplasmic mutations that appeared in 23 mtDNA genomes derived from DLPFC (Table S5 in Supplementary Material) and in blood samples also analyzed. Thus, we did not find somatic homoplasmic mutations that differed in brain compared to blood. In this study we focused on homoplasmic mutations with high coverage. Although heteroplasmy was observed, we are currently in the process of validating and analyzing our heteroplasmy findings, to be reported in a future manuscript. Five homoplasmic mutations selected for validation by direct sequencing were confirmed. However, we did observe some false positives with this sequencing approach at some heteroplasmic loci, such as the previously reported A > C at mtDNA 3492 position, a known sequencing error hotspot (Li et al.,
Our findings support prior reports that the non-coding hypervariable mtDNA regions have more novel mutations than coding regions (Jazin et al.,
In this study we tested the largest sample to date for common mtSNP association with either SZ or BD (
While not directly affecting mitochondrial function by altering the protein sequence, several of the nine mtSNPs in the hypervariable region that showed preliminary association, such as 16519, have the highest world-wide mutation rates (Soares et al.,
A recent paper did not find an association of T16519C with SZ in a Spanish EA haplogroup case-control analysis (Mosquera-Miguel et al.,
Differences between the current study and Mosquera-Miguel et al. (
In conclusion, the large somatic common deletion was the most variable result in post-mortem brain, highly age-dependent, and replicates many prior reports (Corral-Debrinski et al.,
We also tested whether single point mutations would be found in different brain regions from the same brain. We were unable in this initial study to find evidence for homoplasmic mutations occurring independently across brain. We found the NGS technique to work reliably with some notable exceptions, and have since improved our throughput, by multiplexing 24 samples per lane on a Hi-Seq Illumina instrument, maintaining excellent coverage with 100 bp single reads. We also found evidence that some mtDNA variants are associated with SZ and BD protection or susceptibility at the population level using a large sample of patients and controls and available genotype data.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The Supplementary Material for this article can be found online at
We would like to thank Anja Kammesheidt and Wei Guo at Ambry Genetics for their efforts in mtDNA sequencing and alignment. We appreciate the contributions of the UC Irvine Davis Brain Bank personnel: Preston Cartagena, Psy.D, David Walsh, Psy.D, Richard Stein, Ph.D., Kathleen Burke, and Claudia Cervantes, as well as Jacque Berndt and the investigators and medical examiners at the Orange County Coroner’s Office. The collection of brain tissue was supported by funding from the NIMH Conte Center Grant P50 MH60398, Pritzker Family Philanthropic Fund, and NIMH Mitochondria Grant R01MH085801 (MPV). We thank Drs. Taosheng Huang and Sha Tang for discussion and early access to their data for heteroplasmy detection. M.v.O. received financial support by the Netherlands Forensic Institute (NFI) and Erasmus MC via the Department of Forensic Molecular Biology of Erasmus MC, and by a grant from the Netherlands Genomics Initiative (NGI)/Netherlands Organization for Scientific Research (NWO) within the framework of the Forensic Genomics Consortium Netherlands (FGCN). The research was further supported by the William Lion Penzner Foundation (Dept of Psychiatry), Della Martin Foundation (WEB, AS), and NIMH Grant R01MH085801 (MPV). This study makes use of data generated by the Wellcome Trust Case-Control Consortium. A full list of the investigators who contributed to the generation of the data is available from
Funding support for the whole genome association study of bipolar disorder was provided by the National Institute of Mental Health (NIMH) and the genotyping of samples was provided through the GAIN. The datasets used for the analyses described in this manuscript were obtained from the dbGaP found at
Funding support for the genome-wide association of schizophrenia study was provided by the National Institute of Mental Health (R01 MH67257, R01 MH59588, R01 MH59571, R01 MH59565, R01 MH59587, R01 MH60870, R01 MH59566, R01 MH59586, R01 MH61675, R01 MH60879, R01 MH81800, U01 MH46276, U01 MH46289 U01 MH46318, U01 MH79469, and U01 MH79470) and the genotyping of samples was provided through the Genetic Association Information Network (GAIN). The datasets used for the analyses described in this manuscript were obtained from the database of Genotypes and Phenotypes (dbGaP) found at
1
2
3