Original Research ARTICLE

Front. Genet., 01 April 2014 | http://dx.doi.org/10.3389/fgene.2014.00062

Evaluating the impact of genotype errors on rare variant tests of association

Kaitlyn Cook1, Alejandra Benitez2, Casey Fu3 and Nathan Tintle4*
  • 1Department of Mathematics, Carleton College, Northfield, MN, USA
  • 2Department of Applied Mathematics, Brown University, Providence, RI, USA
  • 3Department of Mathematics, Massachusetts Institute of Technology, Boston, MA, USA
  • 4Department of Mathematics, Statistics and Computer Science, Dordt College, Sioux Center, IA, USA

The new class of rare variant tests has usually been evaluated assuming perfect genotype information. In reality, rare variant genotypes may be incorrect, and so rare variant tests should be robust to imperfect data. Errors and uncertainty in SNP genotyping are already known to dramatically impact statistical power for single marker tests on common variants and, in some cases, inflate the type I error rate. Recent results show that uncertainty in genotype calls derived from sequencing reads are dependent on several factors, including read depth, calling algorithm, number of alleles present in the sample, and the frequency at which an allele segregates in the population. We have recently proposed a general framework for the evaluation and investigation of rare variant tests of association, classifying most rare variant tests into one of two broad categories (length or joint tests). We use this framework to relate factors affecting genotype uncertainty to the power and type I error rate of rare variant tests. We find that non-differential genotype errors (an error process that occurs independent of phenotype) decrease power, with larger decreases for extremely rare variants, and for the common homozygote to heterozygote error. Differential genotype errors (an error process that is associated with phenotype status), lead to inflated type I error rates which are more likely to occur at sites with more common homozygote to heterozygote errors than vice versa. Finally, our work suggests that certain rare variant tests and study designs may be more robust to the inclusion of genotype errors. Further work is needed to directly integrate genotype calling algorithm decisions, study costs and test statistic choices to provide comprehensive design and analysis advice which appropriately accounts for the impact of genotype errors.

Keywords: SKAT, gene-based, genotype uncertainty, misclassification, dosage

Citation: Cook K, Benitez A, Fu C and Tintle N (2014) Evaluating the impact of genotype errors on rare variant tests of association. Front. Genet. 5:62. doi: 10.3389/fgene.2014.00062

Received: 02 October 2013; Accepted: 11 March 2014;
Published online: 01 April 2014.

Edited by:

Joanna Biernacka, Mayo Clinic, USA

Reviewed by:

Zheyang Wu, Worcester Polytechnic Institute, USA
Nicholas Larson, Mayo Clinic, USA

Copyright © 2014 Cook, Benitez, Fu and Tintle. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Nathan Tintle, Department of Mathematics, Statistics and Computer Science, Dordt College, 498 4th Ave. NE, Sioux Center, IA 51250, USA e-mail: nathan.tintle@dordt.edu