Project Summary De novo DNA sequence mutations are mutations not inherited from a parent and have been identified as playing an important role in many disorders, including autism, schizophrenia, and heart conditions. However, de novo mutations can be difficult to identify because sequencing errors may be just as common as these rare mutations. Current approaches used to analyze DNA sequence data are inadequate to identify de novo mutations successfully at a genome scale because each potential de novo mutation must be validated by a costly and time-consuming validation process. Our goal is to improve the identification of de novo mutations so that it will be possible to understand their role in genetic disorders. We will develop a novel statistical approach to identify de novo mutations, and we will implement it in software to make our method readily available to other researchers. Our first objective is to determine the probability that an apparent DNA sequence change is due to a de novo point-mutation. To determine this probability, we will integrate over other possible sources of error/noise including, sequencing error, population diversity, and chromosome segregation. Secondly, we will then expand on this model to detect complex de novo mutations including insertions, deletions, and copy-number variations. Thirdly, we will develop new models to handle sequencing data from single-cell sequencing , which generates different probabilities of error compared to those discussed previously. These methods can be applied to study mutation variation during germ-line development and the nature of human disease at the cellular level. Finally, we will implement these methods in an easy-to-use software package that will make the identification of de novo mutations readily available. We expect it to be used by a scientists working on subjects ranging from variation in mutation rates to the effects of aging.
|Effective start/end date||5/8/14 → 5/29/18|
- HHS: National Institutes of Health (NIH): $949,132.00