A wide variety of functional domains exist within human genes. Since different domains vary in their roles regarding overall gene function, the ability for a mutation in a gene region to produce disease varies among domains. We tested two hypotheses regarding distributions of mutations among functional domains by using (1) sets of single nucleotide disease mutations for six genes (CFTR, TSC2, G6PD, PAX6, RS1, and PAH) and (2) sets of polymorphic replacement and silent mutations found in two genes (CFTR and TSC2). First, we tested the null hypothesis that sets of mutations are uniformly distributed among functional domains within genes. Second, we tested the null hypothesis that disease mutations are distributed among gene regions according to expectations derived from the distribution of evolutionary conserved and variable amino acid sites throughout each gene. In contrast to the mainly uniform distribution of sets of silent and polymorphic mutations, sets of disease mutations generally rejected the null hypotheses of both uniform and evolutionary-influenced distributions. Although the disease mutation data showed a better agreement with the evolutionary-derived expectations, disease mutations were found to be statistically overabundant in conserved domains, and under-represented in variable regions, even after accounting for amino acid site variability of domains over long-term evolutionary history. This finding suggests that there is a non-additive influence of amino acid site conservation on the observed intragenic distribution of disease mutations, and underscores the importance of understanding the patterns of neutral amino acid substitutions permitted in a gene over long-term evolutionary history.
ASJC Scopus subject areas