Using dendritic heat maps to simultaneously display genotype divergence with phenotype divergence

Matthew Kellom, Jason Raymond

Research output: Contribution to journalArticle

Abstract

The advancement of techniques to visualize and analyze large-scale sequencing datasets is an area of active research and is rooted in traditional techniques such as heat maps and dendrograms. We introduce dendritic heat maps that display heat map results over aligned DNA sequence clusters for a range of clustering cutoffs. Dendritic heat maps aid in visualizing the effects of group differences on clustering hierarchy and relative abundance of sampled sequences. Here, we artificially generate two separate datasets with simplified mutation and population growth procedures with GC content group separation to use as example phenotypes. In this work, we use the term phenotype to represent any feature by which groups can be separated. These sequences were clustered in a fractional identity range of 0.75 to 1.0 using agglomerative minimum-, maximum-, and average-linkage algorithms, as well as a divisive centroid-based algorithm. We demonstrate that dendritic heat maps give freedom to scrutinize specific clustering levels across a range of cutoffs, track changes in phenotype inequity across multiple levels of sequence clustering specificity, and easily visualize how deeply rooted changes in phenotype inequity are in a dataset. As genotypes diverge in sample populations, clusters are shown to break apart into smaller clusters at higher identity cutoff levels, similar to a dendrogram. Phenotype divergence, which is shown as a heat map of relative abundance bin response, may or may not follow genotype divergences. This joined view highlights the relationship between genotype and phenotype divergence for treatment groups. We discuss the minimum-, maximum-, average-, and centroid-linkage algorithm approaches to building dendritic heat maps and make a case for the divisive "top-down" centroid-based clustering methodology as being the best option visualize the effects of changing factors on clustering hierarchy and relative abundance.

Original languageEnglish (US)
Article numbere0161292
JournalPLoS One
Volume11
Issue number8
DOIs
StatePublished - Aug 1 2016

Fingerprint

Hot Temperature
Cluster Analysis
Genotype
Phenotype
heat
phenotype
genotype
linkage (genetics)
group effect
Population Growth
DNA sequences
Base Composition
Bins
population growth
methodology
mutation
nucleotide sequences
Mutation
Research
Population

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Using dendritic heat maps to simultaneously display genotype divergence with phenotype divergence. / Kellom, Matthew; Raymond, Jason.

In: PLoS One, Vol. 11, No. 8, e0161292, 01.08.2016.

Research output: Contribution to journalArticle

@article{96d02935306e40818f87983776e7af56,
title = "Using dendritic heat maps to simultaneously display genotype divergence with phenotype divergence",
abstract = "The advancement of techniques to visualize and analyze large-scale sequencing datasets is an area of active research and is rooted in traditional techniques such as heat maps and dendrograms. We introduce dendritic heat maps that display heat map results over aligned DNA sequence clusters for a range of clustering cutoffs. Dendritic heat maps aid in visualizing the effects of group differences on clustering hierarchy and relative abundance of sampled sequences. Here, we artificially generate two separate datasets with simplified mutation and population growth procedures with GC content group separation to use as example phenotypes. In this work, we use the term phenotype to represent any feature by which groups can be separated. These sequences were clustered in a fractional identity range of 0.75 to 1.0 using agglomerative minimum-, maximum-, and average-linkage algorithms, as well as a divisive centroid-based algorithm. We demonstrate that dendritic heat maps give freedom to scrutinize specific clustering levels across a range of cutoffs, track changes in phenotype inequity across multiple levels of sequence clustering specificity, and easily visualize how deeply rooted changes in phenotype inequity are in a dataset. As genotypes diverge in sample populations, clusters are shown to break apart into smaller clusters at higher identity cutoff levels, similar to a dendrogram. Phenotype divergence, which is shown as a heat map of relative abundance bin response, may or may not follow genotype divergences. This joined view highlights the relationship between genotype and phenotype divergence for treatment groups. We discuss the minimum-, maximum-, average-, and centroid-linkage algorithm approaches to building dendritic heat maps and make a case for the divisive {"}top-down{"} centroid-based clustering methodology as being the best option visualize the effects of changing factors on clustering hierarchy and relative abundance.",
author = "Matthew Kellom and Jason Raymond",
year = "2016",
month = "8",
day = "1",
doi = "10.1371/journal.pone.0161292",
language = "English (US)",
volume = "11",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "8",

}

TY - JOUR

T1 - Using dendritic heat maps to simultaneously display genotype divergence with phenotype divergence

AU - Kellom, Matthew

AU - Raymond, Jason

PY - 2016/8/1

Y1 - 2016/8/1

N2 - The advancement of techniques to visualize and analyze large-scale sequencing datasets is an area of active research and is rooted in traditional techniques such as heat maps and dendrograms. We introduce dendritic heat maps that display heat map results over aligned DNA sequence clusters for a range of clustering cutoffs. Dendritic heat maps aid in visualizing the effects of group differences on clustering hierarchy and relative abundance of sampled sequences. Here, we artificially generate two separate datasets with simplified mutation and population growth procedures with GC content group separation to use as example phenotypes. In this work, we use the term phenotype to represent any feature by which groups can be separated. These sequences were clustered in a fractional identity range of 0.75 to 1.0 using agglomerative minimum-, maximum-, and average-linkage algorithms, as well as a divisive centroid-based algorithm. We demonstrate that dendritic heat maps give freedom to scrutinize specific clustering levels across a range of cutoffs, track changes in phenotype inequity across multiple levels of sequence clustering specificity, and easily visualize how deeply rooted changes in phenotype inequity are in a dataset. As genotypes diverge in sample populations, clusters are shown to break apart into smaller clusters at higher identity cutoff levels, similar to a dendrogram. Phenotype divergence, which is shown as a heat map of relative abundance bin response, may or may not follow genotype divergences. This joined view highlights the relationship between genotype and phenotype divergence for treatment groups. We discuss the minimum-, maximum-, average-, and centroid-linkage algorithm approaches to building dendritic heat maps and make a case for the divisive "top-down" centroid-based clustering methodology as being the best option visualize the effects of changing factors on clustering hierarchy and relative abundance.

AB - The advancement of techniques to visualize and analyze large-scale sequencing datasets is an area of active research and is rooted in traditional techniques such as heat maps and dendrograms. We introduce dendritic heat maps that display heat map results over aligned DNA sequence clusters for a range of clustering cutoffs. Dendritic heat maps aid in visualizing the effects of group differences on clustering hierarchy and relative abundance of sampled sequences. Here, we artificially generate two separate datasets with simplified mutation and population growth procedures with GC content group separation to use as example phenotypes. In this work, we use the term phenotype to represent any feature by which groups can be separated. These sequences were clustered in a fractional identity range of 0.75 to 1.0 using agglomerative minimum-, maximum-, and average-linkage algorithms, as well as a divisive centroid-based algorithm. We demonstrate that dendritic heat maps give freedom to scrutinize specific clustering levels across a range of cutoffs, track changes in phenotype inequity across multiple levels of sequence clustering specificity, and easily visualize how deeply rooted changes in phenotype inequity are in a dataset. As genotypes diverge in sample populations, clusters are shown to break apart into smaller clusters at higher identity cutoff levels, similar to a dendrogram. Phenotype divergence, which is shown as a heat map of relative abundance bin response, may or may not follow genotype divergences. This joined view highlights the relationship between genotype and phenotype divergence for treatment groups. We discuss the minimum-, maximum-, average-, and centroid-linkage algorithm approaches to building dendritic heat maps and make a case for the divisive "top-down" centroid-based clustering methodology as being the best option visualize the effects of changing factors on clustering hierarchy and relative abundance.

UR - http://www.scopus.com/inward/record.url?scp=84985040970&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84985040970&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0161292

DO - 10.1371/journal.pone.0161292

M3 - Article

C2 - 27536963

AN - SCOPUS:84985040970

VL - 11

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 8

M1 - e0161292

ER -