🍩 Database of Original & Non-Theoretical Uses of Topology

(found 2 matches in 0.001576s)
  1. Topological Data Analysis for Genomics and Evolution: Topology in Biology (2019)

    Raul Rabadan, Andrew J. Blumberg
    Abstract Biology has entered the age of Big Data. A technical revolution has transformed the field, and extracting meaningful information from large biological data sets is now a central methodological challenge. Algebraic topology is a well-established branch of pure mathematics that studies qualitative descriptors of the shape of geometric objects. It aims to reduce comparisons of shape to a comparison of algebraic invariants, such as numbers, which are typically easier to work with. Topological data analysis is a rapidly developing subfield that leverages the tools of algebraic topology to provide robust multiscale analysis of data sets. This book introduces the central ideas and techniques of topological data analysis and its specific applications to biology, including the evolution of viruses, bacteria and humans, genomics of cancer, and single cell characterization of developmental processes. Bridging two disciplines, the book is for researchers and graduate students in genomics and evolutionary biology as well as mathematicians interested in applied topology.
  2. Fast Estimation of Recombination Rates Using Topological Data Analysis (2019)

    Devon P. Humphreys, Melissa R. McGuirl, Michael Miyagi, Andrew J. Blumberg
    Abstract Accurate estimation of recombination rates is critical for studying the origins and maintenance of genetic diversity. Because the inference of recombination rates under a full evolutionary model is computationally expensive, we developed an alternative approach using topological data analysis (TDA) on genome sequences. We find that this method can analyze datasets larger than what can be handled by any existing recombination inference software, and has accuracy comparable to commonly used model-based methods with significantly less processing time. Previous TDA methods used information contained solely in the first Betti number (\textlessimg class="highwire-embed" alt="Embedded Image" src="http://www.genetics.org/sites/default/files/highwire/genetics/211/4/1191/embed/mml-math-1.gif"/\textgreater) of a set of genomes, which aims to capture the number of loops that can be detected within a genealogy. These explorations have proven difficult to connect to the theory of the underlying biological process of recombination, and, consequently, have unpredictable behavior under perturbations of the data. We introduce a new topological feature, which we call ψ, with a natural connection to coalescent models, and present novel arguments relating \textlessimg class="highwire-embed" alt="Embedded Image" src="http://www.genetics.org/sites/default/files/highwire/genetics/211/4/1191/embed/mml-math-2.gif"/\textgreater to population genetic models. Using simulations, we show that ψ and \textlessimg class="highwire-embed" alt="Embedded Image" src="http://www.genetics.org/sites/default/files/highwire/genetics/211/4/1191/embed/mml-math-3.gif"/\textgreater are differentially affected by missing data, and package our approach as TREE (Topological Recombination Estimator). TREE’s efficiency and accuracy make it well suited as a first-pass estimator of recombination rate heterogeneity or hotspots throughout the genome. Our work empirically and theoretically justifies the use of topological statistics as summaries of genome sequences and describes a new, unintuitive relationship between topological features of the distribution of sequence data and the footprint of recombination on genomes.