🍩 Database of Original & Non-Theoretical Uses of Topology

(found 78 matches in 0.010504s)
  1. Diverse 3D Cellular Patterns Underlie the Development of Cardamine Hirsuta and Arabidopsis Thaliana Ovules (2023)

    Tejasvinee Atul Mody, Alexander Rolle, Nico Stucki, Fabian Roll, Ulrich Bauer, Kay Schneitz
    Abstract A fundamental question in biology is how organ morphogenesis comes about. The ovules of Arabidopsis thaliana have been established as a successful model to study numerous aspects of tissue morphogenesis; however, little is known regarding the relative contributions and dynamics of differential tissue and cellular growth and architecture in establishing ovule morphogenesis in different species. To address this issue, we generated a 3D digital atlas of Cardamine hirsuta ovule development with full cellular resolution. We combined quantitative comparative morphometrics and topological analysis to explore similarities and differences in the 3D cellular architectures underlying ovule development of the two species. We discovered that they show diversity in the way the three radial cell layers of the primordium contribute to its growth, in the formation of a new cell layer in the inner integument and, in certain cases, in the topological properties of the 3D cell architectures of homologous tissues despite their similar shape. Our work demonstrates the power of comparative 3D cellular morphometry and the importance of internal tissues and their cellular architecture in organ morphogenesis. Summary Statement Quantitative morphometric comparison of 3D digital ovules at full cellular resolution reveals diversity in internal 3D cellular architectures between similarly shaped ovules of Cardamine hirsuta and Arabidopsis thaliana.
  2. Feature Detection and Hypothesis Testing for Extremely Noisy Nanoparticle Images Using Topological Data Analysis (2023)

    Andrew M. Thomas, Peter A. Crozier, Yuchen Xu, David S. Matteson
    Abstract We propose a flexible algorithm for feature detection and hypothesis testing in images with ultra-low signal-to-noise ratio using cubical persistent homology. Our main application is in the identification of atomic columns and other features in Transmission Electron Microscopy (TEM). Cubical persistent homology is used to identify local minima and their size in subregions in the frames of nanoparticle videos, which are hypothesized to correspond to relevant atomic features. We compare the performance of our algorithm to other employed methods for the detection of columns and their intensity. Additionally, Monte Carlo goodness-of-fit testing using real-valued summaries of persistence diagrams derived from smoothed images (generated from pixels residing in the vacuum region of an image) is developed and employed to identify whether or not the proposed atomic features generated by our algorithm are due to noise. Using these summaries derived from the generated persistence diagrams, one can produce univariate time series for the nanoparticle videos, thus, providing a means for assessing fluxional behavior. A guarantee on the false discovery rate for multiple Monte Carlo testing of identical hypotheses is also established.

    Community Resources

  3. Pattern Characterization Using Topological Data Analysis: Application to Piezo Vibration Striking Treatment (2023)

    Max M. Chumley, Melih C. Yesilli, Jisheng Chen, Firas A. Khasawneh, Yang Guo
    Abstract Quantifying patterns in visual or tactile textures provides important information about the process or phenomena that generated these patterns. In manufacturing, these patterns can be intentionally introduced as a design feature, or they can be a byproduct of a specific process. Since surface texture has significant impact on the mechanical properties and the longevity of the workpiece, it is important to develop tools for quantifying surface patterns and, when applicable, comparing them to their nominal counterparts. While existing tools may be able to indicate the existence of a pattern, they typically do not provide more information about the pattern structure, or how much it deviates from a nominal pattern. Further, prior works do not provide automatic or algorithmic approaches for quantifying other pattern characteristics such as depths’ consistency, and variations in the pattern motifs at different level sets. This paper leverages persistent homology from Topological Data Analysis (TDA) to derive noise-robust scores for quantifying motifs’ depth and roundness in a pattern. Specifically, sublevel persistence is used to derive scores that quantify the consistency of indentation depths at any level set in Piezo Vibration Striking Treatment (PVST) surfaces. Moreover, we combine sublevel persistence with the distance transform to quantify the consistency of the indentation radii, and to compare them with the nominal ones. Although the tool in our PVST experiments had a semi-spherical profile, we present a generalization of our approach to tools/motifs of arbitrary shapes thus making our method applicable to other pattern-generating manufacturing processes.
  4. Topological Data Analysis of Spatial Patterning in Heterogeneous Cell Populations: Clustering and Sorting With Varying Cell-Cell Adhesion (2023)

    Dhananjay Bhaskar, William Y. Zhang, Alexandria Volkening, Björn Sandstede, Ian Y. Wong
    Abstract Different cell types aggregate and sort into hierarchical architectures during the formation of animal tissues. The resulting spatial organization depends (in part) on the strength of adhesion of one cell type to itself relative to other cell types. However, automated and unsupervised classification of these multicellular spatial patterns remains challenging, particularly given their structural diversity and biological variability. Recent developments based on topological data analysis are intriguing to reveal similarities in tissue architecture, but these methods remain computationally expensive. In this article, we show that multicellular patterns organized from two interacting cell types can be efficiently represented through persistence images. Our optimized combination of dimensionality reduction via autoencoders, combined with hierarchical clustering, achieved high classification accuracy for simulations with constant cell numbers. We further demonstrate that persistence images can be normalized to improve classification for simulations with varying cell numbers due to proliferation. Finally, we systematically consider the importance of incorporating different topological features as well as information about each cell type to improve classification accuracy. We envision that topological machine learning based on persistence images will enable versatile and robust classification of complex tissue architectures that occur in development and disease.
  5. Relational Persistent Homology for Multispecies Data With Application to the Tumor Microenvironment (2023)

    Bernadette J. Stolz, Jagdeep Dhesi, Joshua A. Bull, Heather A. Harrington, Helen M. Byrne, Iris H. R. Yoon
    Abstract Topological data analysis (TDA) is an active field of mathematics for quantifying shape in complex data. Standard methods in TDA such as persistent homology (PH) are typically focused on the analysis of data consisting of a single entity (e.g., cells or molecular species). However, state-of-the-art data collection techniques now generate exquisitely detailed multispecies data, prompting a need for methods that can examine and quantify the relations among them. Such heterogeneous data types arise in many contexts, ranging from biomedical imaging, geospatial analysis, to species ecology. Here, we propose two methods for encoding spatial relations among different data types that are based on Dowker complexes and Witness complexes. We apply the methods to synthetic multispecies data of a tumor microenvironment and analyze topological features that capture relations between different cell types, e.g., blood vessels, macrophages, tumor cells, and necrotic cells. We demonstrate that relational topological features can extract biological insight, including the dominant immune cell phenotype (an important predictor of patient prognosis) and the parameter regimes of a data-generating model. The methods provide a quantitative perspective on the relational analysis of multispecies spatial data, overcome the limits of traditional PH, and are readily computable.
  6. A Topological Machine Learning Pipeline for Classification (2022)

    Francesco Conti, Davide Moroni, Maria Antonietta Pascali
    Abstract In this work, we develop a pipeline that associates Persistence Diagrams to digital data via the most appropriate filtration for the type of data considered. Using a grid search approach, this pipeline determines optimal representation methods and parameters. The development of such a topological pipeline for Machine Learning involves two crucial steps that strongly affect its performance: firstly, digital data must be represented as an algebraic object with a proper associated filtration in order to compute its topological summary, the Persistence Diagram. Secondly, the persistence diagram must be transformed with suitable representation methods in order to be introduced in a Machine Learning algorithm. We assess the performance of our pipeline, and in parallel, we compare the different representation methods on popular benchmark datasets. This work is a first step toward both an easy and ready-to-use pipeline for data classification using persistent homology and Machine Learning, and to understand the theoretical reasons why, given a dataset and a task to be performed, a pair (filtration, topological representation) is better than another.
  7. Topological Early Warning Signals: Quantifying Varying Routes to Extinction in a Spatially Distributed Population Model (2022)

    Laura S. Storch, Sarah L. Day
    Abstract Understanding and predicting critical transitions in spatially explicit ecological systems is particularly challenging due to their complex spatial and temporal dynamics and high dimensionality. Here, we explore changes in population distribution patterns during a critical transition (an extinction event) using computational topology. Computational topology allows us to quantify certain features of a population distribution pattern, such as the level of fragmentation. We create population distribution patterns via a simple coupled patch model with Ricker map growth and nearest neighbors dispersal on a two dimensional lattice. We observe two dominant paths to extinction within the explored parameter space that depend critically on the dispersal rate d and the rate of parameter drift, Δϵ. These paths to extinction are easily topologically distinguishable, so categorization can be automated. We use this population model as a theoretical proof-of-concept for the methodology, and argue that computational topology is a powerful tool for analyzing dynamical changes in systems with noisy data that are coarsely resolved in space and/or time. In addition, computational topology can provide early warning signals for chaotic dynamical systems where traditional statistical early warning signals would fail. For these reasons, we envision this work as a helpful addition to the critical transitions prediction toolbox.
  8. Exploring Surface Texture Quantification in Piezo Vibration Striking Treatment (PVST) Using Topological Measures (2022)

    Melih C. Yesilli, Max M. Chumley, Jisheng Chen, Firas A. Khasawneh, Yang Guo
    Abstract Abstract. Surface texture influences wear and tribological properties of manufactured parts, and it plays a critical role in end-user products. Therefore, quantifying the order or structure of a manufactured surface provides important information on the quality and life expectancy of the product. Although texture can be intentionally introduced to enhance aesthetics or to satisfy a design function, sometimes it is an inevitable byproduct of surface treatment processes such as Piezo Vibration Striking Treatment (PVST). Measures of order for surfaces have been characterized using statistical, spectral, and geometric approaches. For nearly hexagonal lattices, topological tools have also been used to measure the surface order. This paper explores utilizing tools from Topological Data Analysis for measuring surface texture. We compute measures of order based on optical digital microscope images of surfaces treated using PVST. These measures are applied to the grid obtained from estimating the centers of tool impacts, and they quantify the grid’s deviations from the nominal one. Our results show that TDA provides a convenient framework for characterization of pattern type that bypasses some limitations of existing tools such as difficult manual processing of the data and the need for an expert user to analyze and interpret the surface images.
  9. Persistent Homology for Breast Tumor Classification Using Mammogram Scans (2022)

    Aras Asaad, Dashti Ali, Taban Majeed, Rasber Rashid
    Abstract An Important tool in the field topological data analysis is known as persistent Homology (PH) which is used to encode abstract representation of the homology of data at different resolutions in the form of persistence diagram (PD). In this work we build more than one PD representation of a single image based on a landmark selection method, known as local binary patterns, that encode different types of local textures from images. We employed different PD vectorizations using persistence landscapes, persistence images, persistence binning (Betti Curve) and statistics. We tested the effectiveness of proposed landmark based PH on two publicly available breast abnormality detection datasets using mammogram scans. Sensitivity of landmark based PH obtained is over 90% in both datasets for the detection of abnormal breast scans. Finally, experimental results give new insights on using different types of PD vectorizations which help in utilising PH in conjunction with machine learning classifiers.
  10. Capturing Shape Information With Multi-Scale Topological Loss Terms For 3D Reconstruction (2022)

    Dominik J. E. Waibel, Scott Atwell, Matthias Meier, Carsten Marr, Bastian Rieck
    Abstract Reconstructing 3D objects from 2D images is both challenging for our brains and machine learning algorithms. To support this spatial reasoning task, contextual information about the overall shape of an object is critical. However, such information is not captured by established loss terms (e.g. Dice loss). We propose to complement geometrical shape information by including multi-scale topological features, such as connected components, cycles, and voids, in the reconstruction loss. Our method uses cubical complexes to calculate topological features of 3D volume data and employs an optimal transport distance to guide the reconstruction process. This topology-aware loss is fully differentiable, computationally efficient, and can be added to any neural network. We demonstrate the utility of our loss by incorporating it into SHAPR, a model for predicting the 3D cell shape of individual cells based on 2D microscopy images. Using a hybrid loss that leverages both geometrical and topological information of single objects to assess their shape, we find that topological information substantially improves the quality of reconstructions, thus highlighting its ability to extract more relevant features from image datasets.
  11. Barcodes Distinguish Morphology of Neuronal Tauopathy (2022)

    David Beers, Despoina Goniotaki, Diane P. Hanger, Alain Goriely, Heather A. Harrington
    Abstract The geometry of neurons is known to be important for their functions. Hence, neurons are often classified by their morphology. Two recent methods, persistent homology and the topological morphology descriptor, assign a morphology descriptor called a barcode to a neuron equipped with a given function, such as the Euclidean distance from the root of the neuron. These barcodes can be converted into matrices called persistence images, which can then be averaged across groups. We show that when the defining function is the path length from the root, both the topological morphology descriptor and persistent homology are equivalent. We further show that persistence images arising from the path length procedure provide an interpretable summary of neuronal morphology. We introduce \topological morphology functions\, a class of functions similar to Sholl functions, that can be recovered from the associated topological morphology descriptor. To demonstrate this topological approach, we compare healthy cortical and hippocampal mouse neurons to those affected by progressive tauopathy. We find a significant difference in the morphology of healthy neurons and those with a tauopathy at a postsymptomatic age. We use persistence images to conclude that the diseased group tends to have neurons with shorter branches as well as fewer branches far from the soma.
  12. A Novel Quality Clustering Methodology on Fab-Wide Wafer Map Images in Semiconductor Manufacturing (2022)

    Yuan-Ming Hsu, Xiaodong Jia, Wenzhe Li, Jay Lee
    Abstract Abstract. In semiconductor manufacturing, clustering the fab-wide wafer map images is of critical importance for practitioners to understand the subclusters of wafer defects, recognize novel clusters or anomalies, and develop fast reactions to quality issues. However, due to the high-mix manufacturing of diversified wafer products of different sizes and technologies, it is difficult to cluster the wafer map images across the fab. This paper addresses this challenge by proposing a novel methodology for fab-wide wafer map data clustering. In the proposed methodology, a well-known deep learning technique, vision transformer with multi-head attention is first trained to convert binary wafer images of different sizes into condensed feature vectors for efficient clustering. Then, the Topological Data Analysis (TDA), which is widely used in biomedical applications, is employed to visualize the data clusters and identify the anomalies. The TDA yields a topological representation of high-dimensional big data as well as its local clusters by creating a graph that shows nodes corresponding to the clusters within the data. The effectiveness of the proposed methodology is demonstrated by clustering the public wafer map dataset WM-811k from the real application which has a total of 811,457 wafer map images. We further demonstrate the potential applicability of topology data analytics in the semiconductor area by visualization.
  13. Continuous Indexing of Fibrosis (CIF): Improving the Assessment and Classification of MPN Patients (2022)

    Hosuk Ryou, Korsuk Sirinukunwattana, Alan Aberdeen, Gillian Grindstaff, Bernadette Stolz, Helen Byrne, Heather A. Harrington, Nikolaos Sousos, Anna L. Godfrey, Claire N. Harrison, Bethan Psaila, Adam J. Mead, Gabrielle Rees, Gareth D. H. Turner, Jens Rittscher, Daniel Royston
    Abstract The detection and grading of fibrosis in myeloproliferative neoplasms (MPN) is an important component of disease classification, prognostication and disease monitoring. However, current fibrosis grading systems are only semi-quantitative and fail to capture sample heterogeneity. To improve the detection, quantitation and representation of reticulin fibrosis, we developed a machine learning (ML) approach using bone marrow trephine (BMT) samples (n = 107) from patients diagnosed with MPN or a reactive / nonneoplastic marrow. The resulting Continuous Indexing of Fibrosis (CIF) enhances the detection and monitoring of fibrosis within BMTs, and aids the discrimination of MPN subtypes. When combined with megakaryocyte feature analysis, CIF discriminates between the frequently challenging differential diagnosis of essential thrombocythemia (ET) and pre-fibrotic myelofibrosis (pre-PMF) with high predictive accuracy [area under the curve = 0.94]. CIF also shows significant promise in the identification of MPN patients at risk of disease progression; analysis of samples from 35 patients diagnosed with ET and enrolled in the Primary Thrombocythemia-1 (PT-1) trial identified features predictive of post-ET myelofibrosis (area under the curve = 0.77). In addition to these clinical applications, automated analysis of fibrosis has clear potential to further refine disease classification boundaries and inform future studies of the micro-environmental factors driving disease initiation and progression in MPN and other stem cell disorders. The image analysis methods used to generate CIF can be readily integrated with those of other key morphological features in MPNs, including megakaryocyte morphology, that lie beyond the scope of conventional histological assessment. Key PointsMachine learning enables an objective and quantitative description of reticulin fibrosis within the bone marrow of patients with myeloproliferative neoplasms (MPN),Automated analysis and Continuous Indexing of Fibrosis (CIF) captures heterogeneity within MPN samples and has utility in refined classification and disease monitoringQuantitative fibrosis assessment combined with topological data analysis may help to predict patients at increased risk of progression to post-ET myelofibrosis, and assist in the discrimination of ET and pre-fibrotic PMF (pre-PMF)
  14. A Multi-Parameter Persistence Framework for Mathematical Morphology (2021)

    Yu-Min Chung, Sarah Day, Chuan-Shen Hu
    Abstract The field of mathematical morphology offers well-studied techniques for image processing. In this work, we view morphological operations through the lens of persistent homology, a tool at the heart of the field of topological data analysis. We demonstrate that morphological operations naturally form a multiparameter filtration and that persistent homology can then be used to extract information about both topology and geometry in the images as well as to automate methods for optimizing the study and rendering of structure in images. For illustration, we apply this framework to analyze noisy binary, grayscale, and color images.
  15. Topological Data Analysis Distinguishes Parameter Regimes in the Anderson-Chaplain Model of Angiogenesis (2021)

    John T. Nardini, Bernadette J. Stolz, Kevin B. Flores, Heather A. Harrington, Helen M. Byrne
    Abstract Angiogenesis is the process by which blood vessels form from pre-existing vessels. It plays a key role in many biological processes, including embryonic development and wound healing, and contributes to many diseases including cancer and rheumatoid arthritis. The structure of the resulting vessel networks determines their ability to deliver nutrients and remove waste products from biological tissues. Here we simulate the Anderson-Chaplain model of angiogenesis at different parameter values and quantify the vessel architectures of the resulting synthetic data. Specifically, we propose a topological data analysis (TDA) pipeline for systematic analysis of the model. TDA is a vibrant and relatively new field of computational mathematics for studying the shape of data. We compute topological and standard descriptors of model simulations generated by different parameter values. We show that TDA of model simulation data stratifies parameter space into regions with similar vessel morphology. The methodologies proposed here are widely applicable to other synthetic and experimental data including wound healing, development, and plant biology.
  16. Data-Driven and Automatic Surface Texture Analysis Using Persistent Homology (2021)

    Melih C. Yesilli, Firas A. Khasawneh
    Abstract Surface roughness plays an important role in analyzing engineering surfaces. It quantifies the surface topography and can be used to determine whether the resulting surface finish is acceptable or not. Nevertheless, while several existing tools and standards are available for computing surface roughness, these methods rely heavily on user input thus slowing down the analysis and increasing manufacturing costs. Therefore, fast and automatic determination of the roughness level is essential to avoid costs resulting from surfaces with unacceptable finish, and user-intensive analysis. In this study, we propose a Topological Data Analysis (TDA) based approach to classify the roughness level of synthetic surfaces using both their areal images and profiles. We utilize persistent homology from TDA to generate persistence diagrams that encapsulate information on the shape of the surface. We then obtain feature matrices for each surface or profile using Carlsson coordinates, persistence images, and template functions. We compare our results to two widely used methods in the literature: Fast Fourier Transform (FFT) and Gaussian filtering. The results show that our approach yields mean accuracies as high as 97%. We also show that, in contrast to existing surface analysis tools, our TDA-based approach is fully automatable and provides adaptive feature extraction.
  17. Classification of COVID-19 via Homology of CT-SCAN (2021)

    Sohail Iqbal, H. Fareed Ahmed, Talha Qaiser, Muhammad Imran Qureshi, Nasir Rajpoot
    Abstract In this worldwide spread of SARS-CoV-2 (COVID-19) infection, it is of utmost importance to detect the disease at an early stage especially in the hot spots of this epidemic. There are more than 110 Million infected cases on the globe, sofar. Due to its promptness and effective results computed tomography (CT)-scan image is preferred to the reverse-transcription polymerase chain reaction (RT-PCR). Early detection and isolation of the patient is the only possible way of controlling the spread of the disease. Automated analysis of CT-Scans can provide enormous support in this process. In this article, We propose a novel approach to detect SARS-CoV-2 using CT-scan images. Our method is based on a very intuitive and natural idea of analyzing shapes, an attempt to mimic a professional medic. We mainly trace SARS-CoV-2 features by quantifying their topological properties. We primarily use a tool called persistent homology, from Topological Data Analysis (TDA), to compute these topological properties. We train and test our model on the "SARS-CoV-2 CT-scan dataset" i̧tep\soares2020sars\, an open-source dataset, containing 2,481 CT-scans of normal and COVID-19 patients. Our model yielded an overall benchmark F1 score of \$99.42\% \$, accuracy \$99.416\%\$, precision \$99.41\%\$, and recall \$99.42\%\$. The TDA techniques have great potential that can be utilized for efficient and prompt detection of COVID-19. The immense potential of TDA may be exploited in clinics for rapid and safe detection of COVID-19 globally, in particular in the low and middle-income countries where RT-PCR labs and/or kits are in a serious crisis.
  18. Topological Data Analysis of C. Elegans Locomotion and Behavior (2021)

    Ashleigh Thomas, Kathleen Bates, Alex Elchesen, Iryna Hartsock, Hang Lu, Peter Bubenik
    Abstract Video of nematodes/roundworms was analyzed using persistent homology to study locomotion and behavior. In each frame, an organism's body posture was represented by a high-dimensional vector. By concatenating points in fixed-duration segments of this time series, we created a sliding window embedding (sometimes called a time delay embedding) where each point corresponds to a sequence of postures of an organism. Persistent homology on the points in this time series detected behaviors and comparisons of these persistent homology computations detected variation in their corresponding behaviors. We used average persistence landscapes and machine learning techniques to study changes in locomotion and behavior in varying environments.
  19. Euler Characteristic Surfaces (2021)

    Gabriele Beltramo, Rayna Andreeva, Ylenia Giarratano, Miguel O. Bernabeu, Rik Sarkar, Primoz Skraba
    Abstract We study the use of the Euler characteristic for multiparameter topological data analysis. Euler characteristic is a classical, well-understood topological invariant that has appeared in numerous applications, including in the context of random fields. The goal of this paper is to present the extension of using the Euler characteristic in higher-dimensional parameter spaces. While topological data analysis of higher-dimensional parameter spaces using stronger invariants such as homology continues to be the subject of intense research, Euler characteristic is more manageable theoretically and computationally, and this analysis can be seen as an important intermediary step in multi-parameter topological data analysis. We show the usefulness of the techniques using artificially generated examples, and a real-world application of detecting diabetic retinopathy in retinal images.
  20. Topological Regularization for Dense Prediction (2021)

    Deqing Fu, Bradley J. Nelson
    Abstract Dense prediction tasks such as depth perception and semantic segmentation are important applications in computer vision that have a concrete topological description in terms of partitioning an image into connected components or estimating a function with a small number of local extrema corresponding to objects in the image. We develop a form of topological regularization based on persistent homology that can be used in dense prediction tasks with these topological descriptions. Experimental results show that the output topology can also appear in the internal activations of trained neural networks which allows for a novel use of topological regularization to the internal states of neural networks during training, reducing the computational cost of the regularization. We demonstrate that this topological regularization of internal activations leads to improved convergence and test benchmarks on several problems and architectures.
  21. The Euler Characteristic: A General Topological Descriptor for Complex Data (2021)

    Alexander Smith, Victor Zavala
    Abstract Datasets are mathematical objects (e.g., point clouds, matrices, graphs, images, fields/functions) that have shape. This shape encodes important knowledge about the system under study. Topology is an area of mathematics that provides diverse tools to characterize the shape of data objects. In this work, we study a specific tool known as the Euler characteristic (EC). The EC is a general, low-dimensional, and interpretable descriptor of topological spaces defined by data objects. We revise the mathematical foundations of the EC and highlight its connections with statistics, linear algebra, field theory, and graph theory. We discuss advantages offered by the use of the EC in the characterization of complex datasets; to do so, we illustrate its use in different applications of interest in chemical engineering such as process monitoring, flow cytometry, and microscopy. We show that the EC provides a descriptor that effectively reduces complex datasets and that this reduction facilitates tasks such as visualization, regression, classification, and clustering.
  22. TDAExplore: Quantitative Analysis of Fluorescence Microscopy Images Through Topology-Based Machine Learning (2021)

    Parker Edwards, Kristen Skruber, Nikola Milićević, James B. Heidings, Tracy-Ann Read, Peter Bubenik, Eric A. Vitriol
    Abstract Recent advances in machine learning have greatly enhanced automatic methods to extract information from fluorescence microscopy data. However, current machine-learning-based models can require hundreds to thousands of images to train, and the most readily accessible models classify images without describing which parts of an image contributed to classification. Here, we introduce TDAExplore, a machine learning image analysis pipeline based on topological data analysis. It can classify different types of cellular perturbations after training with only 20–30 high-resolution images and performs robustly on images from multiple subjects and microscopy modes. Using only images and whole-image labels for training, TDAExplore provides quantitative, spatial information, characterizing which image regions contribute to classification. Computational requirements to train TDAExplore models are modest and a standard PC can perform training with minimal user input. TDAExplore is therefore an accessible, powerful option for obtaining quantitative information about imaging data in a wide variety of applications.
  23. TDA-Net: Fusion of Persistent Homology and Deep Learning Features for COVID-19 Detection From Chest X-Ray Images (2021)

    Mustafa Hajij, Ghada Zamzmi, Fawwaz Batayneh
    Abstract Topological Data Analysis (TDA) has emerged recently as a robust tool to extract and compare the structure of datasets. TDA identifies features in data (e.g., connected components and holes) and assigns a quantitative measure to these features. Several studies reported that topological features extracted by TDA tools provide unique information about the data, discover new insights, and determine which feature is more related to the outcome. On the other hand, the overwhelming success of deep neural networks in learning patterns and relationships has been proven on various data applications including images. To capture the characteristics of both worlds, we propose TDA-Net, a novel ensemble network that fuses topological and deep features for the purpose of enhancing model generalizability and accuracy. We apply the proposed TDA-Net to a critical application, which is the automated detection of COVID-19 from CXR images. Experimental results showed that the proposed network achieved excellent performance and suggested the applicability of our method in practice.
  24. Determining Structural Properties of Artificial Neural Networks Using Algebraic Topology (2021)

    David Pérez Fernández, Asier Gutiérrez-Fandiño, Jordi Armengol-Estapé, Marta Villegas
    Abstract Artificial Neural Networks (ANNs) are widely used for approximating complex functions. The process that is usually followed to define the most appropriate architecture for an ANN given a specific function is mostly empirical. Once this architecture has been defined, weights are usually optimized according to the error function. On the other hand, we observe that ANNs can be represented as graphs and their topological 'fingerprints' can be obtained using Persistent Homology (PH). In this paper, we describe a proposal focused on designing more principled architecture search procedures. To do this, different architectures for solving problems related to a heterogeneous set of datasets have been analyzed. The results of the evaluation corroborate that PH effectively characterizes the ANN invariants: when ANN density (layers and neurons) or sample feeding order is the only difference, PH topological invariants appear; in the opposite direction in different sub-problems (i.e. different labels), PH varies. This approach based on topological analysis helps towards the goal of designing more principled architecture search procedures and having a better understanding of ANNs.
  25. Topological Data Analysis of Collective and Individual Epithelial Cells Using Persistent Homology of Loops (2021)

    Dhananjay Bhaskar, William Y. Zhang, Ian Y. Wong
    Abstract Interacting, self-propelled particles such as epithelial cells can dynamically self-organize into complex multicellular patterns, which are challenging to classify without a priori information. Classically, different phases and phase transitions have been described based on local ordering, which may not capture structural features at larger length scales. Instead, topological data analysis (TDA) determines the stability of spatial connectivity at varying length scales (i.e. persistent homology), and can compare different particle configurations based on the “cost” of reorganizing one configuration into another. Here, we demonstrate a topology-based machine learning approach for unsupervised profiling of individual and collective phases based on large-scale loops. We show that these topological loops (i.e. dimension 1 homology) are robust to variations in particle number and density, particularly in comparison to connected components (i.e. dimension 0 homology). We use TDA to map out phase diagrams for simulated particles with varying adhesion and propulsion, at constant population size as well as when proliferation is permitted. Next, we use this approach to profile our recent experiments on the clustering of epithelial cells in varying growth factor conditions, which are compared to our simulations. Finally, we characterize the robustness of this approach at varying length scales, with sparse sampling, and over time. Overall, we envision TDA will be broadly applicable as a model-agnostic approach to analyze active systems with varying population size, from cytoskeletal motors to motile cells to flocking or swarming animals.
  26. The Shape of Cancer Relapse: Topological Data Analysis Predicts Recurrence in Paediatric Acute Lymphoblastic Leukaemia (2021)

    Salvador Chulián, Bernadette J. Stolz, Álvaro Martínez-Rubio, Cristina Blázquez Goñi, Juan F. Rodríguez Gutiérrez, Teresa Caballero Velázquez, Águeda Molinos Quintana, Manuel Ramírez Orellana, Ana Castillo Robleda, José Luis Fuster Soler, Alfredo Minguela Puras, María Victoria Martínez Sánchez, María Rosa, Víctor M. Pérez-García, Helen Byrne
    Abstract Acute Lymphoblastic Leukaemia (ALL) is the most frequent paediatric cancer. Modern therapies have improved survival rates, but approximately 15-20 % of patients relapse. At present, patients’ risk of relapse are assessed by projecting high-dimensional flow cytometry data onto a subset of biomarkers and manually estimating the shape of this reduced data. Here, we apply methods from topological data analysis (TDA), which quantify shape in data via features such as connected components and loops, to pre-treatment ALL datasets with known outcomes. We combine these fully unsupervised analyses with machine learning to identify features in the pre-treatment data that are prognostic for risk of relapse. We find significant topological differences between relapsing and non-relapsing patients and confirm the predictive power of CD10, CD20, CD38, and CD45. Further, we are able to use the TDA descriptors to predict patients who relapsed. We propose three prognostic pipelines that readily extend to other haematological malignancies. Teaser Topology reveals features in flow cytometry data which predict relapse of patients with acute lymphoblastic leukemia
  27. Measuring Hidden Phenotype: Quantifying the Shape of Barley Seeds Using the Euler Characteristic Transform (2021)

    Erik J. Amézquita, Michelle Y. Quigley, Tim Ophelders, Jacob B. Landis, Daniel Koenig, Elizabeth Munch, Daniel H. Chitwood
    Abstract Shape plays a fundamental role in biology. Traditional phenotypic analysis methods measure some features but fail to measure the information embedded in shape comprehensively. To extract, compare, and analyze this information embedded in a robust and concise way, we turn to Topological Data Analysis (TDA), specifically the Euler Characteristic Transform. TDA measures shape comprehensively using mathematical representations based on algebraic topology features. To study its use, we compute both traditional and topological shape descriptors to quantify the morphology of 3121 barley seeds scanned with X-ray Computed Tomography (CT) technology at 127 micron resolution. The Euler Characteristic Transform measures shape by analyzing topological features of an object at thresholds across a number of directional axes. A Kruskal-Wallis analysis of the information encoded by the topological signature reveals that the Euler Characteristic Transform picks up successfully the shape of the crease and bottom of the seeds. Moreover, while traditional shape descriptors can cluster the seeds based on their accession, topological shape descriptors can cluster them further based on their panicle. We then successfully train a support vector machine (SVM) to classify 28 different accessions of barley based exclusively on the shape of their grains. We observe that combining both traditional and topological descriptors classifies barley seeds better than using just traditional descriptors alone. This improvement suggests that TDA is thus a powerful complement to traditional morphometrics to comprehensively describe a multitude of “hidden” shape nuances which are otherwise not detected.
  28. Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology (2021)

    Nicole Bussola, Bruno Papa, Ombretta Melaiu, Aurora Castellano, Doriana Fruci, Giuseppe Jurman
    Abstract We introduce here a novel machine learning (ML) framework to address the issue of the quantitative assessment of the immune content in neuroblastoma (NB) specimens. First, the EUNet, a U-Net with an EfficientNet encoder, is trained to detect lymphocytes on tissue digital slides stained with the CD3 T-cell marker. The training set consists of 3782 images extracted from an original collection of 54 whole slide images (WSIs), manually annotated for a total of 73,751 lymphocytes. Resampling strategies, data augmentation, and transfer learning approaches are adopted to warrant reproducibility and to reduce the risk of overfitting and selection bias. Topological data analysis (TDA) is then used to define activation maps from different layers of the neural network at different stages of the training process, described by persistence diagrams (PD) and Betti curves. TDA is further integrated with the uniform manifold approximation and projection (UMAP) dimensionality reduction and the hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm for clustering, by the deep features, the relevant subgroups and structures, across different levels of the neural network. Finally, the recent TwoNN approach is leveraged to study the variation of the intrinsic dimensionality of the U-Net model. As the main task, the proposed pipeline is employed to evaluate the density of lymphocytes over the whole tissue area of the WSIs. The model achieves good results with mean absolute error 3.1 on test set, showing significant agreement between densities estimated by our EUNet model and by trained pathologists, thus indicating the potentialities of a promising new strategy in the quantification of the immune content in NB specimens. Moreover, the UMAP algorithm unveiled interesting patterns compatible with pathological characteristics, also highlighting novel insights into the dynamics of the intrinsic dataset dimensionality at different stages of the training process. All the experiments were run on the Microsoft Azure cloud platform.
  29. Cubical Ripser: Software for Computing Persistent Homology of Image and Volume Data (2020)

    Shizuo Kaji, Takeki Sudo, Kazushi Ahara
    Abstract We introduce Cubical Ripser for computing persistent homology of image and volume data. To our best knowledge, Cubical Ripser is currently the fastest and the most memory-efficient program for computing persistent homology of image and volume data. We demonstrate our software with an example of image analysis in which persistent homology and convolutional neural networks are successfully combined. Our open source implementation is available at [14].
  30. Automatic Tree Ring Detection Using Jacobi Sets (2020)

    Kayla Makela, Tim Ophelders, Michelle Quigley, Elizabeth Munch, Daniel Chitwood, Asia Dowtin
    Abstract Tree ring widths are an important source of climatic and historical data, but measuring these widths typically requires extensive manual work. Computer vision techniques provide promising directions towards the automation of tree ring detection, but most automated methods still require a substantial amount of user interaction to obtain high accuracy. We perform analysis on 3D X-ray CT images of a cross-section of a tree trunk, known as a tree disk. We present novel automated methods for locating the pith (center) of a tree disk, and ring boundaries. Our methods use a combination of standard image processing techniques and tools from topological data analysis. We evaluate the efficacy of our method for two different CT scans by comparing its results to manually located rings and centers and show that it is better than current automatic methods in terms of correctly counting each ring and its location. Our methods have several parameters, which we optimize experimentally by minimizing edit distances to the manually obtained locations.
  31. Can Neural Networks Learn Persistent Homology Features? (2020)

    Guido Montúfar, Nina Otter, Yuguang Wang
    Abstract Topological data analysis uses tools from topology -- the mathematical area that studies shapes -- to create representations of data. In particular, in persistent homology, one studies one-parameter families of spaces associated with data, and persistence diagrams describe the lifetime of topological invariants, such as connected components or holes, across the one-parameter family. In many applications, one is interested in working with features associated with persistence diagrams rather than the diagrams themselves. In our work, we explore the possibility of learning several types of features extracted from persistence diagrams using neural networks.
  32. Hypothesis Testing for Shapes Using Vectorized Persistence Diagrams (2020)

    Chul Moon, Nicole A. Lazar
    Abstract Topological data analysis involves the statistical characterization of the shape of data. Persistent homology is a primary tool of topological data analysis, which can be used to analyze those topological features and perform statistical inference. In this paper, we present a two-stage hypothesis test for vectorized persistence diagrams. The first stage filters elements in the vectorized persistence diagrams to reduce false positives. The second stage consists of multiple hypothesis tests, with false positives controlled by false discovery rates. We demonstrate applications of the proposed procedure on simulated point clouds and three-dimensional rock image data. Our results show that the proposed hypothesis tests can provide flexible and informative inferences on the shape of data with lower computational cost compared to the permutation test.
  33. The Weighted Euler Curve Transform for Shape and Image Analysis (2020)

    Qitong Jiang, Sebastian Kurtek, Tom Needham
    Abstract The Euler Curve Transform (ECT) of Turner et al. is a complete invariant of an embedded simplicial complex, which is amenable to statistical analysis. We generalize the ECT to provide a similarly convenient representation for weighted simplicial complexes, objects which arise naturally, for example, in certain medical imaging applications. We leverage work of Ghrist et al. on Euler integral calculus to prove that this invariant—dubbed the Weighted Euler Curve Transform (WECT)—is also complete. We explain how to transform a segmented region of interest in a grayscale image into a weighted simplicial complex and then into a WECT representation. This WECT representation is applied to study Glioblastoma Multiforme brain tumor shape and texture data. We show that the WECT representation is effective at clustering tumors based on qualitative shape and texture features and that this clustering correlates with patient survival time.
  34. Localization in the Crowd With Topological Constraints (2020)

    Shahira Abousamra, Minh Hoai, Dimitris Samaras, Chao Chen
    Abstract We address the problem of crowd localization, i.e., the prediction of dots corresponding to people in a crowded scene. Due to various challenges, a localization method is prone to spatial semantic errors, i.e., predicting multiple dots within a same person or collapsing multiple dots in a cluttered region. We propose a topological approach targeting these semantic errors. We introduce a topological constraint that teaches the model to reason about the spatial arrangement of dots. To enforce this constraint, we define a persistence loss based on the theory of persistent homology. The loss compares the topographic landscape of the likelihood map and the topology of the ground truth. Topological reasoning improves the quality of the localization algorithm especially near cluttered regions. On multiple public benchmarks, our method outperforms previous localization methods. Additionally, we demonstrate the potential of our method in improving the performance in the crowd counting task.
  35. Topological Data Analysis Quantifies Biological Nano-Structure From Single Molecule Localization Microscopy (2020)

    Jeremy A. Pike, Abdullah O. Khan, Chiara Pallini, Steven G. Thomas, Markus Mund, Jonas Ries, Natalie S. Poulter, Iain B. Styles
    Abstract AbstractMotivation. Localization microscopy data is represented by a set of spatial coordinates, each corresponding to a single detection, that form a point cl
  36. Classification of Skin Lesions by Topological Data Analysis Alongside With Neural Network (2020)

    Naiereh Elyasi, Mehdi Hosseini Moghadam
    Abstract In this paper we use TDA mapper alongside with deep convolutional neural networks in the classification of 7 major skin diseases. First we apply kepler mapper with neural network as one of its filter steps to classify the dataset HAM10000. Mapper visualizes the classification result by a simplicial complex, where neural network can not do this alone, but as a filter step neural network helps to classify data better. Furthermore we apply TDA mapper and persistent homology to understand the weights of layers of mobilenet network in different training epochs of HAM10000. Also we use persistent diagrams to visualize the results of analysis of layers of mobilenet network.
  37. Quantitative and Interpretable Order Parameters for Phase Transitions From Persistent Homology (2020)

    Alex Cole, Gregory J. Loges, Gary Shiu
    Abstract We apply modern methods in computational topology to the task of discovering and characterizing phase transitions. As illustrations, we apply our method to four two-dimensional lattice spin models: the Ising, square ice, XY, and fully-frustrated XY models. In particular, we use persistent homology, which computes the births and deaths of individual topological features as a coarse-graining scale or sublevel threshold is increased, to summarize multiscale and high-point correlations in a spin configuration. We employ vector representations of this information called persistence images to formulate and perform the statistical task of distinguishing phases. For the models we consider, a simple logistic regression on these images is sufficient to identify the phase transition. Interpretable order parameters are then read from the weights of the regression. This method suffices to identify magnetization, frustration, and vortex-antivortex structure as relevant features for phase transitions in our models. We also define "persistence" critical exponents and study how they are related to those critical exponents usually considered.
  38. From Trees to Barcodes and Back Again: Theoretical and Statistical Perspectives (2020)

    Lida Kanari, Adélie Garin, Kathryn Hess
    Abstract Methods of topological data analysis have been successfully applied in a wide range of fields to provide useful summaries of the structure of complex data sets in terms of topological descriptors, such as persistence diagrams. While there are many powerful techniques for computing topological descriptors, the inverse problem, i.e., recovering the input data from topological descriptors, has proved to be challenging. In this article we study in detail the Topological Morphology Descriptor (TMD), which assigns a persistence diagram to any tree embedded in Euclidean space, and a sort of stochastic inverse to the TMD, the Topological Neuron Synthesis (TNS) algorithm, gaining both theoretical and computational insights into the relation between the two. We propose a new approach to classify barcodes using symmetric groups, which provides a concrete language to formulate our results. We investigate to what extent the TNS recovers a geometric tree from its TMD and describe the effect of different types of noise on the process of tree generation from persistence diagrams. We prove moreover that the TNS algorithm is stable with respect to specific types of noise.
  39. A Sheaf and Topology Approach to Generating Local Branch Numbers in Digital Images (2020)

    Chuan-Shen Hu, Yu-Min Chung
    Abstract This paper concerns a theoretical approach that combines topological data analysis (TDA) and sheaf theory. Topological data analysis, a rising field in mathematics and computer science, concerns the shape of the data and has been proven effective in many scientific disciplines. Sheaf theory, a mathematics subject in algebraic geometry, provides a framework for describing the local consistency in geometric objects. Persistent homology (PH) is one of the main driving forces in TDA, and the idea is to track changes of geometric objects at different scales. The persistence diagram (PD) summarizes the information of PH in the form of a multi-set. While PD provides useful information about the underlying objects, it lacks fine relations about the local consistency of specific pairs of generators in PD, such as the merging relation between two connected components in the PH. The sheaf structure provides a novel point of view for describing the merging relation of local objects in PH. It is the goal of this paper to establish a theoretic framework that utilizes the sheaf theory to uncover finer information from the PH. We also show that the proposed theory can be applied to identify the branch numbers of local objects in digital images.
  40. Topological Data Analysis of Zebrafish Patterns (2020)

    Melissa R. McGuirl, Alexandria Volkening, Björn Sandstede
    Abstract Self-organized pattern behavior is ubiquitous throughout nature, from fish schooling to collective cell dynamics during organism development. Qualitatively these patterns display impressive consistency, yet variability inevitably exists within pattern-forming systems on both microscopic and macroscopic scales. Quantifying variability and measuring pattern features can inform the underlying agent interactions and allow for predictive analyses. Nevertheless, current methods for analyzing patterns that arise from collective behavior capture only macroscopic features or rely on either manual inspection or smoothing algorithms that lose the underlying agent-based nature of the data. Here we introduce methods based on topological data analysis and interpretable machine learning for quantifying both agent-level features and global pattern attributes on a large scale. Because the zebrafish is a model organism for skin pattern formation, we focus specifically on analyzing its skin patterns as a means of illustrating our approach. Using a recent agent-based model, we simulate thousands of wild-type and mutant zebrafish patterns and apply our methodology to better understand pattern variability in zebrafish. Our methodology is able to quantify the differential impact of stochasticity in cell interactions on wild-type and mutant patterns, and we use our methods to predict stripe and spot statistics as a function of varying cellular communication. Our work provides an approach to automatically quantifying biological patterns and analyzing agent-based dynamics so that we can now answer critical questions in pattern formation at a much larger scale.
  41. PI-Net: A Deep Learning Approach to Extract Topological Persistence Images (2020)

    Anirudh Som, Hongjun Choi, Karthikeyan Natesan Ramamurthy, Matthew Buman, Pavan Turaga
    Abstract Topological features such as persistence diagrams and their functional approximations like persistence images (PIs) have been showing substantial promise for machine learning and computer vision applications. This is greatly attributed to the robustness topological representations provide against different types of physical nuisance variables seen in real-world data, such as view-point, illumination, and more. However, key bottlenecks to their large scale adoption are computational expenditure and difficulty incorporating them in a differentiable architecture. We take an important step in this paper to mitigate these bottlenecks by proposing a novel one-step approach to generate PIs directly from the input data. We design two separate convolutional neural network architectures, one designed to take in multi-variate time series signals as input and another that accepts multi-channel images as input. We call these networks Signal PI-Net and Image PINet respectively. To the best of our knowledge, we are the first to propose the use of deep learning for computing topological features directly from data. We explore the use of the proposed PI-Net architectures on two applications: human activity recognition using tri-axial accelerometer sensor data and image classification. We demonstrate the ease of fusion of PIs in supervised deep learning architectures and speed up of several orders of magnitude for extracting PIs from data. Our code is available at https://github.com/anirudhsom/PI-Net.
  42. Topological Descriptors Help Predict Guest Adsorption in Nanoporous Materials (2020)

    Aditi S. Krishnapriyan, Maciej Haranczyk, Dmitriy Morozov
    Abstract Machine learning has emerged as an attractive alternative to experiments and simulations for predicting material properties. Usually, such an approach relies on specific domain knowledge for feature design: each learning target requires careful selection of features that an expert recognizes as important for the specific task. The major drawback of this approach is that computation of only a few structural features has been implemented so far, and it is difficult to tell a priori which features are important for a particular application. The latter problem has been empirically observed for predictors of guest uptake in nanoporous materials: local and global porosity features become dominant descriptors at low and high pressures, respectively. We investigate a feature representation of materials using tools from topological data analysis. Specifically, we use persistent homology to describe the geometry of nanoporous materials at various scales. We combine our topological descriptor with traditional structural features and investigate the relative importance of each to the prediction tasks. We demonstrate an application of this feature representation by predicting methane adsorption in zeolites, for pressures in the range of 1-200 bar. Our results not only show a considerable improvement compared to the baseline, but they also highlight that topological features capture information complementary to the structural features: this is especially important for the adsorption at low pressure, a task particularly difficult for the traditional features. Furthermore, by investigation of the importance of individual topological features in the adsorption model, we are able to pinpoint the location of the pores that correlate best to adsorption at different pressure, contributing to our atom-level understanding of structure-property relationships.
  43. Steinhaus Filtration and Stable Paths in the Mapper (2020)

    Dustin L. Arendt, Matthew Broussard, Bala Krishnamoorthy, Nathaniel Saul
    Abstract Two central concepts from topological data analysis are persistence and the Mapper construction. Persistence employs a sequence of objects built on data called a filtration. A Mapper produces insightful summaries of data, and has found widespread applications in diverse areas. We define a new filtration called the cover filtration built from a single cover based on a generalized Steinhaus distance, which is a generalization of Jaccard distance. We prove a stability result: the cover filtrations of two covers are \$\alpha/m\$ interleaved, where \$\alpha\$ is a bound on bottleneck distance between covers and \$m\$ is the size of smallest set in either cover. We also show our construction is equivalent to the Cech filtration under certain settings, and the Vietoris-Rips filtration completely determines the cover filtration in all cases. We then develop a theory for stable paths within this filtration. Unlike standard results on stability in topological persistence, our definition of path stability aligns exactly with the above result on stability of cover filtration. We demonstrate how our framework can be employed in a variety of applications where a metric is not obvious but a cover is readily available. First we present a new model for recommendation systems using cover filtration. For an explicit example, stable paths identified on a movies data set represent sequences of movies constituting gentle transitions from one genre to another. As a second application in explainable machine learning, we apply the Mapper for model induction, providing explanations in the form of paths between subpopulations. Stable paths in the Mapper from a supervised machine learning model trained on the FashionMNIST data set provide improved explanations of relationships between subpopulations of images.
  44. Combining Geometric and Topological Information in Image Segmentation (2019)

    Hengrui Luo, Justin Strait
    Abstract A fundamental problem in computer vision is image segmentation, where the goal is to delineate the boundary of an object in the image. The focus of this work is on the segmentation of grayscale images and its purpose is two-fold. First, we conduct an in-depth study comparing active contour and topology-based methods in a statistical framework, two popular approaches for boundary detection of 2-dimensional images. Certain properties of the image dataset may favor one method over the other, both from an interpretability perspective as well as through evaluation of performance measures. Second, we propose the use of topological knowledge to assist an active contour method, which can potentially incorporate prior shape information. The latter is known to be extremely sensitive to algorithm initialization, and thus, we use a topological model to provide an automatic initialization. In addition, our proposed model can handle objects in images with more complex topological structures, including objects with holes and multiple objects within one image. We demonstrate this on artificially-constructed image datasets from computer vision, as well as real medical image data.
  45. Hepatic Tumor Classification Using Texture and Topology Analysis of Non-Contrast-Enhanced Three-Dimensional T1-Weighted MR Images With a Radiomics Approach (2019)

    Asuka Oyama, Yasuaki Hiraoka, Ippei Obayashi, Yusuke Saikawa, Shigeru Furui, Kenshiro Shiraishi, Shinobu Kumagai, Tatsuya Hayashi, Jun’ichi Kotoku
    Abstract The purpose of this study is to evaluate the accuracy for classification of hepatic tumors by characterization of T1-weighted magnetic resonance (MR) images using two radiomics approaches with machine learning models: texture analysis and topological data analysis using persistent homology. This study assessed non-contrast-enhanced fat-suppressed three-dimensional (3D) T1-weighted images of 150 hepatic tumors. The lesions included 50 hepatocellular carcinomas (HCCs), 50 metastatic tumors (MTs), and 50 hepatic hemangiomas (HHs) found respectively in 37, 23, and 33 patients. For classification, texture features were calculated, and also persistence images of three types (degree 0, degree 1 and degree 2) were obtained for each lesion from the 3D MR imaging data. We used three classification models. In the classification of HCC and MT (resp. HCC and HH, HH and MT), we obtained accuracy of 92% (resp. 90%, 73%) by texture analysis, and the highest accuracy of 85% (resp. 84%, 74%) when degree 1 (resp. degree 1, degree 2) persistence images were used. Our methods using texture analysis or topological data analysis allow for classification of the three hepatic tumors with considerable accuracy, and thus might be useful when applied for computer-aided diagnosis with MR images.
  46. Characterising Epithelial Tissues Using Persistent Entropy (2019)

    N. Atienza, L. M. Escudero, M. J. Jimenez, M. Soriano-Trigueros
    Abstract In this paper, we apply persistent entropy, a novel topological statistic, for characterization of images of epithelial tissues. We have found out that persistent entropy is able to summarize topological and geometric information encoded by \$\$\alpha \$\$α-complexes and persistent homology. After using some statistical tests, we can guarantee the existence of significant differences in the studied tissues.
  47. Phase-Field Investigation of the Coarsening of Porous Structures by Surface Diffusion (2019)

    Pierre-Antoine Geslin, Mickaël Buchet, Takeshi Wada, Hidemi Kato
    Abstract Nano and microporous connected structures have attracted increasing attention in the past decades due to their high surface area, presenting interesting properties for a number of applications. These structures generally coarsen by surface diffusion, leading to an enlargement of the structure characteristic length scale. We propose to study this coarsening behavior using a phase-field model for surface diffusion. In addition to reproducing the expected scaling law, our simulations enable to investigate precisely the evolution of the topological and morphological characteristics along the coarsening process. In particular, we show that after a transient regime, the coarsening is self-similar as exhibited by the evolution of both morphological and topological features. In addition, the influence of surface anisotropy is discussed and comparisons with experimental tomographic observations are presented.
  48. Persistent Homology Machine Learning for Fingerprint Classification (2019)

    N. Giansiracusa, R. Giansiracusa, C. Moon
    Abstract The fingerprint classification problem is to sort fingerprints into predetermined groups, such as arch, loop, and whorl. It was asserted in the literature that minutiae points, which are commonly used for fingerprint matching, are not useful for classification. We show that, to the contrary, near state-of-the-art classification accuracy rates can be achieved when applying topological data analysis (TDA) to 3-dimensional point clouds of oriented minutiae points. We also apply TDA to fingerprint ink-roll images, which yields a lower accuracy rate but still shows promise; moreover, combining the two approaches outperforms each one individually. These methods use supervised learning applied to persistent homology and allow us to explore feature selection on barcodes, an important topic at the interface between TDA and machine learning. We test our classification algorithms on the NIST fingerprint database SD-27.
  49. Specimen-Based Analysis of Morphology and the Environment in Ecologically Dominant Grasses: The Power of the Herbarium (2019)

    Christine A. McAllister, Michael R. McKain, Mao Li, Bess Bookout, Elizabeth A. Kellogg
    Abstract Herbaria contain a cumulative sample of the world's flora, assembled by thousands of people over centuries. To capitalize on this resource, we conducted a specimen-based analysis of a major clade in the grass tribe Andropogoneae, including the dominant species of the world's grasslands in the genera Andropogon, Schizachyrium, Hyparrhenia and several others. We imaged 186 of the 250 named species of the clade, georeferenced the specimens and extracted climatic variables for each. Using semi- and fully automated image analysis techniques, we extracted spikelet morphological characters and correlated these with environmental variables. We generated chloroplast genome sequences to correct for phylogenetic covariance and here present a new phylogeny for 81 of the species. We confirm and extend earlier studies to show that Andropogon and Schizachyrium are not monophyletic. In addition, we find all morphological and ecological characters are homoplasious but variable among clades. For example, sessile spikelet length is positively correlated with awn length when all accessions are considered, but when separated by clade, the relationship is positive for three sub-clades and negative for three others. Climate variables showed no correlation with morphological variation in the spikelet pair; only very weak effects of temperature and precipitation were detected on macrohair density. This article is part of the theme issue ‘Biological collections for understanding biodiversity in the Anthropocene'.
  50. Predicting Clinical Outcomes in Glioblastoma: An Application of Topological and Functional Data Analysis (2019)

    Lorin Crawford, Anthea Monod, Andrew X. Chen, Sayan Mukherjee, Raúl Rabadán
    Abstract Glioblastoma multiforme (GBM) is an aggressive form of human brain cancer that is under active study in the field of cancer biology. Its rapid progression and the relative time cost of obtaining molecular data make other readily available forms of data, such as images, an important resource for actionable measures in patients. Our goal is to use information given by medical images taken from GBM patients in statistical settings. To do this, we design a novel statistic—the smooth Euler characteristic transform (SECT)—that quantifies magnetic resonance images of tumors. Due to its well-defined inner product structure, the SECT can be used in a wider range of functional and nonparametric modeling approaches than other previously proposed topological summary statistics. When applied to a cohort of GBM patients, we find that the SECT is a better predictor of clinical outcomes than both existing tumor shape quantifications and common molecular assays. Specifically, we demonstrate that SECT features alone explain more of the variance in GBM patient survival than gene expression, volumetric features, and morphometric features. The main takeaways from our findings are thus 2-fold. First, they suggest that images contain valuable information that can play an important role in clinical prognosis and other medical decisions. Second, they show that the SECT is a viable tool for the broader study of medical imaging informatics. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
  51. RGB Image-Based Data Analysis via Discrete Morse Theory and Persistent Homology (2018)

    Chuan Du, Christopher Szul, Adarsh Manawa, Nima Rasekh, Rosemary Guzman, Ruth Davidson
    Abstract Understanding and comparing images for the purposes of data analysis is currently a very computationally demanding task. A group at Australian National University (ANU) recently developed open-source code that can detect fundamental topological features of a grayscale image in a computationally feasible manner. This is made possible by the fact that computers store grayscale images as cubical cellular complexes. These complexes can be studied using the techniques of discrete Morse theory. We expand the functionality of the ANU code by introducing methods and software for analyzing images encoded in red, green, and blue (RGB), because this image encoding is very popular for publicly available data. Our methods allow the extraction of key topological information from RGB images via informative persistence diagrams by introducing novel methods for transforming RGB-to-grayscale. This paradigm allows us to perform data analysis directly on RGB images representing water scarcity variability as well as crime variability. We introduce software enabling a a user to predict future image properties, towards the eventual aim of more rapid image-based data behavior prediction.
  52. Lung Topology Characteristics in Patients With Chronic Obstructive Pulmonary Disease (2018)

    Francisco Belchi, Mariam Pirashvili, Joy Conway, Michael Bennett, Ratko Djukanovic, Jacek Brodzki
    Abstract Quantitative features that can currently be obtained from medical imaging do not provide a complete picture of Chronic Obstructive Pulmonary Disease (COPD). In this paper, we introduce a novel analytical tool based on persistent homology that extracts quantitative features from chest CT scans to describe the geometric structure of the airways inside the lungs. We show that these new radiomic features stratify COPD patients in agreement with the GOLD guidelines for COPD and can distinguish between inspiratory and expiratory scans. These CT measurements are very different to those currently in use and we demonstrate that they convey significant medical information. The results of this study are a proof of concept that topological methods can enhance the standard methodology to create a finer classification of COPD and increase the possibilities of more personalized treatment.
  53. Pore Geometry Characterization by Persistent Homology Theory (2018)

    Fei Jiang, Takeshi Tsuji, Tomoyuki Shirai
    Abstract Rock pore geometry has heterogeneous characteristics and is scale dependent. This feature in a geological formation differs significantly from artificial materials and makes it difficult to predict hydrologic and elastic properties. To characterize pore heterogeneity, we propose an evaluation method that exploits the recently developed persistent homology theory. In the proposed method, complex pore geometry is first represented as sphere cloud data using a pore-network extraction method. Then, a persistence diagram (PD) is calculated from the point cloud, which represents the spatial distribution of pore bodies. A new parameter (distance index H) derived from the PD is proposed to characterize the degree of rock heterogeneity. Low H value indicates high heterogeneity. A new empirical equation using this index H is proposed to predict the effective elastic modulus of porous media. The results indicate that the proposed PD analysis is very efficient for extracting topological feature of pore geometry.
  54. Revisiting Abnormalities in Brain Network Architecture Underlying Autism Using Topology-Inspired Statistical Inference (2018)

    Sourabh Palande, Vipin Jose, Brandon Zielinski, Jeffrey Anderson, P. Thomas Fletcher, Bei Wang
    Abstract A large body of evidence relates autism with abnormal structural and functional brain connectivity. Structural covariance magnetic resonance imaging (scMRI) is a technique that maps brain regions with covarying gray matter densities across subjects. It provides a way to probe the anatomical structure underlying intrinsic connectivity networks (ICNs) through analysis of gray matter signal covariance. In this article, we apply topological data analysis in conjunction with scMRI to explore network-specific differences in the gray matter structure in subjects with autism versus age-, gender-, and IQ-matched controls. Specifically, we investigate topological differences in gray matter structure captured by structural correlation graphs derived from three ICNs strongly implicated in autism, namely the salience network, default mode network, and executive control network. By combining topological data analysis with statistical inference, our results provide evidence of statistically significant network-specific structural abnormalities in autism.
  55. Topological Data Analysis and Diagnostics of Compressible Magnetohydrodynamic Turbulence (2018)

    Irina Makarenko, Paul Bushby, Andrew Fletcher, Robin Henderson, Nikolay Makarenko, Anvar Shukurov
    Abstract The predictions of mean-field electrodynamics can now be probed using direct numerical simulations of random flows and magnetic fields. When modelling astrophysical magnetohydrodynamics, it is important to verify that such simulations are in agreement with observations. One of the main challenges in this area is to identify robust quantitative measures to compare structures found in simulations with those inferred from astrophysical observations. A similar challenge is to compare quantitatively results from different simulations. Topological data analysis offers a range of techniques, including the Betti numbers and persistence diagrams, that can be used to facilitate such a comparison. After describing these tools, we first apply them to synthetic random fields and demonstrate that, when the data are standardized in a straightforward manner, some topological measures are insensitive to either large-scale trends or the resolution of the data. Focusing upon one particular astrophysical example, we apply topological data analysis to H i observations of the turbulent interstellar medium (ISM) in the Milky Way and to recent magnetohydrodynamic simulations of the random, strongly compressible ISM. We stress that these topological techniques are generic and could be applied to any complex, multi-dimensional random field.
  56. The Persistent Homology Mathematical Framework Provides Enhanced Genotype-to-Phenotype Associations for Plant Morphology (2018)

    Mao Li, Margaret H. Frank, Viktoriya Coneva, Washington Mio, Daniel H. Chitwood, Christopher N. Topp
    Abstract Efforts to understand the genetic and environmental conditioning of plant morphology are hindered by the lack of flexible and effective tools for quantifying morphology. Here, we demonstrate that persistent-homology-based topological methods can improve measurement of variation in leaf shape, serrations, and root architecture. We apply these methods to 2D images of leaves and root systems in field-grown plants of a domesticated introgression line population of tomato (Solanum pennellii). We find that compared with some commonly used conventional traits, (1) persistent-homology-based methods can more comprehensively capture morphological variation; (2) these techniques discriminate between genotypes with a larger normalized effect size and detect a greater number of unique quantitative trait loci (QTLs); (3) multivariate traits, whether statistically derived from univariate or persistent-homology-based traits, improve our ability to understand the genetic basis of phenotype; and (4) persistent-homology-based techniques detect unique QTLs compared to conventional traits or their multivariate derivatives, indicating that previously unmeasured aspects of morphology are now detectable. The QTL results further imply that genetic contributions to morphology can affect both the shoot and root, revealing a pleiotropic basis to natural variation in tomato. Persistent homology is a versatile framework to quantify plant morphology and developmental processes that complements and extends existing methods.
  57. Image-Based Phenotyping for Identification of QTL Determining Fruit Shape and Size in American Cranberry (Vaccinium Macrocarpon L.) (2018)

    Luis Diaz-Garcia, Giovanny Covarrubias-Pazaran, Brandon Schlautman, Edward Grygleski, Juan Zalapa
    Abstract Image-based phenotyping methodologies are powerful tools to determine quality parameters for fruit breeders and processors. The fruit size and shape of American cranberry (Vaccinium macrocarpon L.) are particularly important characteristics that determine the harvests’ processing value and potential end-use products (e.g., juice vs. sweetened dried cranberries). However, cranberry fruit size and shape attributes can be difficult and time consuming for breeders and processors to measure, especially when relying on manual measurements and visual ratings. Therefore, in this study, we implemented image-based phenotyping techniques for gathering data regarding basic cranberry fruit parameters such as length, width, length-to-width ratio, and eccentricity. Additionally, we applied a persistent homology algorithm to better characterize complex shape parameters. Using this high-throughput artificial vision approach, we characterized fruit from 351 progeny from a full-sib cranberry population over three field seasons. Using a covariate analysis to maximize the identification of well-supported quantitative trait loci (QTL), we found 252 single QTL in a 3-year period for cranberry fruit size and shape descriptors from which 20% were consistently found in all years. The present study highlights the potential for the identified QTL and the image-based methods to serve as a basis for future explorations of the genetic architecture of fruit size and shape in cranberry and other fruit crops.
  58. Constructing Shape Spaces From a Topological Perspective (2017)

    Christoph Hofer, Roland Kwitt, Marc Niethammer, Yvonne Höller, Eugen Trinka, Andreas Uhl
    Abstract We consider the task of constructing (metric) shape space(s) from a topological perspective. In particular, we present a generic construction scheme and demonstrate how to apply this scheme when shape is interpreted as the differences that remain after factoring out translation, scaling and rotation. This is achieved by leveraging a recently proposed injective functional transform of 2D/3D (binary) objects, based on persistent homology. The resulting shape space is then equipped with a similarity measure that is (1) by design robust to noise and (2) fulfills all metric axioms. From a practical point of view, analyses of object shape can then be carried out directly on segmented objects obtained from some imaging modality without any preprocessing, such as alignment, smoothing, or landmark selection. We demonstrate the utility of the approach on the problem of distinguishing segmented hippocampi from normal controls vs. patients with Alzheimer’s disease in a challenging setup where volume changes are no longer discriminative.
  59. A Morphometric Analysis of Vegetation Patterns in Dryland Ecosystems (2017)

    Luke Mander, Stefan C. Dekker, Mao Li, Washington Mio, Surangi W. Punyasena, Timothy M. Lenton
    Abstract Vegetation in dryland ecosystems often forms remarkable spatial patterns. These range from regular bands of vegetation alternating with bare ground, to vegetated spots and labyrinths, to regular gaps of bare ground within an otherwise continuous expanse of vegetation. It has been suggested that spotted vegetation patterns could indicate that collapse into a bare ground state is imminent, and the morphology of spatial vegetation patterns, therefore, represents a potentially valuable source of information on the proximity of regime shifts in dryland ecosystems. In this paper, we have developed quantitative methods to characterize the morphology of spatial patterns in dryland vegetation. Our approach is based on algorithmic techniques that have been used to classify pollen grains on the basis of textural patterning, and involves constructing feature vectors to quantify the shapes formed by vegetation patterns. We have analysed images of patterned vegetation produced by a computational model and a small set of satellite images from South Kordofan (South Sudan), which illustrates that our methods are applicable to both simulated and real-world data. Our approach provides a means of quantifying patterns that are frequently described using qualitative terminology, and could be used to classify vegetation patterns in large-scale satellite surveys of dryland ecosystems.
  60. Sheaves Are the Canonical Data Structure for Sensor Integration (2017)

    Michael Robinson
    Abstract A sensor integration framework should be sufficiently general to accurately represent many sensor modalities, and also be able to summarize information in a faithful way that emphasizes important, actionable information. Few approaches adequately address these two discordant requirements. The purpose of this expository paper is to explain why sheaves are the canonical data structure for sensor integration and how the mathematics of sheaves satisfies our two requirements. We outline some of the powerful inferential tools that are not available to other representational frameworks.
  61. Skeletonization and Partitioning of Digital Images Using Discrete Morse Theory (2015)

    Olaf Delgado-Friedrichs, Vanessa Robins, Adrian Sheppard
    Abstract We show how discrete Morse theory provides a rigorous and unifying foundation for defining skeletons and partitions of grayscale digital images. We model a grayscale image as a cubical complex with a real-valued function defined on its vertices (the voxel values). This function is extended to a discrete gradient vector field using the algorithm presented in Robins, Wood, Sheppard TPAMI 33:1646 (2011). In the current paper we define basins (the building blocks of a partition) and segments of the skeleton using the stable and unstable sets associated with critical cells. The natural connection between Morse theory and homology allows us to prove the topological validity of these constructions; for example, that the skeleton is homotopic to the initial object. We simplify the basins and skeletons via Morse-theoretic cancellation of critical cells in the discrete gradient vector field using a strategy informed by persistent homology. Simple working Python code for our algorithms for efficient vector field traversal is included. Example data are taken from micro-CT images of porous materials, an application area where accurate topological models of pore connectivity are vital for fluid-flow modelling.
  62. Fruit Flies and Moduli: Interactions Between Biology and Mathematics (2015)

    Ezra Miller
    Abstract Possibilities for using geometry and topology to analyze statistical problems in biology raise a host of novel questions in geometry, probability, algebra, and combinatorics that demonstrate the power of biology to influence the future of pure mathematics. This expository article is a tour through some biological explorations and their mathematical ramifications. The article starts with evolution of novel topological features in wing veins of fruit flies, which are quantified using the algebraic structure of multiparameter persistent homology. The statistical issues involved highlight mathematical implications of sampling from moduli spaces. These lead to geometric probability on stratified spaces, including the sticky phenomenon for Frechet means and the origin of this mathematical area in the reconstruction of phylogenetic trees.
  63. Persistent Homology for Path Planning in Uncertain Environments (2015)

    S. Bhattacharya, R. Ghrist, V. Kumar
    Abstract We address the fundamental problem of goal-directed path planning in an uncertain environment represented as a probability (of occupancy) map. Most methods generally use a threshold to reduce the grayscale map to a binary map before applying off-the-shelf techniques to find the best path. This raises the somewhat ill-posed question, what is the right (optimal) value to threshold the map? We instead suggest a persistent homology approach to the problem-a topological approach in which we seek the homology class of trajectories that is most persistent for the given probability map. In other words, we want the class of trajectories that is free of obstacles over the largest range of threshold values. In order to make this problem tractable, we use homology in ℤ2 coefficients (instead of the standard ℤ coefficients), and describe how graph search-based algorithms can be used to find trajectories in different homology classes. Our simulation results demonstrate the efficiency and practical applicability of the algorithm proposed in this paper.paper.
  64. Morse Theory and Persistent Homology for Topological Analysis of 3D Images of Complex Materials (2014)

    O. Delgado-Friedrichs, V. Robins, A. Sheppard
    Abstract We develop topologically accurate and compatible definitions for the skeleton and watershed segmentation of a 3D digital object that are computed by a single algorithm. These definitions are based on a discrete gradient vector field derived from a signed distance transform. This gradient vector field is amenable to topological analysis and simplification via For-man's discrete Morse theory and provides a filtration that can be used as input to persistent homology algorithms. Efficient implementations allow us to process large-scale x-ray micro-CT data of rock cores and other materials.
  65. Topological Descriptors of Histology Images (2014)

    Nikhil Singh, Heather D. Couture, J. S. Marron, Charles Perou, Marc Niethammer
    Abstract The purpose of this study is to investigate architectural characteristics of cell arrangements in breast cancer histology images. We propose the use of topological data analysis to summarize the geometric information inherent in tumor cell arrangements. Our goal is to use this information as signatures that encode robust summaries of cell arrangements in tumor tissue as captured through histology images. In particular, using ideas from algebraic topology we construct topological descriptors based on cell nucleus segmentations such as persistency charts and Betti sequences. We assess their performance on the task of discriminating the breast cancer subtypes Basal, Luminal A, Luminal B and HER2. We demonstrate that the topological features contain useful complementary information to image-appearance based features that can improve discriminatory performance of classifiers.
  66. A Klein-Bottle-Based Dictionary for Texture Representation (2014)

    Jose A. Perea, Gunnar Carlsson
    Abstract A natural object of study in texture representation and material classification is the probability density function, in pixel-value space, underlying the set of small patches from the given image. Inspired by the fact that small \$\$n\times n\$\$n×nhigh-contrast patches from natural images in gray-scale accumulate with high density around a surface \$\$\fancyscript\K\\subset \\mathbb \R\\\textasciicircum\n\textasciicircum2\\$\$K⊂Rn2with the topology of a Klein bottle (Carlsson et al. International Journal of Computer Vision 76(1):1–12, 2008), we present in this paper a novel framework for the estimation and representation of distributions around \$\$\fancyscript\K\\$\$K, of patches from texture images. More specifically, we show that most \$\$n\times n\$\$n×npatches from a given image can be projected onto \$\$\fancyscript\K\\$\$Kyielding a finite sample \$\$S\subset \fancyscript\K\\$\$S⊂K, whose underlying probability density function can be represented in terms of Fourier-like coefficients, which in turn, can be estimated from \$\$S\$\$S. We show that image rotation acts as a linear transformation at the level of the estimated coefficients, and use this to define a multi-scale rotation-invariant descriptor. We test it by classifying the materials in three popular data sets: The CUReT, UIUCTex and KTH-TIPS texture databases.
  67. Theory and Algorithms for Constructing Discrete Morse Complexes From Grayscale Digital Images (2011)

    V. Robins, P. J. Wood, A. P. Sheppard
    Abstract We present an algorithm for determining the Morse complex of a two or three-dimensional grayscale digital image. Each cell in the Morse complex corresponds to a topological change in the level sets (i.e., a critical point) of the grayscale image. Since more than one critical point may be associated with a single image voxel, we model digital images by cubical complexes. A new homotopic algorithm is used to construct a discrete Morse function on the cubical complex that agrees with the digital image and has exactly the number and type of critical cells necessary to characterize the topological changes in the level sets. We make use of discrete Morse theory and simple homotopy theory to prove correctness of this algorithm. The resulting Morse complex is considerably simpler than the cubical complex originally used to represent the image and may be used to compute persistent homology.
  68. Persistent Betti Numbers for a Noise Tolerant Shape-Based Approach to Image Retrieval (2011)

    Patrizio Frosini, Claudia Landi
    Abstract In content-based image retrieval a major problem is the presence of noisy shapes. It is well known that persistent Betti numbers are a shape descriptor that admits a dissimilarity distance, the matching distance, stable under continuous shape deformations. In this paper we focus on the problem of dealing with noise that changes the topology of the studied objects. We present a general method to turn persistent Betti numbers into stable descriptors also in the presence of topological changes. Retrieval tests on the Kimia-99 database show the effectiveness of the method.
  69. A Mayer–Vietoris Formula for Persistent Homology With an Application to Shape Recognition in the Presence of Occlusions (2011)

    Barbara Di Fabio, Claudia Landi
    Abstract In algebraic topology it is well known that, using the Mayer–Vietoris sequence, the homology of a space X can be studied by splitting X into subspaces A and B and computing the homology of A, B, and A∩B. A natural question is: To what extent does persistent homology benefit from a similar property? In this paper we show that persistent homology has a Mayer–Vietoris sequence that is generally not exact but only of order 2. However, we obtain a Mayer–Vietoris formula involving the ranks of the persistent homology groups of X, A, B, and A∩B plus three extra terms. This implies that persistent homological features of A and B can be found either as persistent homological features of X or of A∩B. As an application of this result, we show that persistence diagrams are able to recognize an occluded shape by showing a common subset of points.
  70. Finite Topology as Applied to Image Analysis (1989)

    V. A Kovalevsky
    Abstract The notion of a cellular complex which is well known in the topology is applied to describe the structure of images. It is shown that the topology of cellular complexes is the only possible topology of finite sets. Under this topology no contradictions or paradoxes arise when defining connected subsets and their boundaries. Ways of encoding images as cellular complexes are discussed. The process of image segmentation is considered as splitting (in the topological sense) a cellular complex into blocks of cells. The notion of a cell list is introduced as a precise and compact data structure for encoding segmented images. Some applications of this data structure to the image analysis are demonstrated.