🍩 Database of Original & Non-Theoretical Uses of Topology
(found 95 matches in 0.009614s)
-
-
A Multi-Parameter Persistence Framework for Mathematical Morphology (2021)
Yu-Min Chung, Sarah Day, Chuan-Shen HuAbstract
The field of mathematical morphology offers well-studied techniques for image processing. In this work, we view morphological operations through the lens of persistent homology, a tool at the heart of the field of topological data analysis. We demonstrate that morphological operations naturally form a multiparameter filtration and that persistent homology can then be used to extract information about both topology and geometry in the images as well as to automate methods for optimizing the study and rendering of structure in images. For illustration, we apply this framework to analyze noisy binary, grayscale, and color images. -
TILT: Topological Interface Recovery in Limited-Angle Tomography (2024)
Elli Karvonen, Matti Lassas, Pekka Pankka, Samuli SiltanenAbstract
A wavelet-based sparsity-promoting reconstruction method is studied in the context of tomography with severely limited projection data. Such imaging problems are ill-posed inverse problems, or very sensitive to measurement and modeling errors. The reconstruction method is based on minimizing a sum of a data discrepancy term based on an \$\ell\textasciicircum2\$-norm and another term containing an \$\ell\textasciicircum1\$-norm of a wavelet coefficient vector. Depending on the viewpoint, the method can be considered (i) as finding the Bayesian maximum a posteriori (MAP) estimate using a Besov-space \$B_\11\\textasciicircum\1\(\\mathbb T\\textasciicircum\2\)\$ prior, or (ii) as deterministic regularization with a Besov-norm penalty. The minimization is performed using a tailored primal-dual path following interior-point method, which is applicable to problems larger in scale than commercially available general-purpose optimization package algorithms. The choice of “regularization parameter” is done by a novel technique called the S-curve method, which can be used to incorporate a priori information on the sparsity of the unknown target to the reconstruction process. Numerical results are presented, focusing on uniformly sampled sparse-angle data. Both simulated and measured data are considered, and noise-robust and edge-preserving multiresolution reconstructions are achieved. In sparse-angle cases with simulated data the proposed method offers a significant improvement in reconstruction quality (measured in relative square norm error) over filtered back-projection (FBP) and Tikhonov regularization. -
A Study on Topological Descriptors for the Analysis of 3D Surface Texture (2018)
Matthias Zeppelzauer, Bartosz Zieliński, Mateusz Juda, Markus Seidl -
Topo-Cxr: Chest X-Ray TB and Pneumonia Screening With Topological Machine Learning (2023)
Faisal Ahmed, Brighton Nuwagira, Furkan Torlak, Baris CoskunuzerCommunity Resources
-
Cubical Ripser: Software for Computing Persistent Homology of Image and Volume Data (2020)
Shizuo Kaji, Takeki Sudo, Kazushi AharaAbstract
We introduce Cubical Ripser for computing persistent homology of image and volume data. To our best knowledge, Cubical Ripser is currently the fastest and the most memory-efficient program for computing persistent homology of image and volume data. We demonstrate our software with an example of image analysis in which persistent homology and convolutional neural networks are successfully combined. Our open source implementation is available at [14]. -
RGB Image-Based Data Analysis via Discrete Morse Theory and Persistent Homology (2018)
Chuan Du, Christopher Szul, Adarsh Manawa, Nima Rasekh, Rosemary Guzman, Ruth DavidsonAbstract
Understanding and comparing images for the purposes of data analysis is currently a very computationally demanding task. A group at Australian National University (ANU) recently developed open-source code that can detect fundamental topological features of a grayscale image in a computationally feasible manner. This is made possible by the fact that computers store grayscale images as cubical cellular complexes. These complexes can be studied using the techniques of discrete Morse theory. We expand the functionality of the ANU code by introducing methods and software for analyzing images encoded in red, green, and blue (RGB), because this image encoding is very popular for publicly available data. Our methods allow the extraction of key topological information from RGB images via informative persistence diagrams by introducing novel methods for transforming RGB-to-grayscale. This paradigm allows us to perform data analysis directly on RGB images representing water scarcity variability as well as crime variability. We introduce software enabling a a user to predict future image properties, towards the eventual aim of more rapid image-based data behavior prediction. -
Automatic Tree Ring Detection Using Jacobi Sets (2020)
Kayla Makela, Tim Ophelders, Michelle Quigley, Elizabeth Munch, Daniel Chitwood, Asia DowtinAbstract
Tree ring widths are an important source of climatic and historical data, but measuring these widths typically requires extensive manual work. Computer vision techniques provide promising directions towards the automation of tree ring detection, but most automated methods still require a substantial amount of user interaction to obtain high accuracy. We perform analysis on 3D X-ray CT images of a cross-section of a tree trunk, known as a tree disk. We present novel automated methods for locating the pith (center) of a tree disk, and ring boundaries. Our methods use a combination of standard image processing techniques and tools from topological data analysis. We evaluate the efficacy of our method for two different CT scans by comparing its results to manually located rings and centers and show that it is better than current automatic methods in terms of correctly counting each ring and its location. Our methods have several parameters, which we optimize experimentally by minimizing edit distances to the manually obtained locations. -
Topological Data Analysis Distinguishes Parameter Regimes in the Anderson-Chaplain Model of Angiogenesis (2021)
John T. Nardini, Bernadette J. Stolz, Kevin B. Flores, Heather A. Harrington, Helen M. ByrneAbstract
Angiogenesis is the process by which blood vessels form from pre-existing vessels. It plays a key role in many biological processes, including embryonic development and wound healing, and contributes to many diseases including cancer and rheumatoid arthritis. The structure of the resulting vessel networks determines their ability to deliver nutrients and remove waste products from biological tissues. Here we simulate the Anderson-Chaplain model of angiogenesis at different parameter values and quantify the vessel architectures of the resulting synthetic data. Specifically, we propose a topological data analysis (TDA) pipeline for systematic analysis of the model. TDA is a vibrant and relatively new field of computational mathematics for studying the shape of data. We compute topological and standard descriptors of model simulations generated by different parameter values. We show that TDA of model simulation data stratifies parameter space into regions with similar vessel morphology. The methodologies proposed here are widely applicable to other synthetic and experimental data including wound healing, development, and plant biology. -
A Topological Machine Learning Pipeline for Classification (2022)
Francesco Conti, Davide Moroni, Maria Antonietta PascaliAbstract
In this work, we develop a pipeline that associates Persistence Diagrams to digital data via the most appropriate filtration for the type of data considered. Using a grid search approach, this pipeline determines optimal representation methods and parameters. The development of such a topological pipeline for Machine Learning involves two crucial steps that strongly affect its performance: firstly, digital data must be represented as an algebraic object with a proper associated filtration in order to compute its topological summary, the Persistence Diagram. Secondly, the persistence diagram must be transformed with suitable representation methods in order to be introduced in a Machine Learning algorithm. We assess the performance of our pipeline, and in parallel, we compare the different representation methods on popular benchmark datasets. This work is a first step toward both an easy and ready-to-use pipeline for data classification using persistent homology and Machine Learning, and to understand the theoretical reasons why, given a dataset and a task to be performed, a pair (filtration, topological representation) is better than another. -
Skeletonization and Partitioning of Digital Images Using Discrete Morse Theory (2015)
Olaf Delgado-Friedrichs, Vanessa Robins, Adrian SheppardAbstract
We show how discrete Morse theory provides a rigorous and unifying foundation for defining skeletons and partitions of grayscale digital images. We model a grayscale image as a cubical complex with a real-valued function defined on its vertices (the voxel values). This function is extended to a discrete gradient vector field using the algorithm presented in Robins, Wood, Sheppard TPAMI 33:1646 (2011). In the current paper we define basins (the building blocks of a partition) and segments of the skeleton using the stable and unstable sets associated with critical cells. The natural connection between Morse theory and homology allows us to prove the topological validity of these constructions; for example, that the skeleton is homotopic to the initial object. We simplify the basins and skeletons via Morse-theoretic cancellation of critical cells in the discrete gradient vector field using a strategy informed by persistent homology. Simple working Python code for our algorithms for efficient vector field traversal is included. Example data are taken from micro-CT images of porous materials, an application area where accurate topological models of pore connectivity are vital for fluid-flow modelling. -
Dissecting Glial Scar Formation by Spatial Point Pattern and Topological Data Analysis (2024)
Daniel Manrique-Castano, Dhananjay Bhaskar, Ayman ElAliAbstract
Glial scar formation represents a fundamental response to central nervous system (CNS) injuries. It is mainly characterized by a well-defined spatial rearrangement of reactive astrocytes and microglia. The mechanisms underlying glial scar formation have been extensively studied, yet quantitative descriptors of the spatial arrangement of reactive glial cells remain limited. Here, we present a novel approach using point pattern analysis (PPA) and topological data analysis (TDA) to quantify spatial patterns of reactive glial cells after experimental ischemic stroke in mice. We provide open and reproducible tools using R and Julia to quantify spatial intensity, cell covariance and conditional distribution, cell-to-cell interactions, and short/long-scale arrangement, which collectively disentangle the arrangement patterns of the glial scar. This approach unravels a substantial divergence in the distribution of GFAP+ and IBA1+ cells after injury that conventional analysis methods cannot fully characterize. PPA and TDA are valuable tools for studying the complex spatial arrangement of reactive glia and other nervous cells following CNS injuries and have potential applications for evaluating glial-targeted restorative therapies.Community Resources
-
Persistent Homology for the Automatic Classification of Prostate Cancer Aggressiveness in Histopathology Images (2019)
Peter Lawson, Jordan Schupbach, Brittany Terese Fasy, John W. Sheppard -
Multiscale Topology Characterizes Dynamic Tumor Vascular Networks (2022)
Bernadette J. Stolz, Jakob Kaeppler, Bostjan Markelc, Franziska Braun, Florian Lipsmeier, Ruth J. Muschel, Helen M. Byrne, Heather A. Harrington -
Deep Learning With Topological Signatures (2017)
Christoph Hofer, Roland Kwitt, Marc Niethammer, Andreas Uhl -
Shortened Persistent Homology for a Biomedical Retrieval System With Relevance Feedback (2018)
Alessia Angeli, Massimo Ferri, Eleonora Monti, Ivan Tomba -
Classifying RGB Images With Multi-Colour Persistent Homology (2019)
Wolf Byttner -
A Sheaf and Topology Approach to Generating Local Branch Numbers in Digital Images (2020)
Chuan-Shen Hu, Yu-Min ChungAbstract
This paper concerns a theoretical approach that combines topological data analysis (TDA) and sheaf theory. Topological data analysis, a rising field in mathematics and computer science, concerns the shape of the data and has been proven effective in many scientific disciplines. Sheaf theory, a mathematics subject in algebraic geometry, provides a framework for describing the local consistency in geometric objects. Persistent homology (PH) is one of the main driving forces in TDA, and the idea is to track changes of geometric objects at different scales. The persistence diagram (PD) summarizes the information of PH in the form of a multi-set. While PD provides useful information about the underlying objects, it lacks fine relations about the local consistency of specific pairs of generators in PD, such as the merging relation between two connected components in the PH. The sheaf structure provides a novel point of view for describing the merging relation of local objects in PH. It is the goal of this paper to establish a theoretic framework that utilizes the sheaf theory to uncover finer information from the PH. We also show that the proposed theory can be applied to identify the branch numbers of local objects in digital images. -
Diverse 3D Cellular Patterns Underlie the Development of Cardamine Hirsuta and Arabidopsis Thaliana Ovules (2023)
Tejasvinee Atul Mody, Alexander Rolle, Nico Stucki, Fabian Roll, Ulrich Bauer, Kay SchneitzAbstract
A fundamental question in biology is how organ morphogenesis comes about. The ovules of Arabidopsis thaliana have been established as a successful model to study numerous aspects of tissue morphogenesis; however, little is known regarding the relative contributions and dynamics of differential tissue and cellular growth and architecture in establishing ovule morphogenesis in different species. To address this issue, we generated a 3D digital atlas of Cardamine hirsuta ovule development with full cellular resolution. We combined quantitative comparative morphometrics and topological analysis to explore similarities and differences in the 3D cellular architectures underlying ovule development of the two species. We discovered that they show diversity in the way the three radial cell layers of the primordium contribute to its growth, in the formation of a new cell layer in the inner integument and, in certain cases, in the topological properties of the 3D cell architectures of homologous tissues despite their similar shape. Our work demonstrates the power of comparative 3D cellular morphometry and the importance of internal tissues and their cellular architecture in organ morphogenesis. Summary Statement Quantitative morphometric comparison of 3D digital ovules at full cellular resolution reveals diversity in internal 3D cellular architectures between similarly shaped ovules of Cardamine hirsuta and Arabidopsis thaliana. -
Can Neural Networks Learn Persistent Homology Features? (2020)
Guido Montúfar, Nina Otter, Yuguang WangAbstract
Topological data analysis uses tools from topology -- the mathematical area that studies shapes -- to create representations of data. In particular, in persistent homology, one studies one-parameter families of spaces associated with data, and persistence diagrams describe the lifetime of topological invariants, such as connected components or holes, across the one-parameter family. In many applications, one is interested in working with features associated with persistence diagrams rather than the diagrams themselves. In our work, we explore the possibility of learning several types of features extracted from persistence diagrams using neural networks. -
Morse Theory and Persistent Homology for Topological Analysis of 3D Images of Complex Materials (2014)
O. Delgado-Friedrichs, V. Robins, A. SheppardAbstract
We develop topologically accurate and compatible definitions for the skeleton and watershed segmentation of a 3D digital object that are computed by a single algorithm. These definitions are based on a discrete gradient vector field derived from a signed distance transform. This gradient vector field is amenable to topological analysis and simplification via For-man's discrete Morse theory and provides a filtration that can be used as input to persistent homology algorithms. Efficient implementations allow us to process large-scale x-ray micro-CT data of rock cores and other materials. -
Persistent Betti Numbers for a Noise Tolerant Shape-Based Approach to Image Retrieval (2011)
Patrizio Frosini, Claudia LandiAbstract
In content-based image retrieval a major problem is the presence of noisy shapes. It is well known that persistent Betti numbers are a shape descriptor that admits a dissimilarity distance, the matching distance, stable under continuous shape deformations. In this paper we focus on the problem of dealing with noise that changes the topology of the studied objects. We present a general method to turn persistent Betti numbers into stable descriptors also in the presence of topological changes. Retrieval tests on the Kimia-99 database show the effectiveness of the method. -
Hypothesis Testing for Shapes Using Vectorized Persistence Diagrams (2020)
Chul Moon, Nicole A. LazarAbstract
Topological data analysis involves the statistical characterization of the shape of data. Persistent homology is a primary tool of topological data analysis, which can be used to analyze those topological features and perform statistical inference. In this paper, we present a two-stage hypothesis test for vectorized persistence diagrams. The first stage filters elements in the vectorized persistence diagrams to reduce false positives. The second stage consists of multiple hypothesis tests, with false positives controlled by false discovery rates. We demonstrate applications of the proposed procedure on simulated point clouds and three-dimensional rock image data. Our results show that the proposed hypothesis tests can provide flexible and informative inferences on the shape of data with lower computational cost compared to the permutation test. -
Lung Topology Characteristics in Patients With Chronic Obstructive Pulmonary Disease (2018)
Francisco Belchi, Mariam Pirashvili, Joy Conway, Michael Bennett, Ratko Djukanovic, Jacek BrodzkiAbstract
Quantitative features that can currently be obtained from medical imaging do not provide a complete picture of Chronic Obstructive Pulmonary Disease (COPD). In this paper, we introduce a novel analytical tool based on persistent homology that extracts quantitative features from chest CT scans to describe the geometric structure of the airways inside the lungs. We show that these new radiomic features stratify COPD patients in agreement with the GOLD guidelines for COPD and can distinguish between inspiratory and expiratory scans. These CT measurements are very different to those currently in use and we demonstrate that they convey significant medical information. The results of this study are a proof of concept that topological methods can enhance the standard methodology to create a finer classification of COPD and increase the possibilities of more personalized treatment. -
CCF-GNN: A Unified Model Aggregating Appearance, Microenvironment, and Topology for Pathology Image Classification (2023)
Hongxiao Wang, Gang Huang, Zhuo Zhao, Liang Cheng, Anna Juncker-Jensen, Máté Levente Nagy, Xin Lu, Xiangliang Zhang, Danny Z. ChenAbstract
Pathology images contain rich information of cell appearance, microenvironment, and topology features for cancer analysis and diagnosis. Among such features, topology becomes increasingly important in analysis for cancer immunotherapy. By analyzing geometric and hierarchically structured cell distribution topology, oncologists can identify densely-packed and cancer-relevant cell communities (CCs) for making decisions. Compared to commonly-used pixel-level Convolution Neural Network (CNN) features and cell-instance-level Graph Neural Network (GNN) features, CC topology features are at a higher level of granularity and geometry. However, topological features have not been well exploited by recent deep learning (DL) methods for pathology image classification due to lack of effective topological descriptors for cell distribution and gathering patterns. In this paper, inspired by clinical practice, we analyze and classify pathology images by comprehensively learning cell appearance, microenvironment, and topology in a fine-to-coarse manner. To describe and exploit topology, we design Cell Community Forest (CCF), a novel graph that represents the hierarchical formulation process of big-sparse CCs from small-dense CCs. Using CCF as a new geometric topological descriptor of tumor cells in pathology images, we propose CCF-GNN, a GNN model that successively aggregates heterogeneous features (e.g., appearance, microenvironment) from cell-instance-level, cell-community-level, into image-level for pathology image classification. Extensive cross-validation experiments show that our method significantly outperforms alternative methods on H&E-stained; immunofluorescence images for disease grading tasks with multiple cancer types. Our proposed CCF-GNN establishes a new topological data analysis (TDA) based method, which facilitates integrating multi-level heterogeneous features of point clouds (e.g., for cells) into a unified DL framework. -
The Weighted Euler Curve Transform for Shape and Image Analysis (2020)
Qitong Jiang, Sebastian Kurtek, Tom NeedhamAbstract
The Euler Curve Transform (ECT) of Turner et al. is a complete invariant of an embedded simplicial complex, which is amenable to statistical analysis. We generalize the ECT to provide a similarly convenient representation for weighted simplicial complexes, objects which arise naturally, for example, in certain medical imaging applications. We leverage work of Ghrist et al. on Euler integral calculus to prove that this invariant—dubbed the Weighted Euler Curve Transform (WECT)—is also complete. We explain how to transform a segmented region of interest in a grayscale image into a weighted simplicial complex and then into a WECT representation. This WECT representation is applied to study Glioblastoma Multiforme brain tumor shape and texture data. We show that the WECT representation is effective at clustering tumors based on qualitative shape and texture features and that this clustering correlates with patient survival time. -
Localization in the Crowd With Topological Constraints (2020)
Shahira Abousamra, Minh Hoai, Dimitris Samaras, Chao ChenAbstract
We address the problem of crowd localization, i.e., the prediction of dots corresponding to people in a crowded scene. Due to various challenges, a localization method is prone to spatial semantic errors, i.e., predicting multiple dots within a same person or collapsing multiple dots in a cluttered region. We propose a topological approach targeting these semantic errors. We introduce a topological constraint that teaches the model to reason about the spatial arrangement of dots. To enforce this constraint, we define a persistence loss based on the theory of persistent homology. The loss compares the topographic landscape of the likelihood map and the topology of the ground truth. Topological reasoning improves the quality of the localization algorithm especially near cluttered regions. On multiple public benchmarks, our method outperforms previous localization methods. Additionally, we demonstrate the potential of our method in improving the performance in the crowd counting task. -
Pore Geometry Characterization by Persistent Homology Theory (2018)
Fei Jiang, Takeshi Tsuji, Tomoyuki ShiraiAbstract
Rock pore geometry has heterogeneous characteristics and is scale dependent. This feature in a geological formation differs significantly from artificial materials and makes it difficult to predict hydrologic and elastic properties. To characterize pore heterogeneity, we propose an evaluation method that exploits the recently developed persistent homology theory. In the proposed method, complex pore geometry is first represented as sphere cloud data using a pore-network extraction method. Then, a persistence diagram (PD) is calculated from the point cloud, which represents the spatial distribution of pore bodies. A new parameter (distance index H) derived from the PD is proposed to characterize the degree of rock heterogeneity. Low H value indicates high heterogeneity. A new empirical equation using this index H is proposed to predict the effective elastic modulus of porous media. The results indicate that the proposed PD analysis is very efficient for extracting topological feature of pore geometry. -
Revisiting Abnormalities in Brain Network Architecture Underlying Autism Using Topology-Inspired Statistical Inference (2018)
Sourabh Palande, Vipin Jose, Brandon Zielinski, Jeffrey Anderson, P. Thomas Fletcher, Bei WangAbstract
A large body of evidence relates autism with abnormal structural and functional brain connectivity. Structural covariance magnetic resonance imaging (scMRI) is a technique that maps brain regions with covarying gray matter densities across subjects. It provides a way to probe the anatomical structure underlying intrinsic connectivity networks (ICNs) through analysis of gray matter signal covariance. In this article, we apply topological data analysis in conjunction with scMRI to explore network-specific differences in the gray matter structure in subjects with autism versus age-, gender-, and IQ-matched controls. Specifically, we investigate topological differences in gray matter structure captured by structural correlation graphs derived from three ICNs strongly implicated in autism, namely the salience network, default mode network, and executive control network. By combining topological data analysis with statistical inference, our results provide evidence of statistically significant network-specific structural abnormalities in autism. -
A Mayer–Vietoris Formula for Persistent Homology With an Application to Shape Recognition in the Presence of Occlusions (2011)
Barbara Di Fabio, Claudia LandiAbstract
In algebraic topology it is well known that, using the Mayer–Vietoris sequence, the homology of a space X can be studied by splitting X into subspaces A and B and computing the homology of A, B, and A∩B. A natural question is: To what extent does persistent homology benefit from a similar property? In this paper we show that persistent homology has a Mayer–Vietoris sequence that is generally not exact but only of order 2. However, we obtain a Mayer–Vietoris formula involving the ranks of the persistent homology groups of X, A, B, and A∩B plus three extra terms. This implies that persistent homological features of A and B can be found either as persistent homological features of X or of A∩B. As an application of this result, we show that persistence diagrams are able to recognize an occluded shape by showing a common subset of points. -
TopoGAN: A Topology-Aware Generative Adversarial Network (2020)
Fan Wang, Huidong Liu, Dimitris Samaras, Chao ChenAbstract
Existing generative adversarial networks (GANs) focus on generating realistic images based on CNN-derived image features, but fail to preserve the structural properties of real images. This can be fatal in applications where the underlying structure (e.g.., neurons, vessels, membranes, and road networks) of the image carries crucial semantic meaning. In this paper, we propose a novel GAN model that learns the topology of real images, i.e., connectedness and loopy-ness. In particular, we introduce a new loss that bridges the gap between synthetic image distribution and real image distribution in the topological feature space. By optimizing this loss, the generator produces images with the same structural topology as real images. We also propose new GAN evaluation metrics that measure the topological realism of the synthetic images. We show in experiments that our method generates synthetic images with realistic topology. We also highlight the increased performance that our method brings to downstream tasks such as segmentation.Community Resources
-
Acute Lymphoblastic Leukemia Classification Using Persistent Homology (2024)
Waqar Hussain Shah, Abdullah Baloch, Rider Jaimes-Reátegui, Sohail Iqbal, Syeda Rafia Fatima, Alexander N. PisarchikAbstract
Acute Lymphoblastic Leukemia (ALL) is a prevalent form of childhood blood cancer characterized by the proliferation of immature white blood cells that rapidly replace normal cells in the bone marrow. The exponential growth of these leukemic cells can be fatal if not treated promptly. Classifying lymphoblasts and healthy cells poses a significant challenge, even for domain experts, due to their morphological similarities. Automated computer analysis of ALL can provide substantial support in this domain and potentially save numerous lives. In this paper, we propose a novel classification approach that involves analyzing shapes and extracting topological features of ALL cells. We employ persistent homology to capture these topological features. Our technique accurately and efficiently detects and classifies leukemia blast cells, achieving a recall of 98.2% and an F1-score of 94.6%. This approach has the potential to significantly enhance leukemia diagnosis and therapy. -
Constructing Shape Spaces From a Topological Perspective (2017)
Christoph Hofer, Roland Kwitt, Marc Niethammer, Yvonne Höller, Eugen Trinka, Andreas UhlAbstract
We consider the task of constructing (metric) shape space(s) from a topological perspective. In particular, we present a generic construction scheme and demonstrate how to apply this scheme when shape is interpreted as the differences that remain after factoring out translation, scaling and rotation. This is achieved by leveraging a recently proposed injective functional transform of 2D/3D (binary) objects, based on persistent homology. The resulting shape space is then equipped with a similarity measure that is (1) by design robust to noise and (2) fulfills all metric axioms. From a practical point of view, analyses of object shape can then be carried out directly on segmented objects obtained from some imaging modality without any preprocessing, such as alignment, smoothing, or landmark selection. We demonstrate the utility of the approach on the problem of distinguishing segmented hippocampi from normal controls vs. patients with Alzheimer’s disease in a challenging setup where volume changes are no longer discriminative. -
Topological Detection of Phenomenological Bifurcations With Unreliable Kernel Density Estimates (2024)
Sunia Tanweer, Firas A. KhasawnehAbstract
Phenomenological (P-type) bifurcations are qualitative changes in stochastic dynamical systems whereby the stationary probability density function (PDF) changes its topology. The current state of the art for detecting these bifurcations requires reliable kernel density estimates computed from an ensemble of system realizations. However, in several real world signals such as Big Data, only a single system realization is available—making it impossible to estimate a reliable kernel density. This study presents an approach for detecting P-type bifurcations using unreliable density estimates. The approach creates an ensemble of objects from Topological Data Analysis (TDA) called persistence diagrams from the system’s sole realization and statistically analyzes the resulting set. We compare several methods for replicating the original persistence diagram including Gibbs point process modelling, Pairwise Interaction Point Modelling, and subsampling. We show that for the purpose of predicting a bifurcation, the simple method of subsampling exceeds the other two methods of point process modelling in performance. -
Mapping Geometric and Electromagnetic Feature Spaces With Machine Learning for Additively Manufactured RF Devices (2022)
Deanna Sessions, Venkatesh Meenakshisundaram, Andrew Gillman, Alexander Cook, Kazuko Fuchi, Philip R. Buskohl, Gregory H. HuffAbstract
Multi-material additive manufacturing enables transformative capabilities in customized, low-cost, and multi-functional electromagnetic devices. However, process-specific fabrication anomalies can result in non-intuitive effects on performance; we propose a framework for identifying defect mechanisms and their performance impact by mapping geometric variances to electromagnetic performance metrics. This method can accelerate additive fabrication feedback while avoiding the high computational cost of in-line electromagnetic simulation. We first used dimension reduction to explore the population of geometric manufacturing anomalies and electromagnetic performance. Convolutional neural networks are then trained to predict the electromagnetic performance of the printed geometries. In generating the networks, we explored two inputs: one image-derived geometric description and one using the same description with additional simulated electromagnetic information. Network latent space analysis shows the networks learned both geometric and electromagnetic values even without electromagnetic input. This result demonstrates it is possible to create accelerated additive feedback systems predicting electromagnetic performance without in-line simulation. -
Combining Geometric and Topological Information in Image Segmentation (2019)
Hengrui Luo, Justin StraitAbstract
A fundamental problem in computer vision is image segmentation, where the goal is to delineate the boundary of an object in the image. The focus of this work is on the segmentation of grayscale images and its purpose is two-fold. First, we conduct an in-depth study comparing active contour and topology-based methods in a statistical framework, two popular approaches for boundary detection of 2-dimensional images. Certain properties of the image dataset may favor one method over the other, both from an interpretability perspective as well as through evaluation of performance measures. Second, we propose the use of topological knowledge to assist an active contour method, which can potentially incorporate prior shape information. The latter is known to be extremely sensitive to algorithm initialization, and thus, we use a topological model to provide an automatic initialization. In addition, our proposed model can handle objects in images with more complex topological structures, including objects with holes and multiple objects within one image. We demonstrate this on artificially-constructed image datasets from computer vision, as well as real medical image data. -
Histopathological Cancer Detection With Topological Signatures (2023)
Ankur Yadav, Faisal Ahmed, Ovidiu Daescu, Reyhan Gedik, Baris CoskunuzerAbstract
We present a transformative approach to histopathological cancer detection and grading by introducing a very powerful feature extraction method based on the latest topological data analysis tools. By analyzing the evolution of topological patterns in different color channels, we discovered that every tumor class leaves its own topological footprint in histopathological images, allowing to extract feature vectors that can be used to reliably identify tumor classes.Our topological signatures, even when combined with traditional machine learning methods, provide very fast and highly accurate results in various settings. While most DL models work well for one type of cancer, our model easily adapts to different scenarios, and consistently gives highly competitive results with the state-of-the-art models on benchmark datasets across multiple cancer types including bone, colon, breast, cervical (cytopathology), and prostate cancer. Unlike most DL models, our proposed Topo-ML model does not need any data augmentation or pre-processing steps and works perfectly on small datasets. The model is computationally very efficient, with end-to-end processing taking only a few hours for datasets consisting of thousands of images. -
Topological Data Analysis and Diagnostics of Compressible Magnetohydrodynamic Turbulence (2018)
Irina Makarenko, Paul Bushby, Andrew Fletcher, Robin Henderson, Nikolay Makarenko, Anvar ShukurovAbstract
The predictions of mean-field electrodynamics can now be probed using direct numerical simulations of random flows and magnetic fields. When modelling astrophysical magnetohydrodynamics, it is important to verify that such simulations are in agreement with observations. One of the main challenges in this area is to identify robust quantitative measures to compare structures found in simulations with those inferred from astrophysical observations. A similar challenge is to compare quantitatively results from different simulations. Topological data analysis offers a range of techniques, including the Betti numbers and persistence diagrams, that can be used to facilitate such a comparison. After describing these tools, we first apply them to synthetic random fields and demonstrate that, when the data are standardized in a straightforward manner, some topological measures are insensitive to either large-scale trends or the resolution of the data. Focusing upon one particular astrophysical example, we apply topological data analysis to H i observations of the turbulent interstellar medium (ISM) in the Milky Way and to recent magnetohydrodynamic simulations of the random, strongly compressible ISM. We stress that these topological techniques are generic and could be applied to any complex, multi-dimensional random field. -
Topological Early Warning Signals: Quantifying Varying Routes to Extinction in a Spatially Distributed Population Model (2022)
Laura S. Storch, Sarah L. DayAbstract
Understanding and predicting critical transitions in spatially explicit ecological systems is particularly challenging due to their complex spatial and temporal dynamics and high dimensionality. Here, we explore changes in population distribution patterns during a critical transition (an extinction event) using computational topology. Computational topology allows us to quantify certain features of a population distribution pattern, such as the level of fragmentation. We create population distribution patterns via a simple coupled patch model with Ricker map growth and nearest neighbors dispersal on a two dimensional lattice. We observe two dominant paths to extinction within the explored parameter space that depend critically on the dispersal rate d and the rate of parameter drift, Δϵ. These paths to extinction are easily topologically distinguishable, so categorization can be automated. We use this population model as a theoretical proof-of-concept for the methodology, and argue that computational topology is a powerful tool for analyzing dynamical changes in systems with noisy data that are coarsely resolved in space and/or time. In addition, computational topology can provide early warning signals for chaotic dynamical systems where traditional statistical early warning signals would fail. For these reasons, we envision this work as a helpful addition to the critical transitions prediction toolbox. -
Feature Detection and Hypothesis Testing for Extremely Noisy Nanoparticle Images Using Topological Data Analysis (2023)
Andrew M. Thomas, Peter A. Crozier, Yuchen Xu, David S. MattesonAbstract
We propose a flexible algorithm for feature detection and hypothesis testing in images with ultra-low signal-to-noise ratio using cubical persistent homology. Our main application is in the identification of atomic columns and other features in Transmission Electron Microscopy (TEM). Cubical persistent homology is used to identify local minima and their size in subregions in the frames of nanoparticle videos, which are hypothesized to correspond to relevant atomic features. We compare the performance of our algorithm to other employed methods for the detection of columns and their intensity. Additionally, Monte Carlo goodness-of-fit testing using real-valued summaries of persistence diagrams derived from smoothed images (generated from pixels residing in the vacuum region of an image) is developed and employed to identify whether or not the proposed atomic features generated by our algorithm are due to noise. Using these summaries derived from the generated persistence diagrams, one can produce univariate time series for the nanoparticle videos, thus, providing a means for assessing fluxional behavior. A guarantee on the false discovery rate for multiple Monte Carlo testing of identical hypotheses is also established.Community Resources
-
A Morphometric Analysis of Vegetation Patterns in Dryland Ecosystems (2017)
Luke Mander, Stefan C. Dekker, Mao Li, Washington Mio, Surangi W. Punyasena, Timothy M. LentonAbstract
Vegetation in dryland ecosystems often forms remarkable spatial patterns. These range from regular bands of vegetation alternating with bare ground, to vegetated spots and labyrinths, to regular gaps of bare ground within an otherwise continuous expanse of vegetation. It has been suggested that spotted vegetation patterns could indicate that collapse into a bare ground state is imminent, and the morphology of spatial vegetation patterns, therefore, represents a potentially valuable source of information on the proximity of regime shifts in dryland ecosystems. In this paper, we have developed quantitative methods to characterize the morphology of spatial patterns in dryland vegetation. Our approach is based on algorithmic techniques that have been used to classify pollen grains on the basis of textural patterning, and involves constructing feature vectors to quantify the shapes formed by vegetation patterns. We have analysed images of patterned vegetation produced by a computational model and a small set of satellite images from South Kordofan (South Sudan), which illustrates that our methods are applicable to both simulated and real-world data. Our approach provides a means of quantifying patterns that are frequently described using qualitative terminology, and could be used to classify vegetation patterns in large-scale satellite surveys of dryland ecosystems. -
Data-Driven and Automatic Surface Texture Analysis Using Persistent Homology (2021)
Melih C. Yesilli, Firas A. KhasawnehAbstract
Surface roughness plays an important role in analyzing engineering surfaces. It quantifies the surface topography and can be used to determine whether the resulting surface finish is acceptable or not. Nevertheless, while several existing tools and standards are available for computing surface roughness, these methods rely heavily on user input thus slowing down the analysis and increasing manufacturing costs. Therefore, fast and automatic determination of the roughness level is essential to avoid costs resulting from surfaces with unacceptable finish, and user-intensive analysis. In this study, we propose a Topological Data Analysis (TDA) based approach to classify the roughness level of synthetic surfaces using both their areal images and profiles. We utilize persistent homology from TDA to generate persistence diagrams that encapsulate information on the shape of the surface. We then obtain feature matrices for each surface or profile using Carlsson coordinates, persistence images, and template functions. We compare our results to two widely used methods in the literature: Fast Fourier Transform (FFT) and Gaussian filtering. The results show that our approach yields mean accuracies as high as 97%. We also show that, in contrast to existing surface analysis tools, our TDA-based approach is fully automatable and provides adaptive feature extraction. -
Hepatic Tumor Classification Using Texture and Topology Analysis of Non-Contrast-Enhanced Three-Dimensional T1-Weighted MR Images With a Radiomics Approach (2019)
Asuka Oyama, Yasuaki Hiraoka, Ippei Obayashi, Yusuke Saikawa, Shigeru Furui, Kenshiro Shiraishi, Shinobu Kumagai, Tatsuya Hayashi, Jun’ichi KotokuAbstract
The purpose of this study is to evaluate the accuracy for classification of hepatic tumors by characterization of T1-weighted magnetic resonance (MR) images using two radiomics approaches with machine learning models: texture analysis and topological data analysis using persistent homology. This study assessed non-contrast-enhanced fat-suppressed three-dimensional (3D) T1-weighted images of 150 hepatic tumors. The lesions included 50 hepatocellular carcinomas (HCCs), 50 metastatic tumors (MTs), and 50 hepatic hemangiomas (HHs) found respectively in 37, 23, and 33 patients. For classification, texture features were calculated, and also persistence images of three types (degree 0, degree 1 and degree 2) were obtained for each lesion from the 3D MR imaging data. We used three classification models. In the classification of HCC and MT (resp. HCC and HH, HH and MT), we obtained accuracy of 92% (resp. 90%, 73%) by texture analysis, and the highest accuracy of 85% (resp. 84%, 74%) when degree 1 (resp. degree 1, degree 2) persistence images were used. Our methods using texture analysis or topological data analysis allow for classification of the three hepatic tumors with considerable accuracy, and thus might be useful when applied for computer-aided diagnosis with MR images. -
Exploring Surface Texture Quantification in Piezo Vibration Striking Treatment (PVST) Using Topological Measures (2022)
Melih C. Yesilli, Max M. Chumley, Jisheng Chen, Firas A. Khasawneh, Yang GuoAbstract
Abstract. Surface texture influences wear and tribological properties of manufactured parts, and it plays a critical role in end-user products. Therefore, quantifying the order or structure of a manufactured surface provides important information on the quality and life expectancy of the product. Although texture can be intentionally introduced to enhance aesthetics or to satisfy a design function, sometimes it is an inevitable byproduct of surface treatment processes such as Piezo Vibration Striking Treatment (PVST). Measures of order for surfaces have been characterized using statistical, spectral, and geometric approaches. For nearly hexagonal lattices, topological tools have also been used to measure the surface order. This paper explores utilizing tools from Topological Data Analysis for measuring surface texture. We compute measures of order based on optical digital microscope images of surfaces treated using PVST. These measures are applied to the grid obtained from estimating the centers of tool impacts, and they quantify the grid’s deviations from the nominal one. Our results show that TDA provides a convenient framework for characterization of pattern type that bypasses some limitations of existing tools such as difficult manual processing of the data and the need for an expert user to analyze and interpret the surface images. -
A Topological Framework for Identifying Phenomenological Bifurcations in Stochastic Dynamical Systems (2024)
Sunia Tanweer, Firas A. Khasawneh, Elizabeth Munch, Joshua R. TempelmanAbstract
Changes in the parameters of dynamical systems can cause the state of the system to shift between different qualitative regimes. These shifts, known as bifurcations, are critical to study as they can indicate when the system is about to undergo harmful changes in its behavior. In stochastic dynamical systems, there is particular interest in P-type (phenomenological) bifurcations, which can include transitions from a monostable state to multi-stable states, the appearance of stochastic limit cycles and other features in the probability density function (PDF) of the system’s state. Current practices are limited to systems with small state spaces, cannot detect all possible behaviors of the PDFs and mandate human intervention for visually identifying the change in the PDF. In contrast, this study presents a new approach based on Topological Data Analysis that uses superlevel persistence to mathematically quantify P-type bifurcations in stochastic systems through a “homological bifurcation plot”—which shows the changing ranks of 0th and 1st homology groups, through Betti vectors. Using these plots, we demonstrate the successful detection of P-bifurcations on the stochastic Duffing, Raleigh-Vander Pol and Quintic Oscillators given their analytical PDFs, and elaborate on how to generate an estimated homological bifurcation plot given a kernel density estimate (KDE) of these systems by employing a tool for finding topological consistency between PDFs and KDEs. -
Manifold Learning for Coherent Design Interpolation Based on Geometrical and Topological Descriptors (2023)
D. Muñoz, O. Allix, F. Chinesta, J. J. Ródenas, E. NadalAbstract
In the context of intellectual property in the manufacturing industry, know-how is referred to practical knowledge on how to accomplish a specific task. This know-how is often difficult to be synthesised in a set of rules or steps as it remains in the intuition and expertise of engineers, designers, and other professionals. Today, a new research line in this concern spot-up thanks to the explosion of Artificial Intelligence and Machine Learning algorithms and its alliance with Computational Mechanics and Optimisation tools. However, a key aspect with industrial design is the scarcity of available data, making it problematic to rely on deep-learning approaches. Assuming that the existing designs live in a manifold, in this paper, we propose a synergistic use of existing Machine Learning tools to infer a reduced manifold from the existing limited set of designs and, then, to use it to interpolate between the individuals, working as a generator basis, to create new and coherent designs. For this, a key aspect is to be able to properly interpolate in the reduced manifold, which requires a proper clustering of the individuals. From our experience, due to the scarcity of data, adding topological descriptors to geometrical ones considerably improves the quality of the clustering. Thus, a distance, mixing topology and geometry is proposed. This distance is used both, for the clustering and for the interpolation. For the interpolation, relying on optimal transport appear to be mandatory. Examples of growing complexity are proposed to illustrate the goodness of the method. -
Classification of COVID-19 via Homology of CT-SCAN (2021)
Sohail Iqbal, H. Fareed Ahmed, Talha Qaiser, Muhammad Imran Qureshi, Nasir RajpootAbstract
In this worldwide spread of SARS-CoV-2 (COVID-19) infection, it is of utmost importance to detect the disease at an early stage especially in the hot spots of this epidemic. There are more than 110 Million infected cases on the globe, sofar. Due to its promptness and effective results computed tomography (CT)-scan image is preferred to the reverse-transcription polymerase chain reaction (RT-PCR). Early detection and isolation of the patient is the only possible way of controlling the spread of the disease. Automated analysis of CT-Scans can provide enormous support in this process. In this article, We propose a novel approach to detect SARS-CoV-2 using CT-scan images. Our method is based on a very intuitive and natural idea of analyzing shapes, an attempt to mimic a professional medic. We mainly trace SARS-CoV-2 features by quantifying their topological properties. We primarily use a tool called persistent homology, from Topological Data Analysis (TDA), to compute these topological properties. We train and test our model on the "SARS-CoV-2 CT-scan dataset" i̧tep\soares2020sars\, an open-source dataset, containing 2,481 CT-scans of normal and COVID-19 patients. Our model yielded an overall benchmark F1 score of \$99.42\% \$, accuracy \$99.416\%\$, precision \$99.41\%\$, and recall \$99.42\%\$. The TDA techniques have great potential that can be utilized for efficient and prompt detection of COVID-19. The immense potential of TDA may be exploited in clinics for rapid and safe detection of COVID-19 globally, in particular in the low and middle-income countries where RT-PCR labs and/or kits are in a serious crisis. -
Pattern Characterization Using Topological Data Analysis: Application to Piezo Vibration Striking Treatment (2023)
Max M. Chumley, Melih C. Yesilli, Jisheng Chen, Firas A. Khasawneh, Yang GuoAbstract
Quantifying patterns in visual or tactile textures provides important information about the process or phenomena that generated these patterns. In manufacturing, these patterns can be intentionally introduced as a design feature, or they can be a byproduct of a specific process. Since surface texture has significant impact on the mechanical properties and the longevity of the workpiece, it is important to develop tools for quantifying surface patterns and, when applicable, comparing them to their nominal counterparts. While existing tools may be able to indicate the existence of a pattern, they typically do not provide more information about the pattern structure, or how much it deviates from a nominal pattern. Further, prior works do not provide automatic or algorithmic approaches for quantifying other pattern characteristics such as depths’ consistency, and variations in the pattern motifs at different level sets. This paper leverages persistent homology from Topological Data Analysis (TDA) to derive noise-robust scores for quantifying motifs’ depth and roundness in a pattern. Specifically, sublevel persistence is used to derive scores that quantify the consistency of indentation depths at any level set in Piezo Vibration Striking Treatment (PVST) surfaces. Moreover, we combine sublevel persistence with the distance transform to quantify the consistency of the indentation radii, and to compare them with the nominal ones. Although the tool in our PVST experiments had a semi-spherical profile, we present a generalization of our approach to tools/motifs of arbitrary shapes thus making our method applicable to other pattern-generating manufacturing processes. -
Characterising Epithelial Tissues Using Persistent Entropy (2019)
N. Atienza, L. M. Escudero, M. J. Jimenez, M. Soriano-TriguerosAbstract
In this paper, we apply persistent entropy, a novel topological statistic, for characterization of images of epithelial tissues. We have found out that persistent entropy is able to summarize topological and geometric information encoded by \$\$\alpha \$\$α-complexes and persistent homology. After using some statistical tests, we can guarantee the existence of significant differences in the studied tissues. -
A Method to the Madness: Using Persistent Homology to Measure Plant Morphology (2018)
Emily R. Larson -
Kernel Method for Persistence Diagrams via Kernel Embedding and Weight Factor (2017)
Genki Kusano, Kenji Fukumizu, Yasuaki Hiraoka -
Topological Data Analysis Quantifies Biological Nano-Structure From Single Molecule Localization Microscopy (2020)
Jeremy A. Pike, Abdullah O. Khan, Chiara Pallini, Steven G. Thomas, Markus Mund, Jonas Ries, Natalie S. Poulter, Iain B. StylesAbstract
AbstractMotivation. Localization microscopy data is represented by a set of spatial coordinates, each corresponding to a single detection, that form a point cl -
Topological Data Analysis of Task-Based fMRI Data From Experiments on Schizophrenia (2021)
Bernadette J. Stolz, Tegan Emerson, Satu Nahkuri, Mason A. Porter, Heather A. Harrington -
Sheaves Are the Canonical Data Structure for Sensor Integration (2017)
Michael RobinsonAbstract
A sensor integration framework should be sufficiently general to accurately represent many sensor modalities, and also be able to summarize information in a faithful way that emphasizes important, actionable information. Few approaches adequately address these two discordant requirements. The purpose of this expository paper is to explain why sheaves are the canonical data structure for sensor integration and how the mathematics of sheaves satisfies our two requirements. We outline some of the powerful inferential tools that are not available to other representational frameworks. -
Finite Topology as Applied to Image Analysis (1989)
V. A KovalevskyAbstract
The notion of a cellular complex which is well known in the topology is applied to describe the structure of images. It is shown that the topology of cellular complexes is the only possible topology of finite sets. Under this topology no contradictions or paradoxes arise when defining connected subsets and their boundaries. Ways of encoding images as cellular complexes are discussed. The process of image segmentation is considered as splitting (in the topological sense) a cellular complex into blocks of cells. The notion of a cell list is introduced as a precise and compact data structure for encoding segmented images. Some applications of this data structure to the image analysis are demonstrated. -
Classification of Skin Lesions by Topological Data Analysis Alongside With Neural Network (2020)
Naiereh Elyasi, Mehdi Hosseini MoghadamAbstract
In this paper we use TDA mapper alongside with deep convolutional neural networks in the classification of 7 major skin diseases. First we apply kepler mapper with neural network as one of its filter steps to classify the dataset HAM10000. Mapper visualizes the classification result by a simplicial complex, where neural network can not do this alone, but as a filter step neural network helps to classify data better. Furthermore we apply TDA mapper and persistent homology to understand the weights of layers of mobilenet network in different training epochs of HAM10000. Also we use persistent diagrams to visualize the results of analysis of layers of mobilenet network. -
Fruit Flies and Moduli: Interactions Between Biology and Mathematics (2015)
Ezra MillerAbstract
Possibilities for using geometry and topology to analyze statistical problems in biology raise a host of novel questions in geometry, probability, algebra, and combinatorics that demonstrate the power of biology to influence the future of pure mathematics. This expository article is a tour through some biological explorations and their mathematical ramifications. The article starts with evolution of novel topological features in wing veins of fruit flies, which are quantified using the algebraic structure of multiparameter persistent homology. The statistical issues involved highlight mathematical implications of sampling from moduli spaces. These lead to geometric probability on stratified spaces, including the sticky phenomenon for Frechet means and the origin of this mathematical area in the reconstruction of phylogenetic trees. -
Topological Data Analysis of C. Elegans Locomotion and Behavior (2021)
Ashleigh Thomas, Kathleen Bates, Alex Elchesen, Iryna Hartsock, Hang Lu, Peter BubenikAbstract
Video of nematodes/roundworms was analyzed using persistent homology to study locomotion and behavior. In each frame, an organism's body posture was represented by a high-dimensional vector. By concatenating points in fixed-duration segments of this time series, we created a sliding window embedding (sometimes called a time delay embedding) where each point corresponds to a sequence of postures of an organism. Persistent homology on the points in this time series detected behaviors and comparisons of these persistent homology computations detected variation in their corresponding behaviors. We used average persistence landscapes and machine learning techniques to study changes in locomotion and behavior in varying environments. -
Transfer Learning for Autonomous Chatter Detection in Machining (2022)
Melih C. Yesilli, Firas A. Khasawneh, Brian P. MannAbstract
Large-amplitude chatter vibrations are one of the most important phenomena in machining processes. It is often detrimental in cutting operations causing a poor surface finish and decreased tool life. Therefore, chatter detection using machine learning has been an active research area over the last decade. Three challenges can be identified in applying machine learning for chatter detection at large in industry: an insufficient understanding of the universality of chatter features across different processes, the need for automating feature extraction, and the existence of limited data for each specific workpiece-machine tool combination, e.g., when machining one-off products. These three challenges can be grouped under the umbrella of transfer learning, which is concerned with studying how knowledge gained from one setting can be leveraged to obtain information in new settings. This paper studies automating chatter detection by evaluating transfer learning of prominent as well as novel chatter detection methods. We investigate chatter classification accuracy using a variety of features extracted from turning and milling experiments with different cutting configurations. The studied methods include Fast Fourier Transform (FFT), Power Spectral Density (PSD), the Auto-correlation Function (ACF), and decomposition based tools such as Wavelet Packet Transform (WPT) and Ensemble Empirical Mode Decomposition (EEMD). We also examine more recent approaches based on Topological Data Analysis (TDA) and similarity measures of time series based on Discrete Time Warping (DTW). We evaluate transfer learning potential of each approach by training and testing both within and across the turning and milling data sets. Four supervised classification algorithms are explored: support vector machine (SVM), logistic regression, random forest classification, and gradient boosting. In addition to accuracy, we also comment on the automation potential of feature extraction for each approach which is integral to creating autonomous manufacturing centers. Our results show that carefully chosen time-frequency features can lead to high classification accuracies albeit at the cost of requiring manual pre-processing and the tagging of an expert user. On the other hand, we found that the TDA and DTW approaches can provide accuracies and F1-scores on par with the time-frequency methods without the need for manual preprocessing via completely automatic pipelines. Further, we discovered that the DTW approach outperforms all other methods when trained using the milling data and tested on the turning data. Therefore, TDA and DTW approaches may be preferred over the time-frequency-based approaches for fully automated chatter detection schemes. DTW and TDA also can be more advantageous when pooling data from either limited workpiece-machine tool combinations, or from small data sets of one-off processes. -
Euler Characteristic Surfaces (2021)
Gabriele Beltramo, Rayna Andreeva, Ylenia Giarratano, Miguel O. Bernabeu, Rik Sarkar, Primoz SkrabaAbstract
We study the use of the Euler characteristic for multiparameter topological data analysis. Euler characteristic is a classical, well-understood topological invariant that has appeared in numerous applications, including in the context of random fields. The goal of this paper is to present the extension of using the Euler characteristic in higher-dimensional parameter spaces. While topological data analysis of higher-dimensional parameter spaces using stronger invariants such as homology continues to be the subject of intense research, Euler characteristic is more manageable theoretically and computationally, and this analysis can be seen as an important intermediary step in multi-parameter topological data analysis. We show the usefulness of the techniques using artificially generated examples, and a real-world application of detecting diabetic retinopathy in retinal images. -
Topological Descriptors of Histology Images (2014)
Nikhil Singh, Heather D. Couture, J. S. Marron, Charles Perou, Marc NiethammerAbstract
The purpose of this study is to investigate architectural characteristics of cell arrangements in breast cancer histology images. We propose the use of topological data analysis to summarize the geometric information inherent in tumor cell arrangements. Our goal is to use this information as signatures that encode robust summaries of cell arrangements in tumor tissue as captured through histology images. In particular, using ideas from algebraic topology we construct topological descriptors based on cell nucleus segmentations such as persistency charts and Betti sequences. We assess their performance on the task of discriminating the breast cancer subtypes Basal, Luminal A, Luminal B and HER2. We demonstrate that the topological features contain useful complementary information to image-appearance based features that can improve discriminatory performance of classifiers. -
Topological Regularization for Dense Prediction (2021)
Deqing Fu, Bradley J. NelsonAbstract
Dense prediction tasks such as depth perception and semantic segmentation are important applications in computer vision that have a concrete topological description in terms of partitioning an image into connected components or estimating a function with a small number of local extrema corresponding to objects in the image. We develop a form of topological regularization based on persistent homology that can be used in dense prediction tasks with these topological descriptions. Experimental results show that the output topology can also appear in the internal activations of trained neural networks which allows for a novel use of topological regularization to the internal states of neural networks during training, reducing the computational cost of the regularization. We demonstrate that this topological regularization of internal activations leads to improved convergence and test benchmarks on several problems and architectures. -
Phase-Field Investigation of the Coarsening of Porous Structures by Surface Diffusion (2019)
Pierre-Antoine Geslin, Mickaël Buchet, Takeshi Wada, Hidemi KatoAbstract
Nano and microporous connected structures have attracted increasing attention in the past decades due to their high surface area, presenting interesting properties for a number of applications. These structures generally coarsen by surface diffusion, leading to an enlargement of the structure characteristic length scale. We propose to study this coarsening behavior using a phase-field model for surface diffusion. In addition to reproducing the expected scaling law, our simulations enable to investigate precisely the evolution of the topological and morphological characteristics along the coarsening process. In particular, we show that after a transient regime, the coarsening is self-similar as exhibited by the evolution of both morphological and topological features. In addition, the influence of surface anisotropy is discussed and comparisons with experimental tomographic observations are presented. -
Persistent Homology Machine Learning for Fingerprint Classification (2019)
N. Giansiracusa, R. Giansiracusa, C. MoonAbstract
The fingerprint classification problem is to sort fingerprints into predetermined groups, such as arch, loop, and whorl. It was asserted in the literature that minutiae points, which are commonly used for fingerprint matching, are not useful for classification. We show that, to the contrary, near state-of-the-art classification accuracy rates can be achieved when applying topological data analysis (TDA) to 3-dimensional point clouds of oriented minutiae points. We also apply TDA to fingerprint ink-roll images, which yields a lower accuracy rate but still shows promise; moreover, combining the two approaches outperforms each one individually. These methods use supervised learning applied to persistent homology and allow us to explore feature selection on barcodes, an important topic at the interface between TDA and machine learning. We test our classification algorithms on the NIST fingerprint database SD-27. -
Persistent Homology for Breast Tumor Classification Using Mammogram Scans (2022)
Aras Asaad, Dashti Ali, Taban Majeed, Rasber RashidAbstract
An Important tool in the field topological data analysis is known as persistent Homology (PH) which is used to encode abstract representation of the homology of data at different resolutions in the form of persistence diagram (PD). In this work we build more than one PD representation of a single image based on a landmark selection method, known as local binary patterns, that encode different types of local textures from images. We employed different PD vectorizations using persistence landscapes, persistence images, persistence binning (Betti Curve) and statistics. We tested the effectiveness of proposed landmark based PH on two publicly available breast abnormality detection datasets using mammogram scans. Sensitivity of landmark based PH obtained is over 90% in both datasets for the detection of abnormal breast scans. Finally, experimental results give new insights on using different types of PD vectorizations which help in utilising PH in conjunction with machine learning classifiers. -
The Euler Characteristic: A General Topological Descriptor for Complex Data (2021)
Alexander Smith, Victor ZavalaAbstract
Datasets are mathematical objects (e.g., point clouds, matrices, graphs, images, fields/functions) that have shape. This shape encodes important knowledge about the system under study. Topology is an area of mathematics that provides diverse tools to characterize the shape of data objects. In this work, we study a specific tool known as the Euler characteristic (EC). The EC is a general, low-dimensional, and interpretable descriptor of topological spaces defined by data objects. We revise the mathematical foundations of the EC and highlight its connections with statistics, linear algebra, field theory, and graph theory. We discuss advantages offered by the use of the EC in the characterization of complex datasets; to do so, we illustrate its use in different applications of interest in chemical engineering such as process monitoring, flow cytometry, and microscopy. We show that the EC provides a descriptor that effectively reduces complex datasets and that this reduction facilitates tasks such as visualization, regression, classification, and clustering. -
A Novel Approach for Wafer Defect Pattern Classification Based on Topological Data Analysis (2023)
Seungchan Ko, Dowan KooAbstract
In semiconductor manufacturing, wafer map defect pattern provides critical information for facility maintenance and yield management, so the classification of defect patterns is one of the most important tasks in the manufacturing process. In this paper, we propose a novel way to represent the shape of the defect pattern as a finite-dimensional vector, which will be used as an input for a neural network algorithm for classification. The main idea is to extract the topological features of each pattern by using the theory of persistent homology from topological data analysis (TDA). Through some experiments with a simulated dataset, we show that the proposed method is faster and much more efficient in training with higher accuracy, compared with the method using convolutional neural networks (CNN) which is the most common approach for wafer map defect pattern classification. Moreover, it was shown that our method outperforms the CNN-based method when the number of training data is not enough and is imbalanced. -
Quantitative and Interpretable Order Parameters for Phase Transitions From Persistent Homology (2020)
Alex Cole, Gregory J. Loges, Gary ShiuAbstract
We apply modern methods in computational topology to the task of discovering and characterizing phase transitions. As illustrations, we apply our method to four two-dimensional lattice spin models: the Ising, square ice, XY, and fully-frustrated XY models. In particular, we use persistent homology, which computes the births and deaths of individual topological features as a coarse-graining scale or sublevel threshold is increased, to summarize multiscale and high-point correlations in a spin configuration. We employ vector representations of this information called persistence images to formulate and perform the statistical task of distinguishing phases. For the models we consider, a simple logistic regression on these images is sufficient to identify the phase transition. Interpretable order parameters are then read from the weights of the regression. This method suffices to identify magnetization, frustration, and vortex-antivortex structure as relevant features for phase transitions in our models. We also define "persistence" critical exponents and study how they are related to those critical exponents usually considered. -
TDAExplore: Quantitative Analysis of Fluorescence Microscopy Images Through Topology-Based Machine Learning (2021)
Parker Edwards, Kristen Skruber, Nikola Milićević, James B. Heidings, Tracy-Ann Read, Peter Bubenik, Eric A. VitriolAbstract
Recent advances in machine learning have greatly enhanced automatic methods to extract information from fluorescence microscopy data. However, current machine-learning-based models can require hundreds to thousands of images to train, and the most readily accessible models classify images without describing which parts of an image contributed to classification. Here, we introduce TDAExplore, a machine learning image analysis pipeline based on topological data analysis. It can classify different types of cellular perturbations after training with only 20–30 high-resolution images and performs robustly on images from multiple subjects and microscopy modes. Using only images and whole-image labels for training, TDAExplore provides quantitative, spatial information, characterizing which image regions contribute to classification. Computational requirements to train TDAExplore models are modest and a standard PC can perform training with minimal user input. TDAExplore is therefore an accessible, powerful option for obtaining quantitative information about imaging data in a wide variety of applications. -
TDA-Net: Fusion of Persistent Homology and Deep Learning Features for COVID-19 Detection From Chest X-Ray Images (2021)
Mustafa Hajij, Ghada Zamzmi, Fawwaz BataynehAbstract
Topological Data Analysis (TDA) has emerged recently as a robust tool to extract and compare the structure of datasets. TDA identifies features in data (e.g., connected components and holes) and assigns a quantitative measure to these features. Several studies reported that topological features extracted by TDA tools provide unique information about the data, discover new insights, and determine which feature is more related to the outcome. On the other hand, the overwhelming success of deep neural networks in learning patterns and relationships has been proven on various data applications including images. To capture the characteristics of both worlds, we propose TDA-Net, a novel ensemble network that fuses topological and deep features for the purpose of enhancing model generalizability and accuracy. We apply the proposed TDA-Net to a critical application, which is the automated detection of COVID-19 from CXR images. Experimental results showed that the proposed network achieved excellent performance and suggested the applicability of our method in practice. -
Persistent Homology for Path Planning in Uncertain Environments (2015)
S. Bhattacharya, R. Ghrist, V. KumarAbstract
We address the fundamental problem of goal-directed path planning in an uncertain environment represented as a probability (of occupancy) map. Most methods generally use a threshold to reduce the grayscale map to a binary map before applying off-the-shelf techniques to find the best path. This raises the somewhat ill-posed question, what is the right (optimal) value to threshold the map? We instead suggest a persistent homology approach to the problem-a topological approach in which we seek the homology class of trajectories that is most persistent for the given probability map. In other words, we want the class of trajectories that is free of obstacles over the largest range of threshold values. In order to make this problem tractable, we use homology in ℤ2 coefficients (instead of the standard ℤ coefficients), and describe how graph search-based algorithms can be used to find trajectories in different homology classes. Our simulation results demonstrate the efficiency and practical applicability of the algorithm proposed in this paper.paper. -
From Trees to Barcodes and Back Again: Theoretical and Statistical Perspectives (2020)
Lida Kanari, Adélie Garin, Kathryn HessAbstract
Methods of topological data analysis have been successfully applied in a wide range of fields to provide useful summaries of the structure of complex data sets in terms of topological descriptors, such as persistence diagrams. While there are many powerful techniques for computing topological descriptors, the inverse problem, i.e., recovering the input data from topological descriptors, has proved to be challenging. In this article we study in detail the Topological Morphology Descriptor (TMD), which assigns a persistence diagram to any tree embedded in Euclidean space, and a sort of stochastic inverse to the TMD, the Topological Neuron Synthesis (TNS) algorithm, gaining both theoretical and computational insights into the relation between the two. We propose a new approach to classify barcodes using symmetric groups, which provides a concrete language to formulate our results. We investigate to what extent the TNS recovers a geometric tree from its TMD and describe the effect of different types of noise on the process of tree generation from persistence diagrams. We prove moreover that the TNS algorithm is stable with respect to specific types of noise. -
Capturing Shape Information With Multi-Scale Topological Loss Terms For 3D Reconstruction (2022)
Dominik J. E. Waibel, Scott Atwell, Matthias Meier, Carsten Marr, Bastian RieckAbstract
Reconstructing 3D objects from 2D images is both challenging for our brains and machine learning algorithms. To support this spatial reasoning task, contextual information about the overall shape of an object is critical. However, such information is not captured by established loss terms (e.g. Dice loss). We propose to complement geometrical shape information by including multi-scale topological features, such as connected components, cycles, and voids, in the reconstruction loss. Our method uses cubical complexes to calculate topological features of 3D volume data and employs an optimal transport distance to guide the reconstruction process. This topology-aware loss is fully differentiable, computationally efficient, and can be added to any neural network. We demonstrate the utility of our loss by incorporating it into SHAPR, a model for predicting the 3D cell shape of individual cells based on 2D microscopy images. Using a hybrid loss that leverages both geometrical and topological information of single objects to assess their shape, we find that topological information substantially improves the quality of reconstructions, thus highlighting its ability to extract more relevant features from image datasets. -
Determining Structural Properties of Artificial Neural Networks Using Algebraic Topology (2021)
David Pérez Fernández, Asier Gutiérrez-Fandiño, Jordi Armengol-Estapé, Marta VillegasAbstract
Artificial Neural Networks (ANNs) are widely used for approximating complex functions. The process that is usually followed to define the most appropriate architecture for an ANN given a specific function is mostly empirical. Once this architecture has been defined, weights are usually optimized according to the error function. On the other hand, we observe that ANNs can be represented as graphs and their topological 'fingerprints' can be obtained using Persistent Homology (PH). In this paper, we describe a proposal focused on designing more principled architecture search procedures. To do this, different architectures for solving problems related to a heterogeneous set of datasets have been analyzed. The results of the evaluation corroborate that PH effectively characterizes the ANN invariants: when ANN density (layers and neurons) or sample feeding order is the only difference, PH topological invariants appear; in the opposite direction in different sub-problems (i.e. different labels), PH varies. This approach based on topological analysis helps towards the goal of designing more principled architecture search procedures and having a better understanding of ANNs. -
A Klein-Bottle-Based Dictionary for Texture Representation (2014)
Jose A. Perea, Gunnar CarlssonAbstract
A natural object of study in texture representation and material classification is the probability density function, in pixel-value space, underlying the set of small patches from the given image. Inspired by the fact that small \$\$n\times n\$\$n×nhigh-contrast patches from natural images in gray-scale accumulate with high density around a surface \$\$\fancyscript\K\\subset \\mathbb \R\\\textasciicircum\n\textasciicircum2\\$\$K⊂Rn2with the topology of a Klein bottle (Carlsson et al. International Journal of Computer Vision 76(1):1–12, 2008), we present in this paper a novel framework for the estimation and representation of distributions around \$\$\fancyscript\K\\$\$K, of patches from texture images. More specifically, we show that most \$\$n\times n\$\$n×npatches from a given image can be projected onto \$\$\fancyscript\K\\$\$Kyielding a finite sample \$\$S\subset \fancyscript\K\\$\$S⊂K, whose underlying probability density function can be represented in terms of Fourier-like coefficients, which in turn, can be estimated from \$\$S\$\$S. We show that image rotation acts as a linear transformation at the level of the estimated coefficients, and use this to define a multi-scale rotation-invariant descriptor. We test it by classifying the materials in three popular data sets: The CUReT, UIUCTex and KTH-TIPS texture databases. -
Topological Data Analysis of Spatial Patterning in Heterogeneous Cell Populations: Clustering and Sorting With Varying Cell-Cell Adhesion (2023)
Dhananjay Bhaskar, William Y. Zhang, Alexandria Volkening, Björn Sandstede, Ian Y. WongAbstract
Different cell types aggregate and sort into hierarchical architectures during the formation of animal tissues. The resulting spatial organization depends (in part) on the strength of adhesion of one cell type to itself relative to other cell types. However, automated and unsupervised classification of these multicellular spatial patterns remains challenging, particularly given their structural diversity and biological variability. Recent developments based on topological data analysis are intriguing to reveal similarities in tissue architecture, but these methods remain computationally expensive. In this article, we show that multicellular patterns organized from two interacting cell types can be efficiently represented through persistence images. Our optimized combination of dimensionality reduction via autoencoders, combined with hierarchical clustering, achieved high classification accuracy for simulations with constant cell numbers. We further demonstrate that persistence images can be normalized to improve classification for simulations with varying cell numbers due to proliferation. Finally, we systematically consider the importance of incorporating different topological features as well as information about each cell type to improve classification accuracy. We envision that topological machine learning based on persistence images will enable versatile and robust classification of complex tissue architectures that occur in development and disease. -
Barcodes Distinguish Morphology of Neuronal Tauopathy (2022)
David Beers, Despoina Goniotaki, Diane P. Hanger, Alain Goriely, Heather A. HarringtonAbstract
The geometry of neurons is known to be important for their functions. Hence, neurons are often classified by their morphology. Two recent methods, persistent homology and the topological morphology descriptor, assign a morphology descriptor called a barcode to a neuron equipped with a given function, such as the Euclidean distance from the root of the neuron. These barcodes can be converted into matrices called persistence images, which can then be averaged across groups. We show that when the defining function is the path length from the root, both the topological morphology descriptor and persistent homology are equivalent. We further show that persistence images arising from the path length procedure provide an interpretable summary of neuronal morphology. We introduce \topological morphology functions\, a class of functions similar to Sholl functions, that can be recovered from the associated topological morphology descriptor. To demonstrate this topological approach, we compare healthy cortical and hippocampal mouse neurons to those affected by progressive tauopathy. We find a significant difference in the morphology of healthy neurons and those with a tauopathy at a postsymptomatic age. We use persistence images to conclude that the diseased group tends to have neurons with shorter branches as well as fewer branches far from the soma. -
Optimizing Porosity Detection in Wire Laser Metal Deposition Processes Through Data-Driven AI Classification Techniques (2023)
Meritxell Gomez-Omella, Jon Flores, Basilio Sierra, Susana Ferreiro, Nicolas Hascoët, Francisco ChinestaAbstract
Additive manufacturing (AM) is an attractive solution for many companies that produce geometrically complex parts. This process consists of depositing material layer by layer following a sliced CAD geometry. It brings several benefits to manufacturing capabilities, such as design freedom, reduced material waste, and short-run customization. However, one of the current challenges faced by users of the process, mainly in wire laser metal deposition (wLMD), is to avoid defects in the manufactured part, especially the porosity. This defect is caused by extreme conditions and metallurgical transformations of the process. And not only does it directly affect the mechanical performance of the parts, especially the fatigue properties, but it also means an increase in costs due to the inspection tasks to which the manufactured parts must be subjected. This work compares three operational solution approaches, product-centric, based on signal-based feature extraction and Topological Data Analysis together with statistical and Machine Learning (ML) techniques, for the early detection and prediction of porosity failure in a wLMD process. The different forecasting and validation strategies demonstrate the variety of conclusions that can be drawn with different objectives in the analysis of the monitored data in AM problems. -
Relational Persistent Homology for Multispecies Data With Application to the Tumor Microenvironment (2023)
Bernadette J. Stolz, Jagdeep Dhesi, Joshua A. Bull, Heather A. Harrington, Helen M. Byrne, Iris H. R. YoonAbstract
Topological data analysis (TDA) is an active field of mathematics for quantifying shape in complex data. Standard methods in TDA such as persistent homology (PH) are typically focused on the analysis of data consisting of a single entity (e.g., cells or molecular species). However, state-of-the-art data collection techniques now generate exquisitely detailed multispecies data, prompting a need for methods that can examine and quantify the relations among them. Such heterogeneous data types arise in many contexts, ranging from biomedical imaging, geospatial analysis, to species ecology. Here, we propose two methods for encoding spatial relations among different data types that are based on Dowker complexes and Witness complexes. We apply the methods to synthetic multispecies data of a tumor microenvironment and analyze topological features that capture relations between different cell types, e.g., blood vessels, macrophages, tumor cells, and necrotic cells. We demonstrate that relational topological features can extract biological insight, including the dominant immune cell phenotype (an important predictor of patient prognosis) and the parameter regimes of a data-generating model. The methods provide a quantitative perspective on the relational analysis of multispecies spatial data, overcome the limits of traditional PH, and are readily computable. -
A Novel Quality Clustering Methodology on Fab-Wide Wafer Map Images in Semiconductor Manufacturing (2022)
Yuan-Ming Hsu, Xiaodong Jia, Wenzhe Li, Jay LeeAbstract
Abstract. In semiconductor manufacturing, clustering the fab-wide wafer map images is of critical importance for practitioners to understand the subclusters of wafer defects, recognize novel clusters or anomalies, and develop fast reactions to quality issues. However, due to the high-mix manufacturing of diversified wafer products of different sizes and technologies, it is difficult to cluster the wafer map images across the fab. This paper addresses this challenge by proposing a novel methodology for fab-wide wafer map data clustering. In the proposed methodology, a well-known deep learning technique, vision transformer with multi-head attention is first trained to convert binary wafer images of different sizes into condensed feature vectors for efficient clustering. Then, the Topological Data Analysis (TDA), which is widely used in biomedical applications, is employed to visualize the data clusters and identify the anomalies. The TDA yields a topological representation of high-dimensional big data as well as its local clusters by creating a graph that shows nodes corresponding to the clusters within the data. The effectiveness of the proposed methodology is demonstrated by clustering the public wafer map dataset WM-811k from the real application which has a total of 811,457 wafer map images. We further demonstrate the potential applicability of topology data analytics in the semiconductor area by visualization. -
The Persistent Homology Mathematical Framework Provides Enhanced Genotype-to-Phenotype Associations for Plant Morphology (2018)
Mao Li, Margaret H. Frank, Viktoriya Coneva, Washington Mio, Daniel H. Chitwood, Christopher N. ToppAbstract
Efforts to understand the genetic and environmental conditioning of plant morphology are hindered by the lack of flexible and effective tools for quantifying morphology. Here, we demonstrate that persistent-homology-based topological methods can improve measurement of variation in leaf shape, serrations, and root architecture. We apply these methods to 2D images of leaves and root systems in field-grown plants of a domesticated introgression line population of tomato (Solanum pennellii). We find that compared with some commonly used conventional traits, (1) persistent-homology-based methods can more comprehensively capture morphological variation; (2) these techniques discriminate between genotypes with a larger normalized effect size and detect a greater number of unique quantitative trait loci (QTLs); (3) multivariate traits, whether statistically derived from univariate or persistent-homology-based traits, improve our ability to understand the genetic basis of phenotype; and (4) persistent-homology-based techniques detect unique QTLs compared to conventional traits or their multivariate derivatives, indicating that previously unmeasured aspects of morphology are now detectable. The QTL results further imply that genetic contributions to morphology can affect both the shoot and root, revealing a pleiotropic basis to natural variation in tomato. Persistent homology is a versatile framework to quantify plant morphology and developmental processes that complements and extends existing methods. -
Specimen-Based Analysis of Morphology and the Environment in Ecologically Dominant Grasses: The Power of the Herbarium (2019)
Christine A. McAllister, Michael R. McKain, Mao Li, Bess Bookout, Elizabeth A. KelloggAbstract
Herbaria contain a cumulative sample of the world's flora, assembled by thousands of people over centuries. To capitalize on this resource, we conducted a specimen-based analysis of a major clade in the grass tribe Andropogoneae, including the dominant species of the world's grasslands in the genera Andropogon, Schizachyrium, Hyparrhenia and several others. We imaged 186 of the 250 named species of the clade, georeferenced the specimens and extracted climatic variables for each. Using semi- and fully automated image analysis techniques, we extracted spikelet morphological characters and correlated these with environmental variables. We generated chloroplast genome sequences to correct for phylogenetic covariance and here present a new phylogeny for 81 of the species. We confirm and extend earlier studies to show that Andropogon and Schizachyrium are not monophyletic. In addition, we find all morphological and ecological characters are homoplasious but variable among clades. For example, sessile spikelet length is positively correlated with awn length when all accessions are considered, but when separated by clade, the relationship is positive for three sub-clades and negative for three others. Climate variables showed no correlation with morphological variation in the spikelet pair; only very weak effects of temperature and precipitation were detected on macrohair density. This article is part of the theme issue ‘Biological collections for understanding biodiversity in the Anthropocene'. -
Image-Based Phenotyping for Identification of QTL Determining Fruit Shape and Size in American Cranberry (Vaccinium Macrocarpon L.) (2018)
Luis Diaz-Garcia, Giovanny Covarrubias-Pazaran, Brandon Schlautman, Edward Grygleski, Juan ZalapaAbstract
Image-based phenotyping methodologies are powerful tools to determine quality parameters for fruit breeders and processors. The fruit size and shape of American cranberry (Vaccinium macrocarpon L.) are particularly important characteristics that determine the harvests’ processing value and potential end-use products (e.g., juice vs. sweetened dried cranberries). However, cranberry fruit size and shape attributes can be difficult and time consuming for breeders and processors to measure, especially when relying on manual measurements and visual ratings. Therefore, in this study, we implemented image-based phenotyping techniques for gathering data regarding basic cranberry fruit parameters such as length, width, length-to-width ratio, and eccentricity. Additionally, we applied a persistent homology algorithm to better characterize complex shape parameters. Using this high-throughput artificial vision approach, we characterized fruit from 351 progeny from a full-sib cranberry population over three field seasons. Using a covariate analysis to maximize the identification of well-supported quantitative trait loci (QTL), we found 252 single QTL in a 3-year period for cranberry fruit size and shape descriptors from which 20% were consistently found in all years. The present study highlights the potential for the identified QTL and the image-based methods to serve as a basis for future explorations of the genetic architecture of fruit size and shape in cranberry and other fruit crops. -
Topological Data Analysis of Zebrafish Patterns (2020)
Melissa R. McGuirl, Alexandria Volkening, Björn SandstedeAbstract
Self-organized pattern behavior is ubiquitous throughout nature, from fish schooling to collective cell dynamics during organism development. Qualitatively these patterns display impressive consistency, yet variability inevitably exists within pattern-forming systems on both microscopic and macroscopic scales. Quantifying variability and measuring pattern features can inform the underlying agent interactions and allow for predictive analyses. Nevertheless, current methods for analyzing patterns that arise from collective behavior capture only macroscopic features or rely on either manual inspection or smoothing algorithms that lose the underlying agent-based nature of the data. Here we introduce methods based on topological data analysis and interpretable machine learning for quantifying both agent-level features and global pattern attributes on a large scale. Because the zebrafish is a model organism for skin pattern formation, we focus specifically on analyzing its skin patterns as a means of illustrating our approach. Using a recent agent-based model, we simulate thousands of wild-type and mutant zebrafish patterns and apply our methodology to better understand pattern variability in zebrafish. Our methodology is able to quantify the differential impact of stochasticity in cell interactions on wild-type and mutant patterns, and we use our methods to predict stripe and spot statistics as a function of varying cellular communication. Our work provides an approach to automatically quantifying biological patterns and analyzing agent-based dynamics so that we can now answer critical questions in pattern formation at a much larger scale. -
PI-Net: A Deep Learning Approach to Extract Topological Persistence Images (2020)
Anirudh Som, Hongjun Choi, Karthikeyan Natesan Ramamurthy, Matthew Buman, Pavan TuragaAbstract
Topological features such as persistence diagrams and their functional approximations like persistence images (PIs) have been showing substantial promise for machine learning and computer vision applications. This is greatly attributed to the robustness topological representations provide against different types of physical nuisance variables seen in real-world data, such as view-point, illumination, and more. However, key bottlenecks to their large scale adoption are computational expenditure and difficulty incorporating them in a differentiable architecture. We take an important step in this paper to mitigate these bottlenecks by proposing a novel one-step approach to generate PIs directly from the input data. We design two separate convolutional neural network architectures, one designed to take in multi-variate time series signals as input and another that accepts multi-channel images as input. We call these networks Signal PI-Net and Image PINet respectively. To the best of our knowledge, we are the first to propose the use of deep learning for computing topological features directly from data. We explore the use of the proposed PI-Net architectures on two applications: human activity recognition using tri-axial accelerometer sensor data and image classification. We demonstrate the ease of fusion of PIs in supervised deep learning architectures and speed up of several orders of magnitude for extracting PIs from data. Our code is available at https://github.com/anirudhsom/PI-Net. -
Topological Data Analysis of Collective and Individual Epithelial Cells Using Persistent Homology of Loops (2021)
Dhananjay Bhaskar, William Y. Zhang, Ian Y. WongAbstract
Interacting, self-propelled particles such as epithelial cells can dynamically self-organize into complex multicellular patterns, which are challenging to classify without a priori information. Classically, different phases and phase transitions have been described based on local ordering, which may not capture structural features at larger length scales. Instead, topological data analysis (TDA) determines the stability of spatial connectivity at varying length scales (i.e. persistent homology), and can compare different particle configurations based on the “cost” of reorganizing one configuration into another. Here, we demonstrate a topology-based machine learning approach for unsupervised profiling of individual and collective phases based on large-scale loops. We show that these topological loops (i.e. dimension 1 homology) are robust to variations in particle number and density, particularly in comparison to connected components (i.e. dimension 0 homology). We use TDA to map out phase diagrams for simulated particles with varying adhesion and propulsion, at constant population size as well as when proliferation is permitted. Next, we use this approach to profile our recent experiments on the clustering of epithelial cells in varying growth factor conditions, which are compared to our simulations. Finally, we characterize the robustness of this approach at varying length scales, with sparse sampling, and over time. Overall, we envision TDA will be broadly applicable as a model-agnostic approach to analyze active systems with varying population size, from cytoskeletal motors to motile cells to flocking or swarming animals. -
The Shape of Cancer Relapse: Topological Data Analysis Predicts Recurrence in Paediatric Acute Lymphoblastic Leukaemia (2021)
Salvador Chulián, Bernadette J. Stolz, Álvaro Martínez-Rubio, Cristina Blázquez Goñi, Juan F. Rodríguez Gutiérrez, Teresa Caballero Velázquez, Águeda Molinos Quintana, Manuel Ramírez Orellana, Ana Castillo Robleda, José Luis Fuster Soler, Alfredo Minguela Puras, María Victoria Martínez Sánchez, María Rosa, Víctor M. Pérez-García, Helen ByrneAbstract
Acute Lymphoblastic Leukaemia (ALL) is the most frequent paediatric cancer. Modern therapies have improved survival rates, but approximately 15-20 % of patients relapse. At present, patients’ risk of relapse are assessed by projecting high-dimensional flow cytometry data onto a subset of biomarkers and manually estimating the shape of this reduced data. Here, we apply methods from topological data analysis (TDA), which quantify shape in data via features such as connected components and loops, to pre-treatment ALL datasets with known outcomes. We combine these fully unsupervised analyses with machine learning to identify features in the pre-treatment data that are prognostic for risk of relapse. We find significant topological differences between relapsing and non-relapsing patients and confirm the predictive power of CD10, CD20, CD38, and CD45. Further, we are able to use the TDA descriptors to predict patients who relapsed. We propose three prognostic pipelines that readily extend to other haematological malignancies. Teaser Topology reveals features in flow cytometry data which predict relapse of patients with acute lymphoblastic leukemia -
Predicting Clinical Outcomes in Glioblastoma: An Application of Topological and Functional Data Analysis (2019)
Lorin Crawford, Anthea Monod, Andrew X. Chen, Sayan Mukherjee, Raúl RabadánAbstract
Glioblastoma multiforme (GBM) is an aggressive form of human brain cancer that is under active study in the field of cancer biology. Its rapid progression and the relative time cost of obtaining molecular data make other readily available forms of data, such as images, an important resource for actionable measures in patients. Our goal is to use information given by medical images taken from GBM patients in statistical settings. To do this, we design a novel statistic—the smooth Euler characteristic transform (SECT)—that quantifies magnetic resonance images of tumors. Due to its well-defined inner product structure, the SECT can be used in a wider range of functional and nonparametric modeling approaches than other previously proposed topological summary statistics. When applied to a cohort of GBM patients, we find that the SECT is a better predictor of clinical outcomes than both existing tumor shape quantifications and common molecular assays. Specifically, we demonstrate that SECT features alone explain more of the variance in GBM patient survival than gene expression, volumetric features, and morphometric features. The main takeaways from our findings are thus 2-fold. First, they suggest that images contain valuable information that can play an important role in clinical prognosis and other medical decisions. Second, they show that the SECT is a viable tool for the broader study of medical imaging informatics. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement. -
Measuring Hidden Phenotype: Quantifying the Shape of Barley Seeds Using the Euler Characteristic Transform (2021)
Erik J. Amézquita, Michelle Y. Quigley, Tim Ophelders, Jacob B. Landis, Daniel Koenig, Elizabeth Munch, Daniel H. ChitwoodAbstract
Shape plays a fundamental role in biology. Traditional phenotypic analysis methods measure some features but fail to measure the information embedded in shape comprehensively. To extract, compare, and analyze this information embedded in a robust and concise way, we turn to Topological Data Analysis (TDA), specifically the Euler Characteristic Transform. TDA measures shape comprehensively using mathematical representations based on algebraic topology features. To study its use, we compute both traditional and topological shape descriptors to quantify the morphology of 3121 barley seeds scanned with X-ray Computed Tomography (CT) technology at 127 micron resolution. The Euler Characteristic Transform measures shape by analyzing topological features of an object at thresholds across a number of directional axes. A Kruskal-Wallis analysis of the information encoded by the topological signature reveals that the Euler Characteristic Transform picks up successfully the shape of the crease and bottom of the seeds. Moreover, while traditional shape descriptors can cluster the seeds based on their accession, topological shape descriptors can cluster them further based on their panicle. We then successfully train a support vector machine (SVM) to classify 28 different accessions of barley based exclusively on the shape of their grains. We observe that combining both traditional and topological descriptors classifies barley seeds better than using just traditional descriptors alone. This improvement suggests that TDA is thus a powerful complement to traditional morphometrics to comprehensively describe a multitude of “hidden” shape nuances which are otherwise not detected. -
Advancing Precision Medicine: Algebraic Topology and Differential Geometry in Radiology and Computational Pathology (2024)
Richard M. Levenson, Yashbir Singh, Bastian Rieck, Ashok Choudhary, Gunnar Carlsson, Deepa Sarkar, Quincy A. Hathaway, Colleen Farrelly, Jennifer Rozenblit, Prateek Prasanna, Bradley EricksonAbstract
Precision medicine aims to provide personalized care based on individual patient characteristics, rather than guideline-directed therapies for groups of diseases or patient demographics. Images—both radiology- and pathology-derived—are a major source of information on presence, type, and status of disease. Exploring the mathematical relationship of pixels in medical imaging (“radiomics”) and cellular-scale structures in digital pathology slides (“pathomics”) offers powerful tools for extracting both qualitative and, increasingly, quantitative data. These analytical approaches, however, may be significantly enhanced by applying additional methods arising from fields of mathematics such as differential geometry and algebraic topology that remain underexplored in this context. Geometry’s strength lies in its ability to provide precise local measurements, such as curvature, that can be crucial for identifying abnormalities at multiple spatial levels. These measurements can augment the quantitative features extracted in conventional radiomics, leading to more nuanced diagnostics. By contrast, topology serves as a robust shape descriptor, capturing essential features such as connected components and holes. The field of topological data analysis was initially founded to explore the shape of data, with functional network connectivity in the brain being a prominent example. Increasingly, its tools are now being used to explore organizational patterns of physical structures in medical images and digitized pathology slides. By leveraging tools from both differential geometry and algebraic topology, researchers and clinicians may be able to obtain a more comprehensive, multi-layered understanding of medical images and contribute to precision medicine’s armamentarium -
Topological Descriptors Help Predict Guest Adsorption in Nanoporous Materials (2020)
Aditi S. Krishnapriyan, Maciej Haranczyk, Dmitriy MorozovAbstract
Machine learning has emerged as an attractive alternative to experiments and simulations for predicting material properties. Usually, such an approach relies on specific domain knowledge for feature design: each learning target requires careful selection of features that an expert recognizes as important for the specific task. The major drawback of this approach is that computation of only a few structural features has been implemented so far, and it is difficult to tell a priori which features are important for a particular application. The latter problem has been empirically observed for predictors of guest uptake in nanoporous materials: local and global porosity features become dominant descriptors at low and high pressures, respectively. We investigate a feature representation of materials using tools from topological data analysis. Specifically, we use persistent homology to describe the geometry of nanoporous materials at various scales. We combine our topological descriptor with traditional structural features and investigate the relative importance of each to the prediction tasks. We demonstrate an application of this feature representation by predicting methane adsorption in zeolites, for pressures in the range of 1-200 bar. Our results not only show a considerable improvement compared to the baseline, but they also highlight that topological features capture information complementary to the structural features: this is especially important for the adsorption at low pressure, a task particularly difficult for the traditional features. Furthermore, by investigation of the importance of individual topological features in the adsorption model, we are able to pinpoint the location of the pores that correlate best to adsorption at different pressure, contributing to our atom-level understanding of structure-property relationships. -
Steinhaus Filtration and Stable Paths in the Mapper (2020)
Dustin L. Arendt, Matthew Broussard, Bala Krishnamoorthy, Nathaniel SaulAbstract
Two central concepts from topological data analysis are persistence and the Mapper construction. Persistence employs a sequence of objects built on data called a filtration. A Mapper produces insightful summaries of data, and has found widespread applications in diverse areas. We define a new filtration called the cover filtration built from a single cover based on a generalized Steinhaus distance, which is a generalization of Jaccard distance. We prove a stability result: the cover filtrations of two covers are \$\alpha/m\$ interleaved, where \$\alpha\$ is a bound on bottleneck distance between covers and \$m\$ is the size of smallest set in either cover. We also show our construction is equivalent to the Cech filtration under certain settings, and the Vietoris-Rips filtration completely determines the cover filtration in all cases. We then develop a theory for stable paths within this filtration. Unlike standard results on stability in topological persistence, our definition of path stability aligns exactly with the above result on stability of cover filtration. We demonstrate how our framework can be employed in a variety of applications where a metric is not obvious but a cover is readily available. First we present a new model for recommendation systems using cover filtration. For an explicit example, stable paths identified on a movies data set represent sequences of movies constituting gentle transitions from one genre to another. As a second application in explainable machine learning, we apply the Mapper for model induction, providing explanations in the form of paths between subpopulations. Stable paths in the Mapper from a supervised machine learning model trained on the FashionMNIST data set provide improved explanations of relationships between subpopulations of images. -
Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology (2021)
Nicole Bussola, Bruno Papa, Ombretta Melaiu, Aurora Castellano, Doriana Fruci, Giuseppe JurmanAbstract
We introduce here a novel machine learning (ML) framework to address the issue of the quantitative assessment of the immune content in neuroblastoma (NB) specimens. First, the EUNet, a U-Net with an EfficientNet encoder, is trained to detect lymphocytes on tissue digital slides stained with the CD3 T-cell marker. The training set consists of 3782 images extracted from an original collection of 54 whole slide images (WSIs), manually annotated for a total of 73,751 lymphocytes. Resampling strategies, data augmentation, and transfer learning approaches are adopted to warrant reproducibility and to reduce the risk of overfitting and selection bias. Topological data analysis (TDA) is then used to define activation maps from different layers of the neural network at different stages of the training process, described by persistence diagrams (PD) and Betti curves. TDA is further integrated with the uniform manifold approximation and projection (UMAP) dimensionality reduction and the hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm for clustering, by the deep features, the relevant subgroups and structures, across different levels of the neural network. Finally, the recent TwoNN approach is leveraged to study the variation of the intrinsic dimensionality of the U-Net model. As the main task, the proposed pipeline is employed to evaluate the density of lymphocytes over the whole tissue area of the WSIs. The model achieves good results with mean absolute error 3.1 on test set, showing significant agreement between densities estimated by our EUNet model and by trained pathologists, thus indicating the potentialities of a promising new strategy in the quantification of the immune content in NB specimens. Moreover, the UMAP algorithm unveiled interesting patterns compatible with pathological characteristics, also highlighting novel insights into the dynamics of the intrinsic dataset dimensionality at different stages of the training process. All the experiments were run on the Microsoft Azure cloud platform. -
Continuous Indexing of Fibrosis (CIF): Improving the Assessment and Classification of MPN Patients (2022)
Hosuk Ryou, Korsuk Sirinukunwattana, Alan Aberdeen, Gillian Grindstaff, Bernadette Stolz, Helen Byrne, Heather A. Harrington, Nikolaos Sousos, Anna L. Godfrey, Claire N. Harrison, Bethan Psaila, Adam J. Mead, Gabrielle Rees, Gareth D. H. Turner, Jens Rittscher, Daniel RoystonAbstract
The detection and grading of fibrosis in myeloproliferative neoplasms (MPN) is an important component of disease classification, prognostication and disease monitoring. However, current fibrosis grading systems are only semi-quantitative and fail to capture sample heterogeneity. To improve the detection, quantitation and representation of reticulin fibrosis, we developed a machine learning (ML) approach using bone marrow trephine (BMT) samples (n = 107) from patients diagnosed with MPN or a reactive / nonneoplastic marrow. The resulting Continuous Indexing of Fibrosis (CIF) enhances the detection and monitoring of fibrosis within BMTs, and aids the discrimination of MPN subtypes. When combined with megakaryocyte feature analysis, CIF discriminates between the frequently challenging differential diagnosis of essential thrombocythemia (ET) and pre-fibrotic myelofibrosis (pre-PMF) with high predictive accuracy [area under the curve = 0.94]. CIF also shows significant promise in the identification of MPN patients at risk of disease progression; analysis of samples from 35 patients diagnosed with ET and enrolled in the Primary Thrombocythemia-1 (PT-1) trial identified features predictive of post-ET myelofibrosis (area under the curve = 0.77). In addition to these clinical applications, automated analysis of fibrosis has clear potential to further refine disease classification boundaries and inform future studies of the micro-environmental factors driving disease initiation and progression in MPN and other stem cell disorders. The image analysis methods used to generate CIF can be readily integrated with those of other key morphological features in MPNs, including megakaryocyte morphology, that lie beyond the scope of conventional histological assessment. Key PointsMachine learning enables an objective and quantitative description of reticulin fibrosis within the bone marrow of patients with myeloproliferative neoplasms (MPN),Automated analysis and Continuous Indexing of Fibrosis (CIF) captures heterogeneity within MPN samples and has utility in refined classification and disease monitoringQuantitative fibrosis assessment combined with topological data analysis may help to predict patients at increased risk of progression to post-ET myelofibrosis, and assist in the discrimination of ET and pre-fibrotic PMF (pre-PMF) -
A Primer on Topological Data Analysis to Support Image Analysis Tasks in Environmental Science (2023)
Lander Ver Hoef, Henry Adams, Emily J. King, Imme Ebert-UphoffAbstract
Abstract Topological data analysis (TDA) is a tool from data science and mathematics that is beginning to make waves in environmental science. In this work, we seek to provide an intuitive and understandable introduction to a tool from TDA that is particularly useful for the analysis of imagery, namely, persistent homology. We briefly discuss the theoretical background but focus primarily on understanding the output of this tool and discussing what information it can glean. To this end, we frame our discussion around a guiding example of classifying satellite images from the sugar, fish, flower, and gravel dataset produced for the study of mesoscale organization of clouds by Rasp et al. We demonstrate how persistent homology and its vectorization, persistence landscapes, can be used in a workflow with a simple machine learning algorithm to obtain good results, and we explore in detail how we can explain this behavior in terms of image-level features. One of the core strengths of persistent homology is how interpretable it can be, so throughout this paper we discuss not just the patterns we find but why those results are to be expected given what we know about the theory of persistent homology. Our goal is that readers of this paper will leave with a better understanding of TDA and persistent homology, will be able to identify problems and datasets of their own for which persistent homology could be helpful, and will gain an understanding of the results they obtain from applying the included GitHub example code. Significance Statement Information such as the geometric structure and texture of image data can greatly support the inference of the physical state of an observed Earth system, for example, in remote sensing to determine whether wildfires are active or to identify local climate zones. Persistent homology is a branch of topological data analysis that allows one to extract such information in an interpretable way—unlike black-box methods like deep neural networks. The purpose of this paper is to explain in an intuitive manner what persistent homology is and how researchers in environmental science can use it to create interpretable models. We demonstrate the approach to identify certain cloud patterns from satellite imagery and find that the resulting model is indeed interpretable.