🍩 Database of Original & Non-Theoretical Uses of Topology
(found 52 matches in 0.019557s)
-
-
Persistence Weighted Gaussian Kernel for Topological Data Analysis (2016)
Genki Kusano, Yasuaki Hiraoka, Kenji FukumizuAbstract
Topological data analysis (TDA) is an emerging mathematical concept for characterizing shapes in complex data. In TDA, persistence diagrams are widely recognized as a useful descriptor of data, and... -
Kernel Method for Persistence Diagrams via Kernel Embedding and Weight Factor (2017)
Genki Kusano, Kenji Fukumizu, Yasuaki Hiraoka -
Topological Approaches to Skin Disease Image Analysis (2018)
Yu-Min Chung, Chuan-Shen Hu, Austin Lawson, Clifford Smyth -
Transfer Learning for Autonomous Chatter Detection in Machining (2022)
Melih C. Yesilli, Firas A. Khasawneh, Brian P. MannAbstract
Large-amplitude chatter vibrations are one of the most important phenomena in machining processes. It is often detrimental in cutting operations causing a poor surface finish and decreased tool life. Therefore, chatter detection using machine learning has been an active research area over the last decade. Three challenges can be identified in applying machine learning for chatter detection at large in industry: an insufficient understanding of the universality of chatter features across different processes, the need for automating feature extraction, and the existence of limited data for each specific workpiece-machine tool combination, e.g., when machining one-off products. These three challenges can be grouped under the umbrella of transfer learning, which is concerned with studying how knowledge gained from one setting can be leveraged to obtain information in new settings. This paper studies automating chatter detection by evaluating transfer learning of prominent as well as novel chatter detection methods. We investigate chatter classification accuracy using a variety of features extracted from turning and milling experiments with different cutting configurations. The studied methods include Fast Fourier Transform (FFT), Power Spectral Density (PSD), the Auto-correlation Function (ACF), and decomposition based tools such as Wavelet Packet Transform (WPT) and Ensemble Empirical Mode Decomposition (EEMD). We also examine more recent approaches based on Topological Data Analysis (TDA) and similarity measures of time series based on Discrete Time Warping (DTW). We evaluate transfer learning potential of each approach by training and testing both within and across the turning and milling data sets. Four supervised classification algorithms are explored: support vector machine (SVM), logistic regression, random forest classification, and gradient boosting. In addition to accuracy, we also comment on the automation potential of feature extraction for each approach which is integral to creating autonomous manufacturing centers. Our results show that carefully chosen time-frequency features can lead to high classification accuracies albeit at the cost of requiring manual pre-processing and the tagging of an expert user. On the other hand, we found that the TDA and DTW approaches can provide accuracies and F1-scores on par with the time-frequency methods without the need for manual preprocessing via completely automatic pipelines. Further, we discovered that the DTW approach outperforms all other methods when trained using the milling data and tested on the turning data. Therefore, TDA and DTW approaches may be preferred over the time-frequency-based approaches for fully automated chatter detection schemes. DTW and TDA also can be more advantageous when pooling data from either limited workpiece-machine tool combinations, or from small data sets of one-off processes. -
Pore Configuration Landscape of Granular Crystallization (2017)
Mohammad Saadatfar, Hiroshi Takeuchi, Vanessa Robins, Nicolas Francois, Yisuaki HiraokaAbstract
Emergence and growth of crystalline domains in granular media remains under-explored. Here, the authors analyse tomographic snapshots from partially recrystallized packings of spheres using persistent homology and find agreement with proposed transitions based on continuous deformation of octahedral and tetrahedral voids. -
Topological Feature Tracking for Submesoscale Eddies (2022)
Sam Voisin, Jay Hineman, James B. Polly, Gary Koplik, Ken Ball, Paul Bendich, Joseph D‘Addezio, Gregg A. Jacobs, Tamay Özgökmen -
Shortened Persistent Homology for a Biomedical Retrieval System With Relevance Feedback (2018)
Alessia Angeli, Massimo Ferri, Eleonora Monti, Ivan Tomba -
Lipschitz Functions Have Lp-Stable Persistence (2010)
David Cohen-Steiner, Herbert Edelsbrunner, John Harer, Yuriy MileykoAbstract
We prove two stability results for Lipschitz functions on triangulable, compact metric spaces and consider applications of both to problems in systems biology. Given two functions, the first result is formulated in terms of the Wasserstein distance between their persistence diagrams and the second in terms of their total persistence. -
Topological Detection of Phenomenological Bifurcations With Unreliable Kernel Density Estimates (2024)
Sunia Tanweer, Firas A. KhasawnehAbstract
Phenomenological (P-type) bifurcations are qualitative changes in stochastic dynamical systems whereby the stationary probability density function (PDF) changes its topology. The current state of the art for detecting these bifurcations requires reliable kernel density estimates computed from an ensemble of system realizations. However, in several real world signals such as Big Data, only a single system realization is available—making it impossible to estimate a reliable kernel density. This study presents an approach for detecting P-type bifurcations using unreliable density estimates. The approach creates an ensemble of objects from Topological Data Analysis (TDA) called persistence diagrams from the system’s sole realization and statistically analyzes the resulting set. We compare several methods for replicating the original persistence diagram including Gibbs point process modelling, Pairwise Interaction Point Modelling, and subsampling. We show that for the purpose of predicting a bifurcation, the simple method of subsampling exceeds the other two methods of point process modelling in performance. -
Graph Filtration Learning (2020)
Christoph Hofer, Florian Graf, Bastian Rieck, Marc Niethammer, Roland KwittAbstract
We propose an approach to learning with graph-structured data in the problem domain of graph classification. In particular, we present a novel type of readout operation to aggregate node features into a graph-level representation. To this end, we leverage persistent homology computed via a real-valued, learnable, filter function. We establish the theoretical foundation for differentiating through the persistent homology computation. Empirically, we show that this type of readout operation compares favorably to previous techniques, especially when the graph connectivity structure is informative for the learning problem. -
Capturing Shape Information With Multi-Scale Topological Loss Terms For 3D Reconstruction (2022)
Dominik J. E. Waibel, Scott Atwell, Matthias Meier, Carsten Marr, Bastian RieckAbstract
Reconstructing 3D objects from 2D images is both challenging for our brains and machine learning algorithms. To support this spatial reasoning task, contextual information about the overall shape of an object is critical. However, such information is not captured by established loss terms (e.g. Dice loss). We propose to complement geometrical shape information by including multi-scale topological features, such as connected components, cycles, and voids, in the reconstruction loss. Our method uses cubical complexes to calculate topological features of 3D volume data and employs an optimal transport distance to guide the reconstruction process. This topology-aware loss is fully differentiable, computationally efficient, and can be added to any neural network. We demonstrate the utility of our loss by incorporating it into SHAPR, a model for predicting the 3D cell shape of individual cells based on 2D microscopy images. Using a hybrid loss that leverages both geometrical and topological information of single objects to assess their shape, we find that topological information substantially improves the quality of reconstructions, thus highlighting its ability to extract more relevant features from image datasets. -
Determining Structural Properties of Artificial Neural Networks Using Algebraic Topology (2021)
David Pérez Fernández, Asier Gutiérrez-Fandiño, Jordi Armengol-Estapé, Marta VillegasAbstract
Artificial Neural Networks (ANNs) are widely used for approximating complex functions. The process that is usually followed to define the most appropriate architecture for an ANN given a specific function is mostly empirical. Once this architecture has been defined, weights are usually optimized according to the error function. On the other hand, we observe that ANNs can be represented as graphs and their topological 'fingerprints' can be obtained using Persistent Homology (PH). In this paper, we describe a proposal focused on designing more principled architecture search procedures. To do this, different architectures for solving problems related to a heterogeneous set of datasets have been analyzed. The results of the evaluation corroborate that PH effectively characterizes the ANN invariants: when ANN density (layers and neurons) or sample feeding order is the only difference, PH topological invariants appear; in the opposite direction in different sub-problems (i.e. different labels), PH varies. This approach based on topological analysis helps towards the goal of designing more principled architecture search procedures and having a better understanding of ANNs. -
Topological Graph Neural Networks (2021)
Max Horn, Edward De Brouwer, Michael Moor, Yves Moreau, Bastian Rieck, Karsten BorgwardtAbstract
Graph neural networks (GNNs) are a powerful architecture for tackling graph learning tasks, yet have been shown to be oblivious to eminent substructures, such as cycles. We present TOGL, a novel layer that incorporates global topological information of a graph using persistent homology. TOGL can be easily integrated into any type of GNN and is strictly more expressive in terms of the Weisfeiler--Lehman test of isomorphism. Augmenting GNNs with our layer leads to beneficial predictive performance, both on synthetic data sets, which can be trivially classified by humans but not by ordinary GNNs, and on real-world data. -
Characterizing Scales of Genetic Recombination and Antibiotic Resistance in Pathogenic Bacteria Using Topological Data Analysis (2014)
Kevin J. Emmett, Raul RabadanAbstract
Pathogenic bacteria present a large disease burden on human health. Control of these pathogens is hampered by rampant lateral gene transfer, whereby pathogenic strains may acquire genes conferring resistance to common antibiotics. Here we introduce tools from topological data analysis to characterize the frequency and scale of lateral gene transfer in bacteria, focusing on a set of pathogens of significant public health relevance. As a case study, we examine the spread of antibiotic resistance in Staphylococcus aureus. Finally, we consider the possible role of the human microbiome as a reservoir for antibiotic resistance genes. -
Topology of Force Networks in Granular Media Under Impact (2017)
M. X. Lim, R. P. BehringerAbstract
We investigate the evolution of the force network in experimental systems of two-dimensional granular materials under impact. We use the first Betti number, , and persistence diagrams, as measures of the topological properties of the force network. We show that the structure of the network has a complex, hysteretic dependence on both the intruder acceleration and the total force response of the granular material. can also distinguish between the nonlinear formation and relaxation of the force network. In addition, using the persistence diagram of the force network, we show that the size of the loops in the force network has a Poisson-like distribution, the characteristic size of which changes over the course of the impact. -
A Simplified Algorithm for Identifying Abnormal Changes in Dynamic Networks (2022)
Bouchaib Azamir, Driss Bennis, Bertrand MichelAbstract
Topological data analysis has recently been applied to the study of dynamic networks. In this context, an algorithm was introduced and helps, among other things, to detect early warning signals of abnormal changes in the dynamic network under study. However, the complexity of this algorithm increases significantly once the database studied grows. In this paper, we propose a simplification of the algorithm without affecting its performance. We give various applications and simulations of the new algorithm on some weighted networks. The obtained results show clearly the efficiency of the introduced approach. Moreover, in some cases, the proposed algorithm makes it possible to highlight local information and sometimes early warning signals of local abnormal changes. -
A Stable Multi-Scale Kernel for Topological Machine Learning (2015)
Jan Reininghaus, Stefan Huber, Ulrich Bauer, Roland KwittAbstract
Topological data analysis offers a rich source of valuable information to study vision problems. Yet, so far we lack a theoretically sound connection to popular kernel-based learning techniques, such as kernel SVMs or kernel PCA. In this work, we establish such a connection by designing a multi-scale kernel for persistence diagrams, a stable summary representation of topological features in data. We show that this kernel is positive definite and prove its stability with respect to the 1-Wasserstein distance. Experiments on two benchmark datasets for 3D shape classification/retrieval and texture recognition show considerable performance gains of the proposed method compared to an alternative approach that is based on the recently introduced persistence landscapes. -
Topological Machine Learning for Multivariate Time Series (2020)
Chengyuan Wu, Carol Anne HargreavesAbstract
We develop a framework for analyzing multivariate time series using topological data analysis (TDA) methods. The proposed methodology involves converting the multivariate time series to point cloud data, calculating Wasserstein distances between the persistence diagrams and using the \$k\$-nearest neighbors algorithm (\$k\$-NN) for supervised machine learning. Two methods (symmetry-breaking and anchor points) are also introduced to enable TDA to better analyze data with heterogeneous features that are sensitive to translation, rotation, or choice of coordinates. We apply our methods to room occupancy detection based on 5 time-dependent variables (temperature, humidity, light, CO2 and humidity ratio). Experimental results show that topological methods are effective in predicting room occupancy during a time window. We also apply our methods to an Activity Recognition dataset and obtained good results. -
Robust Crossings Detection in Noisy Signals Using Topological Signal Processing (2024)
Sunia Tanweer, Firas A. Khasawneh, Elizabeth MunchAbstract
This article explores a novel method of bracketing zero-crossings for both 1-D functions and discretely sampled time series by the application of 0-D persistent homology from algebraic topology. We introduce an algorithm and demonstrate its capability of detecting crossing in noisy signals across various sampling frequencies. Compared to other software-based methods for crossing-detection in signals, our approach is typically faster, shows a higher accuracy, and has the unique ability to identify all roots within the provided interval instead of detecting only one out of all. We also discuss different options for mathematically estimating the persistence threshold— a parameter which impacts and controls the correct bracketing of roots. Finally, we explore the potential of extending our algorithm to higher dimensions. -
Geometry and Topology of the Space of Sonar Target Echos (2018)
Michael Robinson, Sean Fennell, Brian DiZio, Jennifer DumiakAbstract
Successful synthetic aperture sonar target classification depends on the “shape” of the scatterers within a target signature. This article presents a workflow that computes a target-to-target distance from persistence diagrams, since the “shape” of a signature informs its persistence diagram in a structure-preserving way. The target-to-target distances derived from persistence diagrams compare favorably against those derived from spectral features and have the advantage of being substantially more compact. While spectral features produce clusters associated to each target type that are reasonably dense and well formed, the clusters are not well-separated from one another. In rather dramatic contrast, a distance derived from persistence diagrams results in highly separated clusters at the expense of some misclassification of outliers. -
Topological Data Analysis: Concepts, Computation, and Applications in Chemical Engineering (2021)
Alexander D. Smith, Paweł Dłotko, Victor M. ZavalaAbstract
A primary hypothesis that drives scientific and engineering studies is that data has structure. The dominant paradigms for describing such structure are statistics (e.g., moments, correlation functions) and signal processing (e.g., convolutional neural nets, Fourier series). Topological Data Analysis (TDA) is a field of mathematics that analyzes data from a fundamentally different perspective. TDA represents datasets as geometric objects and provides dimensionality reduction techniques that project such objects onto low-dimensional descriptors. The key properties of these descriptors (also known as topological features) are that they provide multiscale information and that they are stable under perturbations (e.g., noise, translation, and rotation). In this work, we review the key mathematical concepts and methods of TDA and present different applications in chemical engineering. -
Computing Robustness and Persistence for Images (2010)
P. Bendich, H. Edelsbrunner, M. KerberAbstract
We are interested in 3-dimensional images given as arrays of voxels with intensity values. Extending these values to a continuous function, we study the robustness of homology classes in its level and interlevel sets, that is, the amount of perturbation needed to destroy these classes. The structure of the homology classes and their robustness, over all level and interlevel sets, can be visualized by a triangular diagram of dots obtained by computing the extended persistence of the function. We give a fast hierarchical algorithm using the dual complexes of oct-tree approximations of the function. In addition, we show that for balanced oct-trees, the dual complexes are geometrically realized in R3 and can thus be used to construct level and interlevel sets. We apply these tools to study 3-dimensional images of plant root systems. -
Hyperparameter Optimization of Topological Features for Machine Learning Applications (2019)
Francis Motta, Christopher Tralie, Rossella Bedini, Fabiano Bini, Gilberto Bini, Hamed Eramian, Marcio Gameiro, Steve Haase, Hugh Haddox, John Harer, Nick Leiby, Franco Marinozzi, Scott Novotney, Gabe Rocklin, Jed Singer, Devin Strickland, Matt VaughnAbstract
This paper describes a general pipeline for generating optimal vector representations of topological features of data for use with machine learning algorithms. This pipeline can be viewed as a costly black-box function defined over a complex configuration space, each point of which specifies both how features are generated and how predictive models are trained on those features. We propose using state-of-the-art Bayesian optimization algorithms to inform the choice of topological vectorization hyperparameters while simultaneously choosing learning model parameters. We demonstrate the need for and effectiveness of this pipeline using two difficult biological learning problems, and illustrate the nontrivial interactions between topological feature generation and learning model hyperparameters. -
Persistent Homology and Many-Body Atomic Structure for Medium-Range Order in the Glass (2015)
Takenobu Nakamura, Yasuaki Hiraoka, Akihiko Hirata, Emerson G. Escolar, Yasumasa NishiuraAbstract
The characterization of the medium-range (MRO) order in amorphous materials and its relation to the short-range order is discussed. A new topological approach to extract a hierarchical structure of amorphous materials is presented, which is robust against small perturbations and allows us to distinguish it from periodic or random configurations. This method is called the persistence diagram (PD) and introduces scales to many-body atomic structures to facilitate size and shape characterization. We first illustrate the representation of perfect crystalline and random structures in PDs. Then, the MRO in amorphous silica is characterized using the appropriate PD. The PD approach compresses the size of the data set significantly, to much smaller geometrical summaries, and has considerable potential for application to a wide range of materials, including complex molecular liquids, granular materials, and metallic glasses. -
Towards a Philological Metric Through a Topological Data Analysis Approach (2020)
Eduardo Paluzo-Hidalgo, Rocio Gonzalez-Diaz, Miguel A. Gutiérrez-NaranjoAbstract
The canon of the baroque Spanish literature has been thoroughly studied with philological techniques. The major representatives of the poetry of this epoch are Francisco de Quevedo and Luis de Góngora y Argote. They are commonly classified by the literary experts in two different streams: Quevedo belongs to the Conceptismo and G\ńgora to the Culteranismo. Besides, traditionally, even if Quevedo is considered the most representative of the Conceptismo, Lope de Vega is also considered to be, at least, closely related to this literary trend. In this paper, we use Topological Data Analysis techniques to provide a first approach to a metric distance between the literary style of these poets. As a consequence, we reach results that are under the literary experts' criteria, locating the literary style of Lope de Vega, closer to the one of Quevedo than to the one of G\'ǵora. -
Statistical Topological Data Analysis - A Kernel Perspective (2015)
Roland Kwitt, Stefan Huber, Marc Niethammer, Weili Lin, Ulrich BauerAbstract
We consider the problem of statistical computations with persistence diagrams, a summary representation of topological features in data. These diagrams encode persistent homology, a widely used invariant in topological data analysis. While several avenues towards a statistical treatment of the diagrams have been explored recently, we follow an alternative route that is motivated by the success of methods based on the embedding of probability measures into reproducing kernel Hilbert spaces. In fact, a positive definite kernel on persistence diagrams has recently been proposed, connecting persistent homology to popular kernel-based learning techniques such as support vector machines. However, important properties of that kernel enabling a principled use in the context of probability measure embeddings remain to be explored. Our contribution is to close this gap by proving universality of a variant of the original kernel, and to demonstrate its effective use in two-sample hypothesis testing on synthetic as well as real-world data. -
Four-Dimensional Observation of Ductile Fracture in Sintered Iron Using Synchrotron X-Ray Laminography (2019)
Y. Ozaki, Y. Mugita, M. Aramaki, O. Furukimi, S. Oue, F. Jiang, T. Tsuji, A. Takeuchi, M. Uesugi, K. AshizukaAbstract
Synchrotron X-ray laminography was used to examine the time-dependent evolution of the three-dimensional (3D) morphology of micropores in sintered iron during the tensile test. 3D snapshots showed that the networked open pores grow wider than 20 µm along the tensile direction, resulting in the internal necking of the specimen. Subsequently, these pores initiated the cracks perpendicular to the tensile direction by coalescing with the surrounding pre-existing microvoids or with the secondary-generated voids immediately before fracture. Topological analysis of the barycentric positions of these microvoids showed that they form the two-dimensional networks within the ∼20 µm of radius area. These observations strongly indicate that the microvoid coalescence could occur on shear planes formed close to the enlarged open pores or between closed pores by strain accumulation and play an important role in the crack initiation. -
Pore Geometry Characterization by Persistent Homology Theory (2018)
Fei Jiang, Takeshi Tsuji, Tomoyuki ShiraiAbstract
Rock pore geometry has heterogeneous characteristics and is scale dependent. This feature in a geological formation differs significantly from artificial materials and makes it difficult to predict hydrologic and elastic properties. To characterize pore heterogeneity, we propose an evaluation method that exploits the recently developed persistent homology theory. In the proposed method, complex pore geometry is first represented as sphere cloud data using a pore-network extraction method. Then, a persistence diagram (PD) is calculated from the point cloud, which represents the spatial distribution of pore bodies. A new parameter (distance index H) derived from the PD is proposed to characterize the degree of rock heterogeneity. Low H value indicates high heterogeneity. A new empirical equation using this index H is proposed to predict the effective elastic modulus of porous media. The results indicate that the proposed PD analysis is very efficient for extracting topological feature of pore geometry. -
Topological Data Analysis for Electric Motor Eccentricity Fault Detection (2022)
Bingnan Wang, Chungwei Lin, Hiroshi Inoue, Makoto KanemaruAbstract
In this paper, we develop topological data analysis (TDA) method for motor current signature analysis (MCSA), and apply it to induction motor eccentricity fault detection. We introduce TDA and present the procedure of extracting topological features from time-domain data that will be represented using persistence diagrams and vectorized Betti sequences. The procedure is applied to induction machine phase current signal analysis, and shown to be highly effective in differentiating signals from different eccentricity levels. With TDA, we are able to use a simple regression model that can predict the fault levels with reasonable accuracy, even for the data of eccentricity levels that are not seen in the training data. The proposed method is model-free, and only requires a small segment of time-domain data to make prediction. These advantages make it attractive for a wide range of fault detection applications. -
Topological Descriptors Help Predict Guest Adsorption in Nanoporous Materials (2020)
Aditi S. Krishnapriyan, Maciej Haranczyk, Dmitriy MorozovAbstract
Machine learning has emerged as an attractive alternative to experiments and simulations for predicting material properties. Usually, such an approach relies on specific domain knowledge for feature design: each learning target requires careful selection of features that an expert recognizes as important for the specific task. The major drawback of this approach is that computation of only a few structural features has been implemented so far, and it is difficult to tell a priori which features are important for a particular application. The latter problem has been empirically observed for predictors of guest uptake in nanoporous materials: local and global porosity features become dominant descriptors at low and high pressures, respectively. We investigate a feature representation of materials using tools from topological data analysis. Specifically, we use persistent homology to describe the geometry of nanoporous materials at various scales. We combine our topological descriptor with traditional structural features and investigate the relative importance of each to the prediction tasks. We demonstrate an application of this feature representation by predicting methane adsorption in zeolites, for pressures in the range of 1-200 bar. Our results not only show a considerable improvement compared to the baseline, but they also highlight that topological features capture information complementary to the structural features: this is especially important for the adsorption at low pressure, a task particularly difficult for the traditional features. Furthermore, by investigation of the importance of individual topological features in the adsorption model, we are able to pinpoint the location of the pores that correlate best to adsorption at different pressure, contributing to our atom-level understanding of structure-property relationships. -
Acute Lymphoblastic Leukemia Classification Using Persistent Homology (2024)
Waqar Hussain Shah, Abdullah Baloch, Rider Jaimes-Reátegui, Sohail Iqbal, Syeda Rafia Fatima, Alexander N. PisarchikAbstract
Acute Lymphoblastic Leukemia (ALL) is a prevalent form of childhood blood cancer characterized by the proliferation of immature white blood cells that rapidly replace normal cells in the bone marrow. The exponential growth of these leukemic cells can be fatal if not treated promptly. Classifying lymphoblasts and healthy cells poses a significant challenge, even for domain experts, due to their morphological similarities. Automated computer analysis of ALL can provide substantial support in this domain and potentially save numerous lives. In this paper, we propose a novel classification approach that involves analyzing shapes and extracting topological features of ALL cells. We employ persistent homology to capture these topological features. Our technique accurately and efficiently detects and classifies leukemia blast cells, achieving a recall of 98.2% and an F1-score of 94.6%. This approach has the potential to significantly enhance leukemia diagnosis and therapy. -
Constructing Shape Spaces From a Topological Perspective (2017)
Christoph Hofer, Roland Kwitt, Marc Niethammer, Yvonne Höller, Eugen Trinka, Andreas UhlAbstract
We consider the task of constructing (metric) shape space(s) from a topological perspective. In particular, we present a generic construction scheme and demonstrate how to apply this scheme when shape is interpreted as the differences that remain after factoring out translation, scaling and rotation. This is achieved by leveraging a recently proposed injective functional transform of 2D/3D (binary) objects, based on persistent homology. The resulting shape space is then equipped with a similarity measure that is (1) by design robust to noise and (2) fulfills all metric axioms. From a practical point of view, analyses of object shape can then be carried out directly on segmented objects obtained from some imaging modality without any preprocessing, such as alignment, smoothing, or landmark selection. We demonstrate the utility of the approach on the problem of distinguishing segmented hippocampi from normal controls vs. patients with Alzheimer’s disease in a challenging setup where volume changes are no longer discriminative. -
TDA-Net: Fusion of Persistent Homology and Deep Learning Features for COVID-19 Detection From Chest X-Ray Images (2021)
Mustafa Hajij, Ghada Zamzmi, Fawwaz BataynehAbstract
Topological Data Analysis (TDA) has emerged recently as a robust tool to extract and compare the structure of datasets. TDA identifies features in data (e.g., connected components and holes) and assigns a quantitative measure to these features. Several studies reported that topological features extracted by TDA tools provide unique information about the data, discover new insights, and determine which feature is more related to the outcome. On the other hand, the overwhelming success of deep neural networks in learning patterns and relationships has been proven on various data applications including images. To capture the characteristics of both worlds, we propose TDA-Net, a novel ensemble network that fuses topological and deep features for the purpose of enhancing model generalizability and accuracy. We apply the proposed TDA-Net to a critical application, which is the automated detection of COVID-19 from CXR images. Experimental results showed that the proposed network achieved excellent performance and suggested the applicability of our method in practice. -
Analysis of Kolmogorov Flow and Rayleigh–Bénard Convection Using Persistent Homology (2016)
Miroslav Kramár, Rachel Levanger, Jeffrey Tithof, Balachandra Suri, Mu Xu, Mark Paul, Michael F. Schatz, Konstantin MischaikowAbstract
We use persistent homology to build a quantitative understanding of large complex systems that are driven far-from-equilibrium. In particular, we analyze image time series of flow field patterns from numerical simulations of two important problems in fluid dynamics: Kolmogorov flow and Rayleigh–Bénard convection. For each image we compute a persistence diagram to yield a reduced description of the flow field; by applying different metrics to the space of persistence diagrams, we relate characteristic features in persistence diagrams to the geometry of the corresponding flow patterns. We also examine the dynamics of the flow patterns by a second application of persistent homology to the time series of persistence diagrams. We demonstrate that persistent homology provides an effective method both for quotienting out symmetries in families of solutions and for identifying multiscale recurrent dynamics. Our approach is quite general and it is anticipated to be applicable to a broad range of open problems exhibiting complex spatio-temporal behavior. -
Uncovering the Topology of Time-Varying fMRI Data Using Cubical Persistence (2020)
Bastian Rieck, Tristan Yates, Christian Bock, Karsten Borgwardt, Guy Wolf, Nicholas Turk-Browne, Smita KrishnaswamyAbstract
Functional magnetic resonance imaging (fMRI) is a crucial technology for gaining insights into cognitive processes in humans. Data amassed from fMRI measurements result in volumetric data sets that vary over time. However, analysing such data presents a challenge due to the large degree of noise and person-to-person variation in how information is represented in the brain. To address this challenge, we present a novel topological approach that encodes each time point in an fMRI data set as a persistence diagram of topological features, i.e. high-dimensional voids present in the data. This representation naturally does not rely on voxel-by-voxel correspondence and is robust to noise. We show that these time-varying persistence diagrams can be clustered to find meaningful groupings between participants, and that they are also useful in studying within-subject brain state trajectories of subjects performing a particular task. Here, we apply both clustering and trajectory analysis techniques to a group of participants watching the movie 'Partly Cloudy'. We observe significant differences in both brain state trajectories and overall topological activity between adults and children watching the same movie. -
A Topological Measurement of Protein Compressibility (2015)
Marcio Gameiro, Yasuaki Hiraoka, Shunsuke Izumi, Miroslav Kramar, Konstantin Mischaikow, Vidit NandaAbstract
In this paper we partially clarify the relation between the compressibility of a protein and its molecular geometric structure. To identify and understand the relevant topological features within a given protein, we model its molecule as an alpha filtration and hence obtain multi-scale insight into the structure of its tunnels and cavities. The persistence diagrams of this alpha filtration capture the sizes and robustness of such tunnels and cavities in a compact and meaningful manner. From these persistence diagrams, we extract a measure of compressibility derived from those topological features whose relevance is suggested by physical and chemical properties. Due to recent advances in combinatorial topology, this measure is efficiently and directly computable from information found in the Protein Data Bank (PDB). Our main result establishes a clear linear correlation between the topological measure and the experimentally-determined compressibility of most proteins for which both PDB information and experimental compressibility data are available. Finally, we establish that both the topological measurement and the linear correlation are stable with respect to small perturbations in the input data, such as those arising from experimental errors in compressibility and X-ray crystallography experiments. -
Topological Data Analysis and Diagnostics of Compressible Magnetohydrodynamic Turbulence (2018)
Irina Makarenko, Paul Bushby, Andrew Fletcher, Robin Henderson, Nikolay Makarenko, Anvar ShukurovAbstract
The predictions of mean-field electrodynamics can now be probed using direct numerical simulations of random flows and magnetic fields. When modelling astrophysical magnetohydrodynamics, it is important to verify that such simulations are in agreement with observations. One of the main challenges in this area is to identify robust quantitative measures to compare structures found in simulations with those inferred from astrophysical observations. A similar challenge is to compare quantitatively results from different simulations. Topological data analysis offers a range of techniques, including the Betti numbers and persistence diagrams, that can be used to facilitate such a comparison. After describing these tools, we first apply them to synthetic random fields and demonstrate that, when the data are standardized in a straightforward manner, some topological measures are insensitive to either large-scale trends or the resolution of the data. Focusing upon one particular astrophysical example, we apply topological data analysis to H i observations of the turbulent interstellar medium (ISM) in the Milky Way and to recent magnetohydrodynamic simulations of the random, strongly compressible ISM. We stress that these topological techniques are generic and could be applied to any complex, multi-dimensional random field. -
Atom-Specific Persistent Homology and Its Application to Protein Flexibility Analysis (2020)
David Bramer, Guo-Wei WeiAbstract
Recently, persistent homology has had tremendous success in biomolecular data analysis. It works by examining the topological relationship or connectivity of a group of atoms in a molecule at a variety of scales, then rendering a family of topological representations of the molecule. However, persistent homology is rarely employed for the analysis of atomic properties, such as biomolecular flexibility analysis or B-factor prediction. This work introduces atom-specific persistent homology to provide a local atomic level representation of a molecule via a global topological tool. This is achieved through the construction of a pair of conjugated sets of atoms and corresponding conjugated simplicial complexes, as well as conjugated topological spaces. The difference between the topological invariants of the pair of conjugated sets is measured by Bottleneck and Wasserstein metrics and leads to an atom-specific topological representation of individual atomic properties in a molecule. Atom-specific topological features are integrated with various machine learning algorithms, including gradient boosting trees and convolutional neural network for protein thermal fluctuation analysis and B-factor prediction. Extensive numerical results indicate the proposed method provides a powerful topological tool for analyzing and predicting localized information in complex macromolecules. -
Data-Driven and Automatic Surface Texture Analysis Using Persistent Homology (2021)
Melih C. Yesilli, Firas A. KhasawnehAbstract
Surface roughness plays an important role in analyzing engineering surfaces. It quantifies the surface topography and can be used to determine whether the resulting surface finish is acceptable or not. Nevertheless, while several existing tools and standards are available for computing surface roughness, these methods rely heavily on user input thus slowing down the analysis and increasing manufacturing costs. Therefore, fast and automatic determination of the roughness level is essential to avoid costs resulting from surfaces with unacceptable finish, and user-intensive analysis. In this study, we propose a Topological Data Analysis (TDA) based approach to classify the roughness level of synthetic surfaces using both their areal images and profiles. We utilize persistent homology from TDA to generate persistence diagrams that encapsulate information on the shape of the surface. We then obtain feature matrices for each surface or profile using Carlsson coordinates, persistence images, and template functions. We compare our results to two widely used methods in the literature: Fast Fourier Transform (FFT) and Gaussian filtering. The results show that our approach yields mean accuracies as high as 97%. We also show that, in contrast to existing surface analysis tools, our TDA-based approach is fully automatable and provides adaptive feature extraction. -
Persistence-Based Pooling for Shape Pose Recognition (2016)
Thomas Bonis, Maks Ovsjanikov, Steve Oudot, Frédéric ChazalAbstract
In this paper, we propose a novel pooling approach for shape classification and recognition using the bag-of-words pipeline, based on topological persistence, a recent tool from Topological Data Analysis. Our technique extends the standard max-pooling, which summarizes the distribution of a visual feature with a single number, thereby losing any notion of spatiality. Instead, we propose to use topological persistence, and the derived persistence diagrams, to provide significantly more informative and spatially sensitive characterizations of the feature functions, which can lead to better recognition performance. Unfortunately, despite their conceptual appeal, persistence diagrams are difficult to handle, since they are not naturally represented as vectors in Euclidean space and even the standard metric, the bottleneck distance is not easy to compute. Furthermore, classical distances between diagrams, such as the bottleneck and Wasserstein distances, do not allow to build positive definite kernels that can be used for learning. To handle this issue, we provide a novel way to transform persistence diagrams into vectors, in which comparisons are trivial. Finally, we demonstrate the performance of our construction on the Non-Rigid 3D Human Models SHREC 2014 dataset, where we show that topological pooling can provide significant improvements over the standard pooling methods for the shape pose recognition within the bag-of-words pipeline. -
Efficient Planning of Multi-Robot Collective Transport Using Graph Reinforcement Learning With Higher Order Topological Abstraction (2023)
Steve Paul, Wenyuan Li, Brian Smyth, Yuzhou Chen, Yulia Gel, Souma ChowdhuryAbstract
Efficient multi-robot task allocation (MRTA) is fundamental to various time-sensitive applications such as disaster response, warehouse operations, and construction. This paper tackles a particular class of these problems that we call MRTA-collective transport or MRTA-CT - here tasks present varying workloads and deadlines, and robots are subject to flight range, communication range, and payload constraints. For large instances of these problems involving 100s-1000's of tasks and 10s-100s of robots, traditional non-learning solvers are often time-inefficient, and emerging learning-based policies do not scale well to larger-sized problems without costly retraining. To address this gap, we use a recently proposed encoder-decoder graph neural network involving Capsule networks and multi-head attention mechanism, and innovatively add topological descriptors (TD) as new features to improve transferability to unseen problems of similar and larger size. Persistent homology is used to derive the TD, and proximal policy optimization is used to train our TD-augmented graph neural network. The resulting policy model compares favorably to state-of-the-art non-learning baselines while being much faster. The benefit of using TD is readily evident when scaling to test problems of size larger than those used in training. -
Exploring Surface Texture Quantification in Piezo Vibration Striking Treatment (PVST) Using Topological Measures (2022)
Melih C. Yesilli, Max M. Chumley, Jisheng Chen, Firas A. Khasawneh, Yang GuoAbstract
Abstract. Surface texture influences wear and tribological properties of manufactured parts, and it plays a critical role in end-user products. Therefore, quantifying the order or structure of a manufactured surface provides important information on the quality and life expectancy of the product. Although texture can be intentionally introduced to enhance aesthetics or to satisfy a design function, sometimes it is an inevitable byproduct of surface treatment processes such as Piezo Vibration Striking Treatment (PVST). Measures of order for surfaces have been characterized using statistical, spectral, and geometric approaches. For nearly hexagonal lattices, topological tools have also been used to measure the surface order. This paper explores utilizing tools from Topological Data Analysis for measuring surface texture. We compute measures of order based on optical digital microscope images of surfaces treated using PVST. These measures are applied to the grid obtained from estimating the centers of tool impacts, and they quantify the grid’s deviations from the nominal one. Our results show that TDA provides a convenient framework for characterization of pattern type that bypasses some limitations of existing tools such as difficult manual processing of the data and the need for an expert user to analyze and interpret the surface images. -
PI-Net: A Deep Learning Approach to Extract Topological Persistence Images (2020)
Anirudh Som, Hongjun Choi, Karthikeyan Natesan Ramamurthy, Matthew Buman, Pavan TuragaAbstract
Topological features such as persistence diagrams and their functional approximations like persistence images (PIs) have been showing substantial promise for machine learning and computer vision applications. This is greatly attributed to the robustness topological representations provide against different types of physical nuisance variables seen in real-world data, such as view-point, illumination, and more. However, key bottlenecks to their large scale adoption are computational expenditure and difficulty incorporating them in a differentiable architecture. We take an important step in this paper to mitigate these bottlenecks by proposing a novel one-step approach to generate PIs directly from the input data. We design two separate convolutional neural network architectures, one designed to take in multi-variate time series signals as input and another that accepts multi-channel images as input. We call these networks Signal PI-Net and Image PINet respectively. To the best of our knowledge, we are the first to propose the use of deep learning for computing topological features directly from data. We explore the use of the proposed PI-Net architectures on two applications: human activity recognition using tri-axial accelerometer sensor data and image classification. We demonstrate the ease of fusion of PIs in supervised deep learning architectures and speed up of several orders of magnitude for extracting PIs from data. Our code is available at https://github.com/anirudhsom/PI-Net. -
A Topological Framework for Identifying Phenomenological Bifurcations in Stochastic Dynamical Systems (2024)
Sunia Tanweer, Firas A. Khasawneh, Elizabeth Munch, Joshua R. TempelmanAbstract
Changes in the parameters of dynamical systems can cause the state of the system to shift between different qualitative regimes. These shifts, known as bifurcations, are critical to study as they can indicate when the system is about to undergo harmful changes in its behavior. In stochastic dynamical systems, there is particular interest in P-type (phenomenological) bifurcations, which can include transitions from a monostable state to multi-stable states, the appearance of stochastic limit cycles and other features in the probability density function (PDF) of the system’s state. Current practices are limited to systems with small state spaces, cannot detect all possible behaviors of the PDFs and mandate human intervention for visually identifying the change in the PDF. In contrast, this study presents a new approach based on Topological Data Analysis that uses superlevel persistence to mathematically quantify P-type bifurcations in stochastic systems through a “homological bifurcation plot”—which shows the changing ranks of 0th and 1st homology groups, through Betti vectors. Using these plots, we demonstrate the successful detection of P-bifurcations on the stochastic Duffing, Raleigh-Vander Pol and Quintic Oscillators given their analytical PDFs, and elaborate on how to generate an estimated homological bifurcation plot given a kernel density estimate (KDE) of these systems by employing a tool for finding topological consistency between PDFs and KDEs. -
Pattern Characterization Using Topological Data Analysis: Application to Piezo Vibration Striking Treatment (2023)
Max M. Chumley, Melih C. Yesilli, Jisheng Chen, Firas A. Khasawneh, Yang GuoAbstract
Quantifying patterns in visual or tactile textures provides important information about the process or phenomena that generated these patterns. In manufacturing, these patterns can be intentionally introduced as a design feature, or they can be a byproduct of a specific process. Since surface texture has significant impact on the mechanical properties and the longevity of the workpiece, it is important to develop tools for quantifying surface patterns and, when applicable, comparing them to their nominal counterparts. While existing tools may be able to indicate the existence of a pattern, they typically do not provide more information about the pattern structure, or how much it deviates from a nominal pattern. Further, prior works do not provide automatic or algorithmic approaches for quantifying other pattern characteristics such as depths’ consistency, and variations in the pattern motifs at different level sets. This paper leverages persistent homology from Topological Data Analysis (TDA) to derive noise-robust scores for quantifying motifs’ depth and roundness in a pattern. Specifically, sublevel persistence is used to derive scores that quantify the consistency of indentation depths at any level set in Piezo Vibration Striking Treatment (PVST) surfaces. Moreover, we combine sublevel persistence with the distance transform to quantify the consistency of the indentation radii, and to compare them with the nominal ones. Although the tool in our PVST experiments had a semi-spherical profile, we present a generalization of our approach to tools/motifs of arbitrary shapes thus making our method applicable to other pattern-generating manufacturing processes. -
Understanding Diffraction Patterns of Glassy, Liquid and Amorphous Materials via Persistent Homology Analyses (2019)
Yohei Onodera, Shinji Kohara, Shuta Tahara, Atsunobu Masuno, Hiroyuki Inoue, Motoki Shiga, Akihiko Hirata, Koichi Tsuchiya, Yasuaki Hiraoka, Ippei Obayashi, Koji Ohara, Akitoshi Mizuno, Osami SakataAbstract
The structure of glassy, liquid, and amorphous materials is still not well understood, due to the insufficient structural information from diffraction data. In this article, attempts are made to understand the origin of diffraction peaks, particularly of the first sharp diffraction peak (FSDP, Q1), the principal peak (PP, Q2), and the third peak (Q3), observed in the measured diffraction patterns of disordered materials whose structure contains tetrahedral motifs. It is confirmed that the FSDP (Q1) is not a signature of the formation of a network, because an FSDP is observed in tetrahedral molecular liquids. It is found that the PP (Q2) reflects orientational correlations of tetrahedra. Q3, that can be observed in all disordered materials, even in common liquid metals, stems from simple pair correlations. Moreover, information on the topology of disordered materials was revealed by utilizing persistent homology analyses. The persistence diagram of silica (SiO2) glass suggests that the shape of rings in the glass is similar not only to those in the crystalline phase with comparable density (α-cristobalite), but also to rings present in crystalline phases with higher density (α-quartz and coesite); this is thought to be the signature of disorder. Furthermore, we have succeeded in revealing the differences, in terms of persistent homology, between tetrahedral networks and tetrahedral molecular liquids, and the difference/similarity between liquid and amorphous (glassy) states. Our series of analyses demonstrated that a combination of diffraction data and persistent homology analyses is a useful tool for allowing us to uncover structural features hidden in halo pattern of disordered materials. -
Algebraic Topology-Based Machine Learning Using MRI Predicts Outcomes in Primary Sclerosing Cholangitis (2022)
Yashbir Singh, William A. Jons, John E. Eaton, Mette Vesterhus, Tom Karlsen, Ida Bjoerk, Andreas Abildgaard, Kristin Kaasen Jorgensen, Folseraas Trine, Derek Little, Aliya F. Gulamhusein, Kosta Petrovic, Anne Negard, Gian Marco Conte, Joseph D. Sobek, Jaidip Jagtap, Sudhakar K. Venkatesh, Gregory J. Gores, Nicholas F. LaRusso, Konstantinos N. Lazaridis, Bradley J. EricksonAbstract
Background: Primary sclerosing cholangitis (PSC) is a chronic cholestatic liver disease that can lead to cirrhosis and hepatic decompensation. However, predicting future outcomes in patients with PSC is challenging. Our aim was to extract magnetic resonance imaging (MRI) features that predict the development of hepatic decompensation by applying algebraic topology-based machine learning (ML). Methods: We conducted a retrospective multicenter study among adults with large duct PSC who underwent MRI. A topological data analysis-inspired nonlinear framework was used to predict the risk of hepatic decompensation, which was motivated by algebraic topology theory-based ML. The topological representations (persistence images) were employed as input for classifcation to predict who developed early hepatic decompensation within one year after their baseline MRI. Results: We reviewed 590 patients; 298 were excluded due to poor image quality or inadequate liver coverage, leaving 292 potentially eligible subjects, of which 169 subjects were included in the study. We trained our model using contrast-enhanced delayed phase T1-weighted images on a single center derivation cohort consisting of 54 patients (hepatic decompensation, n = 21; no hepatic decompensation, n = 33) and a multicenter independent validation cohort of 115 individuals (hepatic decompensation, n = 31; no hepatic decompensation, n = 84). When our model was applied in the independent validation cohort, it remained predictive of early hepatic decompensation (area under the receiver operating characteristic curve = 0.84). Conclusions: Algebraic topology-based ML is a methodological approach that can predict outcomes in patients with PSC and has the potential for application in other chronic liver diseases -
Persistent Homology of the Cosmic Web. I: Hierarchical Topology in \$\Lambda\$CDM Cosmologies (2021)
Georg Wilding, Keimpe Nevenzeel, Rien van de Weygaert, Gert Vegter, Pratyush Pranav, Bernard J. T. Jones, Konstantinos Efstathiou, Job FeldbruggeAbstract
Using a set of \$\Lambda\$CDM simulations of cosmic structure formation, we study the evolving connectivity and changing topological structure of the cosmic web using state-of-the-art tools of multiscale topological data analysis (TDA). We follow the development of the cosmic web topology in terms of the evolution of Betti number curves and feature persistence diagrams of the three (topological) classes of structural features: matter concentrations, filaments and tunnels, and voids. The Betti curves specify the prominence of features as a function of density level, and their evolution with cosmic epoch reflects the changing network connections between these structural features. The persistence diagrams quantify the longevity and stability of topological features. In this study we establish, for the first time, the link between persistence diagrams, the features they show, and the gravitationally driven cosmic structure formation process. By following the diagrams' development over cosmic time, the link between the multiscale topology of the cosmic web and the hierarchical buildup of cosmic structure is established. The sharp apexes in the diagrams are intimately related to key transitions in the structure formation process. The apex in the matter concentration diagrams coincides with the density level at which, typically, they detach from the Hubble expansion and begin to collapse. At that level many individual islands merge to form the network of the cosmic web and a large number of filaments and tunnels emerge to establish its connecting bridges. The location trends of the apex possess a self-similar character that can be related to the cosmic web's hierarchical buildup. We find that persistence diagrams provide a significantly higher and more profound level of information on the structure formation process than more global summary statistics like Euler characteristic or Betti numbers. -
Determining Clinically Relevant Features in Cytometry Data Using Persistent Homology (2022)
Soham Mukherjee, Darren Wethington, Tamal K. Dey, Jayajit DasAbstract
Cytometry experiments yield high-dimensional point cloud data that is difficult to interpret manually. Boolean gating techniques coupled with comparisons of relative abundances of cellular subsets is the current standard for cytometry data analysis. However, this approach is unable to capture more subtle topological features hidden in data, especially if those features are further masked by data transforms or significant batch effects or donor-to-donor variations in clinical data. We present that persistent homology, a mathematical structure that summarizes the topological features, can distinguish different sources of data, such as from groups of healthy donors or patients, effectively. Analysis of publicly available cytometry data describing non-naïve CD8+ T cells in COVID-19 patients and healthy controls shows that systematic structural differences exist between single cell protein expressions in COVID-19 patients and healthy controls. We identify proteins of interest by a decision-tree based classifier, sample points randomly and compute persistence diagrams from these sampled points. The resulting persistence diagrams identify regions in cytometry datasets of varying density and identify protruded structures such as ‘elbows’. We compute Wasserstein distances between these persistence diagrams for random pairs of healthy controls and COVID-19 patients and find that systematic structural differences exist between COVID-19 patients and healthy controls in the expression data for T-bet, Eomes, and Ki-67. Further analysis shows that expression of T-bet and Eomes are significantly downregulated in COVID-19 patient non-naïve CD8+ T cells compared to healthy controls. This counter-intuitive finding may indicate that canonical effector CD8+ T cells are less prevalent in COVID-19 patients than healthy controls. This method is applicable to any cytometry dataset for discovering novel insights through topological data analysis which may be difficult to ascertain otherwise with a standard gating strategy or existing bioinformatic tools.Community Resources
-
Quantification of the Immune Content in Neuroblastoma: Deep Learning and Topological Data Analysis in Digital Pathology (2021)
Nicole Bussola, Bruno Papa, Ombretta Melaiu, Aurora Castellano, Doriana Fruci, Giuseppe JurmanAbstract
We introduce here a novel machine learning (ML) framework to address the issue of the quantitative assessment of the immune content in neuroblastoma (NB) specimens. First, the EUNet, a U-Net with an EfficientNet encoder, is trained to detect lymphocytes on tissue digital slides stained with the CD3 T-cell marker. The training set consists of 3782 images extracted from an original collection of 54 whole slide images (WSIs), manually annotated for a total of 73,751 lymphocytes. Resampling strategies, data augmentation, and transfer learning approaches are adopted to warrant reproducibility and to reduce the risk of overfitting and selection bias. Topological data analysis (TDA) is then used to define activation maps from different layers of the neural network at different stages of the training process, described by persistence diagrams (PD) and Betti curves. TDA is further integrated with the uniform manifold approximation and projection (UMAP) dimensionality reduction and the hierarchical density-based spatial clustering of applications with noise (HDBSCAN) algorithm for clustering, by the deep features, the relevant subgroups and structures, across different levels of the neural network. Finally, the recent TwoNN approach is leveraged to study the variation of the intrinsic dimensionality of the U-Net model. As the main task, the proposed pipeline is employed to evaluate the density of lymphocytes over the whole tissue area of the WSIs. The model achieves good results with mean absolute error 3.1 on test set, showing significant agreement between densities estimated by our EUNet model and by trained pathologists, thus indicating the potentialities of a promising new strategy in the quantification of the immune content in NB specimens. Moreover, the UMAP algorithm unveiled interesting patterns compatible with pathological characteristics, also highlighting novel insights into the dynamics of the intrinsic dataset dimensionality at different stages of the training process. All the experiments were run on the Microsoft Azure cloud platform. -
Segmentation of Biomedical Images by a Computational Topology Framework (2017)
Rodrigo Rojas Moraleda, Wei Xiong, Niels Halama, Katja Breitkopf-Heinlein, Steven Steven, Luis Salinas, Dieter W. Heermann, Nektarios A. ValousAbstract
The segmentation of cell nuclei is an important step towards the automated analysis of histological images. The presence of a large number of nuclei in whole-slide images necessitates methods that are computationally tractable in addition to being effective. In this work, a method is developed for the robust segmentation of cell nuclei in histological images based on the principles of persistent homology. More specifically, an abstract simplicial homology approach for image segmentation is established. Essentially, the approach deals with the persistence of disconnected sets in the image, thus identifying salient regions that express patterns of persistence. By introducing an image representation based on topological features, the task of segmentation is less dependent on variations of color or texture. This results in a novel approach that generalizes well and provides stable performance. The method conceptualizes regions of interest (cell nuclei) pertinent to their topological features in a successful manner. The time cost of the proposed approach is lower-bounded by an almost linear behavior and upper-bounded by O(n2) in a worst-case scenario. Time complexity matches a quasilinear behavior which is O(n1+ɛ) for ε \textless 1. Images acquired from histological sections of liver tissue are used as a case study to demonstrate the effectiveness of the approach. The histological landscape consists of hepatocytes and non-parenchymal cells. The accuracy of the proposed methodology is verified against an automated workflow created by the output of a conventional filter bank (validated by experts) and the supervised training of a random forest classifier. The results are obtained on a per-object basis. The proposed workflow successfully detected both hepatocyte and non-parenchymal cell nuclei with an accuracy of 84.6%, and hepatocyte cell nuclei only with an accuracy of 86.2%. A public histological dataset with supplied ground-truth data is also used for evaluating the performance of the proposed approach (accuracy: 94.5%). Further validations are carried out with a publicly available dataset and ground-truth data from the Gland Segmentation in Colon Histology Images Challenge (GlaS) contest. The proposed method is useful for obtaining unsupervised robust initial segmentations that can be further integrated in image/data processing and management pipelines. The development of a fully automated system supporting a human expert provides tangible benefits in the context of clinical decision-making.