Innate and Adaptive T Cells in Asthmatic Patients: Relationship to Severity and Disease Mechanisms (2015)

Timothy SC Hinks, Xiaoying Zhou, Karl J. Staples, Borislav D. Dimitrov, Alexander Manta, Tanya Petrossian, Pek Y. Lum, Caroline G. Smith, Jon A. Ward, Peter H. Howarth, Andrew F. Walls, Stephan D. Gadola, Ratko Djukanović

A Barcode Shape Descriptor for Curve Point Cloud Data (2004)

Anne Collins, Afra Zomorodian, Gunnar Carlsson, Leonidas J. Guibas

Abstract

In this paper, we present a complete computational pipeline for extracting a compact shape descriptor for curve point cloud data (PCD). Our shape descriptor, called a barcode, is based on a blend of techniques from differential geometry and algebraic topology. We also provide a metric over the space of barcodes, enabling fast comparison of PCDs for shape recognition and clustering. To demonstrate the feasibility of our approach, we implement our pipeline and provide experimental evidence in shape classification and parametrization.

Fractal Dimension Estimation With Persistent Homology: A Comparative Study (2020)

Jonathan Jaquette, Benjamin Schweinhart

Abstract

We propose that the recently defined persistent homology dimensions are a practical tool for fractal dimension estimation of point samples. We implement an algorithm to estimate the persistent homology dimension, and compare its performance to classical methods to compute the correlation and box-counting dimensions in examples of self-similar fractals, chaotic attractors, and an empirical dataset. The performance of the 0-dimensional persistent homology dimension is comparable to that of the correlation dimension, and better than box-counting.

Community Resources

Code

Bayesian Computation Meets Topology (2024)

Julius von Rohrscheidt, Bastian Rieck, Sebastian M. Schmon

Abstract

Computational topology recently started to emerge as a novel paradigm for characterising the ‘shape’ of high-dimensional data, leading to powerful algorithms in (un)supervised representation learning. While capable of capturing prominent features at multiple scales, topological methods cannot readily be used for Bayesian inference. We develop a novel approach that bridges this gap, making it possible to perform parameter estimation in a Bayesian framework, using topology-based loss functions. Our method affords easy integration into topological machine learning algorithms. We demonstrate its efficacy for parameter estimation in different simulation settings.

Characterizing Fluid Dynamical Systems Using Euler Characteristic Surface and Euler Metric (2023)

A. Roy, R. A. I. Haque, A. J. Mitra, S. Tarafdar, T. Dutta

Abstract

Euler characteristic ( χ ), a topological invariant, helps to understand the topology of a network or complex. We demonstrate that the multi-scale topological information of dynamically evolving fluid flow systems can be crystallized into their Euler characteristic surfaces χ s ( r , t ). Furthermore, we demonstrate the Euler Metric (EM), introduced by the authors, can be utilized to identify the stability regime of a given flow pattern, besides distinguishing between different flow systems. The potential of the Euler characteristic surface and the Euler metric have been demonstrated first on analyzing a simulated deterministic dynamical system before being applied to analyze experimental flow patterns that develop in micrometer sized drying droplets.

Euler Characteristic Surfaces: A Stable Multiscale Topological Summary of Time Series Data (2024)

Anamika Roy, Atish J. Mitra, Tapati Dutta

Abstract

We present Euler Characteristic Surfaces as a multiscale spatiotemporal topological summary of time series data encapsulating the topology of the system at different time instants and length scales. Euler Characteristic Surfaces with an appropriate metric is used to quantify stability and locate critical changes in a dynamical system with respect to variations in a parameter, while being substantially computationally cheaper than available alternate methods such as persistent homology. The stability of the construction is demonstrated by a quantitative comparison bound with persistent homology, and a quantitative stability bound under small changes in time is established. The proposed construction is used to analyze two different kinds of simulated disordered flow situations.

Robust Crossings Detection in Noisy Signals Using Topological Signal Processing (2024)

Sunia Tanweer, Firas A. Khasawneh, Elizabeth Munch

Abstract

This article explores a novel method of bracketing zero-crossings for both 1-D functions and discretely sampled time series by the application of 0-D persistent homology from algebraic topology. We introduce an algorithm and demonstrate its capability of detecting crossing in noisy signals across various sampling frequencies. Compared to other software-based methods for crossing-detection in signals, our approach is typically faster, shows a higher accuracy, and has the unique ability to identify all roots within the provided interval instead of detecting only one out of all. We also discuss different options for mathematically estimating the persistence threshold— a parameter which impacts and controls the correct bracketing of roots. Finally, we explore the potential of extending our algorithm to higher dimensions.

Gene Expression Data Classification Using Topology and Machine Learning Models (2022)

Tamal K. Dey, Sayan Mandal, Soham Mukherjee

Abstract

Interpretation of high-throughput gene expression data continues to require mathematical tools in data analysis that recognizes the shape of the data in high dimensions. Topological data analysis (TDA) has recently been successful in extracting robust features in several applications dealing with high dimensional constructs. In this work, we utilize some recent developments in TDA to curate gene expression data. Our work differs from the predecessors in two aspects: (1) Traditional TDA pipelines use topological signatures called barcodes to enhance feature vectors which are used for classification. In contrast, this work involves curating relevant features to obtain somewhat better representatives with the help of TDA. This representatives of the entire data facilitates better comprehension of the phenotype labels. (2) Most of the earlier works employ barcodes obtained using topological summaries as fingerprints for the data. Even though they are stable signatures, there exists no direct mapping between the data and said barcodes.

Community Resources

Code

Some Applications of TDA on Financial Markets (2022)

Miguel Angel Ruiz-Ortiz, José Carlos Gómez-Larrañaga, Jesús Rodríguez-Viorato

Abstract

The Topological Data Analysis (TDA) has had many applications. However, financial markets has been studied slightly through TDA. Here we present a quick review of some recent applications of TDA on financial markets and propose a new turbulence index based on persistent homology -- the fundamental tool for TDA -- that seems to capture critical transitions on financial data, based on our experiment with SP500 data before 2020 stock market crash in February 20, 2020, due to the COVID-19 pandemic. We review applications in the early detection of turbulence periods in financial markets and how TDA can help to get new insights while investing and obtain superior risk-adjusted returns compared with investing strategies using classical turbulence indices as VIX and the Chow's index based on the Mahalanobis distance. Furthermore, we include an introduction to persistent homology so the reader could be able to understand this paper without knowing TDA.

Position: Topological Deep Learning Is the New Frontier for Relational Learning (2024)

Theodore Papamarkou, Tolga Birdal, Michael M. Bronstein, Gunnar E. Carlsson, Justin Curry, Yue Gao, Mustafa Hajij, Roland Kwitt, Pietro Lio, Paolo Di Lorenzo, Vasileios Maroulas, Nina Miolane, Farzana Nasrin, Karthikeyan Natesan Ramamurthy, Bastian Rieck, Simone Scardapane, Michael T. Schaub, Petar Veličković, Bei Wang, Yusu Wang, Guowei Wei, Ghada Zamzmi

Abstract

Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning settings. To this end, this paper discusses open problems in TDL, ranging from practical benefits to theoretical foundations. For each problem, it outlines potential solutions and future research opportunities. At the same time, this paper serves as an invitation to the scientific community to actively participate in TDL research to unlock the potential of this emerging field.

Explainable Machine Learning Approach to Yield and Quality Improvements Using Deep Topological Data Analytics (2023)

Janhavi Giri, Attila Lengyel

Abstract

Abstract. In wafer fabrication, data is collected and analyzed to prevent process deviations that could affect product quality and wafer yield. However, the high-dimensional, sparse, and imbalanced nature of the data poses significant challenges to yield and quality root cause analysis. Deep Topological Data Analysis (DTDA) is an unsupervised machine learning method that clusters and models the data in the form of geometric objects such as graphs and their higher-dimensional versions. This method reduces the multidimensional dataset to two-dimensional networks or graphs, where each node represents a cluster of samples with similar characteristics, and an edge represents the presence of overlapping characteristics between the connecting nodes. DTDA provides insights into the necessary data elements required to conduct accurate analysis and helps engineers identify the features contributing to yield and quality issues, enabling corrective actions. Moreover, the approach prevents the waste of engineering resources and mitigates the impact on final manufacturing cost.

Revisiting Abnormalities in Brain Network Architecture Underlying Autism Using Topology-Inspired Statistical Inference (2018)

Sourabh Palande, Vipin Jose, Brandon Zielinski, Jeffrey Anderson, P. Thomas Fletcher, Bei Wang

Abstract

A large body of evidence relates autism with abnormal structural and functional brain connectivity. Structural covariance magnetic resonance imaging (scMRI) is a technique that maps brain regions with covarying gray matter densities across subjects. It provides a way to probe the anatomical structure underlying intrinsic connectivity networks (ICNs) through analysis of gray matter signal covariance. In this article, we apply topological data analysis in conjunction with scMRI to explore network-specific differences in the gray matter structure in subjects with autism versus age-, gender-, and IQ-matched controls. Specifically, we investigate topological differences in gray matter structure captured by structural correlation graphs derived from three ICNs strongly implicated in autism, namely the salience network, default mode network, and executive control network. By combining topological data analysis with statistical inference, our results provide evidence of statistically significant network-specific structural abnormalities in autism.

Topological Data Analysis: Concepts, Computation, and Applications in Chemical Engineering (2021)

Alexander D. Smith, Paweł Dłotko, Victor M. Zavala

Abstract

A primary hypothesis that drives scientific and engineering studies is that data has structure. The dominant paradigms for describing such structure are statistics (e.g., moments, correlation functions) and signal processing (e.g., convolutional neural nets, Fourier series). Topological Data Analysis (TDA) is a field of mathematics that analyzes data from a fundamentally different perspective. TDA represents datasets as geometric objects and provides dimensionality reduction techniques that project such objects onto low-dimensional descriptors. The key properties of these descriptors (also known as topological features) are that they provide multiscale information and that they are stable under perturbations (e.g., noise, translation, and rotation). In this work, we review the key mathematical concepts and methods of TDA and present different applications in chemical engineering.

Acute Lymphoblastic Leukemia Classification Using Persistent Homology (2024)

Waqar Hussain Shah, Abdullah Baloch, Rider Jaimes-Reátegui, Sohail Iqbal, Syeda Rafia Fatima, Alexander N. Pisarchik

Abstract

Acute Lymphoblastic Leukemia (ALL) is a prevalent form of childhood blood cancer characterized by the proliferation of immature white blood cells that rapidly replace normal cells in the bone marrow. The exponential growth of these leukemic cells can be fatal if not treated promptly. Classifying lymphoblasts and healthy cells poses a significant challenge, even for domain experts, due to their morphological similarities. Automated computer analysis of ALL can provide substantial support in this domain and potentially save numerous lives. In this paper, we propose a novel classification approach that involves analyzing shapes and extracting topological features of ALL cells. We employ persistent homology to capture these topological features. Our technique accurately and efficiently detects and classifies leukemia blast cells, achieving a recall of 98.2% and an F1-score of 94.6%. This approach has the potential to significantly enhance leukemia diagnosis and therapy.

Topological Detection of Phenomenological Bifurcations With Unreliable Kernel Density Estimates (2024)

Sunia Tanweer, Firas A. Khasawneh

Abstract

Phenomenological (P-type) bifurcations are qualitative changes in stochastic dynamical systems whereby the stationary probability density function (PDF) changes its topology. The current state of the art for detecting these bifurcations requires reliable kernel density estimates computed from an ensemble of system realizations. However, in several real world signals such as Big Data, only a single system realization is available—making it impossible to estimate a reliable kernel density. This study presents an approach for detecting P-type bifurcations using unreliable density estimates. The approach creates an ensemble of objects from Topological Data Analysis (TDA) called persistence diagrams from the system’s sole realization and statistically analyzes the resulting set. We compare several methods for replicating the original persistence diagram including Gibbs point process modelling, Pairwise Interaction Point Modelling, and subsampling. We show that for the purpose of predicting a bifurcation, the simple method of subsampling exceeds the other two methods of point process modelling in performance.

SuPerPoV: Score and Evolution of the Stratospheric Polar Vortex via Persistent Homology (2026)

Jake Cordes, Barbara Giunti, Zheng Wu

Abstract

Classifying the stratospheric polar vortex provides predictability for surface weather on extended-range timescales definitions of these events proposed in over 60 years of study depend on empirically chosen parameters and yield different results when one of them changes. Moreover, as previous definitions are based on static thresholds, it is not straightforward to use them to study the spatiotemporal evolution of the vortexe introduce SuPerPoV, a score system that computes displacement and split ratiossing tools from applied topology. The computation is entirely threshold-free, open source, and does not require familiarity with applied topology. The scores generally recovers previous definitions and are output for a user-defined number of days, thus showing the evolution of the event. SuPerPoV offers a paradigm shift in the study of the polar vortex, hopefully bringing a deeper understanding of the polar vortex and related extreme events, such as sudden stratospheric warmings.

Community Resources

Rapid and Precise Topological Comparison With Merge Tree Neural Networks (2024)

Yu Qin, Brittany Terese Fasy, Carola Wenk, Brian Summa

Abstract

Merge trees are a valuable tool in the scientific visualization of scalar fields; however, current methods for merge tree comparisons are computationally expensive, primarily due to the exhaustive matching between tree nodes. To address this challenge, we introduce the Merge Tree Neural Network (MTNN), a learned neural network model designed for merge tree comparison. The MTNN enables rapid and high-quality similarity computation. We first demonstrate how to train graph neural networks, which emerged as effective encoders for graphs, in order to produce embeddings of merge trees in vector spaces for efficient similarity comparison. Next, we formulate the novel MTNN model that further improves the similarity comparisons by integrating the tree and node embeddings with a new topological attention mechanism. We demonstrate the effectiveness of our model on real-world data in different domains and examine our model's generalizability across various datasets. Our experimental analysis demonstrates our approach's superiority in accuracy and efficiency. In particular, we speed up the prior state-of-the-art by more than \$100\times\$ on the benchmark datasets while maintaining an error rate below \$0.1\%\$.

Statistical Topology of Bond Networks With Applications to Silica (2020)

B. Schweinhart, D. Rodney, J. K. Mason

Abstract

Whereas knowledge of a crystalline material's unit cell is fundamental to understanding the material's properties and behavior, there are no obvious analogs to unit cells for disordered materials despite the frequent existence of considerable medium-range order. This article views a material's structure as a collection of local atomic environments that are sampled from some underlying probability distribution of such environments, with the advantage of offering a unified description of both ordered and disordered materials. Crystalline materials can then be regarded as special cases where the underlying probability distribution is highly concentrated around the traditional unit cell. The 𝐻1 barcode is proposed as a descriptor of local atomic environments suitable for disordered bond networks and is applied with three other descriptors to molecular dynamics simulations of silica glasses. Each descriptor reliably distinguishes the structure of glasses produced at different cooling rates, with the 𝐻1 barcode and coordination profile providing the best separation. The approach is generally applicable to any system that can be represented as a sparse graph.

Community Resources

Code

Towards a Philological Metric Through a Topological Data Analysis Approach (2020)

Eduardo Paluzo-Hidalgo, Rocio Gonzalez-Diaz, Miguel A. Gutiérrez-Naranjo

Abstract

The canon of the baroque Spanish literature has been thoroughly studied with philological techniques. The major representatives of the poetry of this epoch are Francisco de Quevedo and Luis de Góngora y Argote. They are commonly classified by the literary experts in two different streams: Quevedo belongs to the Conceptismo and G\ńgora to the Culteranismo. Besides, traditionally, even if Quevedo is considered the most representative of the Conceptismo, Lope de Vega is also considered to be, at least, closely related to this literary trend. In this paper, we use Topological Data Analysis techniques to provide a first approach to a metric distance between the literary style of these poets. As a consequence, we reach results that are under the literary experts' criteria, locating the literary style of Lope de Vega, closer to the one of Quevedo than to the one of G\'ǵora.

Community Resources

Data

Multidimensional Persistence in Biomolecular Data (2015)

Kelin Xia, Guo-Wei Wei

Abstract

Persistent homology has emerged as a popular technique for the topological simplification of big data, including biomolecular data. Multidimensional persistence bears considerable promise to bridge the gap between geometry and topology. However, its practical and robust construction has been a challenge. We introduce two families of multidimensional persistence, namely pseudo-multidimensional persistence and multiscale multidimensional persistence. The former is generated via the repeated applications of persistent homology filtration to high dimensional data, such as results from molecular dynamics or partial differential equations. The latter is constructed via isotropic and anisotropic scales that create new simiplicial complexes and associated topological spaces. The utility, robustness and efficiency of the proposed topological methods are demonstrated via protein folding, protein flexibility analysis, the topological denoising of cryo-electron microscopy data, and the scale dependence of nano particles. Topological transition between partial folded and unfolded proteins has been observed in multidimensional persistence. The separation between noise topological signatures and molecular topological fingerprints is achieved by the Laplace-Beltrami flow. The multiscale multidimensional persistent homology reveals relative local features in Betti-0 invariants and the relatively global characteristics of Betti-1 and Betti-2 invariants.

Dissecting Glial Scar Formation by Spatial Point Pattern and Topological Data Analysis (2024)

Daniel Manrique-Castano, Dhananjay Bhaskar, Ayman ElAli

Abstract

Glial scar formation represents a fundamental response to central nervous system (CNS) injuries. It is mainly characterized by a well-defined spatial rearrangement of reactive astrocytes and microglia. The mechanisms underlying glial scar formation have been extensively studied, yet quantitative descriptors of the spatial arrangement of reactive glial cells remain limited. Here, we present a novel approach using point pattern analysis (PPA) and topological data analysis (TDA) to quantify spatial patterns of reactive glial cells after experimental ischemic stroke in mice. We provide open and reproducible tools using R and Julia to quantify spatial intensity, cell covariance and conditional distribution, cell-to-cell interactions, and short/long-scale arrangement, which collectively disentangle the arrangement patterns of the glial scar. This approach unravels a substantial divergence in the distribution of GFAP+ and IBA1+ cells after injury that conventional analysis methods cannot fully characterize. PPA and TDA are valuable tools for studying the complex spatial arrangement of reactive glia and other nervous cells following CNS injuries and have potential applications for evaluating glial-targeted restorative therapies.

Community Resources

Code
Data

Geometric Anomaly Detection in Data (2020)

Bernadette J. Stolz, Jared Tanner, Heather A. Harrington, Vidit Nanda

Abstract

The quest for low-dimensional models which approximate high-dimensional data is pervasive across the physical, natural, and social sciences. The dominant paradigm underlying most standard modeling techniques assumes that the data are concentrated near a single unknown manifold of relatively small intrinsic dimension. Here, we present a systematic framework for detecting interfaces and related anomalies in data which may fail to satisfy the manifold hypothesis. By computing the local topology of small regions around each data point, we are able to partition a given dataset into disjoint classes, each of which can be individually approximated by a single manifold. Since these manifolds may have different intrinsic dimensions, local topology discovers singular regions in data even when none of the points have been sampled precisely from the singularities. We showcase this method by identifying the intersection of two surfaces in the 24-dimensional space of cyclo-octane conformations and by locating all of the self-intersections of a Henneberg minimal surface immersed in 3-dimensional space. Due to the local nature of the topological computations, the algorithmic burden of performing such data stratification is readily distributable across several processors.

Persistent Homology in Cosmic Shear - II. A Tomographic Analysis of DES-Y1 (2022)

Sven Heydenreich, Benjamin Brück, Pierre Burger, Joachim Harnois-Déraps, Sandra Unruh, Tiago Castro, Klaus Dolag, Nicolas Martinet

Abstract

We demonstrate how to use persistent homology for cosmological parameter inference in a tomographic cosmic shear survey. We obtain the first cosmological parameter constraints from persistent homology by applying our method to the first-year data of the Dark Energy Survey. To obtain these constraints, we analyse the topological structure of the matter distribution by extracting persistence diagrams from signal-to-noise maps of aperture masses. This presents a natural extension to the widely used peak count statistics. Extracting the persistence diagrams from the cosmo-SLICS, a suite of \textlessi\textgreaterN\textlessi/\textgreater-body simulations with variable cosmological parameters, we interpolate the signal using Gaussian processes and marginalise over the most relevant systematic effects, including intrinsic alignments and baryonic effects. For the structure growth parameter, we find , which is in full agreement with other late-time probes. We also constrain the intrinsic alignment parameter to \textlessi\textgreaterA\textlessi/\textgreater = 1.54 ± 0.52, which constitutes a detection of the intrinsic alignment effect at almost 3\textlessi\textgreaterσ\textlessi/\textgreater.

Exploring the Geometry and Topology of Neural Network Loss Landscapes (2022)

Stefan Horoi, Jessie Huang, Bastian Rieck, Guillaume Lajoie, Guy Wolf, Smita Krishnaswamy

Abstract

Recent work has established clear links between the generalization performance of trained neural networks and the geometry of their loss landscape near the local minima to which they converge. This suggests that qualitative and quantitative examination of the loss landscape geometry could yield insights about neural network generalization performance during training. To this end, researchers have proposed visualizing the loss landscape through the use of simple dimensionality reduction techniques. However, such visualization methods have been limited by their linear nature and only capture features in one or two dimensions, thus restricting sampling of the loss landscape to lines or planes. Here, we expand and improve upon these in three ways. First, we present a novel “jump and retrain” procedure for sampling relevant portions of the loss landscape. We show that the resulting sampled data holds more meaningful information about the network’s ability to generalize. Next, we show that non-linear dimensionality reduction of the jump and retrain trajectories via PHATE, a trajectory and manifold-preserving method, allows us to visualize differences between networks that are generalizing well vs poorly. Finally, we combine PHATE trajectories with a computational homology characterization to quantify trajectory differences.

Topology-Informed Machine Learning for Efficient Prediction of Solid Oxide Fuel Cell Electrode Polarization (2025)

Maksym Szemer, Szymon Buchaniec, Tomasz Prokop, Grzegorz Brus

Abstract

Machine learning has emerged as a potent computational tool for expediting research and development in solid oxide fuel cell electrodes. The effective application of machine learning for performance prediction requires transforming electrode microstructure into a format compatible with artificial neural networks. Input data may range from a comprehensive digital material representation of the electrode to a selected set of microstructural parameters. The chosen representation significantly influences the performance and results of the network. Here, we show a novel approach utilizing persistence representation derived from computational topology. Using 500 microstructures and current–voltage characteristics obtained with three-dimensional first-principles simulations, we have prepared an artificial neural network model that can replicate current–voltage characteristics of unseen microstructures based on their persistent image representation. The artificial neural network can accurately predict the polarization curve of solid oxide fuel cell electrodes. The presented method incorporates complex microstructural information from the digital material representation while requiring substantially less computational resources (preprocessing and prediction time ≈1min) compared to our high-fidelity simulations (simulation time ≈1h) to obtain a single current-potential characteristic for one microstructure.

Community Resources

Code

Multiresolution Persistent Homology for Excessively Large Biomolecular Datasets (2015)

Kelin Xia, Zhixiong Zhao, Guo-Wei Wei

Abstract

Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of interest so as to represent large scale datasets with appropriate resolution. We utilize flexibility-rigidity index to access the topological connectivity of the data set and define a rigidity density for the filtration analysis. By appropriately tuning the resolution of the rigidity density, we are able to focus the topological lens on the scale of interest. The proposed multiresolution topological analysis is validated by a hexagonal fractal image which has three distinct scales. We further demonstrate the proposed method for extracting topological fingerprints from DNA molecules. In particular, the topological persistence of a virus capsid with 273 780 atoms is successfully analyzed which would otherwise be inaccessible to the normal point cloud method and unreliable by using coarse-grained multiscale persistent homology. The proposed method has also been successfully applied to the protein domain classification, which is the first time that persistent homology is used for practical protein domain analysis, to our knowledge. The proposed multiresolution topological method has potential applications in arbitrary data sets, such as social networks, biological networks, and graphs.

Persistent Homology of Time-Dependent Functional Networks Constructed From Coupled Time Series (2017)

Bernadette J. Stolz, Heather A. Harrington, Mason A. Porter

Abstract

We use topological data analysis to study “functional networks” that we construct from time-series data from both experimental and synthetic sources. We use persistent homology with a weight rank clique filtration to gain insights into these functional networks, and we use persistence landscapes to interpret our results. Our first example uses time-series output from networks of coupled Kuramoto oscillators. Our second example consists of biological data in the form of functional magnetic resonance imaging data that were acquired from human subjects during a simple motor-learning task in which subjects were monitored for three days during a five-day period. With these examples, we demonstrate that (1) using persistent homology to study functional networks provides fascinating insights into their properties and (2) the position of the features in a filtration can sometimes play a more vital role than persistence in the interpretation of topological features, even though conventionally the latter is used to distinguish between signal and noise. We find that persistent homology can detect differences in synchronization patterns in our data sets over time, giving insight both on changes in community structure in the networks and on increased synchronization between brain regions that form loops in a functional network during motor learning. For the motor-learning data, persistence landscapes also reveal that on average the majority of changes in the network loops take place on the second of the three days of the learning process.

Understanding Flow Features in Drying Droplets via Euler Characteristic Surfaces—A Topological Tool (2020)

A. Roy, R. A. I. Haque, A. J. Mitra, M. Dutta Choudhury, S. Tarafdar, T. Dutta

Abstract

In this paper, we propose a mathematical picture of flow in a drying multiphase droplet. The system studied consists of a suspension of microscopic polystyrene beads in water. The time development of the drying process is described by defining the “Euler characteristic surface,” which provides a multiscale topological map of this dynamical system. A novel method is adopted to analyze the images extracted from experimental video sequences. Experimental image data are converted to binary data through appropriate Gaussian filters and optimal thresholding and analyzed using the Euler characteristic determined on a hexagonal lattice. In order to do a multiscale analysis of the extracted image, we introduce the concept of Euler characteristic at a specific scale r > 0. This multiscale time evolution of the connectivity information on aggregates of polysterene beads in water is summarized in a Euler characteristic surface and, subsequently, in a Euler characteristic level curve plot. We introduce a metric between Euler characteristic surfaces as a possible similarity measure between two flow situations. The constructions proposed by us are used to interpret flow patterns (and their stability) generated on the upper surface of the drying droplet interface. The philosophy behind the topological tools developed in this work is to produce low-dimensional signatures of dynamical systems, which may be used to efficiently summarize and distinguish topological information in various types of flow situations.

Toroidal Topology of Population Activity in Grid Cells (2022)

Richard J. Gardner, Erik Hermansen, Marius Pachitariu, Yoram Burak, Nils A. Baas, Benjamin A. Dunn, May-Britt Moser, Edvard I. Moser

Abstract

The medial entorhinal cortex is part of a neural system for mapping the position of an individual within a physical environment1. Grid cells, a key component of this system, fire in a characteristic hexagonal pattern of locations2, and are organized in modules3 that collectively form a population code for the animal’s allocentric position1. The invariance of the correlation structure of this population code across environments4,5 and behavioural states6,7, independent of specific sensory inputs, has pointed to intrinsic, recurrently connected continuous attractor networks (CANs) as a possible substrate of the grid pattern1,8–11. However, whether grid cell networks show continuous attractor dynamics, and how they interface with inputs from the environment, has remained unclear owing to the small samples of cells obtained so far. Here, using simultaneous recordings from many hundreds of grid cells and subsequent topological data analysis, we show that the joint activity of grid cells from an individual module resides on a toroidal manifold, as expected in a two-dimensional CAN. Positions on the torus correspond to positions of the moving animal in the environment. Individual cells are preferentially active at singular positions on the torus. Their positions are maintained between environments and from wakefulness to sleep, as predicted by CAN models for grid cells but not by alternative feedforward models12. This demonstration of network dynamics on a toroidal manifold provides a population-level visualization of CAN dynamics in grid cells.

Community Resources

Data

Classification of COVID-19 via Homology of CT-SCAN (2021)

Sohail Iqbal, H. Fareed Ahmed, Talha Qaiser, Muhammad Imran Qureshi, Nasir Rajpoot

Abstract

In this worldwide spread of SARS-CoV-2 (COVID-19) infection, it is of utmost importance to detect the disease at an early stage especially in the hot spots of this epidemic. There are more than 110 Million infected cases on the globe, sofar. Due to its promptness and effective results computed tomography (CT)-scan image is preferred to the reverse-transcription polymerase chain reaction (RT-PCR). Early detection and isolation of the patient is the only possible way of controlling the spread of the disease. Automated analysis of CT-Scans can provide enormous support in this process. In this article, We propose a novel approach to detect SARS-CoV-2 using CT-scan images. Our method is based on a very intuitive and natural idea of analyzing shapes, an attempt to mimic a professional medic. We mainly trace SARS-CoV-2 features by quantifying their topological properties. We primarily use a tool called persistent homology, from Topological Data Analysis (TDA), to compute these topological properties. We train and test our model on the "SARS-CoV-2 CT-scan dataset" i̧tep\soares2020sars\, an open-source dataset, containing 2,481 CT-scans of normal and COVID-19 patients. Our model yielded an overall benchmark F1 score of \$99.42\% \$, accuracy \$99.416\%\$, precision \$99.41\%\$, and recall \$99.42\%\$. The TDA techniques have great potential that can be utilized for efficient and prompt detection of COVID-19. The immense potential of TDA may be exploited in clinics for rapid and safe detection of COVID-19 globally, in particular in the low and middle-income countries where RT-PCR labs and/or kits are in a serious crisis.

Persistent Homology in Cosmic Shear: Constraining Parameters With Topological Data Analysis (2021)

Sven Heydenreich, Benjamin Brück, Joachim Harnois-Déraps

Abstract

In recent years, cosmic shear has emerged as a powerful tool for studying the statistical distribution of matter in our Universe. Apart from the standard two-point correlation functions, several alternative methods such as peak count statistics offer competitive results. Here we show that persistent homology, a tool from topological data analysis, can extract more cosmological information than previous methods from the same data set. For this, we use persistent Betti numbers to efficiently summarise the full topological structure of weak lensing aperture mass maps. This method can be seen as an extension of the peak count statistics, in which we additionally capture information about the environment surrounding the maxima. We first demonstrate the performance in a mock analysis of the KiDS+VIKING-450 data: We extract the Betti functions from a suite of \textlessi\textgreaterN\textlessi/\textgreater-body simulations and use these to train a Gaussian process emulator that provides rapid model predictions; we next run a Markov chain Monte Carlo analysis on independent mock data to infer the cosmological parameters and their uncertainties. When comparing our results, we recover the input cosmology and achieve a constraining power on that is 3% tighter than that on peak count statistics. Performing the same analysis on 100 deg\textlesssup\textgreater2\textlesssup/\textgreater of \textlessi\textgreaterEuclid\textlessi/\textgreater-like simulations, we are able to improve the constraints on \textlessi\textgreaterS\textlessi/\textgreater\textlesssub\textgreater8\textlesssub/\textgreater and Ω\textlesssub\textgreaterm\textlesssub/\textgreater by 19% and 12%, respectively, while breaking some of the degeneracy between \textlessi\textgreaterS\textlessi/\textgreater\textlesssub\textgreater8\textlesssub/\textgreater and the dark energy equation of state. To our knowledge, the methods presented here are the most powerful topological tools for constraining cosmological parameters with lensing data.

Quantifying Genetic Innovation: Mathematical Foundations for the Topological Study of Reticulate Evolution (2020)

Michael Lesnick, Raúl Rabadán, Daniel I. S. Rosenbloom

Abstract

A topological approach to the study of genetic recombination, based on persistent homology, was introduced by Chan, Carlsson, and Rabadán in 2013. This associates a sequence of signatures called barcodes to genomic data sampled from an evolutionary history. In this paper, we develop theoretical foundations for this approach. First, we present a novel formulation of the underlying inference problem. Specifically, we introduce and study the novelty profile, a simple, stable statistic of an evolutionary history which not only counts recombination events but also quantifies how recombination creates genetic diversity. We propose that the (hitherto implicit) goal of the topological approach to recombination is the estimation of novelty profiles. We then study the problem of obtaining a lower bound on the novelty profile using barcodes. We focus on a low-recombination regime, where the evolutionary history can be described by a directed acyclic graph called a galled tree, which differs from a tree only by isolated topological defects. We show that in this regime, under a complete sampling assumption, the \$1\textasciicircum\mathrm\st\\$ barcode yields a lower bound on the novelty profile, and hence on the number of recombination events. For \$i\textgreater1\$, the \$i\textasciicircum\\mathrm\th\\\$ barcode is empty. In addition, we use a stability principle to strengthen these results to ones which hold for any subsample of an arbitrary evolutionary history. To establish these results, we describe the topology of the Vietoris--Rips filtrations arising from evolutionary histories indexed by galled trees. As a step towards a probabilistic theory, we also show that for a random history indexed by a fixed galled tree and satisfying biologically reasonable conditions, the intervals of the \$1\textasciicircum\\mathrm\st\\\$ barcode are independent random variables. Using simulations, we explore the sensitivity of these intervals to recombination.

Inference of Ancestral Recombination Graphs Through Topological Data Analysis (2016)

Pablo G. Cámara, Arnold J. Levine, Raúl Rabadán

Abstract

The recent explosion of genomic data has underscored the need for interpretable and comprehensive analyses that can capture complex phylogenetic relationships within and across species. Recombination, reassortment and horizontal gene transfer constitute examples of pervasive biological phenomena that cannot be captured by tree-like representations. Starting from hundreds of genomes, we are interested in the reconstruction of potential evolutionary histories leading to the observed data. Ancestral recombination graphs represent potential histories that explicitly accommodate recombination and mutation events across orthologous genomes. However, they are computationally costly to reconstruct, usually being infeasible for more than few tens of genomes. Recently, Topological Data Analysis (TDA) methods have been proposed as robust and scalable methods that can capture the genetic scale and frequency of recombination. We build upon previous TDA developments for detecting and quantifying recombination, and present a novel framework that can be applied to hundreds of genomes and can be interpreted in terms of minimal histories of mutation and recombination events, quantifying the scales and identifying the genomic locations of recombinations. We implement this framework in a software package, called TARGet, and apply it to several examples, including small migration between different populations, human recombination, and horizontal evolution in finches inhabiting the Galápagos Islands., Evolution occurs through different mechanisms, including point mutations, gene duplication, horizontal gene transfer, and recombinations. Some of these mechanisms cannot be captured by tree graphs. We present a framework, based on the mathematical tools of computational topology, that can explicitly accommodate both recombination and mutation events across the evolutionary history of a sample of genomic sequences. This approach generates a new type of summary graph and algebraic structures that provide quantitative information on the evolutionary scale and frequency of recombination events. The accompanying software, TARGet, is applied to several examples, including migration between sexually-reproducing populations, human recombination, and recombination in Darwin’s finches.

Determining Clinically Relevant Features in Cytometry Data Using Persistent Homology (2022)

Soham Mukherjee, Darren Wethington, Tamal K. Dey, Jayajit Das

Abstract

Cytometry experiments yield high-dimensional point cloud data that is difficult to interpret manually. Boolean gating techniques coupled with comparisons of relative abundances of cellular subsets is the current standard for cytometry data analysis. However, this approach is unable to capture more subtle topological features hidden in data, especially if those features are further masked by data transforms or significant batch effects or donor-to-donor variations in clinical data. We present that persistent homology, a mathematical structure that summarizes the topological features, can distinguish different sources of data, such as from groups of healthy donors or patients, effectively. Analysis of publicly available cytometry data describing non-naïve CD8+ T cells in COVID-19 patients and healthy controls shows that systematic structural differences exist between single cell protein expressions in COVID-19 patients and healthy controls. We identify proteins of interest by a decision-tree based classifier, sample points randomly and compute persistence diagrams from these sampled points. The resulting persistence diagrams identify regions in cytometry datasets of varying density and identify protruded structures such as ‘elbows’. We compute Wasserstein distances between these persistence diagrams for random pairs of healthy controls and COVID-19 patients and find that systematic structural differences exist between COVID-19 patients and healthy controls in the expression data for T-bet, Eomes, and Ki-67. Further analysis shows that expression of T-bet and Eomes are significantly downregulated in COVID-19 patient non-naïve CD8+ T cells compared to healthy controls. This counter-intuitive finding may indicate that canonical effector CD8+ T cells are less prevalent in COVID-19 patients than healthy controls. This method is applicable to any cytometry dataset for discovering novel insights through topological data analysis which may be difficult to ascertain otherwise with a standard gating strategy or existing bioinformatic tools.

Community Resources

Code
Data

Feasibility of Topological Data Analysis for Event-Related fMRI (2019)

Cameron T. Ellis, Michael Lesnick, Gregory Henselman-Petrusek, Bryn Keller, Jonathan D. Cohen

Abstract

Recent fMRI research shows that perceptual and cognitive representations are instantiated in high-dimensional multivoxel patterns in the brain. However, the methods for detecting these representations are limited. Topological data analysis (TDA) is a new approach, based on the mathematical field of topology, that can detect unique types of geometric features in patterns of data. Several recent studies have successfully applied TDA to study various forms of neural data; however, to our knowledge, TDA has not been successfully applied to data from event-related fMRI designs. Event-related fMRI is very common but limited in terms of the number of events that can be run within a practical time frame and the effect size that can be expected. Here, we investigate whether persistent homology—a popular TDA tool that identifies topological features in data and quantifies their robustness—can identify known signals given these constraints. We use fmrisim, a Python-based simulator of realistic fMRI data, to assess the plausibility of recovering a simple topological representation under a variety of conditions. Our results suggest that persistent homology can be used under certain circumstances to recover topological structure embedded in realistic fMRI data simulations.How do we represent the world? In cognitive neuroscience it is typical to think representations are points in high-dimensional space. In order to study these kinds of spaces it is necessary to have tools that capture the organization of high-dimensional data. Topological data analysis (TDA) holds promise for detecting unique types of geometric features in patterns of data. Although potentially useful, TDA has not been applied to event-related fMRI data. Here we utilized a popular tool from TDA, persistent homology, to recover topological signals from event-related fMRI data. We simulated realistic fMRI data and explored the parameters under which persistent homology can successfully extract signal. We also provided extensive code and recommendations for how to make the most out of TDA for fMRI analysis.

🍩 Database of Original & Non-Theoretical Uses of Topology

Innate and Adaptive T Cells in Asthmatic Patients: Relationship to Severity and Disease Mechanisms (2015)

A Barcode Shape Descriptor for Curve Point Cloud Data (2004)

Fractal Dimension Estimation With Persistent Homology: A Comparative Study (2020)

Community Resources

Bayesian Computation Meets Topology (2024)

Characterizing Fluid Dynamical Systems Using Euler Characteristic Surface and Euler Metric (2023)

Euler Characteristic Surfaces: A Stable Multiscale Topological Summary of Time Series Data (2024)

Robust Crossings Detection in Noisy Signals Using Topological Signal Processing (2024)

Gene Expression Data Classification Using Topology and Machine Learning Models (2022)

Community Resources

Some Applications of TDA on Financial Markets (2022)

Position: Topological Deep Learning Is the New Frontier for Relational Learning (2024)

Explainable Machine Learning Approach to Yield and Quality Improvements Using Deep Topological Data Analytics (2023)

Revisiting Abnormalities in Brain Network Architecture Underlying Autism Using Topology-Inspired Statistical Inference (2018)

Topological Data Analysis: Concepts, Computation, and Applications in Chemical Engineering (2021)

Acute Lymphoblastic Leukemia Classification Using Persistent Homology (2024)

Topological Detection of Phenomenological Bifurcations With Unreliable Kernel Density Estimates (2024)

SuPerPoV: Score and Evolution of the Stratospheric Polar Vortex via Persistent Homology (2026)

Community Resources

Rapid and Precise Topological Comparison With Merge Tree Neural Networks (2024)

Statistical Topology of Bond Networks With Applications to Silica (2020)

Community Resources

Towards a Philological Metric Through a Topological Data Analysis Approach (2020)

Community Resources

Multidimensional Persistence in Biomolecular Data (2015)

Dissecting Glial Scar Formation by Spatial Point Pattern and Topological Data Analysis (2024)

Community Resources

Geometric Anomaly Detection in Data (2020)

Persistent Homology in Cosmic Shear - II. A Tomographic Analysis of DES-Y1 (2022)

Exploring the Geometry and Topology of Neural Network Loss Landscapes (2022)

Topology-Informed Machine Learning for Efficient Prediction of Solid Oxide Fuel Cell Electrode Polarization (2025)

Community Resources

Multiresolution Persistent Homology for Excessively Large Biomolecular Datasets (2015)

Persistent Homology of Time-Dependent Functional Networks Constructed From Coupled Time Series (2017)

Understanding Flow Features in Drying Droplets via Euler Characteristic Surfaces—A Topological Tool (2020)

Toroidal Topology of Population Activity in Grid Cells (2022)

Community Resources

Classification of COVID-19 via Homology of CT-SCAN (2021)

Persistent Homology in Cosmic Shear: Constraining Parameters With Topological Data Analysis (2021)

Quantifying Genetic Innovation: Mathematical Foundations for the Topological Study of Reticulate Evolution (2020)

Inference of Ancestral Recombination Graphs Through Topological Data Analysis (2016)

Determining Clinically Relevant Features in Cytometry Data Using Persistent Homology (2022)

Community Resources

Feasibility of Topological Data Analysis for Event-Related fMRI (2019)