Advancing Precision Medicine: Algebraic Topology and Differential Geometry in Radiology and Computational Pathology (2024)

Richard M. Levenson, Yashbir Singh, Bastian Rieck, Quincy A. Hathaway, Colleen Farrelly, Jennifer Rozenblit, Prateek Prasanna, Bradley Erickson, Ashok Choudhary, Gunnar Carlsson, Deepa Sarkar

Filtration Curves for Graph Representation (2021)

Leslie O'Bray, Bastian Rieck, Karsten Borgwardt

Abstract

The two predominant approaches to graph comparison in recent years are based on (i) enumerating matching subgraphs or (ii) comparing neighborhoods of nodes. In this work, we complement these two perspectives with a third way of representing graphs: using filtration curves from topological data analysis that capture both edge weight information and global graph structure. Filtration curves are highly efficient to compute and lead to expressive representations of graphs, which we demonstrate on graph classification benchmark datasets. Our work opens the door to a new form of graph representation in data mining.

Graph Filtration Learning (2020)

Christoph Hofer, Florian Graf, Bastian Rieck, Marc Niethammer, Roland Kwitt

Abstract

We propose an approach to learning with graph-structured data in the problem domain of graph classification. In particular, we present a novel type of readout operation to aggregate node features into a graph-level representation. To this end, we leverage persistent homology computed via a real-valued, learnable, filter function. We establish the theoretical foundation for differentiating through the persistent homology computation. Empirically, we show that this type of readout operation compares favorably to previous techniques, especially when the graph connectivity structure is informative for the learning problem.

Topological Autoencoders (2020)

Michael Moor, Max Horn, Bastian Rieck, Karsten Borgwardt

Abstract

We propose a novel approach for preserving topological structures of the input space in latent representations of autoencoders. Using persistent homology, a technique from topological data analysis, we calculate topological signatures of both the input and latent space to derive a topological loss term. Under weak theoretical assumptions, we construct this loss in a differentiable manner, such that the encoding learns to retain multi-scale connectivity information. We show that our approach is theoretically well-founded and that it exhibits favourable latent representations on a synthetic manifold as well as on real-world image data sets, while preserving low reconstruction errors.

A Persistent Weisfeiler-Lehman Procedure for Graph Classification (2019)

Bastian Rieck, Christian Bock, Karsten Borgwardt

Abstract

The Weisfeiler–Lehman graph kernel exhibits competitive performance in many graph classification tasks. However, its subtree features are not able to capture connected components and cycles, topological features known for characterising graphs. To extract such features, we leverage propagated node label information and transform unweighted graphs into metric ones. This permits us to augment the subtree features with topological information obtained using persistent homology, a concept from topological data analysis. Our method, which we formalise as a generalisation of Weisfeiler–Lehman subtree features, exhibits favourable classification accuracy and its improvements in predictive performance are mainly driven by including cycle information.

Topological Graph Neural Networks (2021)

Max Horn, Edward De Brouwer, Michael Moor, Yves Moreau, Bastian Rieck, Karsten Borgwardt

Abstract

Graph neural networks (GNNs) are a powerful architecture for tackling graph learning tasks, yet have been shown to be oblivious to eminent substructures, such as cycles. We present TOGL, a novel layer that incorporates global topological information of a graph using persistent homology. TOGL can be easily integrated into any type of GNN and is strictly more expressive in terms of the Weisfeiler--Lehman test of isomorphism. Augmenting GNNs with our layer leads to beneficial predictive performance, both on synthetic data sets, which can be trivially classified by humans but not by ordinary GNNs, and on real-world data.

Topological Machine Learning With Persistence Indicator Functions (2019)

Bastian Rieck, Filip Sadlo, Heike Leitte

Abstract

Techniques from computational topology, in particular persistent homology, are becoming increasingly relevant for data analysis. Their stable metrics permit the use of many distance-based data analysis methods, such as multidimensional scaling, while providing a firm theoretical ground. Many modern machine learning algorithms, however, are based on kernels. This paper presents persistence indicator functions (PIFs), which summarize persistence diagrams, i.e., feature descriptors in topological data analysis. PIFs can be calculated and compared in linear time and have many beneficial properties, such as the availability of a kernel-based similarity measure. We demonstrate their usage in common data analysis scenarios, such as confidence set estimation and classification of complex structured data.

Topological Singularity Detection at Multiple Scales (2023)

Julius von Rohrscheidt, Bastian Rieck

Abstract

The manifold hypothesis, which assumes that data lies on or close to an unknown manifold of low intrinsic dimension, is a staple of modern machine learning research. However, recent work has shown that real-world data exhibits distinct non-manifold structures, i.e. singularities, that can lead to erroneous findings. Detecting such singularities is therefore crucial as a precursor to interpolation and inference tasks. We address this issue by developing a topological framework that (i) quantifies the local intrinsic dimension, and (ii) yields a Euclidicity score for assessing the ’manifoldness’ of a point along multiple scales. Our approach identifies singularities of complex spaces, while also capturing singular structures and local geometric complexity in image data.

Capturing Shape Information With Multi-Scale Topological Loss Terms For 3D Reconstruction (2022)

Dominik J. E. Waibel, Scott Atwell, Matthias Meier, Carsten Marr, Bastian Rieck

Abstract

Reconstructing 3D objects from 2D images is both challenging for our brains and machine learning algorithms. To support this spatial reasoning task, contextual information about the overall shape of an object is critical. However, such information is not captured by established loss terms (e.g. Dice loss). We propose to complement geometrical shape information by including multi-scale topological features, such as connected components, cycles, and voids, in the reconstruction loss. Our method uses cubical complexes to calculate topological features of 3D volume data and employs an optimal transport distance to guide the reconstruction process. This topology-aware loss is fully differentiable, computationally efficient, and can be added to any neural network. We demonstrate the utility of our loss by incorporating it into SHAPR, a model for predicting the 3D cell shape of individual cells based on 2D microscopy images. Using a hybrid loss that leverages both geometrical and topological information of single objects to assess their shape, we find that topological information substantially improves the quality of reconstructions, thus highlighting its ability to extract more relevant features from image datasets.

Uncovering the Topology of Time-Varying fMRI Data Using Cubical Persistence (2020)

Bastian Rieck, Tristan Yates, Christian Bock, Karsten Borgwardt, Guy Wolf, Nicholas Turk-Browne, Smita Krishnaswamy

Abstract

Functional magnetic resonance imaging (fMRI) is a crucial technology for gaining insights into cognitive processes in humans. Data amassed from fMRI measurements result in volumetric data sets that vary over time. However, analysing such data presents a challenge due to the large degree of noise and person-to-person variation in how information is represented in the brain. To address this challenge, we present a novel topological approach that encodes each time point in an fMRI data set as a persistence diagram of topological features, i.e. high-dimensional voids present in the data. This representation naturally does not rely on voxel-by-voxel correspondence and is robust to noise. We show that these time-varying persistence diagrams can be clustered to find meaningful groupings between participants, and that they are also useful in studying within-subject brain state trajectories of subjects performing a particular task. Here, we apply both clustering and trajectory analysis techniques to a group of participants watching the movie 'Partly Cloudy'. We observe significant differences in both brain state trajectories and overall topological activity between adults and children watching the same movie.

Exploring the Geometry and Topology of Neural Network Loss Landscapes (2022)

Stefan Horoi, Jessie Huang, Bastian Rieck, Guillaume Lajoie, Guy Wolf, Smita Krishnaswamy

Abstract

Recent work has established clear links between the generalization performance of trained neural networks and the geometry of their loss landscape near the local minima to which they converge. This suggests that qualitative and quantitative examination of the loss landscape geometry could yield insights about neural network generalization performance during training. To this end, researchers have proposed visualizing the loss landscape through the use of simple dimensionality reduction techniques. However, such visualization methods have been limited by their linear nature and only capture features in one or two dimensions, thus restricting sampling of the loss landscape to lines or planes. Here, we expand and improve upon these in three ways. First, we present a novel “jump and retrain” procedure for sampling relevant portions of the loss landscape. We show that the resulting sampled data holds more meaningful information about the network’s ability to generalize. Next, we show that non-linear dimensionality reduction of the jump and retrain trajectories via PHATE, a trajectory and manifold-preserving method, allows us to visualize differences between networks that are generalizing well vs poorly. Finally, we combine PHATE trajectories with a computational homology characterization to quantify trajectory differences.

Time-Inhomogeneous Diffusion Geometry and Topology (2022)

Guillaume Huguet, Alexander Tong, Bastian Rieck, Jessie Huang, Manik Kuchroo, Matthew Hirn, Guy Wolf, Smita Krishnaswamy

Abstract

Diffusion condensation is a dynamic process that yields a sequence of multiscale data representations that aim to encode meaningful abstractions. It has proven effective for manifold learning, denoising, clustering, and visualization of high-dimensional data. Diffusion condensation is constructed as a time-inhomogeneous process where each step first computes and then applies a diffusion operator to the data. We theoretically analyze the convergence and evolution of this process from geometric, spectral, and topological perspectives. From a geometric perspective, we obtain convergence bounds based on the smallest transition probability and the radius of the data, whereas from a spectral perspective, our bounds are based on the eigenspectrum of the diffusion kernel. Our spectral results are of particular interest since most of the literature on data diffusion is focused on homogeneous processes. From a topological perspective, we show diffusion condensation generalizes centroid-based hierarchical clustering. We use this perspective to obtain a bound based on the number of data points, independent of their location. To understand the evolution of the data geometry beyond convergence, we use topological data analysis. We show that the condensation process itself defines an intrinsic diffusion homology. We use this intrinsic topology as well as an ambient topology to study how the data changes over diffusion time. We demonstrate both homologies in well-understood toy examples. Our work gives theoretical insights into the convergence of diffusion condensation, and shows that it provides a link between topological and geometric data analysis.

Advancing Precision Medicine: Algebraic Topology and Differential Geometry in Radiology and Computational Pathology (2024)

Richard M. Levenson, Yashbir Singh, Bastian Rieck, Ashok Choudhary, Gunnar Carlsson, Deepa Sarkar, Quincy A. Hathaway, Colleen Farrelly, Jennifer Rozenblit, Prateek Prasanna, Bradley Erickson

Abstract

Precision medicine aims to provide personalized care based on individual patient characteristics, rather than guideline-directed therapies for groups of diseases or patient demographics. Images—both radiology- and pathology-derived—are a major source of information on presence, type, and status of disease. Exploring the mathematical relationship of pixels in medical imaging (“radiomics”) and cellular-scale structures in digital pathology slides (“pathomics”) offers powerful tools for extracting both qualitative and, increasingly, quantitative data. These analytical approaches, however, may be significantly enhanced by applying additional methods arising from fields of mathematics such as differential geometry and algebraic topology that remain underexplored in this context. Geometry’s strength lies in its ability to provide precise local measurements, such as curvature, that can be crucial for identifying abnormalities at multiple spatial levels. These measurements can augment the quantitative features extracted in conventional radiomics, leading to more nuanced diagnostics. By contrast, topology serves as a robust shape descriptor, capturing essential features such as connected components and holes. The field of topological data analysis was initially founded to explore the shape of data, with functional network connectivity in the brain being a prominent example. Increasingly, its tools are now being used to explore organizational patterns of physical structures in medical images and digitized pathology slides. By leveraging tools from both differential geometry and algebraic topology, researchers and clinicians may be able to obtain a more comprehensive, multi-layered understanding of medical images and contribute to precision medicine’s armamentarium

🍩 Database of Original & Non-Theoretical Uses of Topology

Advancing Precision Medicine: Algebraic Topology and Differential Geometry in Radiology and Computational Pathology (2024)

Filtration Curves for Graph Representation (2021)

Graph Filtration Learning (2020)

Topological Autoencoders (2020)

A Persistent Weisfeiler-Lehman Procedure for Graph Classification (2019)

Topological Graph Neural Networks (2021)

Topological Machine Learning With Persistence Indicator Functions (2019)

Topological Singularity Detection at Multiple Scales (2023)

Capturing Shape Information With Multi-Scale Topological Loss Terms For 3D Reconstruction (2022)

Uncovering the Topology of Time-Varying fMRI Data Using Cubical Persistence (2020)

Exploring the Geometry and Topology of Neural Network Loss Landscapes (2022)

Time-Inhomogeneous Diffusion Geometry and Topology (2022)

Advancing Precision Medicine: Algebraic Topology and Differential Geometry in Radiology and Computational Pathology (2024)