ToDD: Topological Compound Fingerprinting in Computer-Aided Drug Discovery (2022)

Andaç Demir, Baris Coskunuzer, Yulia Gel, Ignacio Segovia-Dominguez, Yuzhou Chen, Bulent Kiziltan

Community Resources

Data
Video

Differentiable Euler Characteristic Transforms for Shape Classification (2023)

Abstract

The _Euler Characteristic Transform_ (ECT) is a powerful invariant, combining geometrical and topological characteristics of shapes and graphs. However, the ECT was hitherto unable to learn task-specific representations. We overcome this issue and develop a novel computational layer that enables learning the ECT in an end-to-end fashion. Our method, the _Differentiable Euler Characteristic Transform_ (DECT) is fast and computationally efficient, while exhibiting performance on a par with more complex models in both graph and point cloud classification tasks. Moreover, we show that this seemingly simple statistic provides the same topological expressivity as more complex topological deep learning layers.

Filtration Surfaces for Dynamic Graph Classification (2023)

Franz Srambical, Bastian Rieck

Abstract

Existing approaches for classifying dynamic graphs either lift graph kernels to the temporal domain, or use graph neural networks (GNNs). However, current baselines have scalability issues, cannot handle a changing node set, or do not take edge weight information into account. We propose filtration surfaces, a novel method that is scalable and flexible, to alleviate said restrictions. We experimentally validate the efficacy of our model and show that filtration surfaces outperform previous state-of-the-art baselines on datasets that rely on edge weight information. Our method does so while being either completely parameter-free or having at most one parameter, and yielding the lowest overall standard deviation among similarly scalable methods.

Complexes of Tournaments, Directionality Filtrations and Persistent Homology (2020)

Dejan Govc, Ran Levi, Jason P. Smith

Abstract

Complete digraphs are referred to in the combinatorics literature as tournaments. We consider a family of semi-simplicial complexes, that we refer to as "tournaplexes", whose simplices are tournaments. In particular, given a digraph \$\mathcal\G\\$, we associate with it a "flag tournaplex" which is a tournaplex containing the directed flag complex of \$\mathcal\G\\$, but also the geometric realisation of cliques that are not directed. We define several types of filtrations on tournaplexes, and exploiting persistent homology, we observe that flag tournaplexes provide finer means of distinguishing graph dynamics than the directed flag complex. We then demonstrate the power of these ideas by applying them to graph data arising from the Blue Brain Project's digital reconstruction of a rat's neocortex.

Stable Signatures for Dynamic Graphs and Dynamic Metric Spaces via Zigzag Persistence (2018)

Woojin Kim, Facundo Memoli

Abstract

When studying flocking/swarming behaviors in animals one is interested in quantifying and comparing the dynamics of the clustering induced by the coalescence and disbanding of animals in different groups. In a similar vein, studying the dynamics of social networks leads to the problem of characterizing groups/communities as they form and disperse throughout time. Motivated by this, we study the problem of obtaining persistent homology based summaries of time-dependent data. Given a finite dynamic graph (DG), we first construct a zigzag persistence module arising from linearizing the dynamic transitive graph naturally induced from the input DG. Based on standard results, we then obtain a persistence diagram or barcode from this zigzag persistence module. We prove that these barcodes are stable under perturbations in the input DG under a suitable distance between DGs that we identify. More precisely, our stability theorem can be interpreted as providing a lower bound for the distance between DGs. Since it relies on barcodes, and their bottleneck distance, this lower bound can be computed in polynomial time from the DG inputs. Since DGs can be given rise by applying the Rips functor (with a fixed threshold) to dynamic metric spaces, we are also able to derive related stable invariants for these richer class of dynamic objects. Along the way, we propose a summarization of dynamic graphs that captures their time-dependent clustering features which we call formigrams. These set-valued functions generalize the notion of dendrogram, a prevalent tool for hierarchical clustering. In order to elucidate the relationship between our distance between two DGs and the bottleneck distance between their associated barcodes, we exploit recent advances in the stability of zigzag persistence due to Botnan and Lesnick, and to Bjerkevik.

Simplicial Representation Learning With Neural \$K\$-Forms (2023)

Kelly Maggs, Celia Hacker, Bastian Rieck

Abstract

Geometric deep learning extends deep learning to incorporate information about the geometry and topology data, especially in complex domains like graphs. Despite the popularity of message passing in this field, it has limitations such as the need for graph rewiring, ambiguity in interpreting data, and over-smoothing. In this paper, we take a different approach, focusing on leveraging geometric information from simplicial complexes embedded in \$\mathbb\R\\textasciicircumn\$ using node coordinates. We use differential \$k\$-forms in \$\mathbb\R\\textasciicircumn\$ to create representations of simplices, offering interpretability and geometric consistency without message passing. This approach also enables us to apply differential geometry tools and achieve universal approximation. Our method is efficient, versatile, and applicable to various input complexes, including graphs, simplicial complexes, and cell complexes. It outperforms existing message passing neural networks in harnessing information from geometrical graphs with node features serving as coordinates.

Deep Learning With Topological Signatures (2017)

Christoph Hofer, Roland Kwitt, Marc Niethammer, Andreas Uhl

Curvature Filtrations for Graph Generative Model Evaluation (2023)

Joshua Southern, Jeremy Wayland, Michael Bronstein, Bastian Rieck

Filtration Curves for Graph Representation (2021)

Leslie O'Bray, Bastian Rieck, Karsten Borgwardt

Abstract

The two predominant approaches to graph comparison in recent years are based on (i) enumerating matching subgraphs or (ii) comparing neighborhoods of nodes. In this work, we complement these two perspectives with a third way of representing graphs: using filtration curves from topological data analysis that capture both edge weight information and global graph structure. Filtration curves are highly efficient to compute and lead to expressive representations of graphs, which we demonstrate on graph classification benchmark datasets. Our work opens the door to a new form of graph representation in data mining.

Graph Filtration Learning (2020)

Christoph Hofer, Florian Graf, Bastian Rieck, Marc Niethammer, Roland Kwitt

Abstract

We propose an approach to learning with graph-structured data in the problem domain of graph classification. In particular, we present a novel type of readout operation to aggregate node features into a graph-level representation. To this end, we leverage persistent homology computed via a real-valued, learnable, filter function. We establish the theoretical foundation for differentiating through the persistent homology computation. Empirically, we show that this type of readout operation compares favorably to previous techniques, especially when the graph connectivity structure is informative for the learning problem.

Simplicial Neural Networks (2020)

Stefania Ebli, Michaël Defferrard, Gard Spreemann

Abstract

We present simplicial neural networks (SNNs), a generalization of graph neural networks to data that live on a class of topological spaces called simplicial complexes. These are natural multi-dimensional extensions of graphs that encode not only pairwise relationships but also higher-order interactions between vertices - allowing us to consider richer data, including vector fields and \$n\$-fold collaboration networks. We define an appropriate notion of convolution that we leverage to construct the desired convolutional neural networks. We test the SNNs on the task of imputing missing data on coauthorship complexes.

A Persistent Weisfeiler-Lehman Procedure for Graph Classification (2019)

Bastian Rieck, Christian Bock, Karsten Borgwardt

Abstract

The Weisfeiler–Lehman graph kernel exhibits competitive performance in many graph classification tasks. However, its subtree features are not able to capture connected components and cycles, topological features known for characterising graphs. To extract such features, we leverage propagated node label information and transform unweighted graphs into metric ones. This permits us to augment the subtree features with topological information obtained using persistent homology, a concept from topological data analysis. Our method, which we formalise as a generalisation of Weisfeiler–Lehman subtree features, exhibits favourable classification accuracy and its improvements in predictive performance are mainly driven by including cycle information.

Tree Decomposition of Reeb Graphs, Parametrized Complexity, and Applications to Phylogenetics (2020)

Anastasios Stefanou

Abstract

Inspired by the interval decomposition of persistence modules and the extended Newick format of phylogenetic networks, we show that, inside the larger category of partially ordered Reeb graphs, every Reeb graph with n leaves and first Betti number s, can be identified with a coproduct of at most \$\$2\textasciicircums\$\$2s partially ordered trees with \$\$(n + s)\$\$(n+s) leaves. Reeb graphs are therefore classified up to isomorphism by their tree-decomposition. An implication of this result, is that the isomorphism problem for Reeb graphs is fixed parameter tractable when the parameter is the first Betti number. We propose partially ordered Reeb graphs as a model for time consistent phylogenetic networks and propose a certain Hausdorff distance as a metric on these structures.

Graph Classification via Heat Diffusion on Simplicial Complexes (2020)

Mehmet Emin Aktas, Esra Akbas

Abstract

In this paper, we study the graph classification problem in vertex-labeled graphs. Our main goal is to classify the graphs comparing their higher-order structures thanks to heat diffusion on their simplices. We first represent vertex-labeled graphs as simplex-weighted super-graphs. We then define the diffusion Frechet function over their simplices to encode the higher-order network topology and finally reach our goal by combining the function values with machine learning algorithms. Our experiments on real-world bioinformatics networks show that using diffusion Fr\éḩet function on simplices is promising in graph classification and more effective than the baseline methods. To the best of our knowledge, this paper is the first paper in the literature using heat diffusion on higher-dimensional simplices in a graph mining problem. We believe that our method can be extended to different graph mining domains, not only the graph classification problem.

Persistent Homology for Path Planning in Uncertain Environments (2015)

S. Bhattacharya, R. Ghrist, V. Kumar

Abstract

We address the fundamental problem of goal-directed path planning in an uncertain environment represented as a probability (of occupancy) map. Most methods generally use a threshold to reduce the grayscale map to a binary map before applying off-the-shelf techniques to find the best path. This raises the somewhat ill-posed question, what is the right (optimal) value to threshold the map? We instead suggest a persistent homology approach to the problem-a topological approach in which we seek the homology class of trajectories that is most persistent for the given probability map. In other words, we want the class of trajectories that is free of obstacles over the largest range of threshold values. In order to make this problem tractable, we use homology in ℤ2 coefficients (instead of the standard ℤ coefficients), and describe how graph search-based algorithms can be used to find trajectories in different homology classes. Our simulation results demonstrate the efficiency and practical applicability of the algorithm proposed in this paper.paper.

Statistical Topology of Bond Networks With Applications to Silica (2020)

B. Schweinhart, D. Rodney, J. K. Mason

Abstract

Whereas knowledge of a crystalline material's unit cell is fundamental to understanding the material's properties and behavior, there are no obvious analogs to unit cells for disordered materials despite the frequent existence of considerable medium-range order. This article views a material's structure as a collection of local atomic environments that are sampled from some underlying probability distribution of such environments, with the advantage of offering a unified description of both ordered and disordered materials. Crystalline materials can then be regarded as special cases where the underlying probability distribution is highly concentrated around the traditional unit cell. The 𝐻1 barcode is proposed as a descriptor of local atomic environments suitable for disordered bond networks and is applied with three other descriptors to molecular dynamics simulations of silica glasses. Each descriptor reliably distinguishes the structure of glasses produced at different cooling rates, with the 𝐻1 barcode and coordination profile providing the best separation. The approach is generally applicable to any system that can be represented as a sparse graph.

Community Resources

Code

Path Homology as a Stronger Analogue of Cyclomatic Complexity (2020)

Steve Huntsman

Abstract

Cyclomatic complexity is an incompletely specified but mathematically principled software metric that can be usefully applied to both source and binary code. We consider the application of path homology as a stronger analogue of cyclomatic complexity. We have implemented an algorithm to compute path homology in arbitrary dimension and applied it to several classes of relevant flow graphs, including randomly generated flow graphs representing structured and unstructured control flow. We also compared path homology and cyclomatic complexity on a set of disassembled binaries obtained from the grep utility. There exist control flow graphs realizable at the assembly level with nontrivial path homology in arbitrary dimension. We exhibit several classes of examples in this vein while also experimentally demonstrating that path homology gives identicial results to cyclomatic complexity for at least one detailed notion of structured control flow. We also experimentally demonstrate that the two notions differ on disassembled binaries, and we highlight an example of extreme disagreement. Path homology empirically generalizes cyclomatic complexity for an elementary notion of structured code and appears to identify more structurally relevant features of control flow in general. Path homology therefore has the potential to substantially improve upon cyclomatic complexity.

Efficient Planning of Multi-Robot Collective Transport Using Graph Reinforcement Learning With Higher Order Topological Abstraction (2023)

Steve Paul, Wenyuan Li, Brian Smyth, Yuzhou Chen, Yulia Gel, Souma Chowdhury

Abstract

Efficient multi-robot task allocation (MRTA) is fundamental to various time-sensitive applications such as disaster response, warehouse operations, and construction. This paper tackles a particular class of these problems that we call MRTA-collective transport or MRTA-CT - here tasks present varying workloads and deadlines, and robots are subject to flight range, communication range, and payload constraints. For large instances of these problems involving 100s-1000's of tasks and 10s-100s of robots, traditional non-learning solvers are often time-inefficient, and emerging learning-based policies do not scale well to larger-sized problems without costly retraining. To address this gap, we use a recently proposed encoder-decoder graph neural network involving Capsule networks and multi-head attention mechanism, and innovatively add topological descriptors (TD) as new features to improve transferability to unseen problems of similar and larger size. Persistent homology is used to derive the TD, and proximal policy optimization is used to train our TD-augmented graph neural network. The resulting policy model compares favorably to state-of-the-art non-learning baselines while being much faster. The benefit of using TD is readily evident when scaling to test problems of size larger than those used in training.

Learning Representations of Persistence Barcodes (2019)

Christoph D. Hofer, Roland Kwitt, Marc Niethammer

Abstract

We consider the problem of supervised learning with summary representations of topological features in data. In particular, we focus on persistent homology, the prevalent tool used in topological data analysis. As the summary representations, referred to as barcodes or persistence diagrams, come in the unusual format of multi sets, equipped with computationally expensive metrics, they can not readily be processed with conventional learning techniques. While different approaches to address this problem have been proposed, either in the context of kernel-based learning, or via carefully designed vectorization techniques, it remains an open problem how to leverage advances in representation learning via deep neural networks. Appropriately handling topological summaries as input to neural networks would address the disadvantage of previous strategies which handle this type of data in a task-agnostic manner. In particular, we propose an approach that is designed to learn a task-specific representation of barcodes. In other words, we aim to learn a representation that adapts to the learning problem while, at the same time, preserving theoretical properties (such as stability). This is done by projecting barcodes into a finite dimensional vector space using a collection of parametrized functionals, so called structure elements, for which we provide a generic construction scheme. A theoretical analysis of this approach reveals sufficient conditions to preserve stability, and also shows that different choices of structure elements lead to great differences with respect to their suitability for numerical optimization. When implemented as a neural network input layer, our approach demonstrates compelling performance on various types of problems, including graph classification and eigenvalue prediction, the classification of 2D/3D object shapes and recognizing activities from EEG signals.

🍩 Database of Original & Non-Theoretical Uses of Topology

ToDD: Topological Compound Fingerprinting in Computer-Aided Drug Discovery (2022)

Community Resources

Differentiable Euler Characteristic Transforms for Shape Classification (2023)

Filtration Surfaces for Dynamic Graph Classification (2023)

Complexes of Tournaments, Directionality Filtrations and Persistent Homology (2020)

Stable Signatures for Dynamic Graphs and Dynamic Metric Spaces via Zigzag Persistence (2018)

Simplicial Representation Learning With Neural \$K\$-Forms (2023)

Deep Learning With Topological Signatures (2017)

Curvature Filtrations for Graph Generative Model Evaluation (2023)

Filtration Curves for Graph Representation (2021)

Graph Filtration Learning (2020)

Simplicial Neural Networks (2020)

A Persistent Weisfeiler-Lehman Procedure for Graph Classification (2019)

Tree Decomposition of Reeb Graphs, Parametrized Complexity, and Applications to Phylogenetics (2020)

Graph Classification via Heat Diffusion on Simplicial Complexes (2020)

Persistent Homology for Path Planning in Uncertain Environments (2015)

Statistical Topology of Bond Networks With Applications to Silica (2020)

Community Resources

Path Homology as a Stronger Analogue of Cyclomatic Complexity (2020)

Efficient Planning of Multi-Robot Collective Transport Using Graph Reinforcement Learning With Higher Order Topological Abstraction (2023)

Learning Representations of Persistence Barcodes (2019)