A Persistent Weisfeiler-Lehman Procedure for Graph Classification (2019)

Bastian Rieck, Christian Bock, Karsten Borgwardt

Abstract

The Weisfeiler–Lehman graph kernel exhibits competitive performance in many graph classification tasks. However, its subtree features are not able to capture connected components and cycles, topological features known for characterising graphs. To extract such features, we leverage propagated node label information and transform unweighted graphs into metric ones. This permits us to augment the subtree features with topological information obtained using persistent homology, a concept from topological data analysis. Our method, which we formalise as a generalisation of Weisfeiler–Lehman subtree features, exhibits favourable classification accuracy and its improvements in predictive performance are mainly driven by including cycle information.

Topological Graph Neural Networks (2021)

Max Horn, Edward De Brouwer, Michael Moor, Yves Moreau, Bastian Rieck, Karsten Borgwardt

Abstract

Graph neural networks (GNNs) are a powerful architecture for tackling graph learning tasks, yet have been shown to be oblivious to eminent substructures, such as cycles. We present TOGL, a novel layer that incorporates global topological information of a graph using persistent homology. TOGL can be easily integrated into any type of GNN and is strictly more expressive in terms of the Weisfeiler--Lehman test of isomorphism. Augmenting GNNs with our layer leads to beneficial predictive performance, both on synthetic data sets, which can be trivially classified by humans but not by ordinary GNNs, and on real-world data.

Visual Detection of Structural Changes in Time-Varying Graphs Using Persistent Homology (2018)

Mustafa Hajij, Bei Wang, Carlos Scheidegger, Paul Rosen

Abstract

Topological data analysis is an emerging area in exploratory data analysis and data mining. Its main tool, persistent homology, has become a popular technique to study the structure of complex, high-dimensional data. In this paper, we propose a novel method using persistent homology to quantify structural changes in time-varying graphs. Specifically, we transform each instance of the time-varying graph into a metric space, extract topological features using persistent homology, and compare those features over time. We provide a visualization that assists in time-varying graph exploration and helps to identify patterns of behavior within the data. To validate our approach, we conduct several case studies on real-world datasets and show how our method can find cyclic patterns, deviations from those patterns, and one-time events in time-varying graphs. We also examine whether a persistence-based similarity measure satisfies a set of well-established, desirable properties for graph metrics.

ChainNet: Learning on Blockchain Graphs With Topological Features (2019)

N. C. Abay, C. G. Akcora, Y. R. Gel, M. Kantarcioglu, U. D. Islambekov, Y. Tian, B. Thuraisingham

Abstract

The following topics are dealt with: learning (artificial intelligence); graph theory; neural nets; pattern classification; data mining; feature extraction; recommender systems; pattern clustering; social networking (online); optimisation.

Resting-State fMRI Functional Connectivity: Big Data Preprocessing Pipelines and Topological Data Analysis (2017)

Angkoon Phinyomark, Esther Ibáñez-Marcelo, Giovanni Petri

Export citation

Reconstructing Linearly Embedded Graphs: A First Step to Stratified Space Learning (2021)

Yossi Bokor, Christopher Williams, Katharine Turner

Community Resources

Code (GitHub)

Topology-Based Signal Separation (2004)

V. Robins, N. Rooney, E. Bradley

The Topological Basis of Function in Flow and Mechanical Networks (2019)

Jason Rocks, Andrea Liu, Eleni Katifori

Export citation

Disrupted Resting State Network of Fibromyalgia in Theta Frequency (2018)

Mi Kyung Choe, Manyoel Lim, June Sic Kim, Dong Soo Lee, Chun Kee Chung

Export citation

Filtration Curves for Graph Representation (2021)

Leslie O'Bray, Bastian Rieck, Karsten Borgwardt

Abstract

The two predominant approaches to graph comparison in recent years are based on (i) enumerating matching subgraphs or (ii) comparing neighborhoods of nodes. In this work, we complement these two perspectives with a third way of representing graphs: using filtration curves from topological data analysis that capture both edge weight information and global graph structure. Filtration curves are highly efficient to compute and lead to expressive representations of graphs, which we demonstrate on graph classification benchmark datasets. Our work opens the door to a new form of graph representation in data mining.

Graph Filtration Learning (2020)

Christoph Hofer, Florian Graf, Bastian Rieck, Marc Niethammer, Roland Kwitt

Abstract

We propose an approach to learning with graph-structured data in the problem domain of graph classification. In particular, we present a novel type of readout operation to aggregate node features into a graph-level representation. To this end, we leverage persistent homology computed via a real-valued, learnable, filter function. We establish the theoretical foundation for differentiating through the persistent homology computation. Empirically, we show that this type of readout operation compares favorably to previous techniques, especially when the graph connectivity structure is informative for the learning problem.

Tree Decomposition of Reeb Graphs, Parametrized Complexity, and Applications to Phylogenetics (2020)

Anastasios Stefanou

Abstract

Inspired by the interval decomposition of persistence modules and the extended Newick format of phylogenetic networks, we show that, inside the larger category of partially ordered Reeb graphs, every Reeb graph with n leaves and first Betti number s, can be identified with a coproduct of at most \$\$2\textasciicircums\$\$2s partially ordered trees with \$\$(n + s)\$\$(n+s) leaves. Reeb graphs are therefore classified up to isomorphism by their tree-decomposition. An implication of this result, is that the isomorphism problem for Reeb graphs is fixed parameter tractable when the parameter is the first Betti number. We propose partially ordered Reeb graphs as a model for time consistent phylogenetic networks and propose a certain Hausdorff distance as a metric on these structures.

Graph Classification via Heat Diffusion on Simplicial Complexes (2020)

Mehmet Emin Aktas, Esra Akbas

Abstract

In this paper, we study the graph classification problem in vertex-labeled graphs. Our main goal is to classify the graphs comparing their higher-order structures thanks to heat diffusion on their simplices. We first represent vertex-labeled graphs as simplex-weighted super-graphs. We then define the diffusion Frechet function over their simplices to encode the higher-order network topology and finally reach our goal by combining the function values with machine learning algorithms. Our experiments on real-world bioinformatics networks show that using diffusion Fr\éḩet function on simplices is promising in graph classification and more effective than the baseline methods. To the best of our knowledge, this paper is the first paper in the literature using heat diffusion on higher-dimensional simplices in a graph mining problem. We believe that our method can be extended to different graph mining domains, not only the graph classification problem.

Topological Extraction and Tracking of Defects in Crystal Structures (2011)

Sebastian Grottel, Carlos A. Dietrich, João L. D. Comba, Thomas Ertl

Abstract

Interfaces between materials with different mechanical properties play an important role in technical applications. Nowadays molecular dynamics simulations are used to observe the behavior of such compound materials at the atomic level. Due to different atom crystal sizes, dislocations in the atom crystal structure occur once external forces are applied, and it has been observed that studying the change of thesedislocations can provide further understanding of macroscopic attributes like elasticity and plasticity. Standard visualization techniques such as the rendering of individual atoms work for 2D data or sectional views; however, visualizingdislocations in 3D using such methods usually fail due to occlusion and clutter. In this work we propose to extract and visualize the structure ofdislocations, which summarizes the commonly employed filtered atomistic renderings into a concise representation. The benefits of our approach are clearer images while retaining relevant data and easier visual tracking of topological changes over time.

Geometric Feature Performance Under Downsampling for EEG Classification Tasks (2021)

Bryan Bischof, Eric Bunch

Abstract

We experimentally investigate a collection of feature engineering pipelines for use with a CNN for classifying eyes-open or eyes-closed from electroencephalogram (EEG) time-series from the Bonn dataset. Using the Takens' embedding--a geometric representation of time-series--we construct simplicial complexes from EEG data. We then compare \$\epsilon\$-series of Betti-numbers and \$\epsilon\$-series of graph spectra (a novel construction)--two topological invariants of the latent geometry from these complexes--to raw time series of the EEG to fill in a gap in the literature for benchmarking. These methods, inspired by Topological Data Analysis, are used for feature engineering to capture local geometry of the time-series. Additionally, we test these feature pipelines' robustness to downsampling and data reduction. This paper seeks to establish clearer expectations for both time-series classification via geometric features, and how CNNs for time-series respond to data of degraded resolution.

A Topological Machine Learning Pipeline for Classification (2022)

Francesco Conti, Davide Moroni, Maria Antonietta Pascali

Abstract

In this work, we develop a pipeline that associates Persistence Diagrams to digital data via the most appropriate filtration for the type of data considered. Using a grid search approach, this pipeline determines optimal representation methods and parameters. The development of such a topological pipeline for Machine Learning involves two crucial steps that strongly affect its performance: firstly, digital data must be represented as an algebraic object with a proper associated filtration in order to compute its topological summary, the Persistence Diagram. Secondly, the persistence diagram must be transformed with suitable representation methods in order to be introduced in a Machine Learning algorithm. We assess the performance of our pipeline, and in parallel, we compare the different representation methods on popular benchmark datasets. This work is a first step toward both an easy and ready-to-use pipeline for data classification using persistent homology and Machine Learning, and to understand the theoretical reasons why, given a dataset and a task to be performed, a pair (filtration, topological representation) is better than another.

PersGNN: Applying Topological Data Analysis and Geometric Deep Learning to Structure-Based Protein Function Prediction (2020)

Nicolas Swenson, Aditi S. Krishnapriyan, Aydin Buluc, Dmitriy Morozov, Katherine Yelick

Abstract

Understanding protein structure-function relationships is a key challenge in computational biology, with applications across the biotechnology and pharmaceutical industries. While it is known that protein structure directly impacts protein function, many functional prediction tasks use only protein sequence. In this work, we isolate protein structure to make functional annotations for proteins in the Protein Data Bank in order to study the expressiveness of different structure-based prediction schemes. We present PersGNN - an end-to-end trainable deep learning model that combines graph representation learning with topological data analysis to capture a complex set of both local and global structural features. While variations of these techniques have been successfully applied to proteins before, we demonstrate that our hybridized approach, PersGNN, outperforms either method on its own as well as a baseline neural network that learns from the same information. PersGNN achieves a 9.3% boost in area under the precision recall curve (AUPR) compared to the best individual model, as well as high F1 scores across different gene ontology categories, indicating the transferability of this approach.

Connectivity in fMRI: Blind Spots and Breakthroughs (2018)

Victor Solo, Jean-Baptiste Poline, Martin A. Lindquist, Sean L. Simpson, F. DuBois Bowman, Moo K. Chung, Ben Cassidy

Abstract

In recent years, driven by scientific and clinical concerns, there has been an increased interest in the analysis of functional brain networks. The goal of these analyses is to better understand how brain regions interact, how this depends upon experimental conditions and behavioral measures and how anomalies (disease) can be recognized. In this work we provide, firstly, a brief review of some of the main existing methods of functional brain network analysis. But rather than compare them, as a traditional review would do, instead, we draw attention to their significant limitations and blind spots. Then, secondly, relevant experts, sketch a number of emerging methods, which can break through these limitations. In particular we discuss five such methods. The first two, stochastic block models and exponential random graph models, provide an inferential basis for network analysis lacking in the exploratory graph analysis methods. The other three address: network comparison via persistent homology, time-varying connectivity that distinguishes sample fluctuations from neural fluctuations and, network system identification that draws inferential strength from temporal autocorrelation.

Musical Stylistic Analysis: A Study of Intervallic Transition Graphs via Persistent Homology (2022)

Martín Mijangos, Alessandro Bravetti, Pablo Padilla

Abstract

Topological data analysis has been recently applied to investigate stylistic signatures and trends in musical compositions. A useful tool in this area is Persistent Homology. In this paper, we develop a novel method to represent a weighted directed graph as a finite metric space and then use persistent homology to extract useful features. We apply this method to weighted directed graphs obtained from pitch transitions information of a given musical fragment and use these techniques to the study of stylistic trends. In particular, we are interested in using these tools to make quantitative stylistic comparisons. As a first illustration, we analyze a selection of string quartets by Haydn, Mozart and Beethoven and discuss possible implications of our results in terms of different approaches by these composers to stylistic exploration and variety. We observe that Haydn is stylistically the most conservative, followed by Mozart, while Beethoven is the most innovative, expanding and modifying the string quartet as a musical form. Finally we also compare the variability of different genres, namely minuets, allegros, prestos and adagios, by a given composer and conclude that the minuet is the most stable form of the string quartet movements.

HERMES: Persistent Spectral Graph Software (2020)

Rui Wang, Rundong Zhao, Emily Ribando-Gros, Jiahui Chen, Yiying Tong, Guo-Wei Wei

Abstract

Persistent homology (PH) is one of the most popular tools in topological data analysis (TDA), while graph theory has had a significant impact on data science. Our earlier work introduced the persistent spectral graph (PSG) theory as a unified multiscale paradigm to encompass TDA and geometric analysis. In PSG theory, families of persistent Laplacians (PLs) corresponding to various topological dimensions are constructed via a filtration to sample a given dataset at multiple scales. The harmonic spectra from the null spaces of PLs offer the same topological invariants, namely persistent Betti numbers, at various dimensions as those provided by PH, while the non-harmonic spectra of PLs give rise to additional geometric analysis of the shape of the data. In this work, we develop an open-source software package, called highly efficient robust multidimensional evolutionary spectra (HERMES), to enable broad applications of PSGs in science, engineering, and technology. To ensure the reliability and robustness of HERMES, we have validated the software with simple geometric shapes and complex datasets from three-dimensional (3D) protein structures. We found that the smallest non-zero eigenvalues are very sensitive to data abnormality.

The Growing Topology of the C. Elegans Connectome (2020)

Alec Helm, Ann S. Blevins, Danielle S. Bassett

Abstract

Probing the developing neural circuitry in Caenorhabditis elegans has enhanced our understanding of nervous systems. The C. elegans connectome, like those of other species, is characterized by a rich club of densely connected neurons embedded within a small-world architecture. This organization of neuronal connections, captured by quantitative network statistics, provides insight into the system's capacity to perform integrative computations. Yet these network measures are limited in their ability to detect weakly connected motifs, such as topological cavities, that may support the systems capacity to perform segregated computations. We address this limitation by using persistent homology to track the evolution of topological cavities in the growing C. elegans connectome throughout neural development, and assess the degree to which the growing connectomes topology is resistant to biological noise. We show that the developing connectome topology is both relatively robust to changes in neuron birth times and not captured by similar growth models. Additionally, we quantify the consequence of a neurons specific birth time and ask if this metric tracks other biological properties of neurons. Our results suggest that the connectomes growing topology is a robust feature of the developing connectome that is distinct from other network properties, and that the growing topology is particularly sensitive to the exact birth times of a small set of predominantly motor neurons. By utilizing novel measurements that track biological features, we anticipate that our study will be helpful in the construction of more accurate models of neuronal development in C. elegans

Learning Representations of Persistence Barcodes (2019)

Christoph D. Hofer, Roland Kwitt, Marc Niethammer

Abstract

We consider the problem of supervised learning with summary representations of topological features in data. In particular, we focus on persistent homology, the prevalent tool used in topological data analysis. As the summary representations, referred to as barcodes or persistence diagrams, come in the unusual format of multi sets, equipped with computationally expensive metrics, they can not readily be processed with conventional learning techniques. While different approaches to address this problem have been proposed, either in the context of kernel-based learning, or via carefully designed vectorization techniques, it remains an open problem how to leverage advances in representation learning via deep neural networks. Appropriately handling topological summaries as input to neural networks would address the disadvantage of previous strategies which handle this type of data in a task-agnostic manner. In particular, we propose an approach that is designed to learn a task-specific representation of barcodes. In other words, we aim to learn a representation that adapts to the learning problem while, at the same time, preserving theoretical properties (such as stability). This is done by projecting barcodes into a finite dimensional vector space using a collection of parametrized functionals, so called structure elements, for which we provide a generic construction scheme. A theoretical analysis of this approach reveals sufficient conditions to preserve stability, and also shows that different choices of structure elements lead to great differences with respect to their suitability for numerical optimization. When implemented as a neural network input layer, our approach demonstrates compelling performance on various types of problems, including graph classification and eigenvalue prediction, the classification of 2D/3D object shapes and recognizing activities from EEG signals.

🍩 Database of Original & Non-Theoretical Uses of Topology

A Persistent Weisfeiler-Lehman Procedure for Graph Classification (2019)

Topological Graph Neural Networks (2021)

Visual Detection of Structural Changes in Time-Varying Graphs Using Persistent Homology (2018)

ChainNet: Learning on Blockchain Graphs With Topological Features (2019)

Resting-State fMRI Functional Connectivity: Big Data Preprocessing Pipelines and Topological Data Analysis (2017)

Reconstructing Linearly Embedded Graphs: A First Step to Stratified Space Learning (2021)

Community Resources

Topology-Based Signal Separation (2004)

The Topological Basis of Function in Flow and Mechanical Networks (2019)

Disrupted Resting State Network of Fibromyalgia in Theta Frequency (2018)

Filtration Curves for Graph Representation (2021)

Graph Filtration Learning (2020)

Tree Decomposition of Reeb Graphs, Parametrized Complexity, and Applications to Phylogenetics (2020)

Graph Classification via Heat Diffusion on Simplicial Complexes (2020)

Topological Extraction and Tracking of Defects in Crystal Structures (2011)

Geometric Feature Performance Under Downsampling for EEG Classification Tasks (2021)

A Topological Machine Learning Pipeline for Classification (2022)

PersGNN: Applying Topological Data Analysis and Geometric Deep Learning to Structure-Based Protein Function Prediction (2020)

Connectivity in fMRI: Blind Spots and Breakthroughs (2018)

Musical Stylistic Analysis: A Study of Intervallic Transition Graphs via Persistent Homology (2022)

HERMES: Persistent Spectral Graph Software (2020)

The Growing Topology of the C. Elegans Connectome (2020)

Learning Representations of Persistence Barcodes (2019)