🍩 Database of Original & NonTheoretical Uses of Topology
(found 20 matches in 0.00958s)


Topological Graph Neural Networks (2021)
Max Horn, Edward De Brouwer, Michael Moor, Yves Moreau, Bastian Rieck, Karsten BorgwardtAbstract
Graph neural networks (GNNs) are a powerful architecture for tackling graph learning tasks, yet have been shown to be oblivious to eminent substructures, such as cycles. We present TOGL, a novel layer that incorporates global topological information of a graph using persistent homology. TOGL can be easily integrated into any type of GNN and is strictly more expressive in terms of the WeisfeilerLehman test of isomorphism. Augmenting GNNs with our layer leads to beneficial predictive performance, both on synthetic data sets, which can be trivially classified by humans but not by ordinary GNNs, and on realworld data. 
Geometric Feature Performance Under Downsampling for EEG Classification Tasks (2021)
Bryan Bischof, Eric BunchAbstract
We experimentally investigate a collection of feature engineering pipelines for use with a CNN for classifying eyesopen or eyesclosed from electroencephalogram (EEG) timeseries from the Bonn dataset. Using the Takens' embeddinga geometric representation of timeserieswe construct simplicial complexes from EEG data. We then compare \$\epsilon\$series of Bettinumbers and \$\epsilon\$series of graph spectra (a novel construction)two topological invariants of the latent geometry from these complexesto raw time series of the EEG to fill in a gap in the literature for benchmarking. These methods, inspired by Topological Data Analysis, are used for feature engineering to capture local geometry of the timeseries. Additionally, we test these feature pipelines' robustness to downsampling and data reduction. This paper seeks to establish clearer expectations for both timeseries classification via geometric features, and how CNNs for timeseries respond to data of degraded resolution. 
A Topological Framework for Deep Learning (2020)
Mustafa Hajij, Kyle IstvanAbstract
We utilize classical facts from topology to show that the classification problem in machine learning is always solvable under very mild conditions. Furthermore, we show that a softmax classification network acts on an input topological space by a finite sequence of topological moves to achieve the classification task. Moreover, given a training dataset, we show how topological formalism can be used to suggest the appropriate architectural choices for neural networks designed to be trained as classifiers on the data. Finally, we show how the architecture of a neural network cannot be chosen independently from the shape of the underlying data. To demonstrate these results, we provide example datasets and show how they are acted upon by neural nets from this topological perspective. 
Cell Complex Neural Networks (2020)
Mustafa Hajij, Kyle Istvan, Ghada ZamzamiAbstract
Cell complexes are topological spaces constructed from simple blocks called cells. They generalize graphs, simplicial complexes, and polyhedral complexes that form important domains for practical applications. We propose a general, combinatorial, and unifying construction for performing neural networktype computations on cell complexes. Furthermore, we introduce intercellular message passing schemes, message passing schemes on cell complexes that take the topology of the underlying space into account. In particular, our method generalizes many of the most popular types of graph neural networks. 
Simplicial Neural Networks (2020)
Stefania Ebli, Michaël Defferrard, Gard SpreemannAbstract
We present simplicial neural networks (SNNs), a generalization of graph neural networks to data that live on a class of topological spaces called simplicial complexes. These are natural multidimensional extensions of graphs that encode not only pairwise relationships but also higherorder interactions between vertices  allowing us to consider richer data, including vector fields and \$n\$fold collaboration networks. We define an appropriate notion of convolution that we leverage to construct the desired convolutional neural networks. We test the SNNs on the task of imputing missing data on coauthorship complexes. 
Topological Machine Learning for Multivariate Time Series (2020)
Chengyuan Wu, Carol Anne HargreavesAbstract
We develop a framework for analyzing multivariate time series using topological data analysis (TDA) methods. The proposed methodology involves converting the multivariate time series to point cloud data, calculating Wasserstein distances between the persistence diagrams and using the \$k\$nearest neighbors algorithm (\$k\$NN) for supervised machine learning. Two methods (symmetrybreaking and anchor points) are also introduced to enable TDA to better analyze data with heterogeneous features that are sensitive to translation, rotation, or choice of coordinates. We apply our methods to room occupancy detection based on 5 timedependent variables (temperature, humidity, light, CO2 and humidity ratio). Experimental results show that topological methods are effective in predicting room occupancy during a time window. We also apply our methods to an Activity Recognition dataset and obtained good results. 
Generalized Penalty for Circular Coordinate Representation (2020)
Hengrui Luo, Alice Patania, Jisu Kim, Mikael VejdemoJohanssonAbstract
Topological Data Analysis (TDA) provides novel approaches that allow us to analyze the geometrical shapes and topological structures of a dataset. As one important application, TDA can be used for data visualization and dimension reduction. We follow the framework of circular coordinate representation, which allows us to perform dimension reduction and visualization for highdimensional datasets on a torus using persistent cohomology. In this paper, we propose a method to adapt the circular coordinate framework to take into account sparsity in highdimensional applications. We use a generalized penalty function instead of an \$L_\2\\$ penalty in the traditional circular coordinate algorithm. We provide simulation experiments and real data analysis to support our claim that circular coordinates with generalized penalty will accommodate the sparsity in highdimensional datasets under different sampling schemes while preserving the topological structures. 
Fibers of Failure: Classifying Errors in Predictive Processes (2020)
Leo S. Carlsson, Mikael VejdemoJohansson, Gunnar Carlsson, Pär G. JönssonAbstract
Predictive models are used in many different fields of science and engineering and are always prone to make faulty predictions. These faulty predictions can be more or less malignant depending on the model application. We describe fibers of failure (FiFa), a method to classify failure modes of predictive processes. Our method uses Mapper, an algorithm from topological data analysis (TDA), to build a graphical model of input data stratified by prediction errors. We demonstrate two ways to use the failure mode groupings: either to produce a correction layer that adjusts predictions by similarity to the failure modes; or to inspect members of the failure modes to illustrate and investigate what characterizes each failure mode. We demonstrate FiFa on two scenarios: a convolutional neural network (CNN) predicting MNIST images with added noise, and an artificial neural network (ANN) predicting the electrical energy consumption of an electric arc furnace (EAF). The correction layer on the CNN model improved its prediction accuracy significantly while the inspection of failure modes for the EAF model provided guiding insights into the domainspecific reasons behind several higherror regions. 
Capturing Dynamics of TimeVarying Data via Topology (2020)
Lu Xian, Henry Adams, Chad M. Topaz, Lori ZiegelmeierAbstract
One approach to understanding complex data is to study its shape through the lens of algebraic topology. While the early development of topological data analysis focused primarily on static data, in recent years, theoretical and applied studies have turned to data that varies in time. A timevarying collection of metric spaces as formed, for example, by a moving school of fish or flock of birds, can contain a vast amount of information. There is often a need to simplify or summarize the dynamic behavior. We provide an introduction to topological summaries of timevarying metric spaces including vineyards [17], crocker plots [52], and multiparameter rank functions [34]. We then introduce a new tool to summarize timevarying metric spaces: a crocker stack. Crocker stacks are convenient for visualization, amenable to machine learning, and satisfy a desirable stability property which we prove. We demonstrate the utility of crocker stacks for a parameter identification task involving an influential model of biological aggregations [54]. Altogether, we aim to bring the broader applied mathematics community uptodate on topological summaries of timevarying metric spaces. 
Uncovering the Topology of TimeVarying fMRI Data Using Cubical Persistence (2020)
Bastian Rieck, Tristan Yates, Christian Bock, Karsten Borgwardt, Guy Wolf, Nicholas TurkBrowne, Smita KrishnaswamyAbstract
Functional magnetic resonance imaging (fMRI) is a crucial technology for gaining insights into cognitive processes in humans. Data amassed from fMRI measurements result in volumetric data sets that vary over time. However, analysing such data presents a challenge due to the large degree of noise and persontoperson variation in how information is represented in the brain. To address this challenge, we present a novel topological approach that encodes each time point in an fMRI data set as a persistence diagram of topological features, i.e. highdimensional voids present in the data. This representation naturally does not rely on voxelbyvoxel correspondence and is robust to noise. We show that these timevarying persistence diagrams can be clustered to find meaningful groupings between participants, and that they are also useful in studying withinsubject brain state trajectories of subjects performing a particular task. Here, we apply both clustering and trajectory analysis techniques to a group of participants watching the movie 'Partly Cloudy'. We observe significant differences in both brain state trajectories and overall topological activity between adults and children watching the same movie. 
PINet: A Deep Learning Approach to Extract Topological Persistence Images (2020)
Anirudh Som, Hongjun Choi, Karthikeyan Natesan Ramamurthy, Matthew Buman, Pavan TuragaAbstract
Topological features such as persistence diagrams and their functional approximations like persistence images (PIs) have been showing substantial promise for machine learning and computer vision applications. This is greatly attributed to the robustness topological representations provide against different types of physical nuisance variables seen in realworld data, such as viewpoint, illumination, and more. However, key bottlenecks to their large scale adoption are computational expenditure and difﬁculty incorporating them in a differentiable architecture. We take an important step in this paper to mitigate these bottlenecks by proposing a novel onestep approach to generate PIs directly from the input data. We design two separate convolutional neural network architectures, one designed to take in multivariate time series signals as input and another that accepts multichannel images as input. We call these networks Signal PINet and Image PINet respectively. To the best of our knowledge, we are the ﬁrst to propose the use of deep learning for computing topological features directly from data. We explore the use of the proposed PINet architectures on two applications: human activity recognition using triaxial accelerometer sensor data and image classiﬁcation. We demonstrate the ease of fusion of PIs in supervised deep learning architectures and speed up of several orders of magnitude for extracting PIs from data. Our code is available at https://github.com/anirudhsom/PINet. 
Prediction in Cancer Genomics Using Topological Signatures and Machine Learning (2020)
Georgina Gonzalez, Arina Ushakova, Radmila Sazdanovic, Javier ArsuagaAbstract
Copy Number Aberrations, gains and losses of genomic regions, are a hallmark of cancer and can be experimentally detected using microarray comparative genomic hybridization (aCGH). In previous works, we developed a topology based method to analyze aCGH data whose output are regions of the genome where copy number is altered in patients with a predetermined cancer phenotype. We call this method Topological Analysis of array CGH (TAaCGH). Here we combine TAaCGH with machine learning techniques to build classifiers using copy number aberrations. We chose logistic regression on two different binary phenotypes related to breast cancer to illustrate this approach. The first case consists of patients with overexpression of the ERBB2 gene. Overexpression of ERBB2 is commonly regulated by a copy number gain in chromosome arm 17q. TAaCGH found the region 17q11q22 associated with the phenotype and using logistic regression we reduced this region to 17q12q21.31 correctly classifying 78% of the ERBB2 positive individuals (sensitivity) in a validation data set. We also analyzed overexpression in Estrogen Receptor (ER), a second phenotype commonly observed in breast cancer patients and found that the region 5p14.312 together with six full arms were associated with the phenotype. Our method identified 4p, 6p and 16q as the strongest predictors correctly classifying 76% of ER positives in our validation data set. However, for this set there was a significant increase in the false positive rate (specificity). We suggest that topological and machine learning methods can be combined for prediction of phenotypes using genetic data. 
Topological Descriptors Help Predict Guest Adsorption in Nanoporous Materials (2020)
Aditi S. Krishnapriyan, Maciej Haranczyk, Dmitriy MorozovAbstract
Machine learning has emerged as an attractive alternative to experiments and simulations for predicting material properties. Usually, such an approach relies on specific domain knowledge for feature design: each learning target requires careful selection of features that an expert recognizes as important for the specific task. The major drawback of this approach is that computation of only a few structural features has been implemented so far, and it is difficult to tell a priori which features are important for a particular application. The latter problem has been empirically observed for predictors of guest uptake in nanoporous materials: local and global porosity features become dominant descriptors at low and high pressures, respectively. We investigate a feature representation of materials using tools from topological data analysis. Specifically, we use persistent homology to describe the geometry of nanoporous materials at various scales. We combine our topological descriptor with traditional structural features and investigate the relative importance of each to the prediction tasks. We demonstrate an application of this feature representation by predicting methane adsorption in zeolites, for pressures in the range of 1200 bar. Our results not only show a considerable improvement compared to the baseline, but they also highlight that topological features capture information complementary to the structural features: this is especially important for the adsorption at low pressure, a task particularly difficult for the traditional features. Furthermore, by investigation of the importance of individual topological features in the adsorption model, we are able to pinpoint the location of the pores that correlate best to adsorption at different pressure, contributing to our atomlevel understanding of structureproperty relationships. 
Steinhaus Filtration and Stable Paths in the Mapper (2020)
Dustin L. Arendt, Matthew Broussard, Bala Krishnamoorthy, Nathaniel SaulAbstract
Two central concepts from topological data analysis are persistence and the Mapper construction. Persistence employs a sequence of objects built on data called a filtration. A Mapper produces insightful summaries of data, and has found widespread applications in diverse areas. We define a new filtration called the cover filtration built from a single cover based on a generalized Steinhaus distance, which is a generalization of Jaccard distance. We prove a stability result: the cover filtrations of two covers are \$\alpha/m\$ interleaved, where \$\alpha\$ is a bound on bottleneck distance between covers and \$m\$ is the size of smallest set in either cover. We also show our construction is equivalent to the Cech filtration under certain settings, and the VietorisRips filtration completely determines the cover filtration in all cases. We then develop a theory for stable paths within this filtration. Unlike standard results on stability in topological persistence, our definition of path stability aligns exactly with the above result on stability of cover filtration. We demonstrate how our framework can be employed in a variety of applications where a metric is not obvious but a cover is readily available. First we present a new model for recommendation systems using cover filtration. For an explicit example, stable paths identified on a movies data set represent sequences of movies constituting gentle transitions from one genre to another. As a second application in explainable machine learning, we apply the Mapper for model induction, providing explanations in the form of paths between subpopulations. Stable paths in the Mapper from a supervised machine learning model trained on the FashionMNIST data set provide improved explanations of relationships between subpopulations of images. 
Text Classification via Network Topology: A Case Study on the Holy Quran (2019)
Mehmet Emin Aktas, Esra AkbasAbstract
Due to the growth in the number of texts and documents available online, machine learning based text classification systems are getting more popular recently. Feature extraction, converting unstructured text into a structured feature space, is one of the essential tasks for text classification. In this paper, we propose a novel feature extraction approach for text classification using the network representation of text, network topology, and machine learning techniques. We present experimental results on classifying the Holy Quran chapters based on the place each chapter was revealed to illustrate the effectiveness of the approach. 
Hyperparameter Optimization of Topological Features for Machine Learning Applications (2019)
Francis Motta, Christopher Tralie, Rossella Bedini, Fabiano Bini, Gilberto Bini, Hamed Eramian, Marcio Gameiro, Steve Haase, Hugh Haddox, John Harer, Nick Leiby, Franco Marinozzi, Scott Novotney, Gabe Rocklin, Jed Singer, Devin Strickland, Matt VaughnAbstract
This paper describes a general pipeline for generating optimal vector representations of topological features of data for use with machine learning algorithms. This pipeline can be viewed as a costly blackbox function defined over a complex configuration space, each point of which specifies both how features are generated and how predictive models are trained on those features. We propose using stateoftheart Bayesian optimization algorithms to inform the choice of topological vectorization hyperparameters while simultaneously choosing learning model parameters. We demonstrate the need for and effectiveness of this pipeline using two difficult biological learning problems, and illustrate the nontrivial interactions between topological feature generation and learning model hyperparameters. 
A Topological Data Analysis Based Classification Method for Multiple Measurements (2019)
Henri Riihimäki, Wojciech Chachólski, Jakob Theorell, Jan Hillert, Ryan RamanujamAbstract
\textlessh3\textgreaterAbstract\textless/h3\textgreater \textlessh3\textgreaterBackground\textless/h3\textgreater \textlessp\textgreaterMachine learning models for repeated measurements are limited. Using topological data analysis (TDA), we present a classifier for repeated measurements which samples from the data space and builds a network graph based on the data topology. When applying this to two case studies, accuracy exceeds alternative models with additional benefits such as reporting data subsets with high purity along with feature values.\textless/p\textgreater\textlessh3\textgreaterResults\textless/h3\textgreater \textlessp\textgreaterFor 300 examples of 3 tree species, the accuracy reached 80% after 30 datapoints, which was improved to 90% after increased sampling to 400 datapoints. Using data from 100 examples of each of 6 point processes, the classifier achieved 96.8% accuracy. In both datasets, the TDA classifier outperformed an alternative model.\textless/p\textgreater\textlessh3\textgreaterConclusions\textless/h3\textgreater \textlessp\textgreaterThis algorithm and software can be beneficial for repeated measurement data common in biological sciences, as both an accurate classifier and a feature selection tool.\textless/p\textgreater 
Persistence Images: A Stable Vector Representation of Persistent Homology (2017)
Henry Adams, Tegan Emerson, Michael Kirby, Rachel Neville, Chris Peterson, Patrick Shipman, Sofya Chepushtanova, Eric Hanson, Francis Motta, Lori ZiegelmeierAbstract
Many data sets can be viewed as a noisy sampling of an underlying space, and tools from topological data analysis can characterize this structure for the purpose of knowledge discovery. One such tool is persistent homology, which provides a multiscale description of the homological features within a data set. A useful representation of this homological information is a persistence diagram (PD). Efforts have been made to map PDs into spaces with additional structure valuable to machine learning tasks. We convert a PD to a finitedimensional vector representation which we call a persistence image (PI), and prove the stability of this transformation with respect to small perturbations in the inputs. The discriminatory power of PIs is compared against existing methods, showing significant performance gains. We explore the use of PIs with vectorbased machine learning tools, such as linear sparse support vector machines, which identify features containing discriminating topological information. Finally, high accuracy inference of parameter values from the dynamic output of a discrete dynamical system (the linked twist map) and a partial differential equation (the anisotropic KuramotoSivashinsky equation) provide a novel application of the discriminatory power of PIs.Community Resources

OmicsBased Strategies in Precision Medicine: Toward a Paradigm Shift in Inborn Errors of Metabolism Investigations (2016)
Abdellah Tebani, Carlos Afonso, Stéphane Marret, Soumeya BekriAbstract
The rise of technologies that simultaneously measure thousands of data points represents the heart of systems biology. These technologies have had a huge impact on the discovery of nextgeneration diagnostics, biomarkers, and drugs in the precision medicine era. Systems biology aims to achieve systemic exploration of complex interactions in biological systems. Driven by highthroughput omics technologies and the computational surge, it enables multiscale and insightful overviews of cells, organisms, and populations. Precision medicine capitalizes on these conceptual and technological advancements and stands on two main pillars: data generation and data modeling. Highthroughput omics technologies allow the retrieval of comprehensive and holistic biological information, whereas computational capabilities enable highdimensional data modeling and, therefore, accessible and userfriendly visualization. Furthermore, bioinformatics has enabled comprehensive multiomics and clinical data integration for insightful interpretation. Despite their promise, the translation of these technologies into clinically actionable tools has been slow. In this review, we present stateoftheart multiomics data analysis strategies in a clinical context. The challenges of omicsbased biomarker translation are discussed. Perspectives regarding the use of multiomics approaches for inborn errors of metabolism (IEM) are presented by introducing a new paradigm shift in addressing IEM investigations in the postgenomic era.