🍩 Database of Original & NonTheoretical Uses of Topology
(found 59 matches in 0.006069s)


Machine Learning and Topological Data Analysis Identify Unique Features of Human Papillae in 3D Scans (2023)
Rayna Andreeva, Anwesha Sarkar, Rik SarkarAbstract
The tongue surface houses a range of papillae that are integral to the mechanics and chemistry of taste and textural sensation. Although gustatory function of papillae is well investigated, the uniqueness of papillae within and across individuals remains elusive. Here, we present the first machine learning framework on 3D microscopic scans of human papillae (n = 2092), uncovering the uniqueness of geometric and topological features of papillae. The finer differences in shapes of papillae are investigated computationally based on a number of features derived from discrete differential geometry and computational topology. Interpretable machine learning techniques show that persistent homology features of the papillae shape are the most effective in predicting the biological variables. Models trained on these features with small volumes of data samples predict the type of papillae with an accuracy of 85%. The papillae type classification models can map the spatial arrangement of filiform and fungiform papillae on a surface. Remarkably, the papillae are found to be distinctive across individuals and an individual can be identified with an accuracy of 48% among the 15 participants from a single papillae. Collectively, this is the first unprecedented evidence demonstrating that tongue papillae can serve as a unique identifier inspiring new research direction for food preferences and oral diagnostics. 
The Persistence of Large Scale Structures I: Primordial NonGaussianity (2020)
Matteo Biagetti, Alex Cole, Gary ShiuAbstract
We develop an analysis pipeline for characterizing the topology of large scale structure and extracting cosmological constraints based on persistent homology. Persistent homology is a technique from topological data analysis that quantifies the multiscale topology of a data set, in our context unifying the contributions of clusters, filament loops, and cosmic voids to cosmological constraints. We describe how this method captures the imprint of primordial local nonGaussianity on the latetime distribution of dark matter halos, using a set of Nbody simulations as a proxy for real data analysis. For our best single statistic, running the pipeline on several cubic volumes of size \$40~(\rm\Gpc/h\)\textasciicircum\3\\$, we detect \$f_\\rm NL\\textasciicircum\\rm loc\=10\$ at \$97.5\%\$ confidence on \$\sim 85\%\$ of the volumes. Additionally we test our ability to resolve degeneracies between the topological signature of \$f_\\rm NL\\textasciicircum\\rm loc\\$ and variation of \$\sigma_8\$ and argue that correctly identifying nonzero \$f_\\rm NL\\textasciicircum\\rm loc\\$ in this case is possible via an optimal template method. Our method relies on information living at \$\mathcal\O\(10)\$ Mpc/h, a complementary scale with respect to commonly used methods such as the scaledependent bias in the halo/galaxy power spectrum. Therefore, while still requiring a large volume, our method does not require sampling longwavelength modes to constrain primordial nonGaussianity. Moreover, our statistics are interpretable: we are able to reproduce previous results in certain limits and we make new predictions for unexplored observables, such as filament loops formed by dark matter halos in a simulation box. 
Felix: A Topology Based Framework for Visual Exploration of Cosmic Filaments (2016)
Nithin Shivshankar, Pratyush Pranav, Vijay Natarajan, Rien van de Weygaert, E. G. Patrick Bos, Steven RiederAbstract
The largescale structure of the universe is comprised of virialized bloblike clusters, linear filaments, sheetlike walls and huge near empty threedimensional voids. Characterizing the large scale universe is essential to our understanding of the formation and evolution of galaxies. The density range of clusters, walls and voids are relatively well separated, when compared to filaments, which span a relatively larger range. The large scale filamentary network thus forms an intricate part of the cosmic web. In this paper, we describe Felix, a topology based framework for visual exploration of filaments in the cosmic web. The filamentary structure is represented by the ascending manifold geometry of the 2saddles in the MorseSmale complex of the density field. We generate a hierarchy of MorseSmale complexes and query for filaments based on the density ranges at the end points of the filaments. The query is processed efficiently over the entire hierarchical MorseSmale complex, allowing for interactive visualization. We apply Felix to computer simulations based on the heuristic Voronoi kinematic model and the standard \$\Lambda\$CDM cosmology, and demonstrate its usefulness through two case studies. First, we extract cosmic filaments within and across cluster like regions in Voronoi kinematic simulation datasets. We demonstrate that we produce similar results to existing structure finders. Filaments that form the spine of the cosmic web, which exist in high density regions in the current epoch, are isolated using Felix. Also, filaments present in voidlike regions are isolated and visualized. These filamentary structures are often over shadowed by higher density range filaments and are not easily characterizable and extractable using other filament extraction methodologies. 
Determining Clinically Relevant Features in Cytometry Data Using Persistent Homology (2022)
Soham Mukherjee, Darren Wethington, Tamal K. Dey, Jayajit DasAbstract
Cytometry experiments yield highdimensional point cloud data that is difficult to interpret manually. Boolean gating techniques coupled with comparisons of relative abundances of cellular subsets is the current standard for cytometry data analysis. However, this approach is unable to capture more subtle topological features hidden in data, especially if those features are further masked by data transforms or significant batch effects or donortodonor variations in clinical data. We present that persistent homology, a mathematical structure that summarizes the topological features, can distinguish different sources of data, such as from groups of healthy donors or patients, effectively. Analysis of publicly available cytometry data describing nonnaïve CD8+ T cells in COVID19 patients and healthy controls shows that systematic structural differences exist between single cell protein expressions in COVID19 patients and healthy controls. We identify proteins of interest by a decisiontree based classifier, sample points randomly and compute persistence diagrams from these sampled points. The resulting persistence diagrams identify regions in cytometry datasets of varying density and identify protruded structures such as ‘elbows’. We compute Wasserstein distances between these persistence diagrams for random pairs of healthy controls and COVID19 patients and find that systematic structural differences exist between COVID19 patients and healthy controls in the expression data for Tbet, Eomes, and Ki67. Further analysis shows that expression of Tbet and Eomes are significantly downregulated in COVID19 patient nonnaïve CD8+ T cells compared to healthy controls. This counterintuitive finding may indicate that canonical effector CD8+ T cells are less prevalent in COVID19 patients than healthy controls. This method is applicable to any cytometry dataset for discovering novel insights through topological data analysis which may be difficult to ascertain otherwise with a standard gating strategy or existing bioinformatic tools.Community Resources

Persistent Homology and the Branching Topologies of Plants (2017)
Mao Li, Keith Duncan, Christopher N. Topp, Daniel H. Chitwood 
Topological Characteristics of Oil and Gas Reservoirs and Their Applications (2017)
V. A. Baikov, R. R. Gilmanov, I. A. Taimanov, A. A. YakovlevAbstract
We demonstrate applications of topological characteristics of oil and gas reservoirs considered as threedimensional bodies to geological modeling. 
Alpha, Betti and the Megaparsec Universe: On the Topology of the Cosmic Web (2011)
Rien Van De Weygaert, Gert Vegter, Herbert Edelsbrunner, Bernard J. T. Jones, Pratyush Pranav, Changbom Park, Wojciech A. Hellwing, Bob Eldering, Nico Kruithof, E. G. P. Bos, Johan Hidding, Job Feldbrugge, Eline Ten Have, Matti Van Engelen, Manuel Caroli, Monique TeillaudAbstract
We study the topology of the Megaparsec Cosmic Web in terms of the scaledependent Betti numbers, which formalize the topological information content of... 
A Method to the Madness: Using Persistent Homology to Measure Plant Morphology (2018)
Emily R. Larson 
Topological Data Analysis Quantifies Biological NanoStructure From Single Molecule Localization Microscopy (2020)
Jeremy A. Pike, Abdullah O. Khan, Chiara Pallini, Steven G. Thomas, Markus Mund, Jonas Ries, Natalie S. Poulter, Iain B. StylesAbstract
AbstractMotivation. Localization microscopy data is represented by a set of spatial coordinates, each corresponding to a single detection, that form a point cl 
Reconstructing Linearly Embedded Graphs: A First Step to Stratified Space Learning (2021)
Yossi Bokor, Christopher Williams, Katharine TurnerCommunity Resources

Skyler (2023)
Yossi Bokor BleileAbstract
Julia package for recovering stratified spaces underlying point clouds. 
Parametric Inference Using Persistence Diagrams: a Case Study in Population Genetics (2014)
Kevin Emmett, Daniel Rosenbloom, Pablo Camara, Raul RabadanAbstract
Persistent homology computes topological invariants from point cloud data. Recent work has focused on developing statistical methods for data analysis in this framework. We show that, in certain models, parametric inference can be performed using statistics deﬁned on the computed invariants. We develop this idea with a model from population genetics, the coalescent with recombination. We apply our model to an inﬂuenza dataset, identifying two scales of topological structure which have a distinct biological interpretation. 
A Barcode Shape Descriptor for Curve Point Cloud Data (2004)
Anne Collins, Afra Zomorodian, Gunnar Carlsson, Leonidas J. GuibasAbstract
In this paper, we present a complete computational pipeline for extracting a compact shape descriptor for curve point cloud data (PCD). Our shape descriptor, called a barcode, is based on a blend of techniques from differential geometry and algebraic topology. We also provide a metric over the space of barcodes, enabling fast comparison of PCDs for shape recognition and clustering. To demonstrate the feasibility of our approach, we implement our pipeline and provide experimental evidence in shape classification and parametrization. 
Hierarchical Clustering and Zeroth Persistent Homology (2020)
İsmail Güzel, Atabey KaygunAbstract
In this article, we show that hierarchical clustering and the zeroth persistent homology do deliver the same topological information about a given data set. We show this fact using cophenetic matrices constructed out of the filtered VietorisRips complex of the data set at hand. As in any cophenetic matrix, one can also display the interrelations of zeroth homology classes via a rooted tree, also known as a dendogram. Since homological cophenetic matrices can be calculated for higher homologies, one can also sketch similar dendograms for higher persistent homology classes. 
Interpretable Phase Detection and Classification With Persistent Homology (2020)
Alex Cole, Gregory J. Loges, Gary ShiuAbstract
We apply persistent homology to the task of discovering and characterizing phase transitions, using lattice spin models from statistical physics for working examples. Persistence images provide a useful representation of the homological data for conducting statistical tasks. To identify the phase transitions, a simple logistic regression on these images is sufficient for the models we consider, and interpretable order parameters are then read from the weights of the regression. Magnetization, frustration and vortexantivortex structure are identified as relevant features for characterizing phase transitions. 
Topological Singularity Detection at Multiple Scales (2023)
Julius von Rohrscheidt, Bastian RieckAbstract
The manifold hypothesis, which assumes that data lies on or close to an unknown manifold of low intrinsic dimension, is a staple of modern machine learning research. However, recent work has shown that realworld data exhibits distinct nonmanifold structures, i.e. singularities, that can lead to erroneous findings. Detecting such singularities is therefore crucial as a precursor to interpolation and inference tasks. We address this issue by developing a topological framework that (i) quantifies the local intrinsic dimension, and (ii) yields a Euclidicity score for assessing the ’manifoldness’ of a point along multiple scales. Our approach identifies singularities of complex spaces, while also capturing singular structures and local geometric complexity in image data. 
Hypothesis Testing for Shapes Using Vectorized Persistence Diagrams (2020)
Chul Moon, Nicole A. LazarAbstract
Topological data analysis involves the statistical characterization of the shape of data. Persistent homology is a primary tool of topological data analysis, which can be used to analyze those topological features and perform statistical inference. In this paper, we present a twostage hypothesis test for vectorized persistence diagrams. The first stage filters elements in the vectorized persistence diagrams to reduce false positives. The second stage consists of multiple hypothesis tests, with false positives controlled by false discovery rates. We demonstrate applications of the proposed procedure on simulated point clouds and threedimensional rock image data. Our results show that the proposed hypothesis tests can provide flexible and informative inferences on the shape of data with lower computational cost compared to the permutation test. 
Topological Data Analysis: Concepts, Computation, and Applications in Chemical Engineering (2021)
Alexander D. Smith, Paweł Dłotko, Victor M. ZavalaAbstract
A primary hypothesis that drives scientific and engineering studies is that data has structure. The dominant paradigms for describing such structure are statistics (e.g., moments, correlation functions) and signal processing (e.g., convolutional neural nets, Fourier series). Topological Data Analysis (TDA) is a field of mathematics that analyzes data from a fundamentally different perspective. TDA represents datasets as geometric objects and provides dimensionality reduction techniques that project such objects onto lowdimensional descriptors. The key properties of these descriptors (also known as topological features) are that they provide multiscale information and that they are stable under perturbations (e.g., noise, translation, and rotation). In this work, we review the key mathematical concepts and methods of TDA and present different applications in chemical engineering. 
Motor Eccentricity Fault Detection: PhysicsBased and DataDriven Approaches (2023)
Bingnan Wang, Hiroshi Inoue, Makoto KanemaruAbstract
Fault detection using motor current signature analysis (MCSA) is attractive for industrial applications due to its simplicity with no additional sensor installation required. However current components associated with faults are often very subtle and much smaller than the supply frequency component, making it challenging to detect and quantify fault levels. In this paper, we present our work on quantitative eccentricity fault diagnosis technologies for electric motors, including physicalmodel approach using improved winding function theory, which can simulate motor dynamics under faulty conditions and agrees well with experiment data, and datadriven approach using topological data analysis (TDA), which can effectively differentiate signals measured at different eccentricity levels. The advantages and limitations of each approach is discussed. Both methods can be extended to the detection and quantification of other types of electric motor faults. 
Contagion Dynamics for Manifold Learning (2020)
Barbara I. MahlerAbstract
Contagion maps exploit activation times in threshold contagions to assign vectors in highdimensional Euclidean space to the nodes of a network. A point cloud that is the image of a contagion map reflects both the structure underlying the network and the spreading behaviour of the contagion on it. Intuitively, such a point cloud exhibits features of the network's underlying structure if the contagion spreads along that structure, an observation which suggests contagion maps as a viable manifoldlearning technique. We test contagion maps as a manifoldlearning tool on a number of different realworld and synthetic data sets, and we compare their performance to that of Isomap, one of the most wellknown manifoldlearning algorithms. We find that, under certain conditions, contagion maps are able to reliably detect underlying manifold structure in noisy data, while Isomap fails due to noiseinduced error. This consolidates contagion maps as a technique for manifold learning. 
Shape Terra: Mechanical Feature Recognition Based on a Persistent Heat Signature (2017)
Ramy Harik, Yang Shi, Stephen BaekAbstract
This paper presents a novel approach to recognizing mechanical features through a multiscale persistent heat signature similarity identification technique. First, heat signature is computed using a modified Laplacian in the application of the heat kernel. Regularly, matrices tend to include an indicator to the manifold curvature (the cotangent in our case), but we add a mesh uniformity factor to overcome mesh proportionality and skewness. Second, once heat retention values are computed, we apply persistent homology to extract significant subsets of the global mesh at different time intervals. Subsets are computed based on similarity of heat retention levels and/or retention values. Third, we present a multiscale persistence identification approach where we scan the part at different persistence levels to detect the presence of a feature. Once features are recognized and their geometrical descriptors identified, the next stage in future work will be feature matching. 
Finding Universal Structures in Quantum ManyBody Dynamics via Persistent Homology (2020)
Daniel Spitz, Jürgen Berges, Markus K. Oberthaler, Anna WienhardAbstract
Inspired by topological data analysis techniques, we introduce persistent homology observables and apply them in a geometric analysis of the dynamics of quantum field theories. As a prototype application, we consider simulated data of a twodimensional Bose gas far from equilibrium. We discover a continuous spectrum of dynamical scaling exponents, which provides a refined classification of nonequilibrium universal phenomena. A possible explanation of the underlying processes is provided in terms of mixing wave turbulence and vortex kinetics components in point clouds. We find that the persistent homology scaling exponents are inherently linked to the geometry of the system, as the derivation of a packing relation reveals. The approach opens new ways of analyzing quantum manybody dynamics in terms of robust topological structures beyond standard field theoretic techniques. 
A TopologyBased Object Representation for Clasping, Latching and Hooking (2013)
J. A. Stork, F. T. Pokorny, D. KragicAbstract
We present a loopbased topological object representation for objects with holes. The representation is used to model object parts suitable for grasping, e.g. handles, and it incorporates local volume information about these. Furthermore, we present a grasp synthesis framework that utilizes this representation for synthesizing caging grasps that are robust under measurement noise. The approach is complementary to a local contactbased forceclosure analysis as it depends on global topological features of the object. We perform an extensive evaluation with four robotic hands on synthetic data. Additionally, we provide real world experiments using a Kinect sensor on two robotic platforms: a Schunk dexterous hand attached to a Kuka robot arm as well as a Nao humanoid robot. In the case of the Nao platform, we provide initial experiments showing that our approach can be used to plan whole arm hooking as well as caging grasps involving only one hand. 
A Probabilistic Topological Approach to Feature Identification Using a Stochastic Robotic Swarm (2018)
Ragesh K. Ramachandran, Sean Wilson, Spring BermanAbstract
This paper presents a novel automated approach to quantifying the topological features of an unknown environment using a swarm of robots with local sensing and limited or no access to global position information. The robots randomly explore the environment and record a time series of their estimated position and the covariance matrix associated with this estimate. After the robots’ deployment, a point cloud indicating the free space of the environment is extracted from their aggregated data. Tools from topological data analysis, in particular the concept of persistent homology, are applied to a subset of the point cloud to construct barcode diagrams, which are used to determine the numbers of different types of features in the domain. We demonstrate that our approach can correctly identify the number of topological features in simulations with zero to four features and in multirobot experiments with one to three features. 
Statistical Topological Data Analysis  A Kernel Perspective (2015)
Roland Kwitt, Stefan Huber, Marc Niethammer, Weili Lin, Ulrich BauerAbstract
We consider the problem of statistical computations with persistence diagrams, a summary representation of topological features in data. These diagrams encode persistent homology, a widely used invariant in topological data analysis. While several avenues towards a statistical treatment of the diagrams have been explored recently, we follow an alternative route that is motivated by the success of methods based on the embedding of probability measures into reproducing kernel Hilbert spaces. In fact, a positive definite kernel on persistence diagrams has recently been proposed, connecting persistent homology to popular kernelbased learning techniques such as support vector machines. However, important properties of that kernel enabling a principled use in the context of probability measure embeddings remain to be explored. Our contribution is to close this gap by proving universality of a variant of the original kernel, and to demonstrate its effective use in twosample hypothesis testing on synthetic as well as realworld data. 
Topological Analysis of Population Activity in Visual Cortex (2008)
Gurjeet Singh, Facundo Memoli, Tigran Ishkhanov, Guillermo Sapiro, Gunnar Carlsson, Dario L. RingachAbstract
Information in the cortex is thought to be represented by the joint activity of neurons. Here we describe how fundamental questions about neural representation can be cast in terms of the topological structure of population activity. A new method, based on the concept of persistent homology, is introduced and applied to the study of population activity in primary visual cortex (V1). We found that the topological structure of activity patterns when the cortex is spontaneously active is similar to those evoked by natural image stimulation and consistent with the topology of a two sphere. We discuss how this structure could emerge from the functional organization of orientation and spatial frequency maps and their mutual relationship. Our findings extend prior results on the relationship between spontaneous and evoked activity in V1 and illustrates how computational topology can help tackle elementary questions about the representation of information in the nervous system. 
Topological Data Analysis of Biological Aggregation Models (2015)
Chad M. Topaz, Lori Ziegelmeier, Tom HalversonAbstract
We apply tools from topological data analysis to two mathematical models inspired by biological aggregations such as bird flocks, fish schools, and insect swarms. Our data consists of numerical simulation output from the models of Vicsek and D'Orsogna. These models are dynamical systems describing the movement of agents who interact via alignment, attraction, and/or repulsion. Each simulation time frame is a point cloud in positionvelocity space. We analyze the topological structure of these point clouds, interpreting the persistent homology by calculating the first few Betti numbers. These Betti numbers count connected components, topological circles, and trapped volumes present in the data. To interpret our results, we introduce a visualization that displays Betti numbers over simulation time and topological persistence scale. We compare our topological results to order parameters typically used to quantify the global behavior of aggregations, such as polarization and angular momentum. The topological calculations reveal events and structure not captured by the order parameters. 
A Topological Perspective on Regimes in Dynamical Systems (2021)
Kristian Strommen, Matthew Chantry, Joshua Dorrington, Nina OtterAbstract
The existence and behaviour of socalled `regimes' has been extensively studied in dynamical systems ranging from simple toy models to the atmosphere itself, due to their potential of drastically simplifying complex and chaotic dynamics. Nevertheless, no agreedupon and clearcut definition of a `regime' or a `regime system' exists in the literature. We argue here for a definition which equates the existence of regimes in a system with the existence of nontrivial topological structure. We show, using persistent homology, a tool in topological data analysis, that this definition is both computationally tractable, practically informative, and accounts for a variety of different examples. We further show that alternative, more strict definitions based on clustering and/or temporal persistence criteria fail to account for one or more examples of dynamical systems typically thought of as having regimes. We finally discuss how our methodology can shed light on regime behaviour in the atmosphere, and discuss future prospects. 
Topological Data Analysis on Simple English Wikipedia Articles (2020)
Matthew Wright, Xiaojun ZhengAbstract
Singleparameter persistent homology, a key tool in topological data analysis, has been widely applied to data problems, with statistical techniques that quantify the significance of the results. In contrast, statistical techniques for twoparameter persistence, while highly desirable for realworld applications, have scarcely been considered. We present three statistical approaches for comparing geometric data using twoparameter persistent homology, and we demonstrate the applicability of these approaches on highdimensional pointcloud data obtained from Simple English Wikipedia articles. These approaches rely on the Hilbert function, matching distance, and barcodes obtained from twoparameter persistence modules computed from the pointcloud data. We demonstrate the applicability of our methods by distinguishing certain subsets of the Wikipedia data, and by comparison with random data. Results include insights into the construction of null distributions and stability of our methods with respect to noisy data. Our statistical methods are broadly applicable for analysis of geometric data indexed by a realvalued parameter. 
Topological Detection of Phenomenological Bifurcations With Unreliable Kernel Density Estimates (2024)
Sunia Tanweer, Firas A. KhasawnehAbstract
Phenomenological (Ptype) bifurcations are qualitative changes in stochastic dynamical systems whereby the stationary probability density function (PDF) changes its topology. The current state of the art for detecting these bifurcations requires reliable kernel density estimates computed from an ensemble of system realizations. However, in several real world signals such as Big Data, only a single system realization is available—making it impossible to estimate a reliable kernel density. This study presents an approach for detecting Ptype bifurcations using unreliable density estimates. The approach creates an ensemble of objects from Topological Data Analysis (TDA) called persistence diagrams from the system’s sole realization and statistically analyzes the resulting set. We compare several methods for replicating the original persistence diagram including Gibbs point process modelling, Pairwise Interaction Point Modelling, and subsampling. We show that for the purpose of predicting a bifurcation, the simple method of subsampling exceeds the other two methods of point process modelling in performance. 
Raw Material Flow Optimization as a Capacitated Vehicle Routing Problem: A Visual Benchmarking Approach for Sustainable Manufacturing (2017)
Michele Dassisti, Yasamin Eslami, Matin MohagheghAbstract
Optimisation problem concerning material flows, to increase the efficiency while reducing relative resource consumption is one of the most pressing problems today. The focus point of this study is to propose a new visual benchmarking approach to select the best materialflow path from the depot to the production lines, referring to the wellknown Capacitated Vehicle Routing Problem (CVRP). An example industrial case study is considered to this aim. Two different solution techniques were adopted (namely Mixed Integer Linear Programming and the Ant Colony Optimization) in searching optimal solutions to the CVRP. The visual benchmarking proposed, based on the persistent homology approach, allowed to support the comparison of the optimal solutions based on the entropy of the output in different scenarios. Finally, based on the nonstandard measurements of Crossing Length Percentage (CLP), the visual benchmarking procedure makes it possible to find the most practical and applicable solution to CVRP by considering the visual attractiveness and the quality of the routes. 
A Novel Approach for Wafer Defect Pattern Classification Based on Topological Data Analysis (2023)
Seungchan Ko, Dowan KooAbstract
In semiconductor manufacturing, wafer map defect pattern provides critical information for facility maintenance and yield management, so the classification of defect patterns is one of the most important tasks in the manufacturing process. In this paper, we propose a novel way to represent the shape of the defect pattern as a finitedimensional vector, which will be used as an input for a neural network algorithm for classification. The main idea is to extract the topological features of each pattern by using the theory of persistent homology from topological data analysis (TDA). Through some experiments with a simulated dataset, we show that the proposed method is faster and much more efficient in training with higher accuracy, compared with the method using convolutional neural networks (CNN) which is the most common approach for wafer map defect pattern classification. Moreover, it was shown that our method outperforms the CNNbased method when the number of training data is not enough and is imbalanced. 
Analysis of Kolmogorov Flow and Rayleigh–Bénard Convection Using Persistent Homology (2016)
Miroslav Kramár, Rachel Levanger, Jeffrey Tithof, Balachandra Suri, Mu Xu, Mark Paul, Michael F. Schatz, Konstantin MischaikowAbstract
We use persistent homology to build a quantitative understanding of large complex systems that are driven farfromequilibrium. In particular, we analyze image time series of flow field patterns from numerical simulations of two important problems in fluid dynamics: Kolmogorov flow and Rayleigh–Bénard convection. For each image we compute a persistence diagram to yield a reduced description of the flow field; by applying different metrics to the space of persistence diagrams, we relate characteristic features in persistence diagrams to the geometry of the corresponding flow patterns. We also examine the dynamics of the flow patterns by a second application of persistent homology to the time series of persistence diagrams. We demonstrate that persistent homology provides an effective method both for quotienting out symmetries in families of solutions and for identifying multiscale recurrent dynamics. Our approach is quite general and it is anticipated to be applicable to a broad range of open problems exhibiting complex spatiotemporal behavior. 
Rule Generation for Classifying SLT Failed Parts (2022)
HoChieh Hsu, ChengChe Lu, ShihWei Wang, Kelly Jones, KaiChiang Wu, Mango C.T. ChaoAbstract
Systemlevel test (SLT) has recently gained visibility when integrated circuits become harder and harder to be fully tested due to increasing transistor density and circuit design complexity. Albeit SLT is effective for reducing test escapes, little diagnostic information can be obtained for product improvement. In this paper, we propose an unsupervised learning (UL) method to resolve the aforementioned issue by discovering correlative, potentially systematic defects during the SLT phase. Toward this end, HDBSCAN [1] is used for clustering SLT failed devices in a lowdimensional space created by UMAP [2]. Decision trees are subsequently applied to explain the HDBSCAN results based on generating explainable quantitative rules, e.g., inequality constraints, providing domain experts additional information for advanced diagnosis. Experiments on industrial data demonstrate that the proposed methodology can effectively cluster SLT failed devices and then explain the clustering results with a promising accuracy of above 90%. Our methodology is also scalable and fast, requiring two to five orders of magnitude lower runtime than the method presented in [3]. 
The Accumulated Persistence Function, a New Useful Functional Summary Statistic for Topological Data Analysis, With a View to Brain Artery Trees and Spatial Point Process Applications (2019)
C.A.N. Biscio, J. MøllerAbstract
We start with a simple introduction to topological data analysis where the most popular tool is called a persistence diagram. Briefly, a persistence diagram is a multiset of points in the plane describing the persistence of topological features of a compact set when a scale parameter varies. Since statistical methods are difficult to apply directly on persistence diagrams, various alternative functional summary statistics have been suggested, but either they do not contain the full information of the persistence diagram or they are twodimensional functions. We suggest a new functional summary statistic that is onedimensional and hence easier to handle, and which under mild conditions contains the full information of the persistence diagram. Its usefulness is illustrated in statistical settings concerned with point clouds and brain artery trees. The supplementary materials include additional methods and examples, technical details, and the R code used for all examples. © 2019, © 2019 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. 
Persistent Homology of Geospatial Data: A Case Study With Voting (2021)
Michelle Feng, Mason A. PorterAbstract
A crucial step in the analysis of persistent homology is the transformation of data into an appropriate topological object (which, in our case, is a simplicial complex). Software packages for computing persistent homology typically construct VietorisRips or other distancebased simplicial complexes on point clouds because they are relatively easy to compute. We investigate alternative methods of constructing simplicial complexes and the effects of making associated choices during simplicialcomplex construction on the output of persistenthomology algorithms. We present two new methods for constructing simplicial complexes from twodimensional geospatial data (such as maps). We apply these methods to a California precinctlevel voting data set, and we thereby demonstrate that our new constructions can capture geometric characteristics that are missed by distancebased constructions. Our new constructions can thus yield more interpretable persistence modules and barcodes for geospatial data. In particular, they are able to distinguish shortpersistence features that occur only for a narrow range of distance scales (e.g., voting patterns in densely populated cities) from shortpersistence noise by incorporating information about other spatial relationships between regions. 
Using Persistent Homology and Dynamical Distances to Analyze Protein Binding (2016)
Violeta KovacevNikolic, Peter Bubenik, Dragan Nikolić, Giseon HeoAbstract
Persistent homology captures the evolution of topological features of a model as a parameter changes. The most commonly used summary statistics of persistent homology are the barcode and the persistence diagram. Another summary statistic, the persistence landscape, was recently introduced by Bubenik. It is a functional summary, so it is easy to calculate sample means and variances, and it is straightforward to construct various test statistics. Implementing a permutation test we detect conformational changes between closed and open forms of the maltosebinding protein, a large biomolecule consisting of 370 amino acid residues. Furthermore, persistence landscapes can be applied to machine learning methods. A hyperplane from a support vector machine shows the clear separation between the closed and open proteins conformations. Moreover, because our approach captures dynamical properties of the protein our results may help in identifying residues susceptible to ligand binding; we show that the majority of active site residues and allosteric pathway residues are located in the vicinity of the most persistent loop in the corresponding filtered VietorisRips complex. This finding was not observed in the classical anisotropic network model. 
From Trees to Barcodes and Back Again: Theoretical and Statistical Perspectives (2020)
Lida Kanari, Adélie Garin, Kathryn HessAbstract
Methods of topological data analysis have been successfully applied in a wide range of fields to provide useful summaries of the structure of complex data sets in terms of topological descriptors, such as persistence diagrams. While there are many powerful techniques for computing topological descriptors, the inverse problem, i.e., recovering the input data from topological descriptors, has proved to be challenging. In this article we study in detail the Topological Morphology Descriptor (TMD), which assigns a persistence diagram to any tree embedded in Euclidean space, and a sort of stochastic inverse to the TMD, the Topological Neuron Synthesis (TNS) algorithm, gaining both theoretical and computational insights into the relation between the two. We propose a new approach to classify barcodes using symmetric groups, which provides a concrete language to formulate our results. We investigate to what extent the TNS recovers a geometric tree from its TMD and describe the effect of different types of noise on the process of tree generation from persistence diagrams. We prove moreover that the TNS algorithm is stable with respect to specific types of noise. 
Multidimensional Persistence in Biomolecular Data (2015)
Kelin Xia, GuoWei WeiAbstract
Persistent homology has emerged as a popular technique for the topological simplification of big data, including biomolecular data. Multidimensional persistence bears considerable promise to bridge the gap between geometry and topology. However, its practical and robust construction has been a challenge. We introduce two families of multidimensional persistence, namely pseudomultidimensional persistence and multiscale multidimensional persistence. The former is generated via the repeated applications of persistent homology filtration to high dimensional data, such as results from molecular dynamics or partial differential equations. The latter is constructed via isotropic and anisotropic scales that create new simiplicial complexes and associated topological spaces. The utility, robustness and efficiency of the proposed topological methods are demonstrated via protein folding, protein flexibility analysis, the topological denoising of cryoelectron microscopy data, and the scale dependence of nano particles. Topological transition between partial folded and unfolded proteins has been observed in multidimensional persistence. The separation between noise topological signatures and molecular topological fingerprints is achieved by the LaplaceBeltrami flow. The multiscale multidimensional persistent homology reveals relative local features in Betti0 invariants and the relatively global characteristics of Betti1 and Betti2 invariants. 
Geometric Anomaly Detection in Data (2020)
Bernadette J. Stolz, Jared Tanner, Heather A. Harrington, Vidit NandaAbstract
The quest for lowdimensional models which approximate highdimensional data is pervasive across the physical, natural, and social sciences. The dominant paradigm underlying most standard modeling techniques assumes that the data are concentrated near a single unknown manifold of relatively small intrinsic dimension. Here, we present a systematic framework for detecting interfaces and related anomalies in data which may fail to satisfy the manifold hypothesis. By computing the local topology of small regions around each data point, we are able to partition a given dataset into disjoint classes, each of which can be individually approximated by a single manifold. Since these manifolds may have different intrinsic dimensions, local topology discovers singular regions in data even when none of the points have been sampled precisely from the singularities. We showcase this method by identifying the intersection of two surfaces in the 24dimensional space of cyclooctane conformations and by locating all of the selfintersections of a Henneberg minimal surface immersed in 3dimensional space. Due to the local nature of the topological computations, the algorithmic burden of performing such data stratification is readily distributable across several processors. 
HERMES: Persistent Spectral Graph Software (2020)
Rui Wang, Rundong Zhao, Emily RibandoGros, Jiahui Chen, Yiying Tong, GuoWei WeiAbstract
Persistent homology (PH) is one of the most popular tools in topological data analysis (TDA), while graph theory has had a significant impact on data science. Our earlier work introduced the persistent spectral graph (PSG) theory as a unified multiscale paradigm to encompass TDA and geometric analysis. In PSG theory, families of persistent Laplacians (PLs) corresponding to various topological dimensions are constructed via a filtration to sample a given dataset at multiple scales. The harmonic spectra from the null spaces of PLs offer the same topological invariants, namely persistent Betti numbers, at various dimensions as those provided by PH, while the nonharmonic spectra of PLs give rise to additional geometric analysis of the shape of the data. In this work, we develop an opensource software package, called highly efficient robust multidimensional evolutionary spectra (HERMES), to enable broad applications of PSGs in science, engineering, and technology. To ensure the reliability and robustness of HERMES, we have validated the software with simple geometric shapes and complex datasets from threedimensional (3D) protein structures. We found that the smallest nonzero eigenvalues are very sensitive to data abnormality. 
Persistent Homology Based Graph Convolution Network for FineGrained 3D Shape Segmentation (2021)
ChiChong Wong, ChiMan VongAbstract
Finegrained 3D segmentation is an important task in 3D object understanding, especially in applications such as intelligent manufacturing or parts analysis for 3D objects. However, many challenges involved in such problem are yet to be solved, such as i) interpreting the complex structures located in different regions for 3D objects; ii) capturing finegrained structures with sufficient topology correctness. Current deep learning and graph machine learning methods fail to tackle such challenges and thus provide inferior performance in finegrained 3D analysis. In this work, methods in topological data analysis are incorporated with geometric deep learning model for the task of finegrained segmentation for 3D objects. We propose a novel neural network model called Persistent Homology based Graph Convolution Network (PHGCN), which i) integrates persistent homology into graph convolution network to capture multiscale structural information that can accurately represent complex structures for 3D objects; ii) applies a novel Persistence Diagram Loss (ℒPD) that provides sufficient topology correctness for segmentation over the finegrained structures. Extensive experiments on finegrained 3D segmentation validate the effectiveness of the proposed PHGCN model and show significant improvements over current stateoftheart methods. 
CD8 TCell Reactivity to Islet Antigens Is Unique to Type 1 While CD4 TCell Reactivity Exists in Both Type 1 and Type 2 Diabetes (2014)
Ghanashyam Sarikonda, Jeremy Pettus, Sonal Phatak, Sowbarnika Sachithanantham, Jacqueline F. Miller, Johnna D. Wesley, Eithon Cadag, Ji Chae, Lakshmi Ganesan, Ronna Mallios, Steve Edelman, Bjoern Peters, Matthias von HerrathAbstract
Previous crosssectional analyses demonstrated that CD8+ and CD4+ Tcell reactivity to isletspecific antigens was more prevalent in T1D subjects than in healthy donors (HD). Here, we examined T1Dassociated epitopespecific CD4+ Tcell cytokine production and autoreactive CD8+ Tcell frequency on a monthly basis for one year in 10 HD, 33 subjects with T1D, and 15 subjects with T2D. Autoreactive CD4+ Tcells from both T1D and T2D subjects produced more IFNγ when stimulated than cells from HD. In contrast, higher frequencies of islet antigenspecific CD8+ Tcells were detected only in T1D. These observations support the hypothesis that general betacell stress drives autoreactive CD4+ Tcell activity while islet overexpression of MHC class I commonly seen in T1D mediates amplification of CD8+ Tcells and more rapid betacell loss. In conclusion, CD4+ Tcell autoreactivity appears to be present in both T1D and T2D while autoreactive CD8+ Tcells are unique to T1D. Thus, autoreactive CD8+ cells may serve as a more T1Dspecific biomarker. 
DataDriven and Automatic Surface Texture Analysis Using Persistent Homology (2021)
Melih C. Yesilli, Firas A. KhasawnehAbstract
Surface roughness plays an important role in analyzing engineering surfaces. It quantifies the surface topography and can be used to determine whether the resulting surface finish is acceptable or not. Nevertheless, while several existing tools and standards are available for computing surface roughness, these methods rely heavily on user input thus slowing down the analysis and increasing manufacturing costs. Therefore, fast and automatic determination of the roughness level is essential to avoid costs resulting from surfaces with unacceptable finish, and userintensive analysis. In this study, we propose a Topological Data Analysis (TDA) based approach to classify the roughness level of synthetic surfaces using both their areal images and profiles. We utilize persistent homology from TDA to generate persistence diagrams that encapsulate information on the shape of the surface. We then obtain feature matrices for each surface or profile using Carlsson coordinates, persistence images, and template functions. We compare our results to two widely used methods in the literature: Fast Fourier Transform (FFT) and Gaussian filtering. The results show that our approach yields mean accuracies as high as 97%. We also show that, in contrast to existing surface analysis tools, our TDAbased approach is fully automatable and provides adaptive feature extraction. 
Relational Persistent Homology for Multispecies Data With Application to the Tumor Microenvironment (2023)
Bernadette J. Stolz, Jagdeep Dhesi, Joshua A. Bull, Heather A. Harrington, Helen M. Byrne, Iris H. R. YoonAbstract
Topological data analysis (TDA) is an active field of mathematics for quantifying shape in complex data. Standard methods in TDA such as persistent homology (PH) are typically focused on the analysis of data consisting of a single entity (e.g., cells or molecular species). However, stateoftheart data collection techniques now generate exquisitely detailed multispecies data, prompting a need for methods that can examine and quantify the relations among them. Such heterogeneous data types arise in many contexts, ranging from biomedical imaging, geospatial analysis, to species ecology. Here, we propose two methods for encoding spatial relations among different data types that are based on Dowker complexes and Witness complexes. We apply the methods to synthetic multispecies data of a tumor microenvironment and analyze topological features that capture relations between different cell types, e.g., blood vessels, macrophages, tumor cells, and necrotic cells. We demonstrate that relational topological features can extract biological insight, including the dominant immune cell phenotype (an important predictor of patient prognosis) and the parameter regimes of a datagenerating model. The methods provide a quantitative perspective on the relational analysis of multispecies spatial data, overcome the limits of traditional PH, and are readily computable. 
Exploring Surface Texture Quantification in Piezo Vibration Striking Treatment (PVST) Using Topological Measures (2022)
Melih C. Yesilli, Max M. Chumley, Jisheng Chen, Firas A. Khasawneh, Yang GuoAbstract
Abstract. Surface texture influences wear and tribological properties of manufactured parts, and it plays a critical role in enduser products. Therefore, quantifying the order or structure of a manufactured surface provides important information on the quality and life expectancy of the product. Although texture can be intentionally introduced to enhance aesthetics or to satisfy a design function, sometimes it is an inevitable byproduct of surface treatment processes such as Piezo Vibration Striking Treatment (PVST). Measures of order for surfaces have been characterized using statistical, spectral, and geometric approaches. For nearly hexagonal lattices, topological tools have also been used to measure the surface order. This paper explores utilizing tools from Topological Data Analysis for measuring surface texture. We compute measures of order based on optical digital microscope images of surfaces treated using PVST. These measures are applied to the grid obtained from estimating the centers of tool impacts, and they quantify the grid’s deviations from the nominal one. Our results show that TDA provides a convenient framework for characterization of pattern type that bypasses some limitations of existing tools such as difficult manual processing of the data and the need for an expert user to analyze and interpret the surface images. 
Persistent Homology Analysis of Protein Structure, Flexibility, and Folding (2014)
Kelin Xia, GuoWei WeiAbstract
SUMMARYProteins are the most important biomolecules for living organisms. The understanding of protein structure, function, dynamics, and transport is one of the most challenging tasks in biological science. In the present work, persistent homology is, for the first time, introduced for extracting molecular topological fingerprints (MTFs) based on the persistence of molecular topological invariants. MTFs are utilized for protein characterization, identification, and classification. The method of slicing is proposed to track the geometric origin of protein topological invariants. Both allatom and coarsegrained representations of MTFs are constructed. A new cutofflike filtration is proposed to shed light on the optimal cutoff distance in elastic network models. On the basis of the correlation between protein compactness, rigidity, and connectivity, we propose an accumulated bar length generated from persistent topological invariants for the quantitative modeling of protein flexibility. To this end, a correlation matrixbased filtration is developed. This approach gives rise to an accurate prediction of the optimal characteristic distance used in protein Bfactor analysis. Finally, MTFs are employed to characterize protein topological evolution during protein folding and quantitatively predict the protein folding stability. An excellent consistence between our persistent homology prediction and molecular dynamics simulation is found. This work reveals the topology–function relationship of proteins. Copyright © 2014 John Wiley & Sons, Ltd. 
Multiresolution Persistent Homology for Excessively Large Biomolecular Datasets (2015)
Kelin Xia, Zhixiong Zhao, GuoWei WeiAbstract
Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of interest so as to represent large scale datasets with appropriate resolution. We utilize flexibilityrigidity index to access the topological connectivity of the data set and define a rigidity density for the filtration analysis. By appropriately tuning the resolution of the rigidity density, we are able to focus the topological lens on the scale of interest. The proposed multiresolution topological analysis is validated by a hexagonal fractal image which has three distinct scales. We further demonstrate the proposed method for extracting topological fingerprints from DNA molecules. In particular, the topological persistence of a virus capsid with 273 780 atoms is successfully analyzed which would otherwise be inaccessible to the normal point cloud method and unreliable by using coarsegrained multiscale persistent homology. The proposed method has also been successfully applied to the protein domain classification, which is the first time that persistent homology is used for practical protein domain analysis, to our knowledge. The proposed multiresolution topological method has potential applications in arbitrary data sets, such as social networks, biological networks, and graphs. 
Branching and Circular Features in High Dimensional Data (2011)
B. Wang, B. Summa, V. Pascucci, M. VejdemoJohanssonAbstract
Large observations and simulations in scientific research give rise to highdimensional data sets that present many challenges and opportunities in data analysis and visualization. Researchers in application domains such as engineering, computational biology, climate study, imaging and motion capture are faced with the problem of how to discover compact representations of highdimensional data while preserving their intrinsic structure. In many applications, the original data is projected onto lowdimensional space via dimensionality reduction techniques prior to modeling. One problem with this approach is that the projection step in the process can fail to preserve structure in the data that is only apparent in high dimensions. Conversely, such techniques may create structural illusions in the projection, implying structure not present in the original highdimensional data. Our solution is to utilize topological techniques to recover important structures in highdimensional data that contains nontrivial topology. Specifically, we are interested in highdimensional branching structures. We construct local circlevalued coordinate functions to represent such features. Subsequently, we perform dimensionality reduction on the data while ensuring such structures are visually preserved. Additionally, we study the effects of global circular structures on visualizations. Our results reveal neverbeforeseen structures on realworld data sets from a variety of applications. 
Topological Data Analysis of Financial Time Series: Landscapes of Crashes (2017)
Marian Gidea, Yuri KatzAbstract
We explore the evolution of daily returns of four major US stock market indices during the technology crash of 2000, and the financial crisis of 20072009. Our methodology is based on topological data analysis (TDA). We use persistence homology to detect and quantify topological patterns that appear in multidimensional time series. Using a sliding window, we extract timedependent point cloud data sets, to which we associate a topological space. We detect transient loops that appear in this space, and we measure their persistence. This is encoded in realvalued functions referred to as a 'persistence landscapes'. We quantify the temporal changes in persistence landscapes via their \$L\textasciicircump\$norms. We test this procedure on multidimensional time series generated by various nonlinear and nonequilibrium models. We find that, in the vicinity of financial meltdowns, the \$L\textasciicircump\$norms exhibit strong growth prior to the primary peak, which ascends during a crash. Remarkably, the average spectral density at low frequencies of the time series of \$L\textasciicircump\$norms of the persistence landscapes demonstrates a strong rising trend for 250 trading days prior to either dotcom crash on 03/10/2000, or to the Lehman bankruptcy on 09/15/2008. Our study suggests that TDA provides a new type of econometric analysis, which goes beyond the standard statistical measures. The method can be used to detect early warning signals of imminent market crashes. We believe that this approach can be used beyond the analysis of financial time series presented here. 
A Topological Framework for Identifying Phenomenological Bifurcations in Stochastic Dynamical Systems (2024)
Sunia Tanweer, Firas A. Khasawneh, Elizabeth Munch, Joshua R. TempelmanAbstract
Changes in the parameters of dynamical systems can cause the state of the system to shift between different qualitative regimes. These shifts, known as bifurcations, are critical to study as they can indicate when the system is about to undergo harmful changes in its behavior. In stochastic dynamical systems, there is particular interest in Ptype (phenomenological) bifurcations, which can include transitions from a monostable state to multistable states, the appearance of stochastic limit cycles and other features in the probability density function (PDF) of the system’s state. Current practices are limited to systems with small state spaces, cannot detect all possible behaviors of the PDFs and mandate human intervention for visually identifying the change in the PDF. In contrast, this study presents a new approach based on Topological Data Analysis that uses superlevel persistence to mathematically quantify Ptype bifurcations in stochastic systems through a “homological bifurcation plot”—which shows the changing ranks of 0th and 1st homology groups, through Betti vectors. Using these plots, we demonstrate the successful detection of Pbifurcations on the stochastic Duffing, RaleighVander Pol and Quintic Oscillators given their analytical PDFs, and elaborate on how to generate an estimated homological bifurcation plot given a kernel density estimate (KDE) of these systems by employing a tool for finding topological consistency between PDFs and KDEs. 
TimeInhomogeneous Diffusion Geometry and Topology (2022)
Guillaume Huguet, Alexander Tong, Bastian Rieck, Jessie Huang, Manik Kuchroo, Matthew Hirn, Guy Wolf, Smita KrishnaswamyAbstract
Diffusion condensation is a dynamic process that yields a sequence of multiscale data representations that aim to encode meaningful abstractions. It has proven effective for manifold learning, denoising, clustering, and visualization of highdimensional data. Diffusion condensation is constructed as a timeinhomogeneous process where each step first computes and then applies a diffusion operator to the data. We theoretically analyze the convergence and evolution of this process from geometric, spectral, and topological perspectives. From a geometric perspective, we obtain convergence bounds based on the smallest transition probability and the radius of the data, whereas from a spectral perspective, our bounds are based on the eigenspectrum of the diffusion kernel. Our spectral results are of particular interest since most of the literature on data diffusion is focused on homogeneous processes. From a topological perspective, we show diffusion condensation generalizes centroidbased hierarchical clustering. We use this perspective to obtain a bound based on the number of data points, independent of their location. To understand the evolution of the data geometry beyond convergence, we use topological data analysis. We show that the condensation process itself defines an intrinsic diffusion homology. We use this intrinsic topology as well as an ambient topology to study how the data changes over diffusion time. We demonstrate both homologies in wellunderstood toy examples. Our work gives theoretical insights into the convergence of diffusion condensation, and shows that it provides a link between topological and geometric data analysis. 
A Topological Approach to Selecting Models of Biological Experiments (2019)
M. Ulmer, Lori Ziegelmeier, Chad M. TopazAbstract
We use topological data analysis as a tool to analyze the fit of mathematical models to experimental data. This study is built on data obtained from motion tracking groups of aphids in [Nilsen et al., PLOS One, 2013] and two random walk models that were proposed to describe the data. One model incorporates social interactions between the insects via a functional dependence on an aphid’s distance to its nearest neighbor. The second model is a control model that ignores this dependence. We compare data from each model to data from experiment by performing statistical tests based on three different sets of measures. First, we use time series of order parameters commonly used in collective motion studies. These order parameters measure the overall polarization and angular momentum of the group, and do not rely on a priori knowledge of the models that produced the data. Second, we use order parameter time series that do rely on a priori knowledge, namely average distance to nearest neighbor and percentage of aphids moving. Third, we use computational persistent homology to calculate topological signatures of the data. Analysis of the a priori order parameters indicates that the interactive model better describes the experimental data than the control model does. The topological approach performs as well as these a priori order parameters and better than the other order parameters, suggesting the utility of the topological approach in the absence of specific knowledge of mechanisms underlying the data. 
Topological Data Analysis for the Characterization of Atomic Scale Morphology From Atom Probe Tomography Images (2018)
Tianmu Zhang, Scott R. Broderick, Krishna RajanAbstract
Atom probe tomography (APT) represents a revolutionary characterization tool for materials that combine atomic imaging with a timeofflight (TOF) mass spectrometer to provide direct space threedimensional, atomic scale resolution images of materials with the chemical identities of hundreds of millions of atoms. It involves the controlled removal of atoms from a specimen’s surface by field evaporation and then sequentially analyzing them with a position sensitive detector and TOF mass spectrometer. A paradox in APT is that while on the one hand, it provides an unprecedented level of imaging resolution in three dimensions, it is very difficult to obtain an accurate perspective of morphology or shape outlined by atoms of similar chemistry and microstructure. The origins of this problem are numerous, including incomplete detection of atoms and the complexity of the evaporation fields of atoms at or near interfaces. Hence, unlike scattering techniques such as electron microscopy, interfaces appear diffused, not sharp. This, in turn, makes it challenging to visualize and quantitatively interpret the microstructure at the “meso” scale, where one is interested in the shape and form of the interfaces and their associated chemical gradients. It is here that the application of informatics at the nanoscale and statistical learning methods plays a critical role in both defining the level of uncertainty and helping to make quantitative, statistically objective interpretations where heuristics often dominate. In this chapter, we show how the tools of Topological Data Analysis provide a new and powerful tool in the field of nanoinformatics for materials characterization. 
Topological Data Analysis as a Morphometric Method: Using Persistent Homology to Demarcate a Leaf Morphospace (2018)
Mao Li, Hong An, Ruthie Angelovici, Clement Bagaza, Albert Batushansky, Lynn Clark, Viktoriya Coneva, Michael J. Donoghue, Erika Edwards, Diego Fajardo, Hui Fang, Margaret H. Frank, Timothy Gallaher, Sarah Gebken, Theresa Hill, Shelley Jansky, Baljinder Kaur, Phillip C. Klahs, Laura L. Klein, Vasu Kuraparthy, Jason Londo, Zoë Migicovsky, Allison Miller, Rebekah Mohn, Sean Myles, Wagner C. Otoni, J. C. Pires, Edmond Rieffer, Sam Schmerler, Elizabeth Spriggs, Christopher N. Topp, Allen Van Deynze, Kuang Zhang, Linglong Zhu, Braden M. Zink, Daniel H. ChitwoodAbstract
Current morphometric methods that comprehensively measure shape cannot compare the disparate leaf shapes found in seed plants and are sensitive to processing artifacts. We explore the use of persistent homology, a topological method applied as a filtration across simplicial complexes (or more simply, a method to measure topological features of spaces across different spatial resolutions), to overcome these limitations. The described method isolates subsets of shape features and measures the spatial relationship of neighboring pixel densities in a shape. We apply the method to the analysis of 182,707 leaves, both published and unpublished, representing 141 plant families collected from 75 sites throughout the world. By measuring leaves from throughout the seed plants using persistent homology, a defined morphospace comparing all leaves is demarcated. Clear differences in shape between major phylogenetic groups are detected and estimates of leaf shape diversity within plant families are made. The approach predicts plant family above chance. The application of a persistent homology method, using topological features, to measure leaf shape allows for a unified morphometric framework to measure plant form, including shapes, textures, patterns, and branching architectures. 
Multivariate Data Analysis Using PersistenceBased Filtering and Topological Signatures (2012)
B. Rieck, H. Mara, H. LeitteAbstract
The extraction of significant structures in arbitrary highdimensional data sets is a challenging task. Moreover, classifying data points as noise in order to reduce a data set bears special relevance for many application domains. Standard methods such as clustering serve to reduce problem complexity by providing the user with classes of similar entities. However, they usually do not highlight relations between different entities and require a stopping criterion, e.g. the number of clusters to be detected. In this paper, we present a visualization pipeline based on recent advancements in algebraic topology. More precisely, we employ methods from persistent homology that enable topological data analysis on highdimensional data sets. Our pipeline inherently copes with noisy data and data sets of arbitrary dimensions. It extracts central structures of a data set in a hierarchical manner by using a persistencebased filtering algorithm that is theoretically wellfounded. We furthermore introduce persistence rings, a novel visualization technique for a class of topological featuresthe persistence intervalsof large data sets. Persistence rings provide a unique topological signature of a data set, which helps in recognizing similarities. In addition, we provide interactive visualization techniques that assist the user in evaluating the parameter space of our method in order to extract relevant structures. We describe and evaluate our analysis pipeline by means of two very distinct classes of data sets: First, a class of synthetic data sets containing topological objects is employed to highlight the interaction capabilities of our method. Second, in order to affirm the utility of our technique, we analyse a class of highdimensional realworld data sets arising from current research in cultural heritage. 
Applications of Persistent Homology to Time Varying Systems (2013)
Elizabeth MunchAbstract
\textlessp\textgreaterThis dissertation extends the theory of persistent homology to time varying systems. Most of the previous work has been dedicated to using this powerful tool in topological data analysis to study static point clouds. In particular, given a point cloud, we can construct its persistence diagram. Since the diagram varies continuously as the point cloud varies continuously, we study the space of time varying persistence diagrams, called vineyards when they were introduced by CohenSteiner, Edelsbrunner, and Morozov.\textless/p\textgreater\textlessp\textgreaterWe will first show that with a good choice of metric, these vineyards are stable for small perturbations of their associated point clouds. We will also define a new mean for a set of persistence diagrams based on the work of Mileyko et al. which, unlike the previously defined mean, is continuous for geodesic vineyards. \textless/p\textgreater\textlessp\textgreaterNext, we study the sensor network problem posed by Ghrist and de Silva, and their application of persistent homology to understand when a set of sensors covers a given region. Giving each of these sensors a probability of failure over time, we show that an exact computation of the probability of failure of the whole system is NPhard, but give an algorithm which can predict failure in the case of a monitored system.\textless/p\textgreater\textlessp\textgreaterFinally, we apply these methods to an automated system which can cluster agents moving in aerial images by their behaviors. We build a data structure for storing and querying the information in realtime, and define behavior vectors which quantify behaviors of interest. This clustering by behavior can be used to find groups of interest, for which we can also quantify behaviors in order to determine whether the group is working together to achieve a common goal, and we speculate that this work can be extended to improving tracking algorithms as well as behavioral predictors.\textless/p\textgreater 
Identification of Copy Number Aberrations in Breast Cancer Subtypes Using Persistence Topology (2015)
Javier Arsuaga, Tyler Borrman, Raymond Cavalcante, Georgina Gonzalez, Catherine ParkAbstract
DNA copy number aberrations (CNAs) are of biological and medical interest because they help identify regulatory mechanisms underlying tumor initiation and evolution. Identification of tumordriving CNAs (driver CNAs) however remains a challenging task, because they are frequently hidden by CNAs that are the product of random events that take place during tumor evolution. Experimental detection of CNAs is commonly accomplished through array comparative genomic hybridization (aCGH) assays followed by supervised and/or unsupervised statistical methods that combine the segmented profiles of all patients to identify driver CNAs. Here, we extend a previouslypresented supervised algorithm for the identification of CNAs that is based on a topological representation of the data. Our method associates a twodimensional (2D) point cloud with each aCGH profile and generates a sequence of simplicial complexes, mathematical objects that generalize the concept of a graph. This representation of the data permits segmenting the data at different resolutions and identifying CNAs by interrogating the topological properties of these simplicial complexes. We tested our approach on a published dataset with the goal of identifying specific breast cancer CNAs associated with specific molecular subtypes. Identification of CNAs associated with each subtype was performed by analyzing each subtype separately from the others and by taking the rest of the subtypes as the control. Our results found a new amplification in 11q at the location of the progesterone receptor in the Luminal A subtype. Aberrations in the Luminal B subtype were found only upon removal of the basallike subtype from the control set. Under those conditions, all regions found in the original publication, except for 17q, were confirmed; all aberrations, except those in chromosome arms 8q and 12q were confirmed in the basallike subtype. These two chromosome arms, however, were detected only upon removal of three patients with exceedingly large copy number values. More importantly, we detected 10 and 21 additional regions in the Luminal B and basallike subtypes, respectively. Most of the additional regions were either validated on an independent dataset and/or using GISTIC. Furthermore, we found three new CNAs in the basallike subtype: a combination of gains and losses in 1p, a gain in 2p and a loss in 14q. Based on these results, we suggest that topological approaches that incorporate multiresolution analyses and that interrogate topological properties of the data can help in the identification of copy number changes in cancer.