← Reading Lists
TDA Methods 9 entries

Applied TDA Methods

Statistical and computational methods for applying topological data analysis to real datasets, including software references and probability theory for persistence diagrams.

Introductory

  1. Tausz, A., Vejdemo-Johansson, M., & Adams, H. (2011). JavaPlex: A research software package for persistent (co)homology. In H. Hong & C. Yap (Eds.), Mathematical Software — ICMS 2014 (pp. 129–136). Springer.

    Introductory

    JavaPlex was the first widely used research software for TDA and remains valuable for pedagogical use because its code is clearly documented. The associated tutorials (available separately) walk through persistent homology computation step by step in MATLAB, making it an excellent companion for readers working through the Edelsbrunner–Harer textbook.

Intermediate

  1. Chazal, F., & Michel, B. (2021). An introduction to topological data analysis: Fundamental and practical aspects for data scientists. Frontiers in Artificial Intelligence, 4, 667963.

    Intermediate

    The most comprehensive and up-to-date survey of TDA for data scientists. Chazal and Michel cover persistent homology, mapper, Betti numbers, stability theorems, and statistical inference, with worked examples throughout. This should be the primary reference for practitioners who want to understand what TDA computes and why the results are statistically meaningful.

  2. Bubenik, P. (2015). Statistical topological data analysis using persistence landscapes. Journal of Machine Learning Research, 16(1), 77–102.

    Intermediate

    Bubenik introduces the persistence landscape — a functional summary of a persistence diagram that lives in a Hilbert space. This construction enables the application of standard statistical tools (means, variances, hypothesis tests) to topological summaries without the metric complications that affect the Wasserstein or bottleneck distances. A key paper for applied work.

  3. Bauer (2021)

    Intermediate

    Bauer's Ripser is the state-of-the-art software for computing Vietoris–Rips persistence barcodes, orders of magnitude faster than earlier implementations. This paper describes the algorithmic innovations (clearing, cohomology, apparent pairs) that make Ripser fast. For anyone computing persistent homology in practice, this is the primary software reference.

  4. The GUDHI Project. (2021). GUDHI User and Reference Manual (Version 3.4.0). GUDHI Editorial Board. https://gudhi.inria.fr/doc/3.4.0/

    Intermediate

    GUDHI is the primary C++/Python library for TDA developed at Inria. It implements Čech, Rips, Alpha, and Čubical complexes alongside persistence homology computation and the mapper algorithm. The reference manual is dense, but the Python interface tutorials are accessible. GUDHI is the recommended library for research-grade TDA work beyond what Ripser covers.

  5. Otter, N., Porter, M. A., Tillmann, U., Grindrod, P., & Harrington, H. A. (2017). A roadmap for the computation of persistent homology. EPJ Data Science, 6(1), 17.

    Intermediate

    A practical guide to the computational choices involved in TDA: which filtration to use, which software to choose, how to handle noise and parameter selection. Otter et al. compare filtration types (Rips, Čech, alpha) and software packages (Ripser, Gudhi, Dionysus, JavaPlex) with concrete benchmarks. An invaluable methodological companion for applied work.

  6. Hensel, F., Moor, M., & Rieck, B. (2021). A survey of topological machine learning methods. Frontiers in Artificial Intelligence, 4, 681108.

    Intermediate

    A broad survey of how topological features are integrated into machine learning pipelines, covering persistence images, landscapes, Betti curves, and graph-level TDA. Useful for understanding how the statistical summaries introduced by Bubenik and Turner are deployed in practice, and for identifying appropriate feature representations for downstream modelling tasks.

Advanced

  1. Turner et al. (2014)

    Advanced

    Turner, Mileyko, Mukherjee, and Harer develop Fréchet means for distributions of persistence diagrams, establishing a rigorous framework for averaging and comparing collections of diagrams. This paper is essential for anyone wanting to do statistics on topological summaries — for example, comparing the persistent homology of poverty maps across regions.

  2. Mileyko, Y., Mukherjee, S., & Harer, J. (2011). Probability measures on the space of persistence diagrams. Inverse Problems, 27(12), 124007.

    Advanced

    Establishes the foundational probability theory for persistence diagrams, defining the Fréchet mean, variance, and central limit theorem in the space of diagrams equipped with the Wasserstein metric. This paper provides the theoretical basis for Turner et al. (2014) and Bubenik (2015), and is essential reading for anyone working on statistical inference with TDA.