∑MathAdvanced

Persistent Homology

Key Points

•
Persistent homology tracks how topological features (components, loops, voids) appear and disappear as you grow a scale parameter over a filtered simplicial complex.
•
A filtration is an increasing sequence of complexes where every simplex enters at a specific scale but only after all its faces.
•
The core computation reduces a boundary matrix over a field (often $ $Z_{2}$ $) to obtain birth–death pairs and barcodes/diagrams.
•
Vietoris–Rips filtrations are built from point clouds by inserting simplices when all pairwise distances are below a threshold, and the filtration value is the maximum edge length.
•
Zero columns in the reduced boundary matrix create features; nonzero columns kill previously created features, forming birth–death pairs.
•
Long bars often indicate meaningful shape while short bars are more likely noise, and stability theorems quantify this robustness.
•
Computing only $ $H_{0}$ $ (connected components) is very fast with union–find and is equivalent to Kruskal’s MST process.
•
Full persistent homology can be cubic in the number of simplices, so careful filtration design and low-dimensional truncation are crucial.

Prerequisites

→Basic linear algebra — Understanding vector spaces, matrices, kernels, images, and rank is needed for chain groups and boundary matrix reduction.
→Graph theory and combinatorics — Simplicial complexes generalize graphs; counting and constructing simplices relies on combinatorial reasoning.
→Point-set topology (introductory) — Concepts like connectedness and loops clarify the meaning of H0 and H1 features.
→Algorithms and data structures — Efficient implementations need sorting, hash maps, union–find, and sparse representations.
→Metric spaces and distances — Rips filtrations depend on pairwise distances; properties like triangle inequality matter.
→Computational geometry (basics) — Alternative filtrations (alpha complexes) use Delaunay/Voronoi structures and geometric primitives.

Detailed Explanation

Tap terms for definitions

01Overview

Persistent homology is a method from topological data analysis (TDA) that summarizes the shape of data across scales. Given data (for example, a point cloud), we do not fix a single scale; instead, we build a family of shapes—called a filtration—by steadily relaxing a scale parameter. As the scale grows, new connections form, loops appear, and voids are filled. Persistent homology records when each feature is born and when it dies, producing a multiscale signature of the data such as a barcode (intervals) or a persistence diagram (points in the plane). Practically, we first turn data into a simplicial complex (e.g., Vietoris–Rips or alpha complex), assign each simplex a filtration value that respects the face condition (faces appear no later than the simplex), sort simplices by this order, build a boundary matrix over a field, and perform a matrix reduction. The pivot structure of the reduced matrix reveals birth–death pairs across homology dimensions ((H_0) for components, (H_1) for loops, (H_2) for voids, etc.). The result is robust: small perturbations of input cause small changes in the diagram. Persistent homology has become a standard tool for understanding structure in high-dimensional and noisy data.

02Intuition & Analogies

Imagine slowly flooding a landscape. At first, isolated puddles form in the lowest areas—these are connected components ((H_0)). As the water level rises, puddles merge, and some components die—the moment two pools connect, one of their identities is lost. Occasionally, water can surround a hill, creating a loop ((H_1)); later, as the level keeps rising, that loop is filled in and dies. In higher dimensions, the same story repeats with voids ((H_2)), like air pockets in a sponge, which appear and later disappear. Persistent homology is the logbook of this flood, noting the water levels (scales) at which each feature appears (birth) and vanishes (death). For point clouds, replace the water with growing balls around each point. Where balls touch, edges appear; where triangles are supported by mutually touching triplets, loops form; further connections can fill and then kill those loops. If a loop persists over a large range of radii, it’s like a ring-shaped canyon—hard to destroy—so it shows as a long bar and likely reflects real structure. Short bars often correspond to small, local irregularities—ripples in the terrain—more likely caused by noise. The beauty is that we don’t have to pick a single radius (which can be arbitrary and fragile); instead, we read structure across all radii and trust the features that endure.

03Formal Definition

Let $K$ be a finite simplicial complex with a filtration function \(f: K

\to

R

\) satisfying the face condition: if \(

τ

\) is a face of \(

σ

\), then \(f(

τ

)

\leq

σ

)\). The filtration is the nested sequence \(

F_{t}

= \{

σ

\in

K : f(

σ

)

\leq

t\}\), so that \(

F_{s}

\subseteq

F_{t}

\) for \(s

\leq

t\). For each dimension $k$ and each $t$, define the chain group \(

C_{k}

(

F_{t}

;

F

)\) over a field \(

F

\) with boundary maps \(

\partial_{k}

C_{k}

\to

C_{k - 1}

\) satisfying \(

\partial_{k}

\circ

\partial_{k + 1}

= 0\). The $k$-th homology at time $t$ is \(

H_{k}

(

F_{t}

;

F

) =

ker

\partial_{k}

im

\partial_{k + 1}

\), with Betti number \(

β_{k}

(t) =

dim

H_{k}

(

F_{t}

;

F

)\). Inclusions \(

F_{s}

↪

F_{t}

\) induce linear maps \(

φ_{s}^{t}

H_{k}

(

F_{s}

)

\to

H_{k}

(

F_{t}

)\), forming a persistence module. Under finiteness (tameness) assumptions, this module decomposes as a direct sum of interval modules, yielding a multiset of birth–death pairs $(b, d)$ (with \(

b < d\

) or \(d=

\infty

\)). Algorithmically, we sort simplices by increasing $f$ (breaking ties by dimension and lexicographic order), build the boundary matrix over \(

F

\), and apply column reduction. Columns that reduce to zero create $k$-dimensional classes at their filtration values; columns with a nonzero pivot kill a class born at the pivot’s row, producing a pair in \(

H_{k - 1}

\). The collection of these intervals is the barcode or, equivalently, the persistence diagram.

04When to Use

Use persistent homology when you need scale-robust shape descriptors of data. Typical scenarios include: (1) point clouds in (\mathbb{R}^d), where you suspect clusters, loops, or voids (e.g., sensor trajectories, molecular conformations); (2) images or voxel data, where cubical complexes capture tunnels and cavities; (3) graphs and networks, where (H_0) persistence provides a principled multiscale clustering (via Kruskal/MST) and higher-dimensional features can be studied on clique complexes; (4) time series, via sliding-window embeddings that turn dynamics into loops/torii; (5) model comparison, where persistence diagrams allow quantitative comparison through distances such as the bottleneck distance. Choose Vietoris–Rips for generic metric data (simple to build, but combinatorially heavy), alpha complexes for Euclidean point clouds (geometrically faithful and sparser), or cubical complexes for gridded data. Compute only as high a dimension as needed (often (H_0)–(H_1)), and restrict the filtration range to control complexity. Persistent homology is especially valuable when you need invariance to isometries and robustness to noise and when conventional clustering or dimensionality reduction misses global structure like loops.

⚠️Common Mistakes

• Violating the face condition: inserting a simplex before its faces leads to invalid filtrations and wrong pairings. Always ensure every (k)-simplex enters no earlier than all its ((k-1))-faces. • Mishandling ties: when multiple simplices share the same filtration value, inconsistent ordering can change pairings. Break ties deterministically—e.g., by increasing dimension, then lexicographic vertex order. • Ignoring coefficient fields: working over (\mathbb{Z}2) simplifies signs; using other fields changes results (especially torsion). State and fix the field upfront. • Overbuilding Rips complexes: they blow up combinatorially; computing up to dimension 1 or 2 and capping the radius often suffices. • Misinterpreting infinite bars: long (H_0) bar corresponds to the final connected component (the whole space), not a special feature. • Numerical instability: distances that are nearly equal can cause large tie sets; snap with a tolerance and use a stable sort. • Memory blowups: dense boundary matrices are huge; prefer sparse column representations and avoid constructing unnecessary higher-dimensional simplices. • Conflating homology dimensions: births in column-reduction happen at the column’s own dimension; deaths recorded by a column of dimension (k) correspond to features in (H{k-1}). Keep this dimension shift clear.

Key Formulas

Face Condition for Filtrations

f (τ) \leq f (σ) whenever τ \subset σ

Explanation: A simplex can only appear after all of its faces. This guarantees each filtration level is a valid simplicial complex and enables correct persistence pairing.

Vietoris–Rips Complex

V R_{ϵ} (X) = {σ = {x_{i_{0}}, \dots, x_{i_{k}}} : p < q max d (x_{i_{p}}, x_{i_{q}}) \leq ϵ}

Explanation: A simplex is included if all its edges are short. In a filtration, one sets the filtration value of $ $σ$ $ to the maximum edge length inside it.

Chain Groups and Boundary Maps

C_{k} (K; F) = F^{m_{k}}, \partial_{k} : C_{k} \to C_{k - 1}, \partial_{k} \circ \partial_{k + 1} = 0

Explanation: Chains are vectors over simplices; boundary maps send k-chains to (k−1)-chains and compose to zero. This structure defines cycles and boundaries.

Homology at Scale t

H_{k} (F_{t}; F) = ker \partial_{k} / im \partial_{k + 1}

Explanation: Homology groups capture k-dimensional holes present at filtration level t by modding out boundaries from cycles.

Betti Numbers

β_{k} (t) = dim H_{k} (F_{t}; F)

Explanation: The number of independent k-dimensional features at time t. These counts can change only when a simplex enters the filtration.

Persistence Module Maps

φ_{s}^{t} : H_{k} (F_{s}) \to H_{k} (F_{t}), s \leq t

Explanation: Inclusion-induced maps track how homology classes evolve across scales. Their structure decomposes into interval modules, giving barcodes.

Stability Theorem (Bottleneck Distance)

d_{B} (D_{f}, D_{g}) \leq ∥ f - g ∥_{\infty}

Explanation: Small perturbations of the filtration function move the persistence diagram by at most the perturbation size. This justifies robustness of persistent homology.

Combinatorial Growth

Rips simplices up to dimension k : i = 0 \sum k (i + 1 n)

Explanation: The number of simplices grows rapidly with n and k. This limits practical computations to low dimensions and moderate n.

Boundary Matrix Reduction Cost

T_{reduction} = O (m^{3}) in the worst case

Explanation: Reducing an m-by-m boundary matrix densely can be cubic. Sparse methods and heuristics often make it much faster in practice.

H0 via Union–Find

T_{H0} = O (E α (V))

Explanation: Zero-dimensional persistence using Kruskal’s process is near-linear in the number of edges, where $ $α$ $ is the inverse Ackermann function.

Complexity Analysis

Let n be the number of input points, k the maximum homology dimension computed, and m the total number of simplices inserted into the filtration. For a Vietoris–Rips filtration up to dimension k, the number of simplices can be as large as \(

\sum_{i = 0}^{k}

(i + 1 n)

\), which is polynomial of degree k+1 in n and rapidly becomes prohibitive. Constructing all edges requires O(

n^{2}

) distance computations and sorting O(

n^{2}

lo g

n) if you explicitly sort edges by length. Building higher-dimensional simplices typically costs O(

n^{k + 1}

) checks; using adjacency lists and pruning by thresholds can reduce constants but not the worst-case growth. The standard persistence algorithm forms a boundary matrix with at most m columns and m rows (often block-structured by dimension). In the naive dense model, column reduction over a field has worst-case time O(

m^{3}

) and space O(

m^{2}

). Practical implementations use sparse columns (storing only nonzero row indices) and hash maps from pivots to columns, giving dramatically better performance on typical filtrations, but the worst case remains cubic. Memory usage becomes the main bottleneck well before time does; using compressed sparse columns and truncating the filtration (e.g., max radius) or dimension (e.g., up to 1 or 2) is crucial. For \(

H_{0}

\), there is a near-linear-time algorithm using union–find. Given E edges (or all \(

(2 n)

\) pairs), sorting edges by weight is O(E

lo g

E), and each union/find operation is essentially constant amortized, for total O(E

α

(n)) after sorting. The \(

H_{0}

\) computation is equivalent to Kruskal’s MST process and scales to very large datasets. In summary: full persistence is expensive in m; design filtrations to keep m small, compute only needed dimensions, and use sparse reductions.

Code Examples

Persistent Homology up to H1 via Boundary Matrix on a Vietoris–Rips Filtration (\u2124_2 coefficients)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // This educational example builds a Vietoris–Rips filtration up to dimension 2
5 // from a small 2D point cloud, reduces the boundary matrix over Z2, and prints
6 // persistence intervals for H0 and H1. It is NOT optimized and intended for small inputs.
7 
8 struct Simplex {
9     vector<int> verts;   // sorted vertex indices
10     int dim;             // dimension = verts.size()-1
11     double filt;         // filtration value (for Rips: max edge length in the simplex)
12 };
13 
14 // Key encoding for a simplex's vertex set (sorted) to map to its index
15 static string keyOf(const vector<int>& v){
16     string s; s.reserve(v.size()*3);
17     for(size_t i=0;i<v.size();++i){
18         if(i) s.push_back(',');
19         s += to_string(v[i]);
20     }
21     return s;
22 }
23 
24 // Symmetric difference (XOR) of two sorted vectors of ints
25 static vector<int> symdiff(const vector<int>& a, const vector<int>& b){
26     vector<int> out; out.reserve(a.size()+b.size());
27     size_t i=0,j=0; 
28     while(i<a.size() || j<b.size()){
29         if(j==b.size() || (i<a.size() && a[i]<b[j])){ out.push_back(a[i++]); }
30         else if(i==a.size() || b[j]<a[i]){ out.push_back(b[j++]); }
31         else { // equal -> cancel in Z2
32             ++i; ++j; 
33         }
34     }
35     return out;
36 }
37 
38 // Compute Euclidean distances
39 static vector<vector<double>> pairwiseDistances(const vector<pair<double,double>>& pts){
40     int n = (int)pts.size();
41     vector<vector<double>> D(n, vector<double>(n, 0.0));
42     for(int i=0;i<n;i++){
43         for(int j=i+1;j<n;j++){
44             double dx = pts[i].first - pts[j].first;
45             double dy = pts[i].second - pts[j].second;
46             double d = sqrt(dx*dx + dy*dy);
47             D[i][j]=D[j][i]=d;
48         }
49     }
50     return D;
51 }
52 
53 int main(){
54     // Example point cloud: noisy circle-like points
55     vector<pair<double,double>> P = {
56         {1.0,0.0},{0.707,0.707},{0.0,1.0},{-0.707,0.707},{-1.0,0.0},
57         {-0.707,-0.707},{0.0,-1.0},{0.707,-0.707},{1.2,0.1},{-1.1,-0.1}
58     };
59     int n = (int)P.size();
60     auto D = pairwiseDistances(P);
61 
62     // Build Vietoris–Rips filtration up to dim=2 with a radius cap to keep size small
63     double maxR = 1.1; // cap (adjust to control size). Triangles/edges with filt > maxR are skipped.
64 
65     vector<Simplex> simplices;
66     simplices.reserve(n + n*(n-1)/2);
67 
68     // 0-simplices (vertices) with filtration 0
69     for(int i=0;i<n;i++){
70         simplices.push_back({{i}, 0, 0.0});
71     }
72 
73     // 1-simplices (edges)
74     for(int i=0;i<n;i++){
75         for(int j=i+1;j<n;j++){
76             double w = D[i][j];
77             if(w <= maxR){
78                 simplices.push_back({{i,j}, 1, w});
79             }
80         }
81     }
82 
83     // 2-simplices (triangles): include if all three edges <= maxR (i.e., max edge length <= maxR)
84     for(int i=0;i<n;i++){
85         for(int j=i+1;j<n;j++){
86             if(D[i][j] > maxR) continue;
87             for(int k=j+1;k<n;k++){
88                 double m = max({D[i][j], D[i][k], D[j][k]});
89                 if(m <= maxR){
90                     simplices.push_back({{i,j,k}, 2, m});
91                 }
92             }
93         }
94     }
95 
96     // Sort by (filtration, dimension, lex vertices) to enforce a consistent order
97     stable_sort(simplices.begin(), simplices.end(), [](const Simplex& a, const Simplex& b){
98         if (a.filt != b.filt) return a.filt < b.filt;
99         if (a.dim != b.dim) return a.dim < b.dim; // faces first on ties
100         return a.verts < b.verts;
101     });
102 
103     int m = (int)simplices.size();
104 
105     // Map simplex key -> index in sorted order
106     unordered_map<string,int> indexOf; indexOf.reserve(m*2);
107     for(int i=0;i<m;i++){
108         indexOf[keyOf(simplices[i].verts)] = i;
109     }
110 
111     // Build boundary columns: store each column as sorted list of row indices with 1s (Z2)
112     vector<vector<int>> boundary(m);
113     for(int col=0; col<m; col++){
114         const auto& s = simplices[col];
115         if(s.dim == 0) continue; // vertex has empty boundary
116         // faces by removing one vertex at a time
117         for(int t=0; t<(int)s.verts.size(); t++){
118             vector<int> face = s.verts;
119             face.erase(face.begin()+t);
120             auto it = indexOf.find(keyOf(face));
121             if(it == indexOf.end()){
122                 cerr << "Error: face not found in filtration (violates face condition)\n";
123                 return 1;
124             }
125             boundary[col].push_back(it->second);
126         }
127         // Sort rows to enable XOR via symmetric difference
128         sort(boundary[col].begin(), boundary[col].end());
129     }
130 
131     // Column reduction over Z2
132     vector<int> low(m, -1);            // low[j] = pivot row of reduced column j, or -1 if zero column
133     unordered_map<int,int> pivotCol;   // pivot row -> column having that low
134     pivotCol.reserve(m*2);
135 
136     for(int j=0; j<m; j++){
137         auto& colVec = boundary[j];
138         // compute low(j)
139         auto getLow = [&](){ return colVec.empty() ? -1 : colVec.back(); };
140         int lj = getLow();
141         while(lj != -1){
142             auto it = pivotCol.find(lj);
143             if(it == pivotCol.end()) break;
144             int k = it->second; // existing column with same low
145             // XOR columns: col_j = col_j + col_k over Z2
146             colVec = symdiff(colVec, boundary[k]);
147             lj = getLow();
148         }
149         low[j] = lj;
150         if(lj != -1){
151             pivotCol[lj] = j; // j kills the class born at row lj
152         }
153     }
154 
155     // Collect intervals: 
156     // - If column j is zero after reduction (low[j]==-1), it creates an H_{dim(s_j)} class at filt(s_j) with infinite death (unless killed later by higher-dim columns, which appears via their pairing)
157     // - If low[j]==i, then column j (dim d) kills a class in H_{d-1} that was born at column i's filtration
158 
159     struct Interval { int dim; double birth, death; bool infinite; };
160     vector<Interval> intervals;
161 
162     // Map pivot row -> its birth simplex index is itself (column index equal to row index)
163     // Create finite intervals from nonzero columns
164     for(int j=0;j<m;j++){
165         if(low[j] != -1){
166             int i = low[j];
167             int d = simplices[j].dim - 1; // killed feature dimension
168             if(d >= 0){
169                 intervals.push_back({d, simplices[i].filt, simplices[j].filt, false});
170             }
171         }
172     }
173     // Create infinite intervals from zero columns that were never used as a pivot row
174     // A zero column at dim d births an H_d class; if its index appears as some low[j], it is killed; otherwise it persists to infinity
175     vector<char> isKilled(m, 0);
176     for(int j=0;j<m;j++) if(low[j]!=-1) isKilled[low[j]] = 1;
177     for(int i=0;i<m;i++){
178         if(!isKilled[i]){
179             // Column i either zero or nonzero. Only zero columns are births; nonzero columns correspond to deaths of lower-dim classes
180             if(low[i] == -1){
181                 int d = simplices[i].dim;
182                 intervals.push_back({d, simplices[i].filt, numeric_limits<double>::infinity(), true});
183             }
184         }
185     }
186 
187     // Print results grouped by dimension
188     cout.setf(std::ios::fixed); cout<<setprecision(4);
189     cout << "Intervals (dimension 0):\n";
190     for(const auto& I: intervals){
191         if(I.dim==0){
192             cout << "  [" << I.birth << ", " << (I.infinite? string("inf"): to_string(I.death)) << ")\n";
193         }
194     }
195     cout << "Intervals (dimension 1):\n";
196     for(const auto& I: intervals){
197         if(I.dim==1){
198             cout << "  [" << I.birth << ", " << (I.infinite? string("inf"): to_string(I.death)) << ")\n";
199         }
200     }
201 
202     return 0;
203 }
204

This program constructs a Vietoris–Rips filtration up to dimension 2 from a small 2D point set. Each 0-simplex has filtration 0; each edge’s filtration is its length; each triangle’s filtration is the maximum of its three edge lengths. After sorting simplices by (filtration, dimension, lexicographic vertices), it builds a sparse boundary matrix over Z2, performs the standard column-reduction algorithm, and extracts intervals. A nonzero reduced column of dimension d pairs with its pivot row to give an H_{d-1} interval [birth_of_pivot, filtration_of_column). A zero column at dimension d creates an H_d class which persists unless later killed (which would be reflected as someone else’s pivot). The output lists H0 and H1 intervals.

Time: O(n^2 log n + m^3) worst-case (n points; m simplices), with m dominated by Rips growthSpace: O(m^2) worst-case (dense), but this code stores sparse columns so closer to O(NZ) nonzeros

Fast H0 Persistence via Union–Find (Equivalent to Kruskal’s MST)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Compute zero-dimensional persistence (H0) from a point cloud by
5 // sorting all edges by length and unioning components (Kruskal).
6 // Each union corresponds to a death time for one H0 bar; one bar persists to infinity.
7 
8 struct DSU {
9     vector<int> p, r;
10     DSU(int n=0){init(n);} 
11     void init(int n){ p.resize(n); r.assign(n,0); iota(p.begin(), p.end(), 0);} 
12     int find(int x){ return p[x]==x?x:p[x]=find(p[x]); }
13     bool unite(int a, int b){ a=find(a); b=find(b); if(a==b) return false; if(r[a]<r[b]) swap(a,b); p[b]=a; if(r[a]==r[b]) r[a]++; return true; }
14 };
15 
16 int main(){
17     vector<pair<double,double>> P = {
18         {0,0},{1,0},{2,0},{0,1},{1,1},{2,1}
19     }; // 6 points in a grid-like pattern
20     int n = (int)P.size();
21 
22     // Build all edges with their lengths
23     struct Edge{int u,v; double w;};
24     vector<Edge> E;
25     for(int i=0;i<n;i++) for(int j=i+1;j<n;j++){
26         double dx=P[i].first-P[j].first, dy=P[i].second-P[j].second;
27         double d=hypot(dx,dy);
28         E.push_back({i,j,d});
29     }
30     sort(E.begin(), E.end(), [](const Edge& a, const Edge& b){ return a.w < b.w; });
31 
32     // Initially each vertex births an H0 class at time 0
33     // Each successful union kills one class at the current edge weight
34     DSU dsu(n);
35     vector<pair<double,double>> intervals; // [birth, death)
36     int components = n;
37     for(const auto& e : E){
38         if(dsu.unite(e.u, e.v)){
39             intervals.push_back({0.0, e.w}); // one bar dies now (birth=0)
40             components--;
41             if(components==1) break; // remaining bar is infinite
42         }
43     }
44 
45     cout.setf(std::ios::fixed); cout<<setprecision(4);
46     cout << "H0 intervals from Kruskal/Union-Find:\n";
47     for(const auto& I : intervals){
48         cout << "  [" << I.first << ", " << I.second << ")\n";
49     }
50     cout << "  [0.0000, inf)\n"; // final component persists
51     return 0;
52 }
53

This code computes H0 persistence by sorting all pairwise edges and performing union–find merges—exactly Kruskal’s MST process. Each time two components merge under the smallest available edge, one H0 bar dies at that edge length, while all births occur at 0 for points. One component remains forever, yielding the infinite interval [0, ∞). This scales to large datasets and is the standard fast approach for connected-component persistence.

Time: O(E log E + E α(n)), where E = n(n−1)/2 for all pairwise edgesSpace: O(E + n)

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// This educational example builds a Vietoris–Rips filtration up to dimension 2
5	// from a small 2D point cloud, reduces the boundary matrix over Z2, and prints
6	// persistence intervals for H0 and H1. It is NOT optimized and intended for small inputs.
7
8	struct Simplex {
9	vector<int> verts; // sorted vertex indices
10	int dim; // dimension = verts.size()-1
11	double filt; // filtration value (for Rips: max edge length in the simplex)
12	};
13
14	// Key encoding for a simplex's vertex set (sorted) to map to its index
15	static string keyOf(const vector<int>& v){
16	string s; s.reserve(v.size()*3);
17	for(size_t i=0;i<v.size();++i){
18	if(i) s.push_back(',');
19	s += to_string(v[i]);
20	}
21	return s;
22	}
23
24	// Symmetric difference (XOR) of two sorted vectors of ints
25	static vector<int> symdiff(const vector<int>& a, const vector<int>& b){
26	vector<int> out; out.reserve(a.size()+b.size());
27	size_t i=0,j=0;
28	while(i<a.size() \|\| j<b.size()){
29	if(j==b.size() \|\| (i<a.size() && a[i]<b[j])){ out.push_back(a[i++]); }
30	else if(i==a.size() \|\| b[j]<a[i]){ out.push_back(b[j++]); }
31	else { // equal -> cancel in Z2
32	++i; ++j;
33	}
34	}
35	return out;
36	}
37
38	// Compute Euclidean distances
39	static vector<vector<double>> pairwiseDistances(const vector<pair<double,double>>& pts){
40	int n = (int)pts.size();
41	vector<vector<double>> D(n, vector<double>(n, 0.0));
42	for(int i=0;i<n;i++){
43	for(int j=i+1;j<n;j++){
44	double dx = pts[i].first - pts[j].first;
45	double dy = pts[i].second - pts[j].second;
46	double d = sqrt(dxdx + dydy);
47	D[i][j]=D[j][i]=d;
48	}
49	}
50	return D;
51	}
52
53	int main(){
54	// Example point cloud: noisy circle-like points
55	vector<pair<double,double>> P = {
56	{1.0,0.0},{0.707,0.707},{0.0,1.0},{-0.707,0.707},{-1.0,0.0},
57	{-0.707,-0.707},{0.0,-1.0},{0.707,-0.707},{1.2,0.1},{-1.1,-0.1}
58	};
59	int n = (int)P.size();
60	auto D = pairwiseDistances(P);
61
62	// Build Vietoris–Rips filtration up to dim=2 with a radius cap to keep size small
63	double maxR = 1.1; // cap (adjust to control size). Triangles/edges with filt > maxR are skipped.
64
65	vector<Simplex> simplices;
66	simplices.reserve(n + n*(n-1)/2);
67
68	// 0-simplices (vertices) with filtration 0
69	for(int i=0;i<n;i++){
70	simplices.push_back({{i}, 0, 0.0});
71	}
72
73	// 1-simplices (edges)
74	for(int i=0;i<n;i++){
75	for(int j=i+1;j<n;j++){
76	double w = D[i][j];
77	if(w <= maxR){
78	simplices.push_back({{i,j}, 1, w});
79	}
80	}
81	}
82
83	// 2-simplices (triangles): include if all three edges <= maxR (i.e., max edge length <= maxR)
84	for(int i=0;i<n;i++){
85	for(int j=i+1;j<n;j++){
86	if(D[i][j] > maxR) continue;
87	for(int k=j+1;k<n;k++){
88	double m = max({D[i][j], D[i][k], D[j][k]});
89	if(m <= maxR){
90	simplices.push_back({{i,j,k}, 2, m});
91	}
92	}
93	}
94	}
95
96	// Sort by (filtration, dimension, lex vertices) to enforce a consistent order
97	stable_sort(simplices.begin(), simplices.end(), [](const Simplex& a, const Simplex& b){
98	if (a.filt != b.filt) return a.filt < b.filt;
99	if (a.dim != b.dim) return a.dim < b.dim; // faces first on ties
100	return a.verts < b.verts;
101	});
102
103	int m = (int)simplices.size();
104
105	// Map simplex key -> index in sorted order
106	unordered_map<string,int> indexOf; indexOf.reserve(m*2);
107	for(int i=0;i<m;i++){
108	indexOf[keyOf(simplices[i].verts)] = i;
109	}
110
111	// Build boundary columns: store each column as sorted list of row indices with 1s (Z2)
112	vector<vector<int>> boundary(m);
113	for(int col=0; col<m; col++){
114	const auto& s = simplices[col];
115	if(s.dim == 0) continue; // vertex has empty boundary
116	// faces by removing one vertex at a time
117	for(int t=0; t<(int)s.verts.size(); t++){
118	vector<int> face = s.verts;
119	face.erase(face.begin()+t);
120	auto it = indexOf.find(keyOf(face));
121	if(it == indexOf.end()){
122	cerr << "Error: face not found in filtration (violates face condition)\n";
123	return 1;
124	}
125	boundary[col].push_back(it->second);
126	}
127	// Sort rows to enable XOR via symmetric difference
128	sort(boundary[col].begin(), boundary[col].end());
129	}
130
131	// Column reduction over Z2
132	vector<int> low(m, -1); // low[j] = pivot row of reduced column j, or -1 if zero column
133	unordered_map<int,int> pivotCol; // pivot row -> column having that low
134	pivotCol.reserve(m*2);
135
136	for(int j=0; j<m; j++){
137	auto& colVec = boundary[j];
138	// compute low(j)
139	auto getLow = [&](){ return colVec.empty() ? -1 : colVec.back(); };
140	int lj = getLow();
141	while(lj != -1){
142	auto it = pivotCol.find(lj);
143	if(it == pivotCol.end()) break;
144	int k = it->second; // existing column with same low
145	// XOR columns: col_j = col_j + col_k over Z2
146	colVec = symdiff(colVec, boundary[k]);
147	lj = getLow();
148	}
149	low[j] = lj;
150	if(lj != -1){
151	pivotCol[lj] = j; // j kills the class born at row lj
152	}
153	}
154
155	// Collect intervals:
156	// - If column j is zero after reduction (low[j]==-1), it creates an H_{dim(s_j)} class at filt(s_j) with infinite death (unless killed later by higher-dim columns, which appears via their pairing)
157	// - If low[j]==i, then column j (dim d) kills a class in H_{d-1} that was born at column i's filtration
158
159	struct Interval { int dim; double birth, death; bool infinite; };
160	vector<Interval> intervals;
161
162	// Map pivot row -> its birth simplex index is itself (column index equal to row index)
163	// Create finite intervals from nonzero columns
164	for(int j=0;j<m;j++){
165	if(low[j] != -1){
166	int i = low[j];
167	int d = simplices[j].dim - 1; // killed feature dimension
168	if(d >= 0){
169	intervals.push_back({d, simplices[i].filt, simplices[j].filt, false});
170	}
171	}
172	}
173	// Create infinite intervals from zero columns that were never used as a pivot row
174	// A zero column at dim d births an H_d class; if its index appears as some low[j], it is killed; otherwise it persists to infinity
175	vector<char> isKilled(m, 0);
176	for(int j=0;j<m;j++) if(low[j]!=-1) isKilled[low[j]] = 1;
177	for(int i=0;i<m;i++){
178	if(!isKilled[i]){
179	// Column i either zero or nonzero. Only zero columns are births; nonzero columns correspond to deaths of lower-dim classes
180	if(low[i] == -1){
181	int d = simplices[i].dim;
182	intervals.push_back({d, simplices[i].filt, numeric_limits<double>::infinity(), true});
183	}
184	}
185	}
186
187	// Print results grouped by dimension
188	cout.setf(std::ios::fixed); cout<<setprecision(4);
189	cout << "Intervals (dimension 0):\n";
190	for(const auto& I: intervals){
191	if(I.dim==0){
192	cout << " [" << I.birth << ", " << (I.infinite? string("inf"): to_string(I.death)) << ")\n";
193	}
194	}
195	cout << "Intervals (dimension 1):\n";
196	for(const auto& I: intervals){
197	if(I.dim==1){
198	cout << " [" << I.birth << ", " << (I.infinite? string("inf"): to_string(I.death)) << ")\n";
199	}
200	}
201
202	return 0;
203	}
204

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Compute zero-dimensional persistence (H0) from a point cloud by
5	// sorting all edges by length and unioning components (Kruskal).
6	// Each union corresponds to a death time for one H0 bar; one bar persists to infinity.
7
8	struct DSU {
9	vector<int> p, r;
10	DSU(int n=0){init(n);}
11	void init(int n){ p.resize(n); r.assign(n,0); iota(p.begin(), p.end(), 0);}
12	int find(int x){ return p[x]==x?x:p[x]=find(p[x]); }
13	bool unite(int a, int b){ a=find(a); b=find(b); if(a==b) return false; if(r[a]<r[b]) swap(a,b); p[b]=a; if(r[a]==r[b]) r[a]++; return true; }
14	};
15
16	int main(){
17	vector<pair<double,double>> P = {
18	{0,0},{1,0},{2,0},{0,1},{1,1},{2,1}
19	}; // 6 points in a grid-like pattern
20	int n = (int)P.size();
21
22	// Build all edges with their lengths
23	struct Edge{int u,v; double w;};
24	vector<Edge> E;
25	for(int i=0;i<n;i++) for(int j=i+1;j<n;j++){
26	double dx=P[i].first-P[j].first, dy=P[i].second-P[j].second;
27	double d=hypot(dx,dy);
28	E.push_back({i,j,d});
29	}
30	sort(E.begin(), E.end(), [](const Edge& a, const Edge& b){ return a.w < b.w; });
31
32	// Initially each vertex births an H0 class at time 0
33	// Each successful union kills one class at the current edge weight
34	DSU dsu(n);
35	vector<pair<double,double>> intervals; // [birth, death)
36	int components = n;
37	for(const auto& e : E){
38	if(dsu.unite(e.u, e.v)){
39	intervals.push_back({0.0, e.w}); // one bar dies now (birth=0)
40	components--;
41	if(components==1) break; // remaining bar is infinite
42	}
43	}
44
45	cout.setf(std::ios::fixed); cout<<setprecision(4);
46	cout << "H0 intervals from Kruskal/Union-Find:\n";
47	for(const auto& I : intervals){
48	cout << " [" << I.first << ", " << I.second << ")\n";
49	}
50	cout << " [0.0000, inf)\n"; // final component persists
51	return 0;
52	}
53